Merge trunk into sve

author: Richard Sandiford <richard.sandiford@linaro.org> 2017-11-05 17:19:35 +0000
committer: Richard Sandiford <richard.sandiford@linaro.org> 2017-11-05 17:19:35 +0000
commit: 648f8fc59b2cc39abd24f4c22388b346cdebcc31 (patch)
tree: 3a07eccc4c22b265261edd75c9ec3910d9c626f5 /gcc
parent: 7bef5b82e4109778a0988d20e19e1ed29dadd835 (diff)
parent: 8c089b5c15a7b35644750ca393f1e66071ad9aa9 (diff)
download: gcc-648f8fc59b2cc39abd24f4c22388b346cdebcc31.tar.gz
1275 files changed, 47766 insertions, 17903 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cb9f1a392aa..806732359b6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,3364 @@
+2017-11-03  Jeff Law  <law@redhat.com>
+
+	* config/i386/i386.c (ix86_emit_restore_reg_using_pop): Prototype.
+	(ix86_adjust_stack_and_probe_stack_clash): Use a push/pop sequence
+	to probe at the start of a noreturn function.
+
+2017-11-03  Jakub Jelinek  <jakub@redhat.com>
+
+	PR tree-optimization/78821
+	* gimple-ssa-store-merging.c: Update the file comment.
+	(MAX_STORE_ALIAS_CHECKS): Define.
+	(struct store_operand_info): New type.
+	(store_operand_info::store_operand_info): New constructor.
+	(struct store_immediate_info): Add rhs_code and ops data members.
+	(store_immediate_info::store_immediate_info): Add rhscode, op0r
+	and op1r arguments to the ctor, initialize corresponding data members.
+	(struct merged_store_group): Add load_align_base and load_align
+	data members.
+	(merged_store_group::merged_store_group): Initialize them.
+	(merged_store_group::do_merge): Update them.
+	(merged_store_group::apply_stores): Pick the constant for
+	encode_tree_to_bitpos from one of the two operands, or skip
+	encode_tree_to_bitpos if neither operand is a constant.
+	(class pass_store_merging): Add process_store method decl.  Remove
+	bool argument from terminate_all_aliasing_chains method decl.
+	(pass_store_merging::terminate_all_aliasing_chains): Remove
+	var_offset_p argument and corresponding handling.
+	(stmts_may_clobber_ref_p): New function.
+	(compatible_load_p): New function.
+	(imm_store_chain_info::coalesce_immediate_stores): Terminate group
+	if there is overlap and rhs_code is not INTEGER_CST.  For
+	non-overlapping stores terminate group if rhs is not mergeable.
+	(get_alias_type_for_stmts): Change first argument from
+	auto_vec<gimple *> & to vec<gimple *> &.  Add IS_LOAD, CLIQUEP and
+	BASEP arguments.  If IS_LOAD is true, look at rhs1 of the stmts
+	instead of lhs.  Compute *CLIQUEP and *BASEP in addition to the
+	alias type.
+	(get_location_for_stmts): Change first argument from
+	auto_vec<gimple *> & to vec<gimple *> &.
+	(struct split_store): Remove orig_stmts data member, add orig_stores.
+	(split_store::split_store): Create orig_stores rather than orig_stmts.
+	(find_constituent_stmts): Renamed to ...
+	(find_constituent_stores): ... this.  Change second argument from
+	vec<gimple *> * to vec<store_immediate_info *> *, push pointers
+	to info structures rather than the statements.
+	(split_group): Rename ALLOW_UNALIGNED argument to
+	ALLOW_UNALIGNED_STORE, add ALLOW_UNALIGNED_LOAD argument and handle
+	it.  Adjust find_constituent_stores caller.
+	(imm_store_chain_info::output_merged_store): Handle rhs_code other
+	than INTEGER_CST, adjust split_group, get_alias_type_for_stmts and
+	get_location_for_stmts callers.  Set MR_DEPENDENCE_CLIQUE and
+	MR_DEPENDENCE_BASE on the MEM_REFs if they are the same in all stores.
+	(mem_valid_for_store_merging): New function.
+	(handled_load): New function.
+	(pass_store_merging::process_store): New method.
+	(pass_store_merging::execute): Use process_store method.  Adjust
+	terminate_all_aliasing_chains caller.
+
+2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
+	Return true for more constants, symbols and label references.
+	(aarch64_valid_floating_const): Remove unused function.
+
+2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	PR target/82786
+	* config/aarch64/aarch64.c (aarch64_layout_frame):
+	Undo forcing of LR at bottom of frame.
+
+2017-11-03  Jeff Law  <law@redhat.com>
+
+	PR target/82823
+	* config/i386/i386.c (ix86_expand_prologue): Tighten assert
+	for int_registers_saved.
+
+	* cfganal.c (single_pred_edge_ignoring_loop_edges): New function
+	extracted from tree-ssa-dom.c.
+	* cfganal.h (single_pred_edge_ignoring_loop_edges): Prototype.
+	* tree-ssa-dom.c (single_incoming_edge_ignoring_loop_edges): Remove.
+	(record_equivalences_from_incoming_edge): Add additional argument
+	to single_pred_edge_ignoring_loop_edges call.
+	* tree-ssa-uncprop.c (single_incoming_edge_ignoring_loop_edges): Remove.
+	(uncprop_dom_walker::before_dom_children): Add additional argument
+	to single_pred_edge_ignoring_loop_edges call.
+	* tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children): Use
+	single_pred_edge_ignoring_loop_edges rather than open coding.
+	* tree-vrp.c (evrp_dom_walker::before_dom_children): Similarly.
+
+2017-11-03  Marc Glisse  <marc.glisse@inria.fr>
+
+	* match.pd (-(-A)): Rewrite.
+
+2017-11-03  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	* config/rs6000/rs60000-protos.h (rs6000_emit_sISEL): Delete.
+	(rs6000_emit_int_cmove): New declaration.
+	* config/rs6000/rs6000.c (rs6000_emit_int_cmove): Delete declaration.
+	(rs6000_emit_sISEL): Delete.
+	(rs6000_emit_int_cmove): Make non-static.
+	* config/rs6000/rs6000.md (cstore<mode>4): Use rs6000_emit_int_cmove
+	instead of rs6000_emit_sISEL.
+
+2017-11-03  Jan Hubicka  <hubicka@ucw.cz>
+
+	* asan.c (create_cond_insert_point): Maintain profile.
+	* ipa-utils.c (ipa_merge_profiles): Be sure only IPA profiles are
+	merged.
+	* basic-block.h (struct basic_block_def): Remove frequency.
+	(EDGE_FREQUENCY): Use to_frequency
+	* bb-reorder.c (push_to_next_round_p): Use only IPA counts for global
+	heuristics.
+	(find_traces): Update to use to_frequency.
+	(find_traces_1_round): Likewise; use only IPA counts.
+	(bb_to_key): Likewise.
+	(connect_traces): Use IPA counts only.
+	(copy_bb_p): Update to use to_frequency.
+	(fix_up_crossing_landing_pad): Likewise.
+	(sanitize_hot_paths): Likewise.
+	* bt-load.c (basic_block_freq): Likewise.
+	* cfg.c (init_flow): Set count_max to uninitialized.
+	(check_bb_profile): Remove frequencies; check counts.
+	(dump_bb_info): Do not dump frequencies.
+	(update_bb_profile_for_threading): Update counts only.
+	(scale_bbs_frequencies_int): Likewise.
+	(MAX_SAFE_MULTIPLIER): Remove.
+	(scale_bbs_frequencies_gcov_type): Update counts only.
+	(scale_bbs_frequencies_profile_count): Update counts only.
+	(scale_bbs_frequencies): Update counts only.
+	* cfg.h (struct control_flow_graph): Add count-max.
+	(update_bb_profile_for_threading): Update prototype.
+	* cfgbuild.c (find_bb_boundaries): Do not update frequencies.
+	(find_many_sub_basic_blocks): Likewise.
+	* cfgcleanup.c (try_forward_edges): Likewise.
+	(try_crossjump_to_edge): Likewise.
+	* cfgexpand.c (expand_gimple_cond): Likewise.
+	(expand_gimple_tailcall): Likewise.
+	(construct_init_block): Likewise.
+	(construct_exit_block): Likewise.
+	* cfghooks.c (verify_flow_info): Check consistency of counts.
+	(dump_bb_for_graph): Do not dump frequencies.
+	(split_block_1): Do not update frequencies.
+	(split_edge): Do not update frequencies.
+	(make_forwarder_block): Do not update frequencies.
+	(duplicate_block): Do not update frequencies.
+	(account_profile_record): Do not update frequencies.
+	* cfgloop.c (find_subloop_latch_edge_by_profile): Use IPA counts
+	for global heuristics.
+	* cfgloopanal.c (average_num_loop_insns): Update to use to_frequency.
+	(expected_loop_iterations_unbounded): Use counts only.
+	* cfgloopmanip.c (scale_loop_profile): Simplify.
+	(create_empty_loop_on_edge): Simplify
+	(loopify): Simplify
+	(duplicate_loop_to_header_edge): Simplify
+	* cfgrtl.c (force_nonfallthru_and_redirect): Update profile.
+	(update_br_prob_note): Take care of removing note when profile
+	becomes undefined.
+	(relink_block_chain): Do not dump frequency.
+	(rtl_account_profile_record): Use to_frequency.
+	* cgraph.c (symbol_table::create_edge): Convert count to ipa count.
+	(cgraph_edge::redirect_call_stmt_to_calle): Conver tcount to ipa count.
+	(cgraph_update_edges_for_call_stmt_node): Likewise.
+	(cgraph_edge::verify_count_and_frequency): Update.
+	(cgraph_node::verify_node): Temporarily disable frequency verification.
+	* cgraphbuild.c (compute_call_stmt_bb_frequency): Use
+	to_cgraph_frequency.
+	(cgraph_edge::rebuild_edges): Convert to ipa counts.
+	* cgraphunit.c (init_lowered_empty_function): Do not initialize
+	frequencies.
+	(cgraph_node::expand_thunk): Update profile.
+	* except.c (dw2_build_landing_pads): Do not update frequency.
+	* final.c (compute_alignments): Use to_frequency.
+	(dump_basic_block_info): Do not dump frequency.
+	* gimple-pretty-print.c (dump_profile): Do not dump frequency.
+	(dump_gimple_bb_header): Do not dump frequency.
+	* gimple-ssa-isolate-paths.c (isolate_path): Do not update frequency;
+	do update count.
+	* gimple-streamer-in.c (input_bb): Do not stream frequency.
+	* gimple-streamer-out.c (output_bb): Do not stream frequency.
+	* haifa-sched.c (sched_pressure_start_bb): Use to_freuqency.
+	(init_before_recovery): Do not update frequency.
+	(sched_create_recovery_edges): Do not update frequency.
+	* hsa-gen.c (convert_switch_statements): Do not update frequency.
+	* ipa-cp.c (ipcp_propagate_stage): Update search for max_count.
+	(ipa_cp_c_finalize): Set max_count to uninitialized.
+	* ipa-fnsummary.c (get_minimal_bb): Use counts.
+	(param_change_prob): Use counts.
+	* ipa-profile.c (ipa_profile_generate_summary): Do not summarize
+	local profiles.
+	* ipa-split.c (consider_split): Use to_frequency.
+	(split_function): Use to_frequency.
+	* ira-build.c (loop_compare_func): Likewise.
+	(mark_loops_for_removal): Likewise.
+	(mark_all_loops_for_removal): Likewise.
+	* loop-doloop.c (doloop_modify): Do not update frequency.
+	* loop-unroll.c (unroll_loop_runtime_iterations): Do not update
+	frequency.
+	* lto-streamer-in.c (input_function): Update count_max.
+	* omp-expand.c (expand_omp_taskreg): Update count_max.
+	* omp-simd-clone.c (simd_clone_adjust): Update profile.
+	* predict.c (maybe_hot_frequency_p): Use to_frequency.
+	(maybe_hot_count_p): Use ipa counts only.
+	(maybe_hot_bb_p): Simplify.
+	(maybe_hot_edge_p): Simplify.
+	(probably_never_executed): Do not take frequency argument.
+	(probably_never_executed_bb_p): Do not pass frequency.
+	(probably_never_executed_edge_p): Likewise.
+	(combine_predictions_for_bb): Check that profile is nonzero.
+	(propagate_freq): Do not set frequency.
+	(drop_profile): Simplify.
+	(counts_to_freqs): Simplify.
+	(expensive_function_p): Use to_frequency.
+	(propagate_unlikely_bbs_forward): Simplify.
+	(determine_unlikely_bbs): Simplify.
+	(estimate_bb_frequencies): Add hack to silence graphite issues.
+	(compute_function_frequency): Use ipa counts.
+	(pass_profile::execute): Update.
+	(rebuild_frequencies): Use counts only.
+	(force_edge_cold): Use counts only.
+	* profile-count.c (profile_count::dump): Dump new count types.
+	(profile_count::differs_from_p): Check compatiblity.
+	(profile_count::to_frequency): New function.
+	(profile_count::to_cgraph_frequency): New function.
+	* profile-count.h (struct function): Declare.
+	(enum profile_quality): Add profile_guessed_local and
+	profile_guessed_global0.
+	(class profile_proability): Decrease number of bits to 29;
+	update from_reg_br_prob_note and to_reg_br_prob_note.
+	(class profile_count: Update comment; decrease number of bits
+	to 61. Check compatibility.
+	(profile_count::compatible_p): New private member function.
+	(profile_count::ipa_p): New member function.
+	(profile_count::operator<): Handle global zero correctly.
+	(profile_count::operator>): Handle global zero correctly.
+	(profile_count::operator<=): Handle global zero correctly.
+	(profile_count::operator>=): Handle global zero correctly.
+	(profile_count::nonzero_p): New member function.
+	(profile_count::force_nonzero): New member function.
+	(profile_count::max): New member function.
+	(profile_count::apply_scale): Handle IPA scalling.
+	(profile_count::guessed_local): New member function.
+	(profile_count::global0): New member function.
+	(profile_count::ipa): New member function.
+	(profile_count::to_frequency): Declare.
+	(profile_count::to_cgraph_frequency): Declare.
+	* profile.c (OVERLAP_BASE): Delete.
+	(compute_frequency_overlap): Delete.
+	(compute_branch_probabilities): Do not use compute_frequency_overlap.
+	* regs.h (REG_FREQ_FROM_BB): Use to_frequency.
+	* sched-ebb.c (rank): Use counts only.
+	* shrink-wrap.c (handle_simple_exit): Use counts only.
+	(try_shrink_wrapping): Use counts only.
+	(place_prologue_for_one_component): Use counts only.
+	* tracer.c (find_best_predecessor): Use to_frequency.
+	(find_trace): Use to_frequency.
+	(tail_duplicate): Use to_frequency.
+	* trans-mem.c (expand_transaction): Do not update frequency.
+	* tree-call-cdce.c: Do not update frequency. 
+	* tree-cfg.c (gimple_find_sub_bbs): Likewise.
+	(gimple_merge_blocks): Likewise.
+	(gimple_split_edge): Likewise.
+	(gimple_duplicate_sese_region): Likewise.
+	(gimple_duplicate_sese_tail): Likewise.
+	(move_sese_region_to_fn): Likewise.
+	(gimple_account_profile_record): Likewise.
+	(insert_cond_bb): Likewise.
+	* tree-complex.c (expand_complex_div_wide): Likewise.
+	* tree-eh.c (lower_resx): Update profile.
+	* tree-inline.c (copy_bb): Simplify count scaling; do not scale
+	frequencies.
+	(initialize_cfun): Do not initialize frequencies
+	(freqs_to_counts): Delete.
+	(copy_cfg_body): Ignore count parameter.
+	(copy_body): Update.
+	(expand_call_inline): Update count_max.
+	(optimize_inline_calls): Update count_max.
+	(tree_function_versioning): Update count_max.
+	* tree-ssa-coalesce.c (coalesce_cost_bb): Use to_frequency.
+	* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Do not update
+	frequency.
+	* tree-ssa-loop-im.c (execute_sm_if_changed): Use counts only.
+	* tree-ssa-loop-ivcanon.c (unloop_loops): Do not update freuqency.
+	(try_peel_loop): Likewise.
+	* tree-ssa-loop-ivopts.c (get_scaled_computation_cost_at): Use
+	to_frequency.
+	* tree-ssa-loop-manip.c (niter_for_unrolled_loop): Pass -1.
+	(tree_transform_and_unroll_loop): Do not use frequencies
+	* tree-ssa-loop-niter.c (estimate_numbers_of_iterations):
+	Use reliable prediction only.
+	* tree-ssa-loop-unswitch.c (hoist_guard): Do not use frequencies.
+	* tree-ssa-sink.c (select_best_block): Use to_frequency.
+	* tree-ssa-tail-merge.c (replace_block_by): Temporarily disable
+	probability scaling.
+	* tree-ssa-threadupdate.c (create_block_for_threading): Do
+	not update frequency
+	(any_remaining_duplicated_blocks): Likewise.
+	(update_profile): Likewise.
+	(estimated_freqs_path): Delete.
+	(freqs_to_counts_path): Delete.
+	(clear_counts_path): Delete.
+	(ssa_fix_duplicate_block_edges): Likewise.
+	(duplicate_thread_path): Likewise.
+	* tree-switch-conversion.c (gen_inbound_check): Use counts.
+	* tree-tailcall.c (decrease_profile): Do not update frequency.
+	(eliminate_tail_call): Likewise.
+	* tree-vect-loop-manip.c (vect_do_peeling): Likewise.
+	* tree-vect-loop.c (scale_profile_for_vect_loop): Likewise.
+	(optimize_mask_stores): Likewise.
+	* tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.
+	* ubsan.c (ubsan_expand_null_ifn): Update profile.
+	(ubsan_expand_ptr_ifn): Update profile.
+	* value-prof.c (gimple_ic): Simplify.
+	* value-prof.h (gimple_ic): Update prototype.
+	* ipa-inline-transform.c (inline_transform): Fix scaling conditoins.
+	* ipa-inline.c (compute_uninlined_call_time): Be sure that
+	counts are nonzero.
+	(want_inline_self_recursive_call_p): Likewise.
+	(resolve_noninline_speculation): Only cummulate defined counts.
+	(inline_small_functions): Use nonzero_p.
+	(ipa_inline): Do not access freed node.
+
+2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_override_options_internal):
+	Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.
+
+2017-11-03  Kito Cheng  <kito.cheng@gmail.com>
+
+	* config/riscv/riscv.c (riscv_legitimize_move): Handle
+	non-legitimate address.
+
+2017-11-03  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	* config/rs6000/rs6000.md (*lt0_disi): Delete.
+	(*lt0_<mode>di, *lt0_<mode>si): New.
+
+2017-11-03  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	* config/rs6000/rs6000.md (move_from_CR_ov_bit): Change condition to
+	TARGET_PAIRED_FLOAT.
+
+2017-11-03  Siddhesh Poyarekar  <siddhesh.poyarekar@linaro.org>
+	    Jim Wilson  <jim.wilson@linaro.org>
+
+	* config/aarch64/aarch64-cores.def (saphira): New CPU.
+	* config/aarch64/aarch64-tune.md: Regenerated.
+	* doc/invoke.texi (AArch64 Options/-mtune): Add "saphira".
+	* gcc/config/aarch64/aarch64.c (saphira_tunings): New tuning table.
+
+2017-11-03  Cupertino Miranda  <cmiranda@synopsys.com>
+
+	* config/arc/arc.c (arc_save_restore): Corrected CFA note.
+	(arc_expand_prologue): Restore blink for millicode.
+	* config/arc/linux.h (LINK_EH_SPEC): Defined.
+
+2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	PR target/82809
+	* config/i386/i386.c (ix86_vector_duplicate_value): Use
+	gen_vec_duplicate after forcing the scalar into a register.
+
+2017-11-02  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	* combine (try_combine): Print the insns input to try_combine to the
+	dump file.
+
+2017-11-02  Steve Ellcey  <sellcey@cavium.com>
+
+	PR target/79868
+	* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
+	Remove second argument from aarch64_process_target_attr call.
+	* config/aarch64/aarch64-protos.h (aarch64_process_target_attr):
+	Ditto.
+	* config/aarch64/aarch64.c (aarch64_attribute_info): Change
+	field type.
+	(aarch64_handle_attr_arch): Remove second argument.
+	(aarch64_handle_attr_cpu): Ditto.
+	(aarch64_handle_attr_tune): Ditto.
+	(aarch64_handle_attr_isa_flags): Ditto.
+	(aarch64_process_one_target_attr): Ditto.
+	(aarch64_process_target_attr): Ditto.
+	(aarch64_option_valid_attribute_p): Remove second argument.
+	on aarch64_process_target_attr call.
+
+2017-11-02  David Malcolm  <dmalcolm@redhat.com>
+
+	* diagnostic.c: Include "selftest-diagnostic.h".
+	(selftest::assert_location_text): New function.
+	(selftest::test_diagnostic_get_location_text): New function.
+	(selftest::diagnostic_c_tests): Call it.
+
+2017-11-02  David Malcolm  <dmalcolm@redhat.com>
+
+	* Makefile.in (OBJS-libcommon): Add selftest-diagnostic.o.
+	* diagnostic-show-locus.c: Include "selftest-diagnostic.h".
+	(class selftest::test_diagnostic_context): Move to...
+	* selftest-diagnostic.c: New file.
+	* selftest-diagnostic.h: New file.
+
+2017-11-02  James Bowman  <james.bowman@ftdichip.com>
+
+	* config/ft32/ft32.c (ft32_addr_space_legitimate_address_p): increase
+	offset range for FT32B.
+	* config/ft32/ft32.h: option "mcompress" enables relaxation.
+	* config/ft32/ft32.md: Add TARGET_NOPM.
+	* config/ft32/ft32.opt: Add mft32b, mcompress, mnopm.
+	* gcc/doc/invoke.texi: Add mft32b, mcompress, mnopm.
+
+2017-11-02  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.h (MALLOC_ABI_ALIGNMENT): New define.
+
+2017-11-02  Jeff Law  <law@redhat.com>
+
+	* gimple-ssa-sprintf.c (sprintf_dom_walker): Remove
+	virtual keyword on FINAL OVERRIDE members.
+
+	* tree-ssa-propagate.h (ssa_propagation_engine): Group
+	virtuals together.  Add virtual destructor.
+	(substitute_and_fold_engine): Similarly.
+
+2017-11-02  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune.def (X86_TUNE_USE_INCDEC): Enable for Haswell+.
+
+2017-11-02  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82795
+	* tree-if-conv.c (predicate_mem_writes): Remove bogus assert.
+
+2017-11-02  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* acinclude.m4 (gcc_AC_INITFINI_ARRAY): Don't require
+	gcc_SUN_LD_VERSION.
+	(gcc_GAS_CHECK_FEATURE): Remove.
+	* configure.ac (ld_vers) <*-*-solaris2*>: Move comments from
+	gcc_AC_INITFINI_ARRAY here.  Update for Solaris 11.4 changes.
+	* configure: Regenerate.
+
+2017-11-02  Claudiu Zissulescu <claziss@synopsys.com>
+
+	* config/arc/arc.c (hwloop_optimize): Account for empty
+	body loops.
+
+2017-11-02  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/82765
+	* varasm.c (decode_addr_const): Make offset HOST_WIDE_INT.
+	Truncate ARRAY_REF index and element size.
+
+2017-11-01  Palmer Dabbelt  <palmer@dabbelt.com>
+
+	* doc/invoke.texi (RISC-V Options): Use "@minus{}2 GB", not "-2 GB".
+
+2017-11-01  Jeff Law  <law@redhat.com>
+
+	* tree-ssa-ccp.c (ccp_folder): New class derived from
+	substitute_and_fold_engine.
+	(ccp_folder::get_value): New member function.
+	(ccp_folder::fold_stmt): Renamed from ccp_fold_stmt.
+	(ccp_fold_stmt): Remove prototype.
+	(ccp_finalize): Call substitute_and_fold from the ccp_class.
+	* tree-ssa-copy.c (copy_folder): New class derived from
+	substitute_and_fold_engine.
+	(copy_folder::get_value): Renamed from get_value.
+	(fini_copy_prop): Call substitute_and_fold from copy_folder class.
+	* tree-vrp.c (vrp_folder): New class derived from
+	substitute_and_fold_engine.
+	(vrp_folder::fold_stmt): Renamed from vrp_fold_stmt.
+	(vrp_folder::get_value): New member function.
+	(vrp_finalize): Call substitute_and_fold from vrp_folder class.
+	(evrp_dom_walker::before_dom_children): Similarly for replace_uses_in.
+	* tree-ssa-propagate.h (substitute_and_fold_engine): New class to
+	provide a class interface to folder/substitute routines.
+	(ssa_prop_fold_stmt_fn): Remove typedef.
+	(ssa_prop_get_value_fn): Likewise.
+	(subsitute_and_fold): Remove prototype.
+	(replace_uses_in): Likewise.
+	* tree-ssa-propagate.c (substitute_and_fold_engine::replace_uses_in):
+	Renamed from replace_uses_in.  Call the virtual member function
+	(substitute_and_fold_engine::replace_phi_args_in): Similarly.
+	(substitute_and_fold_dom_walker): Remove initialization of
+	data member entries for calbacks.  Add substitute_and_fold_engine
+	member and initialize it.
+	(substitute_and_fold_dom_walker::before_dom_children0: Use the
+	member functions for get_value, replace_phi_args_in c
+	replace_uses_in, and fold_stmt calls.
+	(substitute_and_fold_engine::substitute_and_fold): Renamed from
+	substitute_and_fold.  Remove assert.   Update ctor call.
+
+	* tree-ssa-propagate.h (ssa_prop_visit_stmt_fn): Remove typedef.
+	(ssa_prop_visit_phi_fn): Likewise.
+	(class ssa_propagation_engine): New class to provide an interface
+	into ssa_propagate.
+	* tree-ssa-propagate.c (ssa_prop_visit_stmt): Remove file scoped
+	variable.
+	(ssa_prop_visit_phi): Likewise.
+	(ssa_propagation_engine::simulate_stmt): Moved into class.
+	Call visit_phi/visit_stmt from the class rather than via
+	file scoped static variables.
+	(ssa_propagation_engine::simulate_block): Moved into class.
+	(ssa_propagation_engine::process_ssa_edge_worklist): Similarly.
+	(ssa_propagation_engine::ssa_propagate): Similarly.  No longer
+	set file scoped statics for the visit_stmt/visit_phi callbacks.
+	* tree-complex.c (complex_propagate): New class derived from
+	ssa_propagation_engine.
+	(complex_propagate::visit_stmt): Renamed from complex_visit_stmt.
+	(complex_propagate::visit_phi): Renamed from complex_visit_phi.
+	(tree_lower_complex): Call ssa_propagate via the complex_propagate
+	class.
+	* tree-ssa-ccp.c: (ccp_propagate): New class derived from
+	ssa_propagation_engine.
+	(ccp_propagate::visit_phi): Renamed from ccp_visit_phi_node.
+	(ccp_propagate::visit_stmt): Renamed from ccp_visit_stmt.
+	(do_ssa_ccp): Call ssa_propagate from the ccp_propagate class.
+	* tree-ssa-copy.c (copy_prop): New class derived from
+	ssa_propagation_engine.
+	(copy_prop::visit_stmt): Renamed from copy_prop_visit_stmt.
+	(copy_prop::visit_phi): Renamed from copy_prop_visit_phi_node.
+	(execute_copy_prop): Call ssa_propagate from the copy_prop class.
+	* tree-vrp.c (vrp_prop): New class derived from ssa_propagation_engine.
+	(vrp_prop::visit_stmt): Renamed from vrp_visit_stmt.
+	(vrp_prop::visit_phi): Renamed from vrp_visit_phi_node.
+	(execute_vrp): Call ssa_propagate from the vrp_prop class.
+
+2017-11-01  Jakub Jelinek  <jakub@redhat.com>
+
+	PR rtl-optimization/82778
+	PR rtl-optimization/82597
+	* compare-elim.c (struct comparison): Add in_a_setter field.
+	(find_comparison_dom_walker::before_dom_children): Remove killed
+	bitmap and df_simulate_find_defs call, instead walk the defs.
+	Compute last_setter and initialize in_a_setter.  Merge definitions
+	with first initialization for a few variables.
+	(try_validate_parallel): Use insn_invalid_p instead of
+	recog_memoized.  Return insn rather than just the pattern.
+	(try_merge_compare): Fix up comment.  Don't uselessly test if
+	in_a is a REG_P.  Use cmp->in_a_setter instead of walking UD
+	chains.
+	(execute_compare_elim_after_reload): Remove df_chain_add_problem
+	call.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_rtx_costs): Use
+	aarch64_hard_regno_nregs to get the number of registers
+	in a mode.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* config/aarch64/constraints.md (Upl): Rename to...
+	(Uaa): ...this.
+	* config/aarch64/aarch64.md
+	(*zero_extend<SHORT:mode><GPI:mode>2_aarch64, *addsi3_aarch64_uxtw):
+	Update accordingly.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_add_constant_internal)
+	(aarch64_add_constant, aarch64_add_sp, aarch64_sub_sp): Move
+	earlier in file.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_evpc_trn, aarch64_evpc_uzp)
+	(aarch64_evpc_zip, aarch64_evpc_ext, aarch64_evpc_rev)
+	(aarch64_evpc_dup): Generate rtl direcly, rather than using
+	named expanders.
+	(aarch64_expand_vec_perm_const_1): Explicitly check for permutes
+	of a single element.
+	* config/aarch64/iterators.md: Add a comment above the permute
+	unspecs to say that they are generated directly by
+	aarch64_expand_vec_perm_const.
+	* config/aarch64/aarch64-simd.md: Likewise the permute instructions.
+
+2017-11-01  Nathan Sidwell  <nathan@acm.org>
+
+	* tree-dump.c (dequeue_and_dump): Use HAS_DECL_ASSEMBLER_NAME_P.
+
+2017-11-01  Palmer Dabbelt  <palmer@dabbelt.com>
+
+	* doc/invoke.texi (RISC-V Options): Explicitly name the medlow
+	and medany code models, and describe what they do.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	Revert accidental duplicate:
+
+	* combine.c (can_change_dest_mode): Reject changes in
+	REGMODE_NATURAL_SIZE.
+
+2017-11-01  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	PR rtl-optimization/64682
+	PR rtl-optimization/69567
+	PR rtl-optimization/69737
+	PR rtl-optimization/82683
+	* combine.c (distribute_notes) <REG_DEAD>: If the new I2 sets the same
+	register mentioned in the note, drop the note, unless it came from I3,
+	in which case it should go to I3 again.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* tree-ssa-dse.c (normalize_ref): Check whether the ranges overlap
+	and return false if not.
+	(clear_bytes_written_by, live_bytes_read): Update accordingly.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* tree-ssa-alias.h (ranges_overlap_p): Return false if either
+	range is known to be empty.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* simplify-rtx.c (simplify_const_unary_operation): Use GET_MODE_NUNITS
+	and CONST_VECTOR_NUNITS instead of computing the number of units from
+	the byte sizes of the vector and element.
+	(simplify_binary_operation_1): Likewise.
+	(simplify_const_binary_operation): Likewise.
+	(simplify_ternary_operation): Likewise.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* var-tracking.c (INT_MEM_OFFSET): Replace with...
+	(int_mem_offset): ...this new function.
+	(var_mem_set, var_mem_delete_and_set, var_mem_delete)
+	(find_mem_expr_in_1pdv, dataflow_set_preserve_mem_locs)
+	(same_variable_part_p, use_type, add_stores, vt_get_decl_and_offset):
+	Update accordingly.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* lower-subreg.c (interesting_mode_p): New function.
+	(compute_costs, find_decomposable_subregs, decompose_register)
+	(simplify_subreg_concatn, can_decompose_p, resolve_simple_move)
+	(resolve_clobber, dump_choices): Use it.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* rtlhash.c (add_rtx): Use add_hwi for 'w' and add_int for 'i'.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* alias.c (find_base_value, find_base_term): Only process integer
+	truncations.  Check the precision rather than the size.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* machmode.h (is_narrower_int_mode): New function
+	* optabs.c (expand_float, expand_fix): Use it.
+	* dwarf2out.c (rotate_loc_descriptor): Likewise.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* rtl.h (narrower_subreg_mode): New function.
+	* ira-color.c (update_costs_from_allocno): Use it.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* optabs-query.h (convert_optab_p): New function, split out from...
+	(convert_optab_handler): ...here.
+	(widening_optab_handler): Delete.
+	(find_widening_optab_handler): Remove permit_non_widening parameter.
+	(find_widening_optab_handler_and_mode): Likewise.  Provide an
+	override that operates on mode class wrappers.
+	* optabs-query.c (widening_optab_handler): Delete.
+	(find_widening_optab_handler_and_mode): Remove permit_non_widening
+	parameter.  Assert that the two modes are the same class and that
+	the "from" mode is narrower than the "to" mode.  Use
+	convert_optab_handler instead of widening_optab_handler.
+	* expmed.c (expmed_mult_highpart_optab): Use convert_optab_handler
+	instead of widening_optab_handler.
+	* expr.c (expand_expr_real_2): Update calls to
+	find_widening_optab_handler.
+	* optabs.c (expand_widen_pattern_expr): Likewise.
+	(expand_binop_directly): Take the insn_code as a parameter.
+	(expand_binop): Only call find_widening_optab_handler for
+	conversion optabs; use optab_handler otherwise.  Update calls
+	to find_widening_optab_handler and expand_binop_directly.
+	Use convert_optab_handler instead of widening_optab_handler.
+	* tree-ssa-math-opts.c (convert_mult_to_widen): Update calls to
+	find_widening_optab_handler and use scalar_mode rather than
+	machine_mode.
+	(convert_plusminus_to_widen): Likewise.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* machmode.h (fixed_size_mode): New class.
+	* rtl.h (get_pool_mode): Return fixed_size_mode.
+	* gengtype.c (main): Add fixed_size_mode.
+	* target.def (get_raw_result_mode): Return a fixed_size_mode.
+	(get_raw_arg_mode): Likewise.
+	* doc/tm.texi: Regenerate.
+	* targhooks.h (default_get_reg_raw_mode): Return a fixed_size_mode.
+	* targhooks.c (default_get_reg_raw_mode): Likewise.
+	* config/ia64/ia64.c (ia64_get_reg_raw_mode): Likewise.
+	* config/mips/mips.c (mips_get_reg_raw_mode): Likewise.
+	* config/msp430/msp430.c (msp430_get_raw_arg_mode): Likewise.
+	(msp430_get_raw_result_mode): Likewise.
+	* config/avr/avr-protos.h (regmask): Use as_a <fixed_side_mode>
+	* dbxout.c (dbxout_parms): Require fixed-size modes.
+	* expr.c (copy_blkmode_from_reg, copy_blkmode_to_reg): Likewise.
+	* gimple-ssa-store-merging.c (encode_tree_to_bitpos): Likewise.
+	* omp-low.c (lower_oacc_reductions): Likewise.
+	* simplify-rtx.c (simplify_immed_subreg): Take fixed_size_modes.
+	(simplify_subreg): Update accordingly.
+	* varasm.c (constant_descriptor_rtx::mode): Change to fixed_size_mode.
+	(force_const_mem): Update accordingly.  Return NULL_RTX for modes
+	that aren't fixed-size.
+	(get_pool_mode): Return a fixed_size_mode.
+	(output_constant_pool_2): Take a fixed_size_mode.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* doc/rtl.texi (vec_series): Document.
+	(const): Say that the operand can be a vec_series.
+	* rtl.def (VEC_SERIES): New rtx code.
+	* rtl.h (const_vec_series_p_1): Declare.
+	(const_vec_series_p): New function.
+	* emit-rtl.h (gen_const_vec_series): Declare.
+	(gen_vec_series): Likewise.
+	* emit-rtl.c (const_vec_series_p_1, gen_const_vec_series)
+	(gen_vec_series): Likewise.
+	* optabs.c (expand_mult_highpart): Use gen_const_vec_series.
+	* simplify-rtx.c (simplify_unary_operation): Handle negations
+	of vector series.
+	(simplify_binary_operation_series): New function.
+	(simplify_binary_operation_1): Use it.  Handle VEC_SERIES.
+	(test_vector_ops_series): New function.
+	(test_vector_ops): Call it.
+	* config/powerpcspe/altivec.md (altivec_lvsl): Use
+	gen_const_vec_series.
+	(altivec_lvsr): Likewise.
+	* config/rs6000/altivec.md (altivec_lvsl, altivec_lvsr): Likewise.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* doc/rtl.texi (const): Update description of address constants.
+	Say that vector constants are allowed too.
+	* common.md (E, F): Use CONSTANT_P instead of checking for
+	CONST_VECTOR.
+	* emit-rtl.c (gen_lowpart_common): Use const_vec_p instead of
+	checking for CONST_VECTOR.
+	* expmed.c (make_tree): Use build_vector_from_val for a CONST
+	VEC_DUPLICATE.
+	* expr.c (expand_expr_real_2): Check for vector modes instead
+	of checking for CONST_VECTOR.
+	* rtl.h (const_vec_p): New function.
+	(const_vec_duplicate_p): Check for a CONST VEC_DUPLICATE.
+	(unwrap_const_vec_duplicate): Handle them here too.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    David Malcolm  <dmalcolm@redhat.com>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* rtl.h (vec_duplicate_p): New function.
+	* selftest-rtl.c (assert_rtx_eq_at): New function.
+	* selftest-rtl.h (ASSERT_RTX_EQ): New macro.
+	(assert_rtx_eq_at): Declare.
+	* selftest.h (selftest::simplify_rtx_c_tests): Declare.
+	* selftest-run-tests.c (selftest::run_tests): Call it.
+	* simplify-rtx.c: Include selftest.h and selftest-rtl.h.
+	(simplify_unary_operation_1): Recursively handle vector duplicates.
+	(simplify_binary_operation_1): Likewise.  Handle VEC_SELECTs of
+	vector duplicates.
+	(simplify_subreg): Handle subregs of vector duplicates.
+	(make_test_reg, test_vector_ops_duplicate, test_vector_ops)
+	(selftest::simplify_rtx_c_tests): New functions.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* emit-rtl.h (gen_const_vec_duplicate): Declare.
+	(gen_vec_duplicate): Likewise.
+	* emit-rtl.c (gen_const_vec_duplicate_1): New function, split
+	out from...
+	(gen_const_vector): ...here.
+	(gen_const_vec_duplicate, gen_vec_duplicate): New functions.
+	(gen_rtx_CONST_VECTOR): Use gen_const_vec_duplicate for constants
+	whose elements are all equal.
+	* optabs.c (expand_vector_broadcast): Use gen_const_vec_duplicate.
+	* simplify-rtx.c (simplify_const_unary_operation): Likewise.
+	(simplify_relational_operation): Likewise.
+	* config/aarch64/aarch64.c (aarch64_simd_gen_const_vector_dup):
+	Likewise.
+	(aarch64_simd_dup_constant): Use gen_vec_duplicate.
+	(aarch64_expand_vector_init): Likewise.
+	* config/arm/arm.c (neon_vdup_constant): Likewise.
+	(neon_expand_vector_init): Likewise.
+	(arm_expand_vec_perm): Use gen_const_vec_duplicate.
+	(arm_block_set_unaligned_vect): Likewise.
+	(arm_block_set_aligned_vect): Likewise.
+	* config/arm/neon.md (neon_copysignf<mode>): Likewise.
+	* config/i386/i386.c (ix86_expand_vec_perm): Likewise.
+	(expand_vec_perm_even_odd_pack): Likewise.
+	(ix86_vector_duplicate_value): Use gen_vec_duplicate.
+	* config/i386/sse.md (one_cmpl<mode>2): Use CONSTM1_RTX.
+	* config/ia64/ia64.c (ia64_expand_vecint_compare): Use
+	gen_const_vec_duplicate.
+	* config/ia64/vect.md (addv2sf3, subv2sf3): Use CONST1_RTX.
+	* config/mips/mips.c (mips_gen_const_int_vector): Use
+	gen_const_vec_duplicate.
+	(mips_expand_vector_init): Use CONST0_RTX.
+	* config/powerpcspe/altivec.md (abs<mode>2, nabs<mode>2): Likewise.
+	(define_split): Use gen_const_vec_duplicate.
+	* config/rs6000/altivec.md (abs<mode>2, nabs<mode>2): Use CONST0_RTX.
+	(define_split): Use gen_const_vec_duplicate.
+	* config/s390/vx-builtins.md (vec_genmask<mode>): Likewise.
+	(vec_ctd_s64, vec_ctd_u64, vec_ctsl, vec_ctul): Likewise.
+	* config/spu/spu.c (spu_const): Likewise.
+
+2017-11-01  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* combine.c (can_change_dest_mode): Reject changes in
+	REGMODE_NATURAL_SIZE.
+
+2017-10-31  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* configure.ac (--enable-libssp): New.
+	(gcc_cv_libc_provides_ssp): Check for explicit setting before
+	trying to determine target-specific default.  Adjust indentation.
+	* configure: Regenerated.
+	* doc/install.texi (Configuration): Expand --disable-libssp
+	documentation.
+
+2017-10-31  Daniel Santos  <daniel.santos@pobox.com>
+
+	config/i386/i386.c (ix86_expand_epilogue): Correct stack
+	calculation.
+
+2017-10-31  Martin Jambor  <mjambor@suse.cz>
+
+	PR c++/81702
+	* gimple-fold.c (gimple_get_virt_method_for_vtable): Remove assert.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* auto-profile.c (autofdo_source_profile::read): Use
+	UNKNOWN_LOCATION rather than 0.
+	* diagnostic-core.h (warning_at_rich_loc): Rename to...
+	(warning_at): ...this overload.
+	(warning_at_rich_loc_n): Rename to...
+	(warning_n): ...this overload.
+	(error_at_rich_loc): Rename to...
+	(error_at): ...this overload.
+	(pedwarn_at_rich_loc): Rename to...
+	(pedwarn): ...this overload.
+	(permerror_at_rich_loc): Rename to...
+	(permerror): ...this overload.
+	(inform_at_rich_loc): Rename to...
+	(inform): ...this overload.
+	* diagnostic.c: (diagnostic_n_impl): Delete location_t-based decl.
+	(diagnostic_n_impl_richloc): Rename to...
+	(diagnostic_n_impl): ...this rich_location *-based decl.
+	(inform_at_rich_loc): Rename to...
+	(inform): ...this, and add an assertion.
+	(inform_n): Update for removal of location_t-based diagnostic_n_impl.
+	(warning_at_rich_loc): Rename to...
+	(warning_at): ...this, and add an assertion.
+	(warning_at_rich_loc_n): Rename to...
+	(warning_n): ...this, and add an assertion.
+	(warning_n): Update location_t-based implementation for removal of
+	location_t-based diagnostic_n_impl.
+	(pedwarn_at_rich_loc): Rename to...
+	(pedwarn): ...this, and add an assertion.
+	(permerror_at_rich_loc): Rename to...
+	(permerror): ...this, and add an assertion.
+	(error_n): Update for removal of location_t-based diagnostic_n_impl.
+	(error_at_rich_loc): Rename to...
+	(error_at): ...this, and add an assertion.
+	* gcc.c (do_spec_1): Use UNKNOWN_LOCATION rather than 0.
+	(driver::do_spec_on_infiles): Likewise.
+	* substring-locations.c (format_warning_va): Update for renaming
+	of inform_at_rich_loc.
+
+2017-10-31  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* builtins.def (DEF_FLOATN_BUILTIN): Change most _Float<N> and
+	_Float<N>X built-in functions so that the variant without the
+	"__builtin_" prefix is only enabled for the GNU C and Objective C
+	languages when they are in non-strict ANSI/ISO mode.
+	(DEF_EXT_LIB_FLOATN_NX_BUILTINS): Likewise.
+	* target.def (floatn_builtin_p): Add a target hook to control
+	whether _Float<N> and _Float<N>X built-in functions without the
+	"__builtin_" prefix are enabled, and return true for C and
+	Objective C in the default hook.  Include langhooks.h in
+	targhooks.c.
+	* targhooks.h (default_floatn_builtin_p): Likewise.
+	* targhooks.c (default_floatn_builtin_p): Likewise.
+	* doc/tm.texi.in (TARGET_FLOATN_BUILTIN_P): Document the
+	floatn_builtin_p target hook.
+	* doc/tm.texi (TARGET_FLOATN_BUILTIN_P): Likewise.
+
+2017-10-31  Matthew Fortune  <matthew.fortune@imgtec.com>
+            Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR rtl-optimization/81803
+	* lra-constraints.c (curr_insn_transform): Also reload the whole
+	register for a strict subreg no wider than a word if this is for
+	a WORD_REGISTER_OPERATIONS target.
+
+2017-10-31  Jason Merrill  <jason@redhat.com>
+
+	* gdbinit.in: Skip over inlines from timevar.h.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* doc/gcov.texi: Document new option.
+	* gcov.c (print_usage): Likewise print it.
+	(process_args): Support the argument.
+	(format_count): New function.
+	(format_gcov): Use the function.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* gcov.c (struct name_map): do not use typedef.
+	Define operator== and operator<.
+	(name_search): Remove.
+	(name_sort): Remove.
+	(main): Do not allocate names.
+	(process_file): Add vertical space.
+	(generate_results): Use std::find.
+	(release_structures): Do not release memory.
+	(find_source): Use std::find.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* gcov.c (struct line_info): Remove it's typedef.
+	(line_info::line_info): Add proper ctor.
+	(line_info::has_block): Do not use a typedef.
+	(struct source_info): Do not use typedef.
+	(circuit): Likewise.
+	(get_cycles_count): Likewise.
+	(output_intermediate_file): Iterate via vector iterator.
+	(add_line_counts): Use std::vector methods.
+	(accumulate_line_counts): Likewise.
+	(output_lines): Likewise.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* gcov.c (struct source_info): Remove typedef.
+	(source_info::source_info): Add proper ctor.
+	(accumulate_line_counts): Use struct, not it's typedef.
+	(output_gcov_file): Likewise.
+	(output_lines): Likewise.
+	(main): Do not allocate an array.
+	(output_intermediate_file): Use size of vector container.
+	(process_file): Resize the vector.
+	(generate_results): Do not preallocate, use newly added vector
+	lines.
+	(release_structures): Do not release sources.
+	(find_source): Use vector methods.
+	(add_line_counts): Do not use typedef.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* doc/gcov.texi: Document that.
+	* gcov.c (add_line_counts): Mark lines with a non-executed
+	statement.
+	(output_line_beginning): Handle such lines.
+	(output_lines): Pass new argument.
+	(output_intermediate_file): Print it in intermediate format.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* color-macros.h: New file.
+	* diagnostic-color.c: Factor out color related to macros to
+	color-macros.h.
+	* doc/gcov.texi: Document -k option.
+	* gcov.c (INCLUDE_STRING): Include string.h.
+	(print_usage): Add -k option.
+	(process_args): Parse it.
+	(pad_count_string): New function.
+	(output_line_beginning): Likewise.
+	(DEFAULT_LINE_START): New macro.
+	(output_lines): Support color output.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	PR gcov-profile/82633
+	* doc/gcov.texi: Document -fkeep-{static,inline}-functions and
+	their interaction with GCOV infrastructure.
+	* configure.ac: Add -fkeep-{inline,static}-functions to
+	coverage_flags.
+	* configure: Regenerate.
+
+2017-10-31  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82772
+	* config/alpha/sync.md (fetchop_constr) <and>: Change to "rINM".
+
+2017-10-31  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	PR target/82674
+	* config/rs6000/rs6000.md (allocate_stack): Force update interval
+	into a register if it does not fit into an immediate offset field.
+
+2017-10-31  Olivier Hainque  <hainque@adacore.com>
+
+        * gcc/Makefile.in (FLAGS_TO_PASS): Pass libsubdir as well.
+
+2017-10-31  Julia Koval  <julia.koval@intel.com>
+
+	* config.gcc: Add gfniintrin.h.
+	* config/i386/gfniintrin.h: New.
+	* config/i386/i386-builtin-types.def
+	(__builtin_ia32_vgf2p8affineinvqb_v64qi,
+	__builtin_ia32_vgf2p8affineinvqb_v64qi_mask,
+	__builtin_ia32_vgf2p8affineinvqb_v32qi,
+	__builtin_ia32_vgf2p8affineinvqb_v32qi_mask,
+	__builtin_ia32_vgf2p8affineinvqb_v16qi,
+	__builtin_ia32_vgf2p8affineinvqb_v16qi_mask): New builtins.
+	* config/i386/i386-builtin.def (V64QI_FTYPE_V64QI_V64QI_INT_V64QI_UDI,
+	V32QI_FTYPE_V32QI_V32QI_INT_V32QI_USI,
+	V16QI_FTYPE_V16QI_V16QI_INT_V16QI_UHI,
+	V64QI_FTYPE_V64QI_V64QI_INT): New types.
+	* config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
+	* config/i386/immintrin.h: Include gfniintrin.h.
+	* config/i386/sse.md (vgf2p8affineinvqb_*) New pattern.
+
+2017-10-30  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc.c (HAVE_TARGET_EXECUTABLE_SUFFIX): Remove old kludge.
+
+2017-10-30  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/arm/arm.md (ashldi3): Remove shift by 1 expansion.
+	(arm_ashldi3_1bit): Remove pattern.
+	(ashrdi3): Remove shift by 1 expansion.
+	(arm_ashrdi3_1bit): Remove pattern.
+	(lshrdi3): Remove shift by 1 expansion.
+	(arm_lshrdi3_1bit): Remove pattern.
+	* config/arm/arm.c (arm_rtx_costs_internal): Slightly increase
+	cost of ashldi3 by 1.
+	* config/arm/neon.md (ashldi3_neon): Remove shift by 1 expansion.
+	(<shift>di3_neon): Likewise.
+
+2017-10-30  Dominik Infuehr  <dominik.infuehr@theobroma-systems.com>
+
+	* config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Rename
+	both identically named patterns to (*aarch64_simd_mov<VD:mode>)
+	and (*aarch64_simd_mov<VQ:mode>).
+	(*aarch64_simd_mov<VD:mode>): Change type attribute to match
+	pattern alternative.
+	(*aarch64_simd_mov<VQ:mode>): Re-order and change type
+	attributes to match pattern alternative.
+
+2017-10-30  Steven Munroe  <munroesj@gcc.gnu.org>
+
+	* config.gcc (powerpc*-*-*): Add emmintrin.h.
+	* config/rs6000/emmintrin.h: New file.
+	* config/rs6000/x86intrin.h [__ALTIVEC__]: Include emmintrin.h.
+
+2017-10-30  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/arm/vfp.md (movdi_vfp): Merge changes from movdi_vfp_cortexa8.
+	* (movdi_vfp_cortexa8): Remove pattern.
+
+2017-10-30  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* doc/install.texi (Specific, alpha*-*-*): Remove DEC OSF/1
+	etc. reference.
+	(Specific, alpha*-dec-osf5.1): Remove.
+	(Specific, mips-sgi-irix5): Remove.
+	(Specific, mips-sgi-irix6): Remove.
+
+2017-10-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/22141
+	* gimple-ssa-store-merging.c (merged_store_group::apply_stores): Fix
+	arguments to clear_bit_region_be.
+
+2017-10-30  Jim Wilson  <wilson@tuliptree.org>
+
+	* gimplify.c: Include memmodel.h.
+
+2017-10-30  Martin Jambor  <mjambor@suse.cz>
+
+	* omp-grid.c (grid_attempt_target_gridification): Also insert a
+	condition whether loop should be executed at all.
+
+2017-10-30  Will Schmidt  <will_schmidt@vnet.ibm.com>
+
+	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
+	gimple folding of vec_madd() intrinsics.
+	* config/rs6000/altivec.md (mulv8hi3): Rename altivec_vmladduhm to
+	fmav8hi4.  (altivec_vmladduhm): Rename to fmav8hi4.
+	* config/rs6000/rs6000-builtin.def: Rename vmladduhm to fmav8hi4
+
+2017-10-30  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82762
+	Revert
+	2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82129
+	Revert
+	2017-08-01  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/81181
+	* tree-ssa-pre.c (compute_antic_aux): Defer clean() to ...
+	(compute_antic): ... end of iteration here.
+
+2017-10-30  Joseph Myers  <joseph@codesourcery.com>
+
+	* doc/invoke.texi (C Dialect Options): Document -std=c17,
+	-std=iso9899:2017 and -std=gnu17.
+	* doc/standards.texi (C Language): Document C17 support.
+	* doc/cpp.texi (Overview): Mention -std=c17.
+	(Standard Predefined Macros): Document C11 and C17 values of
+	__STDC_VERSION__.  Do not refer to C99 support as incomplete.
+	* doc/extend.texi (Inline): Do not list individual options for
+	standards newer than C99.
+	* dwarf2out.c (highest_c_language, gen_compile_unit_die): Handle
+	"GNU C17".
+	* config/rl78/rl78.c (rl78_option_override): Handle "GNU C17"
+	language name.
+
+2017-10-30  Maxim Ostapenko  <m.ostapenko@samsung.com>
+
+	* asan.c (asan_finish_file): Align asan globals array by shadow
+	granularity.
+
+2017-10-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/22141
+	* gimple-ssa-store-merging.c: Include rtl.h and expr.h.
+	(struct store_immediate_info): Add bitregion_start and bitregion_end
+	fields.
+	(store_immediate_info::store_immediate_info): Add brs and bre
+	arguments and initialize bitregion_{start,end} from those.
+	(struct merged_store_group): Add bitregion_start, bitregion_end,
+	align_base and mask fields.  Drop unnecessary struct keyword from
+	struct store_immediate_info.  Add do_merge method.
+	(clear_bit_region_be): Use memset instead of loop storing zeros.
+	(merged_store_group::do_merge): New method.
+	(merged_store_group::merge_into): Use do_merge.  Allow gaps in between
+	stores as long as the surrounding bitregions have no gaps.
+	(merged_store_group::merge_overlapping): Use do_merge.
+	(merged_store_group::apply_stores): Test that bitregion_{start,end}
+	is byte aligned, rather than requiring that start and width are
+	byte aligned.  Drop unnecessary struct keyword from
+	struct store_immediate_info.  Allocate and populate also mask array.
+	Make start of the arrays relative to bitregion_start rather than
+	start and size them according to bitregion_{end,start} difference.
+	(struct imm_store_chain_info): Drop unnecessary struct keyword from
+	struct store_immediate_info.
+	(pass_store_merging::gate): Punt if BITS_PER_UNIT or CHAR_BIT is not 8.
+	(pass_store_merging::terminate_all_aliasing_chains): Drop unnecessary
+	struct keyword from struct store_immediate_info.
+	(imm_store_chain_info::coalesce_immediate_stores): Allow gaps in
+	between stores as long as the surrounding bitregions have no gaps.
+	Formatting fixes.
+	(struct split_store): Add orig non-static data member.
+	(split_store::split_store): Initialize orig to false.
+	(find_constituent_stmts): Return store_immediate_info *, non-NULL
+	if there is exactly a single original stmt.  Change stmts argument
+	to pointer from reference, if NULL, don't push anything to it.  Add
+	first argument, use it to optimize skipping over orig stmts that
+	are known to be before bitpos already.  Simplify.
+	(split_group): Return unsigned int count how many stores are or
+	would be needed rather than a bool.  Add allow_unaligned argument.
+	Change split_stores argument from reference to pointer, if NULL,
+	only do a dry run computing how many stores would be produced.
+	Rewritten algorithm to use both alignment and misalign if
+	!allow_unaligned and handle bitfield stores with gaps.
+	(imm_store_chain_info::output_merged_store): Set start_byte_pos
+	from bitregion_start instead of start.  Compute allow_unaligned
+	here, if true, do 2 split_group dry runs to compute which one
+	produces fewer stores and prefer aligned if equal.  Punt if
+	new count is bigger or equal than original before emitting any
+	statements, rather than during that.  Remove no longer needed
+	new_ssa_names tracking.  Replace num_stmts with
+	split_stores.length ().  Use 32-bit stack allocated entries
+	in split_stores auto_vec.  Try to reuse original store lhs/rhs1
+	if possible.  Handle bitfields with gaps.
+	(pass_store_merging::execute): Ignore bitsize == 0 stores.
+	Compute bitregion_{start,end} for the stores and construct
+	store_immediate_info with that.  Formatting fixes.
+
+2017-10-30  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82725
+	* config/i386/i386.c (legitimate_pic_address_disp_p): Allow
+	UNSPEC_DTPOFF and UNSPEC_NTPOFF with SImode immediate offset.
+
+2017-10-29  Jim Wilson  <wilson@tuliptree.org>
+
+	* gimplify.c: Include tm_p.h.
+
+	* common.opt (gcoff): Re-add as ignored option.
+	(gcoff1, gcoff2, gcoff3): Likewise.
+
+	* Makefile.in (OBJS): Delete sdbout.o.
+	(GTFILES): Delete $(srcdir)/sdbout.c.
+	* debug.h: Delete sdb_debug_hooks.
+	* final.c: Delete sdbout.h include.
+	(final_scan_insn): Delete SDB_DEBUG check.
+	(rest_of_clean_state): Likewise.
+	* output.h: Delete sdb_begin_function_line.
+	* sdbout.c: Delete.
+	* sdbout.h: Delete.
+	* toplev.c: Delete sdbout.h include.
+	(process_options): Delete SDB_DEBUG check.
+	* tree-core.h (tree_type_common): Delete pointer field of
+	tree_type_symtab.
+	* tree.c (copy_node): Clear TYPE_SYMTAB_DIE instead of
+	TYPE_SYMTAB_POINTER.
+	* tree.h (TYPE_SYMTAB_POINTER): Delete.
+	(TYPE_SYMTAB_IS_POINTER): Delete.
+	(TYPE_SYMTAB_IS_DIE): Renumber.
+	* xcoffout.c: Refer to former sdbout.c file.
+	(xcoffout_begin_prologue): Use past tense for sdbout.c reference.
+
+	* doc/install.texi (--with-stabs): Delete COFF and ECOFF info.
+	* doc/invoke.texi (SEEALSO): Delete adb and sdb references.
+	(Debugging Options): Delete -gcoff.
+	(-gstabs): Delete SDB reference.
+	(-gcoff): Delete.
+	(-gcoff@var{level}): Delete.
+	* doc/passes.texi (Debugging information output): Delete SDB and
+	sdbout.c references.
+	* doc/tm.texi: Regenerate.
+	* doc/tm.texi.in (DWARF_CIE_DATA_ALIGNMENT): Delete SDB from xref.
+	(SDB and DWARF): Change node name to DWARF and delete SDB and COFF
+	references.
+	(DEBUGGER_AUTO_OFFSET): Delete COFF and SDB references.
+	(PREFERRED_DEBUGGING_TYPE): Delete SDB_DEBUG and -gcoff references.
+	(SDB_DEBUGGING_INFO): Delete.
+	(PUT_SDB_@dots{}, SDB_DELIM, SDB_ALLOW_UNKNOWN_REFERENCES)
+	SDB_ALLOW_FORWARD_REFERENCES, SDB_OUTPUT_SOURCE_LINE): Delete.
+	* target.def (output_source_filename): Delete COFF reference.
+
+	* common.opt (gcoff): Delete.
+	(gxcoff+): Update Negative chain.
+	* defaults.h: Delete all references to SDB_DEBUGGING_INFO and
+	SDB_DEBUG.
+	* dwarf2out.c (gen_array_type_die): Change SDB to debuggers.
+	* flag-types.h (enum debug_info_type): Delete SDB_DEBUG.
+	* function.c (number_blocks): Delete SDB_DEBUGGING_INFO, SDB_DEBUG,
+	and SDB references.
+	(expand_function_start): Change sdb reference to past tense.
+	(expand_function_end): Change sdb reference to past tense.
+	* gcc.c (cpp_unique_options): Delete gcoff3 reference.
+	* opts.c (debug_type_names): Delete coff entry.
+	(common_handle_option): Delete OPT_gcoff case.
+	* system.h (SDB_DEBUG, SDB_DEBUGGING_INFO): Poison.
+
+	* config/dbxcoff.h (PREFERRED_DEBUGGING_TYPE): Set to DBX_DEBUG.
+	* config/cris/cris.h: Delete SDB reference in comment.
+	* config/i386/cygming.h: Don't define SDB_DEBUGGING_INFO.
+	(ASM_DECLARE_FUNCTION_NAME): Delete SDB reference from comment.
+	* config/i386/gas.h: Don't define SDB_DEBUGGING_INFO.
+	* config/i386/i386.c (svr4_dbx_register_map): Change SDB references
+	to past tense.
+	(ix86_expand_prologue): Likewise.
+	* config/i386/winnt.c (i386_pe_start_function): Don't check SDB_DEBUG.
+	* config/ia64/ia64.h: Likewise.
+	* config/m68k/m68kelf.h (DBX_REGISTER_NUMBER): Delete SDB reference.
+	* config/mips/mips.h (SUBTARGET_ASM_DEBUGGING_SPEC): Delete gcoff*
+	support.
+	* config/mmix/mmix.h: Likewise.
+	* config/nds32/nds32.c: Likewise.
+	* config/stormy/storym16.h: Likewise.
+	* config/visium/visium.h: Likewise.
+	* config/vx-common.h (SDB_DEBUGGING_INFO): Delete undef.
+
+2017-10-28  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/nios2.h (FRAME_GROWS_DOWNWARD): Define to 1.
+	* config/nios2/nios2.c (nios2_initial_elimination_offset):  Make
+	FRAME_POINTER_REGNUM point at high end of local var area.
+
+2017-10-27  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* bb-reorder.c (find_traces_1_round): Fix off-by-one index.
+	Move comment around.  Do not reset best_edge for a copiable
+	destination if the copy would cause a partition change.
+	(better_edge_p): Remove redundant check.
+
+2017-10-27  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386-protos.h (ix86_fp_compare_mode): Remove prototype.
+
+2017-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* builtins.c (CASE_MATHFN_FLOATN): New helper macro to add cases
+	for math functions that have _Float<N> and _Float<N>X variants.
+	(mathfn_built_in_2): Add support for math functions that have
+	_Float<N> and _Float<N>X variants.
+	(DEF_INTERNAL_FLT_FLOATN_FN): New helper macro.
+	(expand_builtin_mathfn_ternary): Add support for fma with
+	_Float<N> and _Float<N>X variants.
+	(expand_builtin): Likewise.
+	(fold_builtin_3): Likewise.
+	* builtins.def (DEF_EXT_LIB_FLOATN_NX_BUILTINS): New macro to
+	create math function _Float<N> and _Float<N>X variants as external
+	library builtins.
+	(BUILT_IN_COPYSIGN _Float<N> and _Float<N>X variants) Use
+	DEF_EXT_LIB_FLOATN_NX_BUILTINS to make built-in functions using
+	the __builtin_ prefix and if not strict ansi, without the prefix.
+	(BUILT_IN_FABS _Float<N> and _Float<N>X variants): Likewise.
+	(BUILT_IN_FMA _Float<N> and _Float<N>X variants): Likewise.
+	(BUILT_IN_FMAX _Float<N> and _Float<N>X variants): Likewise.
+	(BUILT_IN_FMIN _Float<N> and _Float<N>X variants): Likewise.
+	(BUILT_IN_NAN _Float<N> and _Float<N>X variants): Likewise.
+	(BUILT_IN_SQRT _Float<N> and _Float<N>X variants): Likewise.
+	* builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16): New
+	function signatures for fma _Float<N> and _Float<N>X variants.
+	(BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
+	(BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
+	(BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
+	(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
+	(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
+	(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
+	* gencfn-macros.c (print_case_cfn): Add support for math functions
+	that have _Float<N> and _Float<N>X variants.
+	(print_define_operator_list): Likewise.
+	(fltfn_suffixes): Likewise.
+	(main): Likewise.
+	* internal-fn.def (DEF_INTERNAL_FLT_FLOATN_FN): New helper macro
+	for math functions that have _Float<N> and _Float<N>X variants.
+	(SQRT): Add support for sqrt, copysign, fmin and fmax _Float<N>
+	and _Float<N>X variants.
+	(COPYSIGN): Likewise.
+	(FMIN): Likewise.
+	(FMAX): Likewise.
+	* fold-const.c (tree_call_nonnegative_warnv_p): Add support for
+	copysign, fma, fmax, fmin, and sqrt _Float<N> and _Float<N>X
+	variants.
+	(integer_valued_read_call_p): Likewise.
+	* fold-const-call.c (fold_const_call_ss): Likewise.
+	(fold_const_call_sss): Add support for copysign, fmin, and fmax
+	_Float<N> and _Float<N>X variants.
+	(fold_const_call_ssss): Add support for fma _Float<N> and
+	_Float<N>X variants.
+	* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Add
+	support for copysign and fma _Float<N> and _Float<N>X variants.
+	(backprop::process_builtin_call_use): Likewise.
+	* tree-call-cdce.c (can_test_argument_range); Add support for
+	sqrt _Float<N> and _Float<N>X variants.
+	(edom_only_function): Likewise.
+	(get_no_error_domain): Likewise.
+	* tree-ssa-math-opts.c (internal_fn_reciprocal): Likewise.
+	* tree-ssa-reassoc.c (attempt_builtin_copysign): Add support for
+	copysign _Float<N> and _Float<N>X variants.
+	* config/rs6000/rs6000-builtin.def (SQRTF128): Delete, this is now
+	handled by machine independent code.
+	(FMAF128): Likewise.
+	* doc/cpp.texi (Common Predefined Macros): Document defining
+	__FP_FAST_FMAF<N> and __FP_FAST_FMAF<N>X if the backend supports
+	fma _Float<N> and _Float<N>X variants.
+
+2017-10-27  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82692
+	* config/i386/i386-modes.def (CCFPU): Remove definition.
+	* config/i386/i386.c (put_condition_mode): Remove CCFPU mode handling.
+	(ix86_cc_modes_compatible): Ditto.
+	(ix86_expand_carry_flag_compare): Ditto.
+	(ix86_expand_int_movcc): Ditto.
+	(ix86_expand_int_addcc): Ditto.
+	(ix86_reverse_condition): Ditto.
+	(ix86_unordered_fp_compare): Rename from ix86_fp_compare_mode.
+	Return true/false for unordered/ordered fp comparisons.
+	(ix86_cc_mode): Always return CCFPmode for float mode comparisons.
+	(ix86_prepare_fp_compare_args): Update for rename.
+	(ix86_expand_fp_compare): Update for rename.  Generate unordered
+	compare RTXes wrapped with UNSPEC_NOTRAP unspec.
+	(ix86_expand_sse_compare_and_jump): Ditto.
+	* config/i386/predicates.md (fcmov_comparison_operator):
+	Remove CCFPU mode handling.
+	(ix86_comparison_operator): Ditto.
+	(ix86_carry_flag_operator): Ditto.
+	* config/i386/i386.md (UNSPEC_NOTRAP): New unspec.
+	(*cmpu<mode>_i387): Wrap compare RTX with UNSPEC_NOTRAP unspec.
+	(*cmpu<mode>_cc_i387): Ditto.
+	(FPCMP): Remove mode iterator.
+	(unord): Remove mode attribute.
+	(unord_subst): New define_subst transformation
+	(unord): New define_subst attribute.
+	(unordered): Ditto.
+	(*cmpi<unord><MODEF:mode>): Rewrite using unord_subst transformation.
+	(*cmpi<unord>xf_i387): Ditto.
+	* config/i386/sse.md (<sse>_<unord>comi<round_saeonly_name>): Merge
+	from <sse>_comi<round_saeonly_name> and <sse>_ucomi<round_saeonly_name>
+	using unord_subst transformation.
+	* config/i386/subst.md (SUBST_A): Remove CCFP and CCFPU modes.
+	(round_saeonly): Also handle CCFP mode.
+	* reg-stack.c (subst_stack_regs_pat): Handle UNSPEC_NOTRAP unspec.
+	Remove UNSPEC_SAHF unspec handling.
+
+2017-10-27  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune.def (X86_TUNE_INTER_UNIT_MOVES_TO_VEC): Disable for Zen.
+
+2017-10-27  Jeff Law  <law@redhat.com>
+
+	* gimple-ssa-sprintf.c: Include domwalk.h.
+	(class sprintf_dom_walker): New class, derived from dom_walker.
+	(sprintf_dom_walker::before_dom_children): New function.
+	(struct call_info): Moved into sprintf_dom_walker class
+	(compute_formath_length, handle_gimple_call): Likewise.
+	(sprintf_length::execute): Call the dominator walker rather
+	than walking the statements.
+
+	* tree-vrp.c (check_all_array_refs): Do not use wi->info to smuggle
+	gimple statement locations.
+	(check_array_bounds): Corresponding changes.  Get the statement's
+	location directly from wi->stmt.
+
+2017-10-27  Palmer Dabbelt  <palmer@dabbelt.com>
+
+	PR target/82717
+	* doc/invoke.texi (RISC-V) <-mabi>: Correct and improve.
+
+2017-10-27  Jan Hubicka  <hubicka@ucw.cz>
+
+	* config/i386/x86-tune.def (X86_TUNE_PARTIAL_REG_DEPENDENCY,
+	X86_TUNE_MOVX): Disable for Haswell and newer CPUs.
+
+2017-10-27  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82703
+	* config/i386/i386-protos.h (maybe_get_pool_constant): Removed.
+	* config/i386/i386.c (maybe_get_pool_constant): Removed.
+	(ix86_split_to_parts): Use avoid_constant_pool_reference instead of
+	maybe_get_pool_constant.
+	* config/i386/predicates.md (zero_extended_scalar_load_operand):
+	Likewise.
+
+2017-10-27  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* doc/install.texi (Specific, i?86-*-solaris2.10): Simplify gas
+	2.26 caveat.  Update gas and gld versions.
+	(Specific, *-*-solaris2*): Update binutils version.  Remove caveat
+	reference.
+
+2017-10-27  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
+
+	* cgraph.h (set_malloc_flag): Declare.
+	* cgraph.c (set_malloc_flag_1): New function.
+	(set_malloc_flag): Likewise.
+	* ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
+	* ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
+	false.
+	(read_ipa_call_summary): Add support for reading is_return_callee.
+	(write_ipa_call_summary): Stream is_return_callee.
+	* ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
+	* ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
+	ipa-prop.h, ipa-fnsummary.h.
+	(pure_const_names): Change to static.
+	(malloc_state_e): Define.
+	(malloc_state_names): Define.
+	(funct_state_d): Add field malloc_state.
+	(varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
+	(check_retval_uses): New function.
+	(malloc_candidate_p): Likewise.
+	(analyze_function): Add support for malloc attribute.
+	(pure_const_write_summary): Stream malloc_state.
+	(pure_const_read_summary): Add support for reading malloc_state.
+	(dump_malloc_lattice): New function.
+	(propagate_malloc): New function.
+	(warn_function_malloc): New function.
+	(ipa_pure_const::execute): Call propagate_malloc and
+	ipa_free_fn_summary.
+	(pass_local_pure_const::execute): Add support for malloc attribute.
+	* ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.
+	* doc/invoke.texi: Document Wsuggest-attribute=malloc.
+
+2017-10-27  Martin Liska  <mliska@suse.cz>
+
+	PR gcov-profile/82457
+	* doc/invoke.texi: Document that one needs a non-strict ISO mode
+	for fork-like functions to be properly instrumented.
+
+2017-10-27  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/81659
+	* tree-eh.c (pass_lower_eh_dispatch::execute): Free dominator
+	info when we redirected EH.
+
+2017-10-26  Michael Collison  <michael.collison@arm.com>
+
+	* config/aarch64/aarch64.md(<optab>_trunc><vf><GPI:mode>2):
+	New pattern.
+	(<optab>_trunchf<GPI:mode>2: New pattern.
+	(<optab>_trunc<vgp><GPI:mode>2: New pattern.
+	* config/aarch64/iterators.md (wv): New mode attribute.
+	(vf, VF): New mode attributes.
+	(vgp, VGP): New mode attributes.
+	(s): Update attribute with SImode and DImode prefixes.
+
+2017-10-26  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/constraints.md ("S"): Match r0rel_constant_p too.
+	* config/nios2/nios2-protos.h (r0rel_constant_p): Declare.
+	* config/nios2/nios2.c: (nios2_r0rel_sec_regex): New.
+	(nios2_option_overide): Initialize it.  Don't allow R0-relative
+	addressing with PIC.
+	(nios2_rtx_costs): Handle r0rel_constant_p like gprel_constant_p.
+	(nios2_symbolic_constant_p): Likewise.
+	(nios2_legitimate_address_p): Likewise.
+	(nios2_r0rel_section_name_p): New.
+	(nios2_symbol_ref_in_r0rel_data_p): New.
+	(nios2_emit_move_sequence): Handle r0rel_constant_p.
+	(r0rel_constant_p): New.
+	(nios2_print_operand_address): Handle r0rel_constant_p.
+	(nios2_cdx_narrow_form_p): Likewise.
+	* config/nios2/nios2.opt (mr0rel-sec=): New option.
+	* doc/invoke.texi (Option Summary): Add -mr0rel-sec.
+	(Nios II Options): Document -mr0rel-sec.
+
+2017-10-26  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/nios2.c: Include xregex.h.
+	(nios2_gprel_sec_regex): New.
+	(nios2_option_overide): Initialize it.  Don't allow GP-relative
+	addressing with PIC.
+	(nios2_small_section_name_p): Check for regex match.
+	* config/nios2/nios2.opt (mgprel-sec=): New option.
+	* doc/invoke.texi (Option Summary): Add -mgprel-sec.
+	(Nios II Options): Document -mgprel-sec.
+
+2017-10-26  Jim Wilson  <wilson@tuliptree.org>
+
+	* doc/invoke.texi (-fdebug-prefix-map): Expand documentation.
+
+2017-10-26  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/82707
+	* gimple.c (gimple_copy): Fix unsharing of
+	GIMPLE_OMP_{SINGLE,TARGET,TEAMS}.
+
+2017-10-26  Olga Makhotina  <olga.makhotina@intel.com>
+
+	* config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask,
+	_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
+	_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
+	_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
+	_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
+	_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
+	_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
+	_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
+	_mm512_mask_cmpunord_pd_mask, _mm512_cmpeq_ps_mask,
+	_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask,
+	_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
+	_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
+	_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
+	_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
+	_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
+	_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
+	_mm512_mask_cmpunord_ps_mask): New intrinsics.
+
+2017-10-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* config/rs6000/aix.h (TARGET_IEEEQUAD_DEFAULT): Set long double
+	default to IBM.
+	* config/rs6000/darwin.h (TARGET_IEEEQUAD_DEFAULT): Likewise.
+	* config/rs6000/rs6000.opt (-mabi=ieeelongdouble): Move the
+	warning to rs6000.c.  Remove the Undocumented flag, since it has
+	been documented.
+	(-mabi=ibmlongdouble): Likewise.
+	* config/rs6000/rs6000.c (TARGET_IEEEQUAD_DEFAULT): If it is not
+	already set, set the default format for long double.
+	(rs6000_debug_reg_global): Print whether long double is IBM or
+	IEEE.
+	(rs6000_option_override_internal): Rework setting long double
+	format.  Only warn if the user is changing the long double default
+	and they did not use -Wno-psabi.
+	* doc/invoke.texi (PowerPC options): Update the documentation for
+	-mabi=ieeelongdouble and -mabi=ibmlongdouble.
+
+2017-10-26  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* rtl.h (wider_subreg_mode): New function.
+	* ira.h (ira_sort_regnos_for_alter_reg): Take a machine_mode *
+	rather than an unsigned int *.
+	* ira-color.c (regno_max_ref_width): Replace with...
+	(regno_max_ref_mode): ...this new variable.
+	(coalesced_pseudo_reg_slot_compare): Update accordingly.
+	Use wider_subreg_mode.
+	(ira_sort_regnos_for_alter_reg): Likewise.  Take a machine_mode *
+	rather than an unsigned int *.
+	* lra-constraints.c (uses_hard_regs_p): Use wider_subreg_mode.
+	(process_alt_operands): Likewise.
+	(invariant_p): Likewise.
+	* lra-spills.c (assign_mem_slot): Likewise.
+	(add_pseudo_to_slot): Likewise.
+	* lra.c (collect_non_operand_hard_regs): Likewise.
+	(add_regs_to_insn_regno_info): Likewise.
+	* reload1.c (regno_max_ref_width): Replace with...
+	(regno_max_ref_mode): ...this new variable.
+	(reload): Update accordingly.  Update call to
+	ira_sort_regnos_for_alter_reg.
+	(alter_reg): Update to use regno_max_ref_mode.  Call wider_subreg_mode.
+	(init_eliminable_invariants): Update to use regno_max_ref_mode.
+	(scan_paradoxical_subregs): Likewise.
+
+2017-10-26  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.h (EXIT_IGNORE_STACK): Set if alloca is used.
+	(aarch64_frame): Add emit_frame_chain boolean.
+	* config/aarch64/aarch64.c (aarch64_frame_pointer_required)
+	Move eh_return case to aarch64_layout_frame.
+	(aarch64_layout_frame): Initialize emit_frame_chain.
+	(aarch64_expand_prologue): Use emit_frame_chain.
+
+2017-10-26  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_layout_frame):
+	Ensure LR is always stored at the bottom of the callee-saves.
+	Remove rarely used frame layout which saves callee-saves at top of
+	frame, so the store of LR can be used as a valid probe in all cases.
+
+2017-10-26  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* config/aarch64/aarch64.c (aarch64_legitimize_address_displacement):
+	Improve unaligned TImode/TFmode base/offset split.
+
+2017-10-26  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* caller-save.c (mark_referenced_regs):  Use read_modify_subreg_p.
+	* combine.c (find_single_use_1): Likewise.
+	(expand_field_assignment): Likewise.
+	(move_deaths): Likewise.
+	* lra-constraints.c (simplify_operand_subreg): Likewise.
+	(curr_insn_transform): Likewise.
+	* lra.c (collect_non_operand_hard_regs): Likewise.
+	(add_regs_to_insn_regno_info): Likewise.
+	* rtlanal.c (reg_referenced_p): Likewise.
+	(covers_regno_no_parallel_p): Likewise.
+
+2017-10-26  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* wide-int-print.cc (print_hex): Loop based on extract_uhwi.
+	Don't print any bits outside the precision of the value.
+	* wide-int.cc (test_printing): Add some new tests.
+
+2017-10-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* configure.ac (gcc_cv_as_ix86_xbrace_comment): Check if assembler
+	supports -xbrace_comment option.
+	* configure: Regenerate.
+	* config.in: Regenerate.
+	* config/i386/sol2.h (ASM_XBRACE_COMMENT_SPEC): Define.
+	(ASM_CPU_SPEC): Use it.
+
+2017-10-26  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* target.def (static_rtx_alignment): New hook.
+	* targhooks.h (default_static_rtx_alignment): Declare.
+	* targhooks.c (default_static_rtx_alignment): New function.
+	* doc/tm.texi.in (TARGET_STATIC_RTX_ALIGNMENT): New hook.
+	* doc/tm.texi: Regenerate.
+	* varasm.c (force_const_mem): Use targetm.static_rtx_alignment
+	instead of targetm.constant_alignment.  Remove call to
+	set_mem_attributes.
+	* config/cris/cris.c (TARGET_STATIC_RTX_ALIGNMENT): Redefine.
+	(cris_preferred_mininum_alignment): New function, split out from...
+	(cris_constant_alignment): ...here.
+	(cris_static_rtx_alignment): New function.
+	* config/i386/i386.c (ix86_static_rtx_alignment): New function,
+	split out from...
+	(ix86_constant_alignment): ...here.
+	(TARGET_STATIC_RTX_ALIGNMENT): Redefine.
+	* config/mmix/mmix.c (TARGET_STATIC_RTX_ALIGNMENT): Redefine.
+	(mmix_static_rtx_alignment): New function.
+	* config/spu/spu.c (spu_static_rtx_alignment): New function.
+	(TARGET_STATIC_RTX_ALIGNMENT): Redefine.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	PR target/81800
+	* config/aarch64/aarch64.md (lrint<GPF:mode><GPI:mode>2):
+	Add flag_trapping_math and flag_fp_int_builtin_inexact.
+
+2017-10-25  Palmer Dabbelt  <palmer@dabbelt.com>
+
+	* config/riscv/riscv.md (ZERO_EXTEND_LOAD): Define.
+	* config/riscv/pic.md (local_pic_load): Rename to local_pic_load_s,
+	mark as a sign-extending load.
+	(local_pic_load_u): Define.
+
+2017-10-25  Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR middle-end/82062
+	* fold-const.c (operand_equal_for_comparison_p): Also return true
+	if ARG0 is a simple variant of ARG1 with narrower precision.
+	(fold_ternary_loc): Always pass unstripped operands to the predicate.
+
+2017-10-25  Jan Hubicka  <hubicka@ucw.cz>
+
+	* i386.c (ix86_builtin_vectorization_cost): Compute scatter/gather
+	cost correctly.
+	* i386.h (processor_costs): Add gather_static, gather_per_elt,
+	scatter_static, scatter_per_elt.
+	* x86-tune-costs.h: Add new cost entries.
+
+2017-10-25  Richard Biener  <rguenther@suse.de>
+
+	* tree-ssa-sccvn.h (vn_eliminate): Declare.
+	* tree-ssa-pre.c (class eliminate_dom_walker, eliminate,
+	class pass_fre): Move to ...
+	* tree-ssa-sccvn.c (class eliminate_dom_walker, vn_eliminate,
+	class pass_fre): ... here and adjust for statistics.
+
+2017-10-25  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libstdc++/81706
+	* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
+	for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
+	(duplicate_one_attribute, copy_attributes_to_builtin): New functions.
+	* attribs.h (duplicate_one_attribute, copy_attributes_to_builtin): New
+	declarations.
+
+2017-10-25  Richard Biener  <rguenther@suse.de>
+
+	* tree-ssa-pre.c (need_eh_cleanup, need_ab_cleanup, el_to_remove,
+	el_to_fixup, el_todo, el_avail, el_avail_stack, eliminate_avail,
+	eliminate_push_avail, eliminate_insert): Move inside...
+	(class eliminate_dom_walker): ... this class in preparation
+	of move.
+	(fini_eliminate): Remove by merging with ...
+	(eliminate): ... this function.  Adjust for class changes.
+	(pass_pre::execute): Remove fini_eliminate call.
+	(pass_fre::execute): Likewise.
+
+2017-10-24  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82460
+	* config/i386/sse.md (UNSPEC_VPERMI2, UNSPEC_VPERMI2_MASK): Remove.
+	(VPERMI2, VPERMI2I): New mode iterators.
+	(<avx512>_vpermi2var<mode>3_maskz): Remove 3 define_expand patterns.
+	(<avx512>_vpermi2var<mode>3<sd_maskz_name>): Remove 3 define_insn
+	patterns.
+	(<avx512>_vpermi2var<mode>3_mask): New define_expand using VPERMI2
+	mode iterator.  Remove 3 old define_insn patterns.
+	(*<avx512>_vpermi2var<mode>3_mask): 2 new define_insn patterns.
+	(<avx512>_vpermt2var<mode>3_maskz): Adjust 1 define_expand to use
+	VPERMI2 mode iterator, remove the other two expanders.
+	(<avx512>_vpermt2var<mode>3<sd_maskz_name>): Adjust 1 define_insn
+	to use VPERMI2 mode iterator, add another alternative for vpermi2*
+	instructions, remove the other two patterns.
+	(<avx512>_vpermt2var<mode>3_mask): Adjust 1 define_insn to use VPERMI2
+	mode iterator, remove the other two patterns.
+	* config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Renamed to ...
+	(ix86_expand_vec_perm_vpermt2): ... this.  Swap mask and op0
+	arguments, use gen_*vpermt2* expanders instead of gen_*vpermi2*
+	and adjust argument order accordingly.
+	(ix86_expand_vec_perm): Adjust caller.
+	(expand_vec_perm_1): Likewise.
+	(expand_vec_perm_vpermi2_vpshub2): Rename to ...
+	(expand_vec_perm_vpermt2_vpshub2): ... this.
+	(ix86_expand_vec_perm_const_1): Adjust caller.
+	(ix86_vectorize_vec_perm_const_ok): Adjust comments.
+
+	PR target/82370
+	* config/i386/sse.md (VIMAX_AVX2): Remove V4TImode.
+	(VIMAX_AVX2_AVX512BW, VIMAX_AVX512VL): New mode iterators.
+	(vec_shl_<mode>): Remove unused expander.
+	(avx512bw_<shift_insn><mode>3): New define_insn.
+	(<sse2_avx2>_ashl<mode>3, <sse2_avx2>_lshr<mode>3): Replaced by ...
+	(<sse2_avx2>_<shift_insn><mode>3): ... this.  New define_insn.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82466
+	* doc/invoke.texi ([Wbuiltin-declaration-mismatch]): Extend
+	description.
+
+2017-10-24  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	PR rtl-optimization/82396
+	* gcc/haifa-sched.c (ready_sort_real): Remove qsort workaround.
+	(autopref_multipass_init): Simplify initialization.
+	(autopref_rank_data): Simplify sort order.
+	* gcc/sched-int.h (autopref_multipass_data_): Remove
+	multi_mem_insn_p, min_offset and max_offset.
+
+2017-10-24  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	PR middle-end/60580
+	* config/aarch64/aarch64.c (aarch64_frame_pointer_required)
+	Check special value of flag_omit_frame_pointer.
+	(aarch64_can_eliminate): Likewise.
+	(aarch64_override_options_after_change_1): Simplify handling of
+	-fomit-frame-pointer and -fomit-leaf-frame-pointer.
+
+2017-10-24  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82697
+	* tree-ssa-phiopt.c (cond_store_replacement): Use alias-set
+	zero for conditional load and unconditional store.
+
+2017-10-24  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* doc/install.texi: Document bootstrap-cet.
+
+2017-10-24  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82659
+	* config/i386/i386.c (rest_of_insert_endbranch): Don't insert
+	ENDBR instruction at function entrance if function is only
+	called directly.
+
+2017-10-24  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82628
+	* config/i386/i386.md (addcarry<mode>, subborrow<mode>): Change
+	patterns to better describe from which operation the CF is computed.
+	(addcarry<mode>_0, subborrow<mode>_0): New patterns.
+	* config/i386/i386.c (ix86_expand_builtin) <case handlecarry>: Pass
+	one LTU with [DT]Imode and another one with [SD]Imode.  If arg0
+	is 0, use _0 suffixed expanders instead of emitting a comparison
+	before it.
+
+2017-10-06  Sergey Shalnov  <Sergey.Shalnov@intel.com>
+
+	* config/i386/i386.md(*movsf_internal, *movdf_internal):
+	Avoid 512-bit AVX modes for TARGET_PREFER_AVX256.
+
+2017-10-24  Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR middle-end/82569
+	* tree-outof-ssa.h (always_initialized_rtx_for_ssa_name_p): Delete.
+	* expr.c (expand_expr_real_1) <expand_decl_rtl>: Revert latest change.
+	* loop-iv.c (iv_get_reaching_def): Likewise.
+	* cfgexpand.c (expand_one_ssa_partition): Initialize the RTX if the
+	variable is promoted and the partition contains undefined values.
+
+2017-10-23  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/nios2.c (nios2_rtx_costs): Make costs better
+	reflect reality.
+	(nios2_address_cost): Define.
+	(nios2_legitimize_address): Recognize (exp + constant) directly.
+	(TARGET_ADDRESS_COST): Define.
+
+2017-10-23  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/nios2-protos.h (nios2_large_constant_p): Declare.
+	(nios2_symbolic_memory_operand_p): Declare.
+	(nios2_split_large_constant): Declare.
+	(nios2_split_symbolic_memory_operand): Declare.
+	* config/nios2/nios2.c: Adjust includes.
+	(nios2_symbolic_constant_allowed): New.
+	(nios2_symbolic_constant_p): New.
+	(nios2_plus_symbolic_constant_p): New.
+	(nios2_valid_addr_expr_p): Recognize addresses involving
+	symbolic constants.
+	(nios2_legitimate_address_p): Likewise, also LO_SUM.
+	(nios2_symbolic_memory_operand_p): New.
+	(nios2_large_constant_p): New.
+	(nios2_split_large_constant): New.
+	(nios2_split_plus_large_constant): New.
+	(nios2_split_symbolic_memory_operand): New.
+	(nios2_legitimize_address): Code refactoring.  Handle addresses
+	involving symbolic constants.
+	(nios2_emit_move_sequence): Likewise.
+	(nios2_print_operand): Improve error output.
+	(nios2_print_operand_address): Handle LO_SUM.
+	(nios2_cdx_narrow_form_p): Likewise.
+	* config/nios2/nios2.md (movqi_internal): Add splitter for memory
+	operands involving symbolic constants.
+	(movhi_internal, movsi_internal): Likewise.
+	(zero_extendhisi2, zero_extendqi<mode>2): Likewise.
+	(extendhisi2, extendqi<mode>2): Likewise.
+
+2017-10-23  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* tree-pass.h (PROP_rtl_split_insns): Define.
+	* recog.c (pass_data_split_all_insns): Provide PROP_rtl_split_insns.
+
+2017-10-23  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* config/nios2/nios2.c (TARGET_LRA_P): Don't override.
+
+2017-10-23  Jakub Jelinek  <jakub@redhat.com>
+
+	PR debug/82630
+	* target.def (const_not_ok_for_debug_p): Default to
+	default_const_not_ok_for_debug_p instead of hook_bool_rtx_false.
+	* targhooks.h (default_const_not_ok_for_debug_p): New declaration.
+	* targhooks.c (default_const_not_ok_for_debug_p): New function.
+	* dwarf2out.c (const_ok_for_output_1): Only reject UNSPECs for
+	which targetm.const_not_ok_for_debug_p returned true.
+	* config/arm/arm.c (arm_const_not_ok_for_debug_p): Return true
+	for UNSPECs.
+	* config/powerpcspe/powerpcspe.c (rs6000_const_not_ok_for_debug_p):
+	Likewise.
+	* config/rs6000/rs6000.c (rs6000_const_not_ok_for_debug_p): Likewise.
+	* config/i386/i386.c (ix86_delegitimize_address_1): Don't delegitimize
+	UNSPEC_GOTOFF with addend into addend - _GLOBAL_OFFSET_TABLE_ + symbol
+	if !base_term_p.
+	(ix86_const_not_ok_for_debug_p): New function.
+	(i386_asm_output_addr_const_extra): Handle UNSPEC_GOTOFF.
+	(TARGET_CONST_NOT_OK_FOR_DEBUG_P): Redefine.
+
+2017-10-23  David Malcolm  <dmalcolm@redhat.com>
+
+	PR bootstrap/82610
+	* system.h: Conditionally include "unique-ptr.h" if
+	INCLUDE_UNIQUE_PTR is defined.
+	* unique-ptr-tests.cc: Remove include of "unique-ptr.h" in favor
+	of defining INCLUDE_UNIQUE_PTR before including "system.h".
+
+2017-10-23  Sebastian Perta  <sebastian.perta@renesas.com>
+
+	* config/rl78/rl78.md: New define_expand "subdi3".
+
+2017-10-23  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82673
+	* config/i386/i386.c (ix86_finalize_stack_frame_flags): Skip
+	DF_REF_INSN if DF_REF_INSN_INFO is false.
+
+2017-10-23  Jan Hubicka  <hubicka@ucw.cz>
+
+	* i386.c (dimode_scalar_chain::compute_convert_gain): Use
+	xmm_move instead of sse_move.
+	(sse_store_index): New function.
+	(ix86_register_move_cost): Be more sensible about mismatch stall;
+	model AVX moves correctly; make difference between sse->integer and
+	integer->sse.
+	(ix86_builtin_vectorization_cost): Model correctly aligned and unaligned
+	moves; make difference between SSE and AVX.
+	* i386.h (processor_costs): Remove sse_move; add xmm_move, ymm_move
+	and zmm_move. Increase size of sse load and store tables;
+	add unaligned load and store tables; add ssemmx_to_integer.
+	* x86-tune-costs.h: Update all entries according to real
+	move latencies from Agner Fog's manual and chip documentation.
+
+2017-10-23  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82628
+	* config/i386/predicates.md (x86_64_dwzext_immediate_operand): New.
+	* config/i386/constraints.md (Wf): New constraint.
+	* config/i386/i386.md (UNSPEC_SBB): New unspec.
+	(cmp<dwi>_doubleword): Removed.
+	(sub<mode>3_carry_ccc, *sub<mode>3_carry_ccc_1): New patterns.
+	(sub<mode>3_carry_ccgz): Use unspec instead of compare.
+	* config/i386/i386.c (ix86_expand_branch) <case E_TImode>: Don't
+	expand with cmp<dwi>_doubleword.  For LTU and GEU use
+	sub<mode>3_carry_ccc instead of sub<mode>3_carry_ccgz and use CCCmode.
+
+	* common.opt (gcolumn-info): Enable by default.
+	* doc/invoke.texi (gcolumn-info): Document new default.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82672
+	* graphite-isl-ast-to-gimple.c (graphite_copy_stmts_from_block):
+	Fold the stmt if we propagated into it.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	* tree-ssa-pre.c (bitmap_remove_from_set): Rename to...
+	(bitmap_remove_expr_from_set): ... this.  All callers call this
+	for non-constant values.
+	(bitmap_set_subtract): Rename to...
+	(bitmap_set_subtract_expressions): ... this.  Adjust and
+	optimize.
+	(bitmap_set_contains_value): Remove superfluous check.
+	(bitmap_set_replace_value): Inline into single caller ...
+	(bitmap_value_replace_in_set): ... here and simplify.
+	(dependent_clean): Merge into ...
+	(clean): ... this using an overload.  Adjust.
+	(prune_clobbered_mems): Adjust.
+	(compute_antic_aux): Likewise.
+	(compute_partial_antic_aux): Likewise.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82129
+	Revert
+	2017-08-01  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/81181
+	* tree-ssa-pre.c (compute_antic_aux): Defer clean() to ...
+	(compute_antic): ... end of iteration here.
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* target.def (starting_frame_offset): New hook.
+	* doc/tm.texi (STARTING_FRAME_OFFSET): Remove in favor of...
+	(TARGET_STARTING_FRAME_OFFSET): ...this new hook.
+	* doc/tm.texi.in: Regenerate.
+	* hooks.h (hook_hwi_void_0): Declare.
+	* hooks.c (hook_hwi_void_0): New function.
+	* doc/rtl.texi: Refer to TARGET_STARTING_FRAME_OFFSET instead of
+	STARTING_FRAME_OFFSET.
+	* builtins.c (expand_builtin_setjmp_receiver): Likewise.
+	* reload1.c (reload): Likewise.
+	* cfgexpand.c (expand_used_vars): Use targetm.starting_frame_offset
+	instead of STARTING_FRAME_OFFSET.
+	* function.c (try_fit_stack_local): Likewise.
+	(assign_stack_local_1): Likewise
+	(instantiate_virtual_regs): Likewise.
+	* rtlanal.c (rtx_addr_can_trap_p_1): Likewise.
+	* config/avr/avr.md (nonlocal_goto_receiver): Likewise.
+	* config/aarch64/aarch64.h (STARTING_FRAME_OFFSET): Delete.
+	* config/alpha/alpha.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/arc/arc.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/arm/arm.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/bfin/bfin.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/c6x/c6x.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/cr16/cr16.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/cris/cris.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/fr30/fr30.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/frv/frv.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/ft32/ft32.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/h8300/h8300.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/i386/i386.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/ia64/ia64.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/m32c/m32c.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/m68k/m68k.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/mcore/mcore.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/mn10300/mn10300.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/moxie/moxie.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/msp430/msp430.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/nds32/nds32.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/nios2/nios2.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/nvptx/nvptx.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/pdp11/pdp11.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/riscv/riscv.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/rl78/rl78.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/rx/rx.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/s390/s390.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/sh/sh.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/sparc/sparc.c (sparc_compute_frame_size): Likewise.
+	* config/sparc/sparc.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/spu/spu.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/stormy16/stormy16.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/tilegx/tilegx.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/tilepro/tilepro.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/v850/v850.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/visium/visium.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/avr/avr.h (STARTING_FRAME_OFFSET): Likewise.
+	* config/avr/avr-protos.h (avr_starting_frame_offset): Likewise.
+	* config/avr/avr.c (avr_starting_frame_offset): Make static and
+	return a HOST_WIDE_INT.
+	(avr_builtin_setjmp_frame_value): Use it instead of
+	STARTING_FRAME_OFFSET.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/epiphany/epiphany.h (STARTING_FRAME_OFFSET): Delete.
+	* config/epiphany/epiphany.c (epiphany_starting_frame_offset):
+	New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/iq2000/iq2000.h (STARTING_FRAME_OFFSET): Delete.
+	* config/iq2000/iq2000.c (iq2000_starting_frame_offset): New function.
+	(TARGET_CONSTANT_ALIGNMENT): Redefine.
+	* config/lm32/lm32.h (STARTING_FRAME_OFFSET): Delete.
+	* config/lm32/lm32.c (lm32_starting_frame_offset): New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/m32r/m32r.h (STARTING_FRAME_OFFSET): Delete.
+	* config/m32r/m32r.c (m32r_starting_frame_offset): New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/microblaze/microblaze.h (STARTING_FRAME_OFFSET): Delete.
+	* config/microblaze/microblaze.c (microblaze_starting_frame_offset):
+	New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/mips/mips.h (STARTING_FRAME_OFFSET): Delete.
+	* config/mips/mips.c (mips_compute_frame_info): Refer to
+	TARGET_STARTING_FRAME_OFFSET instead of STARTING_FRAME_OFFSET.
+	(mips_starting_frame_offset): New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/mmix/mmix.h (STARTING_FRAME_OFFSET): Delete.
+	* config/mmix/mmix-protos.h (mmix_starting_frame_offset): Delete.
+	* config/mmix/mmix.c (mmix_starting_frame_offset): Make static
+	and return a HOST_WIDE_INT.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	(mmix_initial_elimination_offset): Refer to
+	TARGET_STARTING_FRAME_OFFSET instead of STARTING_FRAME_OFFSET.
+	* config/pa/pa.h (STARTING_FRAME_OFFSET): Delete.
+	* config/pa/pa.c (pa_starting_frame_offset): New function.
+	(pa_compute_frame_size): Use it instead of STARTING_FRAME_OFFSET.
+	(pa_expand_prologue): Likewise.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/powerpcspe/aix.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/powerpcspe/darwin.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/powerpcspe/powerpcspe.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/powerpcspe/powerpcspe.c (TARGET_STARTING_FRAME_OFFSET):
+	Redefine.
+	(rs6000_starting_frame_offset): New function.
+	* config/rs6000/aix.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/rs6000/darwin.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/rs6000/rs6000.h (STARTING_FRAME_OFFSET): Split out
+	!FRAME_GROWS_DOWNWARD handling to...
+	(RS6000_STARTING_FRAME_OFFSET): ...this new macro.
+	* config/rs6000/rs6000.c (TARGET_STARTING_FRAME_OFFSET): Refine.
+	(rs6000_starting_frame_offset): New function.
+	* config/vax/elf.h (STARTING_FRAME_OFFSET): Delete.
+	* config/vax/vax.h (STARTING_FRAME_OFFSET): Delete.
+	* config/vax/vax.c (vax_starting_frame_offset): New function.
+	(vax_expand_prologue): Use it instead of STARTING_FRAME_OFFSET.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* config/xtensa/xtensa.h (STARTING_FRAME_OFFSET): Delete.
+	* config/xtensa/xtensa.c (xtensa_starting_frame_offset): New function.
+	(TARGET_STARTING_FRAME_OFFSET): Redefine.
+	* system.h (STARTING_FRAME_OFFSET): Poison.
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
+	SCALAR_TYPE_MODE instead of TYPE_MODE.
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* dwarf2out.c (loc_list_from_tree_1): Use SCALAR_INT_TYPE_MODE
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* expmed.c (expand_shift_1): Use scalar_mode for scalar_mode.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82129
+	* tree-ssa-pre.c (bitmap_set_and): Remove.
+	(compute_antic_aux): Compute ANTIC_OUT intersection in a way
+	canonicalizing expressions in the set to those with lowest
+	ID rather than taking that from the first edge.
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* combine.c (rtx_equal_for_field_assignment_p): Use
+	byte_lowpart_offset.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* internal-fn.c (expand_direct_optab_fn): Don't assign directly
+	to a SUBREG_PROMOTED_VAR.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* cfgexpand.c (expand_debug_expr): Use GET_MODE_UNIT_PRECISION.
+	(expand_debug_source_expr): Likewise.
+	* combine.c (combine_simplify_rtx): Likewise.
+	* cse.c (fold_rtx): Likewise.
+	* optabs.c (expand_float): Likewise.
+	* simplify-rtx.c (simplify_unary_operation_1): Likewise.
+	(simplify_binary_operation_1): Likewise.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* combine.c (simplify_comparison): Use HWI_COMPUTABLE_MODE_P.
+	(record_promoted_value): Likewise.
+	* expr.c (expand_expr_real_2): Likewise.
+	* ree.c (update_reg_equal_equiv_notes): Likewise.
+	(combine_set_extension): Likewise.
+	* rtlanal.c (low_bitmask_len): Likewise.
+	* simplify-rtx.c (neg_const_int): Likewise.
+	(simplify_binary_operation_1): Likewise.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* lra-spills.c (assign_mem_slot): Use subreg_size_lowpart_offset.
+	* regcprop.c (maybe_mode_change): Likewise.
+	* reload1.c (alter_reg): Likewise.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* inchash.h (inchash::hash::add_wide_int): New function.
+	* lto-streamer-out.c (hash_tree): Use it.
+
+2017-10-22  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* inchash.h (inchash::hash::add_wide_int): Rename to...
+	(inchash::hash::add_hwi): ...this.
+	* ipa-devirt.c (hash_odr_vtable): Update accordingly.
+	(polymorphic_call_target_hasher::hash): Likewise.
+	* ipa-icf.c (sem_function::get_hash, sem_function::init): Likewise.
+	(sem_item::add_expr, sem_item::add_type, sem_variable::get_hash)
+	(sem_item_optimizer::update_hash_by_addr_refs): Likewise.
+	* lto-streamer-out.c (hash_tree): Likewise.
+	* optc-save-gen.awk: Likewise.
+	* tree.c (add_expr): Likewise.
+
+2017-10-22  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/52451
+	* config/i386/i386.c (ix86_fp_compare_mode): Return CCFPmode
+	for ordered inequality comparisons even with TARGET_IEEE_FP.
+
+2017-10-22  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82628
+	* config/i386/i386.md (cmp<dwi>_doubleword): New pattern.
+	* config/i386/i386.c (ix86_expand_branch) <case E_TImode>:
+	Expand with cmp<dwi>_doubleword.
+
+2017-10-21  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* extend.texi: Add x86 specific to 'nocf_check' attribute.
+	List CET intrinsics.
+	* invoke.texi: Add -mcet, -mibt, -mshstk options.  Add x86
+	specific to -fcf-protection option.
+
+2017-10-21  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* common/config/i386/i386-common.c (OPTION_MASK_ISA_IBT_SET): New.
+	(OPTION_MASK_ISA_SHSTK_SET): Likewise.
+	(OPTION_MASK_ISA_IBT_UNSET): Likewise.
+	(OPTION_MASK_ISA_SHSTK_UNSET): Likewise.
+	(ix86_handle_option): Add -mibt, -mshstk, -mcet handling.
+	* config.gcc (extra_headers): Add cetintrin.h for x86 targets.
+	(extra_objs): Add cet.o for Linux/x86 targets.
+	(tmake_file): Add i386/t-cet for Linux/x86 targets.
+	* config/i386/cet.c: New file.
+	* config/i386/cetintrin.h: Likewise.
+	* config/i386/t-cet: Likewise.
+	* config/i386/cpuid.h (bit_SHSTK): New.
+	(bit_IBT): Likewise.
+	* config/i386/driver-i386.c (host_detect_local_cpu): Detect and
+	pass IBT and SHSTK bits.
+	* config/i386/i386-builtin-types.def
+	(VOID_FTYPE_UNSIGNED_PVOID): New.
+	(VOID_FTYPE_UINT64_PVOID): Likewise.
+	* config/i386/i386-builtin.def: Add CET intrinsics.
+	* config/i386/i386-c.c (ix86_target_macros_internal): Add
+	OPTION_MASK_ISA_IBT, OPTION_MASK_ISA_SHSTK handling.
+	* config/i386/i386-passes.def: Add pass_insert_endbranch pass.
+	* config/i386/i386-protos.h (make_pass_insert_endbranch): New
+	prototype.
+	* config/i386/i386.c (rest_of_insert_endbranch): New.
+	(pass_data_insert_endbranch): Likewise.
+	(pass_insert_endbranch): Likewise.
+	(make_pass_insert_endbranch): Likewise.
+	(ix86_notrack_prefixed_insn_p): Likewise.
+	(ix86_target_string): Add -mibt, -mshstk flags.
+	(ix86_option_override_internal): Add flag_cf_protection
+	processing.
+	(ix86_valid_target_attribute_inner_p): Set OPT_mibt, OPT_mshstk.
+	(ix86_print_operand): Add 'notrack' prefix output.
+	(ix86_init_mmx_sse_builtins): Add CET intrinsics.
+	(ix86_expand_builtin): Expand CET intrinsics.
+	(x86_output_mi_thunk): Add 'endbranch' instruction.
+	* config/i386/i386.h (TARGET_IBT): New.
+	(TARGET_IBT_P): Likewise.
+	(TARGET_SHSTK): Likewise.
+	(TARGET_SHSTK_P): Likewise.
+	* config/i386/i386.md (unspecv): Add UNSPECV_NOP_RDSSP,
+	UNSPECV_INCSSP, UNSPECV_SAVEPREVSSP, UNSPECV_RSTORSSP,
+	UNSPECV_WRSS, UNSPECV_WRUSS, UNSPECV_SETSSBSY, UNSPECV_CLRSSBSY.
+	(builtin_setjmp_setup): New pattern.
+	(builtin_longjmp): Likewise.
+	(rdssp<mode>): Likewise.
+	(incssp<mode>): Likewise.
+	(saveprevssp): Likewise.
+	(rstorssp): Likewise.
+	(wrss<mode>): Likewise.
+	(wruss<mode>): Likewise.
+	(setssbsy): Likewise.
+	(clrssbsy): Likewise.
+	(nop_endbr): Likewise.
+	* config/i386/i386.opt: Add -mcet, -mibt, -mshstk and -mcet-switch
+	options.
+	* config/i386/immintrin.h: Include <cetintrin.h>.
+	* config/i386/linux-common.h
+	(file_end_indicate_exec_stack_and_cet): New prototype.
+	(TARGET_ASM_FILE_END): New.
+
+2017-10-20  Jan Hubicka  <hubicka@ucw.cz>
+
+	* i386.c (ix86_builtin_vectorization_cost): Use existing rtx_cost
+	latencies instead of having separate table; make difference between
+	integer and float costs.
+	* i386.h (processor_costs): Remove scalar_stmt_cost,
+	scalar_load_cost, scalar_store_cost, vec_stmt_cost, vec_to_scalar_cost,
+	scalar_to_vec_cost, vec_align_load_cost, vec_unalign_load_cost,
+	vec_store_cost.
+	* x86-tune-costs.h: Remove entries which has been removed in
+	procesor_costs from all tables; make cond_taken_branch_cost
+	and cond_not_taken_branch_cost COST_N_INSNS based.
+
+2017-10-20  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune-costs.h (intel_cost, generic_cost): Fix move costs.
+
+2017-10-20  Jakub Jelinek  <jakub@redhat.com>
+
+	* config/i386/i386.md (isa): Remove fma_avx512f.
+	* config/i386/sse.md (<avx512>_fmadd_<mode>_mask<round_name>,
+	<avx512>_fmadd_<mode>_mask3<round_name>,
+	<avx512>_fmsub_<mode>_mask<round_name>,
+	<avx512>_fmsub_<mode>_mask3<round_name>,
+	<avx512>_fnmadd_<mode>_mask<round_name>,
+	<avx512>_fnmadd_<mode>_mask3<round_name>,
+	<avx512>_fnmsub_<mode>_mask<round_name>,
+	<avx512>_fnmsub_<mode>_mask3<round_name>,
+	<avx512>_fmaddsub_<mode>_mask<round_name>,
+	<avx512>_fmaddsub_<mode>_mask3<round_name>,
+	<avx512>_fmsubadd_<mode>_mask<round_name>,
+	<avx512>_fmsubadd_<mode>_mask3<round_name>): Remove isa attribute.
+	(*vec_widen_umult_even_v16si<mask_name>,
+	*vec_widen_smult_even_v16si<mask_name>): Likewise.
+	(<mask_codefor>avx512bw_dbpsadbw<mode><mask_name>): Likewise.
+
+2017-10-20  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* extend.texi: Add 'nocf_check' documentation.
+	* gimple.texi: Add second parameter to
+	gimple_build_call_from_tree.
+	* invoke.texi: Add -fcf-protection documentation.
+	* rtl.texi: Add REG_CALL_NOTRACK documenation.
+
+2017-10-20  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82473
+	* tree-vect-loop.c (vectorizable_reduction): Properly get at
+	the largest input type.
+
+2017-10-20  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* c-attribs.c (handle_nocf_check_attribute): New function.
+	(c_common_attribute_table): Add 'nocf_check' handling.
+	* gimple-parser.c: Add second argument NULL to
+	gimple_build_call_from_tree.
+	* attrib.c (comp_type_attributes): Check nocf_check attribute.
+	* cfgexpand.c (expand_call_stmt): Set REG_CALL_NOCF_CHECK for
+	call insn.
+	* combine.c (distribute_notes): Add REG_CALL_NOCF_CHECK handling.
+	* common.opt: Add fcf-protection flag.
+	* emit-rtl.c (try_split): Add REG_CALL_NOCF_CHECK handling.
+	* flag-types.h: Add enum cf_protection_level.
+	* gimple.c (gimple_build_call_from_tree): Add second parameter.
+	Add 'nocf_check' attribute propagation to gimple call.
+	* gimple.h (gf_mask): Add GF_CALL_NOCF_CHECK.
+	(gimple_build_call_from_tree): Update prototype.
+	(gimple_call_nocf_check_p): New function.
+	(gimple_call_set_nocf_check): Likewise.
+	* gimplify.c: Add second argument to gimple_build_call_from_tree.
+	* ipa-icf.c: Add nocf_check attribute in statement hash.
+	* recog.c (peep2_attempt): Add REG_CALL_NOCF_CHECK handling.
+	* reg-notes.def: Add REG_NOTE (CALL_NOCF_CHECK).
+	* toplev.c (process_options): Add flag_cf_protection handling.
+
+2017-10-19  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune-costs.h (core_cost): Fix div, move and sqrt latencies.
+
+2017-10-20  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82603
+	* tree-if-conv.c (predicate_mem_writes): Make sure to only
+	remove false predicated stores.
+
+2017-10-20  Richard Biener  <rguenther@suse.de>
+
+	* graphite-isl-ast-to-gimple.c
+	(translate_isl_ast_to_gimple::graphite_copy_stmts_from_block):
+	Remove return value and simplify, dump copied stmt after lhs
+	adjustment.
+	(translate_isl_ast_to_gimple::translate_isl_ast_node_user):
+	Reduce dump verbosity.
+	(gsi_insert_earliest): Likewise.
+	(translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences): Adjust.
+	* graphite.c (print_global_statistics): Adjust dumping.
+	(print_graphite_scop_statistics): Likewise.
+	(print_graphite_statistics): Do not dump loops here.
+	(graphite_transform_loops): But here.
+
+2017-10-20  Nicolas Roche  <roche@adacore.com>
+
+	* configure.ac (ACX_PROG_GNAT): Append "libgnat" to include search dir.
+	* configure: Regenerate.
+
+2017-10-20  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82158
+	* tree-cfg.c (pass_warn_function_return::execute): In noreturn
+	functions when optimizing replace GIMPLE_RETURN stmts with
+	calls to __builtin_unreachable ().
+
+	PR sanitizer/82595
+	* config/gnu-user.h (LIBTSAN_EARLY_SPEC): Add libtsan_preinit.o
+	for -fsanitize=thread link of executables.
+	(LIBLSAN_EARLY_SPEC): Add liblsan_preinit.o for -fsanitize=leak
+	link of executables.
+
+	PR target/82370
+	* config/i386/sse.md (VI248_AVX2, VI248_AVX512BW, VI248_AVX512BW_2):
+	New mode iterators.
+	(<shift_insn><mode>3<mask_name>): Change the last of the 3
+	define_insns for logical vector shifts to use VI248_AVX512BW
+	iterator instead of VI48_AVX512, remove <mask_mode512bit_condition>
+	condition, useless isa and prefix attributes.  Change the first
+	2 of these define_insns to ...
+	(<mask_codefor><shift_insn><mode>3<mask_name>): ... this, new
+	define_insn for avx512vl.
+	(<shift_insn><mode>3): ... and this, new define_insn without
+	masking for non-avx512vl.
+
+	PR target/82370
+	* config/i386/sse.md (*andnot<mode>3,
+	<mask_codefor><code><mode>3<mask_name>, *<code><mode>3): Split
+	(=v,v,vm) alternative into (=x,x,xm) and (=v,v,vm), for 128-bit
+	and 256-bit vectors, the (=x,x,xm) alternative and when mask is
+	not applied use empty suffix even for TARGET_AVX512VL.
+	* config/i386/subst.md (mask_prefix3, mask_prefix4): When mask
+	is applied, supply evex,evex or evex,evex,evex instead of just
+	evex.
+
+2017-10-20  Julia Koval  <julia.koval@intel.com>
+
+	* common/config/i386/i386-common.c (OPTION_MASK_ISA_GFNI_SET,
+	(OPTION_MASK_ISA_GFNI_UNSET): New.
+	(ix86_handle_option): Handle OPT_mgfni.
+	* config/i386/cpuid.h (bit_GFNI): New.
+	* config/i386/driver-i386.c (host_detect_local_cpu): Detect gfni.
+	* config/i386/i386-c.c (ix86_target_macros_internal): Define __GFNI__.
+	* config/i386/i386.c (ix86_target_string): Add -mgfni.
+	(ix86_valid_target_attribute_inner_p): Add OPT_mgfni.
+	* config/i386/i386.h (TARGET_GFNI, TARGET_GFNI_P): New.
+	* config/i386/i386.opt: Add mgfni.
+
+2017-10-20  Orlando Arias  <oarias@knights.ucf.edu>
+
+	* config/msp430/msp430.c (msp430_option_override): Disable
+	-fdelete-null-pointer-checks.
+	* doc/invoke.text (-fdelete-null-pointer-checks): Document that.
+
+2017-10-19  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune-costs.h (generic_cost, core_cost): Correct costs
+	of x87 and SSE instructions.
+
+2017-10-19  Jan Hubicka  <hubicka@ucw.cz>
+
+	* asan.c (create_cond_insert_point): Do not update edge count.
+	* auto-profile.c (afdo_propagate_edge): Update for edge count removal.
+	(afdo_propagate_circuit): Likewise.
+	(afdo_calculate_branch_prob): Likewise.
+	(afdo_annotate_cfg): Likewise.
+	* basic-block.h (struct edge_def): Remove count.
+	(edge_def::count): New accessor.
+	* bb-reorder.c (rotate_loop): Update.
+	(find_traces_1_round): Update.
+	(connect_traces): Update.
+	(sanitize_hot_paths): Update.
+	* cfg.c (unchecked_make_edge): Update.
+	(make_single_succ_edge): Update.
+	(check_bb_profile): Update.
+	(dump_edge_info): Update.
+	(update_bb_profile_for_threading): Update.
+	(scale_bbs_frequencies_int): Update.
+	(scale_bbs_frequencies_gcov_type): Update.
+	(scale_bbs_frequencies_profile_count): Update.
+	(scale_bbs_frequencies): Update.
+	* cfganal.c (connect_infinite_loops_to_exit): Update.
+	* cfgbuild.c (compute_outgoing_frequencies): Update.
+	(find_many_sub_basic_blocks): Update.
+	* cfgcleanup.c (try_forward_edges): Update.
+	(try_crossjump_to_edge): Update
+	* cfgexpand.c (expand_gimple_cond): Update
+	(expand_gimple_tailcall): Update
+	(construct_exit_block): Update
+	* cfghooks.c (verify_flow_info): Update
+	(redirect_edge_succ_nodup): Update
+	(split_edge): Update
+	(make_forwarder_block): Update
+	(duplicate_block): Update
+	(account_profile_record): Update
+	* cfgloop.c (find_subloop_latch_edge_by_profile): Update.
+	* cfgloopanal.c (expected_loop_iterations_unbounded): Update.
+	* cfgloopmanip.c (scale_loop_profile): Update.
+	(loopify): Update.
+	(lv_adjust_loop_entry_edge): Update.
+	* cfgrtl.c (try_redirect_by_replacing_jump): Update.
+	(force_nonfallthru_and_redirect): Update.
+	(purge_dead_edges): Update.
+	(rtl_flow_call_edges_add): Update.
+	* cgraphunit.c (init_lowered_empty_function): Update.
+	(cgraph_node::expand_thunk): Update.
+	* gimple-pretty-print.c (dump_probability): Update.
+	(dump_edge_probability): Update.
+	* gimple-ssa-isolate-paths.c (isolate_path): Update.
+	* haifa-sched.c (sched_create_recovery_edges): Update.
+	* hsa-gen.c (convert_switch_statements): Update.
+	* ifcvt.c (dead_or_predicable): Update.
+	* ipa-inline-transform.c (inline_transform): Update.
+	* ipa-split.c (split_function): Update.
+	* ipa-utils.c (ipa_merge_profiles): Update.
+	* loop-doloop.c (add_test): Update.
+	* loop-unroll.c (unroll_loop_runtime_iterations): Update.
+	* lto-streamer-in.c (input_cfg): Update.
+	(input_function): Update.
+	* lto-streamer-out.c (output_cfg): Update.
+	* modulo-sched.c (sms_schedule): Update.
+	* postreload-gcse.c (eliminate_partially_redundant_load): Update.
+	* predict.c (maybe_hot_edge_p): Update.
+	(unlikely_executed_edge_p): Update.
+	(probably_never_executed_edge_p): Update.
+	(dump_prediction): Update.
+	(drop_profile): Update.
+	(propagate_unlikely_bbs_forward): Update.
+	(determine_unlikely_bbs): Update.
+	(force_edge_cold): Update.
+	* profile.c (compute_branch_probabilities): Update.
+	* reg-stack.c (better_edge): Update.
+	* shrink-wrap.c (handle_simple_exit): Update.
+	* tracer.c (better_p): Update.
+	* trans-mem.c (expand_transaction): Update.
+	(split_bb_make_tm_edge): Update.
+	* tree-call-cdce.c: Update.
+	* tree-cfg.c (gimple_find_sub_bbs): Update.
+	(gimple_split_edge): Update.
+	(gimple_duplicate_sese_region): Update.
+	(gimple_duplicate_sese_tail): Update.
+	(gimple_flow_call_edges_add): Update.
+	(insert_cond_bb): Update.
+	(execute_fixup_cfg): Update.
+	* tree-cfgcleanup.c (cleanup_control_expr_graph): Update.
+	* tree-complex.c (expand_complex_div_wide): Update.
+	* tree-eh.c (lower_resx): Update.
+	(unsplit_eh): Update.
+	(cleanup_empty_eh_move_lp): Update.
+	* tree-inline.c (copy_edges_for_bb): Update.
+	(freqs_to_counts): Update.
+	(copy_cfg_body): Update.
+	* tree-ssa-dce.c (remove_dead_stmt): Update.
+	* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update.
+	* tree-ssa-loop-im.c (execute_sm_if_changed): Update.
+	* tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Update.
+	(unloop_loops): Update.
+	* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
+	* tree-ssa-loop-split.c (connect_loops): Update.
+	(split_loop): Update.
+	* tree-ssa-loop-unswitch.c (hoist_guard): Update.
+	* tree-ssa-phionlycprop.c (propagate_rhs_into_lhs): Update.
+	* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Update.
+	* tree-ssa-reassoc.c (branch_fixup): Update.
+	* tree-ssa-tail-merge.c (replace_block_by): Update.
+	* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Update.
+	(compute_path_counts): Update.
+	(update_profile): Update.
+	(recompute_probabilities): Update.
+	(update_joiner_offpath_counts): Update.
+	(estimated_freqs_path): Update.
+	(freqs_to_counts_path): Update.
+	(clear_counts_path): Update.
+	(ssa_fix_duplicate_block_edges): Update.
+	(duplicate_thread_path): Update.
+	* tree-switch-conversion.c (hoist_edge_and_branch_if_true): Update.
+	(case_bit_test_cmp): Update.
+	(collect_switch_conv_info): Update.
+	(gen_inbound_check): Update.
+	(do_jump_if_equal): Update.
+	(emit_cmp_and_jump_insns): Update.
+	* tree-tailcall.c (decrease_profile): Update.
+	(eliminate_tail_call): Update.
+	* tree-vect-loop-manip.c (slpeel_add_loop_guard): Update.
+	(vect_do_peeling): Update.
+	* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
+	* ubsan.c (ubsan_expand_null_ifn): Update.
+	(ubsan_expand_ptr_ifn): Update.
+	* value-prof.c (gimple_divmod_fixed_value): Update.
+	(gimple_mod_pow2): Update.
+	(gimple_mod_subtract): Update.
+	(gimple_ic): Update.
+	(gimple_stringop_fixed_value): Update.
+
+2017-10-19  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82618
+	* config/i386/i386.md (sub to cmp): New peephole2 pattern.
+
+2017-10-19  Alexander Monakov  <amonakov@ispras.ru>
+
+	PR rtl-optimization/82395
+	* ira-color.c (allocno_priority_compare_func): Fix comparison step
+	based on non_spilled_static_chain_regno_p.
+
+2017-10-19  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.c (output_387_binary_op): Rewrite SSE part.
+	(ix86_emit_mode_set): Rewrite insn mnemonic construction.
+	(ix86_prepare_fp_compare_args): Redefine is_sse as bool.
+
+2017-10-19  Martin Sebor  <msebor@redhat.com>
+
+	PR tree-optimization/82596
+	* tree.c (array_at_struct_end_p): Handle STRING_CST.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* asan.c (handle_builtin_alloca): Deal with all alloca variants.
+	(get_mem_refs_of_builtin_call): Likewise.
+	* builtins.c (expand_builtin_apply): Adjust call to
+	allocate_dynamic_stack_space.
+	(expand_builtin_alloca): For __builtin_alloca_with_align_and_max, pass
+	the third argument to allocate_dynamic_stack_space, otherwise -1.
+	(expand_builtin): Deal with all alloca variants.
+	(is_inexpensive_builtin): Likewise.
+	* builtins.def (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX): New.
+	* calls.c (special_function_p): Deal with all alloca variants.
+	(initialize_argument_information): Adjust call to
+	allocate_dynamic_stack_space.
+	(expand_call): Likewise.
+	* cfgexpand.c (expand_call_stmt): Deal with all alloca variants.
+	* doc/extend.texi (Built-ins): Add __builtin_alloca_with_align_and_max
+	* explow.c (allocate_dynamic_stack_space): Add MAX_SIZE parameter and
+	use it for the stack usage computation.
+	* explow.h (allocate_dynamic_stack_space): Adjust prototype.
+	* function.c (gimplify_parameters): Call build_alloca_call_expr.
+	* gimple-ssa-warn-alloca.c (alloca_call_type): Simplify control flow.
+	Take into account 3rd argument of __builtin_alloca_with_align_and_max.
+	(in_loop_p): Remove first argument and useless check.
+	(pass_walloca::execute): Remove useless test and adjust call to above.
+	* gimple.c (gimple_build_call_from_tree): Deal with all alloc variants
+	* gimplify.c (gimplify_vla_decl): Call build_alloca_call_expr.
+	(gimplify_call_expr): Deal with all alloca variants.
+	* hsa-gen.c (gen_hsa_alloca): Likewise.
+	(gen_hsa_insns_for_call): Likewise.
+	* ipa-pure-const.c (special_builtin_state): Likewise.
+	* tree-chkp.c (chkp_build_returned_bound): Likewise.
+	* tree-object-size.c (alloc_object_size): Likewise.
+	* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Likewise.
+	(call_may_clobber_ref_p_1): Likewise.
+	* tree-ssa-ccp.c (evaluate_stmt): Likewise.
+	(ccp_fold_stmt): Likewise.
+	(optimize_stack_restore): Likewise.
+	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise.
+	(mark_all_reaching_defs_necessary_1): Likewise.
+	(propagate_necessity): Likewise.
+	(eliminate_unnecessary_stmts): Likewise.
+	* tree.c (build_common_builtin_nodes): Build
+	BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX.
+	(build_alloca_call_expr): New function.
+	* tree.h (ALLOCA_FUNCTION_CODE_P): New macro.
+	(CASE_BUILT_IN_ALLOCA): Likewise.
+	(build_alloca_call_expr): Declare.
+	* varasm.c (incorporeal_function_p): Deal with all alloca variants.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR debug/82509
+	* dwarf2out.c (new_die_raw): New static inline function.
+	(new_die): Use it to create the DIE.
+	(add_AT_external_die_ref): Likewise.
+	(clone_die): Likewise.
+	(clone_as_declaration): Likewise.
+	(dwarf2out_vms_debug_main_pointer): Likewise.
+	(base_type_die): Likewise.  Remove early return for corner cases.
+	Do not call add_pubtype on the DIE here.
+	(is_base_type): Remove ERROR_MARK and return 0 for VOID_TYPE.
+	(modified_type_die): Adjust the lookup for reverse order DIEs.  Skip
+	typedefs for base types with DW_AT_endianity.  Make sure a DIE with
+	native order exists for base types, attach the DIE manually and call
+	add_pubtype on it.  Do not equate a reverse order DIE to the type.
+
+2017-10-19  Richard Earnshaw  <rearnsha@arm.com>
+
+	* config/arm/arm.c (align_ok_ldrd_strd): New function.
+	(mem_ok_for_ldrd_strd): New parameter align.  Extract the alignment of
+	the mem into it.
+	(gen_operands_ldrd_strd): Validate the alignment of the accesses.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	* flag-types.h (enum sanitize_code): Add SANITIZE_BUILTIN.  Or
+	SANITIZE_BUILTIN into SANITIZE_UNDEFINED.
+	* sanitizer.def (BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN,
+	BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN_ABORT): New builtins.
+	* opts.c (sanitizer_opts): Add builtin.
+	* ubsan.c (instrument_builtin): New function.
+	(pass_ubsan::execute): Call it.
+	(pass_ubsan::gate): Enable even for SANITIZE_BUILTIN.
+	* doc/invoke.texi: Document -fsanitize=builtin.
+
+	* ubsan.c (ubsan_expand_null_ifn): Use _v1 suffixed type mismatch
+	builtins, store max (log2 (align), 0) into uchar field instead of
+	align into uptr field.
+	(ubsan_expand_objsize_ifn): Use _v1 suffixed type mismatch builtins,
+	store uchar 0 field instead of uptr 0 field.
+	(instrument_nonnull_return): Use _v1 suffixed nonnull return builtin,
+	instead of passing one address of struct with 2 locations pass
+	two addresses of structs with 1 location each.
+	* sanitizer.def (BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH,
+	BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT,
+	BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN,
+	BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_ABORT): Removed.
+	(BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1,
+	BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT,
+	BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1,
+	BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1_ABORT): New builtins.
+
+2017-10-19  Martin Liska  <mliska@suse.cz>
+
+	PR driver/81829
+	* file-find.c (remove_prefix): Remove.
+	* file-find.h (remove_prefix): Likewise.
+	* gcc-ar.c: Remove smartness of lookup.
+
+2017-10-19  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	* config/rs6000/rs6000.md (*call_indirect_aix<mode>,
+	*call_value_indirect_aix<mode>, *call_indirect_elfv2<mode>,
+	*call_value_indirect_elfv2<mode>): Add correct mode to the unspec.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82580
+	* config/i386/i386.md (setcc + movzbl to xor + setcc): New peephole2.
+	(setcc + and to xor + setcc): New peephole2.
+
+2017-10-19  Tom de Vries  <tom@codesourcery.com>
+
+	* doc/sourcebuild.texi (Test Directives, Variants of
+	dg-require-support): Add dg-require-stack-size.
+
+2017-10-19  Martin Liska  <mliska@suse.cz>
+
+	PR sanitizer/82517
+	* gimplify.c (gimplify_decl_expr): Do not instrument variables
+	that have a large alignment.
+	(gimplify_target_expr): Likewise.
+
+2017-10-18  Segher Boessenkool  <segher@kernel.crashing.org>
+
+	PR rtl-optimization/82602
+	* ira.c (rtx_moveable_p): Return false for volatile asm.
+
+2017-10-18  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82580
+	* config/i386/i386-modes.def (CCGZ): New CC mode.
+	* config/i386/i386.md (sub<mode>3_carry_ccgz): New insn pattern.
+	* config/i386/predicates.md (ix86_comparison_operator):
+	Handle CCGZmode.
+	* config/i386/i386.c (ix86_expand_branch) <case E_TImode>:
+	Emulate LE, LEU, GT, GTU, LT, LTU, GE and GEU double-word comparisons
+	with double-word subtraction.
+	(put_condition_code): Handle CCGZmode.
+
+2017-10-18  Aldy Hernandez  <aldyh@redhat.com>
+
+	* wide-int.cc (debug (const wide_int &)): New.
+	(debug (const wide_int *)): New.
+	(debug (const widest_int &)): New.
+	(debug (const widest_int *)): New.
+
+2017-10-18  Vladimir Makarov  <vmakarov@redhat.com>
+
+	PR middle-end/82556
+	* lra-constraints.c (curr_insn_transform): Use non-input operand
+	instead of output one for matched reload.
+
+2017-10-18  Bin Cheng  <bin.cheng@arm.com>
+
+	* tree-loop-distribution.c (INCLUDE_ALGORITHM): New header file.
+	(tree-ssa-loop-ivopts.h): New header file.
+	(struct builtin_info): New fields.
+	(classify_builtin_1): Compute and record base and offset parts for
+	memset builtin partition by calling strip_offset.
+	(offset_cmp, fuse_memset_builtins): New functions.
+	(finalize_partitions): Fuse adjacent memset partitions by calling
+	above function.
+	* tree-ssa-loop-ivopts.c (strip_offset): Delete static declaration.
+	Expose the interface.
+	* tree-ssa-loop-ivopts.h (strip_offset): New declaration.
+
+2017-10-18  Bin Cheng  <bin.cheng@arm.com>
+
+	PR tree-optimization/82574
+	* tree-loop-distribution.c (find_single_drs): New parameter.  Check
+	that data reference must be executed exactly once per iteration
+	against the outermost loop in nest.
+	(classify_partition): Update call to above function.
+
+2017-10-18  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82591
+	* graphite.c (graphite_transform_loops): Move code gen message
+	printing ...
+	* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl):
+	Here.  Handle scop_to_isl_ast failing.
+	(scop_to_isl_ast): Limit the number of ISL operations.
+
+2017-10-18  Richard Biener  <rguenther@suse.de>
+
+	* graphite-isl-ast-to-gimple.c
+	(translate_isl_ast_to_gimple::set_rename): Simplify.
+	(translate_isl_ast_to_gimple::set_rename_for_each_def): Inline...
+	(graphite_copy_stmts_from_block): ... here.
+	(copy_bb_and_scalar_dependences): Simplify.
+	(add_parameters_to_ivs_params): Canonicalize.
+	(generate_entry_out_of_ssa_copies): Simplify.
+	* graphite-sese-to-poly.c (extract_affine_name): Simplify
+	by passing in ISL dimension.
+	(parameter_index_in_region_1): Rename to ...
+	(parameter_index_in_region): ... this.
+	(extract_affine): Adjust assert, pass down parameter index.
+	(add_param_constraints): Use range-info when available.
+	(build_scop_context): Adjust.
+	* sese.c (new_sese_info): Adjust.
+	(free_sese_info): Likewise.
+	* sese.h (bb_map_t, rename_map_t, phi_rename, init_back_edge_pair_t):
+	Remove unused typedefs.
+	(struct sese_info_t): Simplify rename_map, remove incomplete_phis.
+
+2017-10-18  Martin Liska  <mliska@suse.cz>
+
+	* combine.c (simplify_compare_const): Add gcc_fallthrough.
+
+2017-10-18  Robin Dapp  <rdapp@linux.vnet.ibm.com>
+
+	* config/s390/s390.c (s390_bb_fallthru_entry_likely): New function.
+	(s390_sched_init): Do not reset s390_sched_state if we entered the
+	current basic block via a fallthru edge and all others are unlikely.
+
+2017-10-18  Robin Dapp  <rdapp@linux.vnet.ibm.com>
+
+	* config/s390/s390.c (NUM_SIDES): New variable.
+	(LONGRUNNING_THRESHOLD): New variable.
+	(LATENCY_FACTOR): New variable.
+	(s390_sched_score): Decrease score for long-running instructions on
+	wrong side.
+	(s390_sched_variable_issue): Perform bookkeeping for long-running
+	instructions.
+
+2017-10-18  Richard Biener  <rguenther@suse.de>
+
+	* graphite-isl-ast-to-gimple.c (gcc_expression_from_isl_ast_expr_id):
+	Simplify with removal of the parameter rename map.
+	(set_rename): Likewise.
+	(should_copy_to_new_region): Likewise.
+	(graphite_copy_stmts_from_block): Likewise.
+	(copy_bb_and_scalar_dependences): Remove initialization of
+	unused copied_bb_map.
+	(copy_def): Remove.
+	(copy_internal_parameters): Likewise.
+	(graphite_regenerate_ast_isl): Do not call copy_internal_parameters.
+	* graphite-scop-detection.c (scop_detection::stmt_simple_for_scop_p):
+	Use INTEGRAL_TYPE_P.
+	(parameter_index_in_region_1): Rename to ...
+	(assign_parameter_index_in_region): ... this.  Assert we have
+	a parameter we handle.
+	(scan_tree_for_params): Adjust.
+	* sese.h (parameter_rename_map_t): Remove.
+	(struct sese_info_t): Remove unused parameter_rename_map and
+	copied_bb_map members.
+	* sese.c (new_sese_info): Adjust.
+	(free_sese_info): Likewise.
+
+2017-10-18  Martin Liska  <mliska@suse.cz>
+
+	PR sanitizer/82545
+	* asan.c (asan_expand_poison_ifn): Do not put gimple stmt
+	on an abnormal edge.
+
+2017-10-18  Sebastian Huber  <sebastian.huber@embedded-brains.de>
+
+	* doc/invoke.texi (ffunction-sections and fdata-sections):
+	Update.
+
+2017-10-17  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* tree-ssa-loop-ivopts.c (add_autoinc_candidates): Bail out only if
+	the use statement can throw internally.
+
+2017-10-17  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* config/visium/visium.c (visium_select_cc_mode): Return CCmode for
+	any RTX present on the RHS of a SET.
+	* compare-elim.c (try_eliminate_compare): Restore comment.
+
+2017-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	* langhooks.h (struct lang_hooks): Document that tree_size langhook
+	may be also called on tcc_type nodes.
+	* langhooks.c (lhd_tree_size): Likewise.
+
+2017-10-17  David Malcolm  <dmalcolm@redhat.com>
+
+	* gimple-ssa-sprintf.c (fmtwarn): Update for changed signature of
+	format_warning_at_substring.
+	(maybe_warn): Convert source_range * param to a location_t.  Pass
+	UNKNOWN_LOCATION rather than NULL to fmtwarn.
+	(format_directive): Remove code to extract source_ranges and
+	source_range * in favor of just a location_t.
+	(parse_directive): Pass UNKNOWN_LOCATION rather than NULL to
+	fmtwarn.
+	* substring-locations.c (format_warning_va): Convert
+	source_range * param to a location_t.
+	(format_warning_at_substring): Likewise.
+	* substring-locations.h (format_warning_va): Likewise.
+	(format_warning_at_substring): Likewise.
+
+2017-10-17  Jan Hubicka  <hubicka@ucw.cz>
+
+	* target.h (enum vect_cost_for_stmt): Add vec_gather_load and
+	vec_scatter_store
+	* tree-vect-stmts.c (record_stmt_cost): Make difference between normal
+	and scatter/gather ops.
+
+	* aarch64/aarch64.c (aarch64_builtin_vectorization_cost): Add
+	vec_gather_load and vec_scatter_store.
+	* arm/arm.c (arm_builtin_vectorization_cost): Likewise.
+	* powerpcspe/powerpcspe.c (rs6000_builtin_vectorization_cost): Likewise.
+	* rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Likewise.
+	* s390/s390.c (s390_builtin_vectorization_cost): Likewise.
+	* spu/spu.c (spu_builtin_vectorization_cost): Likewise.
+	* i386/i386.c (x86_builtin_vectorization_cost): Likewise.
+
+2017-10-17  Uros Bizjak  <ubizjak@gmail.com>
+
+	* reg-stack.c (compare_for_stack_reg): Add bool argument.
+	Detect FTST instruction and handle its register pops.  Only pop
+	second operand if can_pop_second_op is true.
+	(subst_stack_regs_pat) <case COMPARE>: Detect FCOMI instruction to
+	set can_pop_second_op to false in the compare_for_stack_reg call.
+
+	* config/i386/i386.md (*cmpi<FPCMP:unord><MODEF:mode>): Only call
+	output_fp_compare for stack register operands.
+	* config/i386/i386.c (output_fp_compare): Do not output SSE compare
+	instructions here.  Do not emit stack register pops here.  Assert
+	that FCOMPP pops next to top stack register.  Rewrite function.
+
+2017-10-17  Nathan Sidwell  <nathan@acm.org>
+
+	PR middle-end/82577
+	* alias.c (compare_base_decls): Check HAS_DECL_ASSEMBLER_NAME_P,
+	use DECL_ASSEMBLER_NAME_RAW.
+
+	PR middle-end/82546
+	* tree.c (tree_code_size): Reformat.  Punt to lang hook for unknown
+	TYPE nodes.
+
+2017-10-17  Qing Zhao <qing.zhao@oracle.com>
+	    Wilco Dijkstra <wilco.dijkstra@arm.com>
+
+	* builtins.c (expand_builtin_update_setjmp_buf): Add a
+	converstion to Pmode from the buf_addr.
+
+2017-10-17  Richard Biener  <rguenther@suse.de>
+
+	* graphite-dependences.c (scop_get_reads_and_writes): Change
+	output parameters to references.
+
+2017-10-17  Jackson Woodruff  <jackson.woodruff@arm.com>
+
+	PR 71026/tree-optimization
+	* fold-const.c (distribute_real_division): Removed.
+	(fold_binary_loc): Remove calls to distribute_real_divison.
+
+2017-10-17  Richard Biener  <rguenther@suse.de>
+
+	* graphite-scop-detection.c
+	(scop_detection::stmt_has_simple_data_refs_p): Always use
+	the full nest as region.
+	(try_generate_gimple_bb): Likewise.
+	* sese.c (scalar_evolution_in_region): Simplify now that
+	SCEV can handle instantiation in regions.
+	* tree-scalar-evolution.c (instantiate_scev_name): Also instantiate
+	in the non-loop part of a function if requested.
+
+2017-10-17  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82563
+	* graphite-isl-ast-to-gimple.c (generate_entry_out_of_ssa_copies):
+	New function.
+	(graphite_regenerate_ast_isl): Call it.
+	* graphite-scop-detection.c (build_scops): Remove entry edge split.
+
+2017-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	PR tree-optimization/82549
+	* fold-const.c (optimize_bit_field_compare, fold_truth_andor_1):
+	Formatting fixes.  Instead of calling make_bit_field_ref with negative
+	bitpos return 0.
+
+2017-10-17  Olga Makhotina  <olga.makhotina@intel.com>
+
+	* config/i386/avx512dqintrin.h (_mm_mask_reduce_sd,
+	_mm_maskz_reduce_sd, _mm_mask_reduce_ss,=20
+	_mm_maskz_reduce_ss): New.
+	* config/i386/i386-builtin.def (__builtin_ia32_reducesd_mask,
+	__builtin_ia32_reducess_mask): Ditto..
+	(__builtin_ia32_reducesd, __builtin_ia32_reducess): Remove.
+	* config/i386/sse.md (reduces<mode>): Renamed to ...
+	(reduces<mode><mask_scalar_name>): ... this.
+	(vreduce<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}):
+	Changed to ...
+	(vreduce<ssescalarmodesuffix>\t{%3, %2, %1, %0<mask_scalar_operand4>|
+	%0<mask_scalar_operand4>, %1, %2, %3}): ... this.
+
+2017-10-16  David Malcolm  <dmalcolm@redhat.com>
+
+	* Makefile.in (OBJS): Add unique-ptr-tests.o.
+	* selftest-run-tests.c (selftest::run_tests): Call
+	selftest::unique_ptr_tests_cc_tests.
+	* selftest.h (selftest::unique_ptr_tests_cc_tests): New decl.
+	* unique-ptr-tests.cc: New file.
+
+2017-10-16  Vladimir Makarov  <vmakarov@redhat.com>
+
+	PR sanitizer/82353
+	* lra.c (collect_non_operand_hard_regs): Don't ignore operator
+	locations.
+	* lra-lives.c (bb_killed_pseudos, bb_gen_pseudos): Move up.
+	(make_hard_regno_born, make_hard_regno_dead): Update
+	bb_killed_pseudos and bb_gen_pseudos for fixed regs.
+
+2017-10-16  Jeff Law  <law@redhat.com>
+
+	* tree-ssa-dse.c (live_bytes_read): Fix thinko.
+
+2017-10-16  Jan Hubicka  <hubicka@ucw.cz>
+
+	* x86-tune-costs.h (znver1_cost): Fix move cost tables.
+
+2017-10-16  Olivier Hainque  <hainque@adacore.com>
+
+	* gcc/config.gcc (powerpc*-*-*spe*): Pick 8548 as the default
+	with_cpu if we were configured for an e500v2 target cpu name.
+
+2017-10-16  Thomas Preud'homme  <thomas.preudhomme@arm.com>
+
+	* config/arm/arm-cpus.in (cortex-m33): Add nodsp option.
+	* doc/invoke.texi: Document +nodsp as a valid extension for
+	-mcpu=cortex-m33.
+
+2017-10-16  Martin Liska  <mliska@suse.cz>
+
+	* sbitmap.c (bitmap_bit_in_range_p_checking): New function.
+	(test_set_range): Likewise.
+	(test_range_functions): Rename to ...
+	(test_bit_in_range): ... this.
+	(sbitmap_c_tests): Add new test.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32, vdot_s32, vdotq_s32):
+	New.
+	(vdot_lane_u32, vdot_laneq_u32, vdotq_lane_u32, vdotq_laneq_u32): New.
+	(vdot_lane_s32, vdot_laneq_s32, vdotq_lane_s32, vdotq_laneq_s32): New.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* config/aarch64/aarch64-builtins.c
+	(aarch64_types_quadopu_lane_qualifiers): New.
+	(TYPES_QUADOPU_LANE): New.
+	* config/aarch64/aarch64-simd.md (aarch64_<sur>dot<vsi2qi>): New.
+	(<sur>dot_prod<vsi2qi>, aarch64_<sur>dot_lane<vsi2qi>): New.
+	(aarch64_<sur>dot_laneq<vsi2qi>): New.
+	* config/aarch64/aarch64-simd-builtins.def (sdot, udot): New.
+	(sdot_lane, udot_lane, sdot_laneq, udot_laneq): New.
+	* config/aarch64/iterators.md (sur): Add UNSPEC_SDOT, UNSPEC_UDOT.
+	(Vdottype, DOTPROD): New.
+	(sur): Add SDOT and UDOT.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* config/aarch64/aarch64.h (AARCH64_FL_DOTPROD): New.
+	(AARCH64_ISA_DOTPROD, TARGET_DOTPROD): New.
+	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
+	Add TARGET_DOTPROD.
+	* config/aarch64/aarch64-option-extensions.def (dotprod): New.
+	* config/aarch64/aarch64-cores.def (cortex-a55, cortex-a75):
+	Enable TARGET_DOTPROD.
+	(cortex-a75.cortex-a55): Likewise.
+	* doc/invoke.texi (aarch64-feature-modifiers): Document dotprod.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* config/arm/arm-builtins.c (arm_unsigned_uternop_qualifiers): New.
+	(UTERNOP_QUALIFIERS, arm_umac_lane_qualifiers, UMAC_LANE_QUALIFIERS):
+	New.
+	* config/arm/arm_neon_builtins.def (sdot, udot, sdot_lane, udot_lane):
+	New.
+	* config/arm/iterators.md (DOTPROD, VSI2QI, vsi2qi): New.
+	(UNSPEC_DOT_S, UNSPEC_DOT_U, opsuffix): New.
+	* config/arm/neon.md (neon_<sup>dot<vsi2qi>): New.
+	(neon_<sup>dot_lane<vsi2qi>, <sup>dot_prod<vsi2qi>): New.
+	* config/arm/types.md (neon_dot, neon_dot_q): New.
+	* config/arm/unspecs.md (sup): Add UNSPEC_DOT_S, UNSPEC_DOT_U.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* config/arm/arm.h (TARGET_DOTPROD): New.
+	* config/arm/arm.c (arm_arch_dotprod): New.
+	(arm_option_reconfigure_globals): Add arm_arch_dotprod.
+	* config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
+	* config/arm/arm-cpus.in (armv8.2-a): Enabled +dotprod.
+	(feature dotprod, group dotprod, ALL_SIMD_INTERNAL): New.
+	(ALL_FPU_INTERNAL): Use ALL_SIMD_INTERNAL.
+	* config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
+	* doc/invoke.texi (armv8.2-a): Document dotprod
+
+2017-10-14  Jan Hubicka  <hubicka@ucw.cz>
+
+	* i386.c (ix86_vec_cost): New function.
+	(ix86_rtx_costs): Handle vector operations better.
+	* i386.h (struct processor_costs): Add sse_op, fmasd, fmass.
+	* x86-tune-costs.h: Add new costs to all tables.
+
+2017-10-14  Jan Hubicka  <hubicka@ucw.cz>
+
+	* i386.c (ix86_rtx_costs): Make difference between x87 and SSE
+	operations.
+	* i386.h (struct processor_costs): Add addss, mulss, mulsd, divss,
+	divsd, sqrtss and sqrtsd
+	* x86-tune-costs.h: Add new entries to all costs.
+	(znver1_cost): Fix to match real instruction latencies.
+
+2017-10-14  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+	    Michael Collison <michael.collison@arm.com>
+
+	* compare-elim.c: Include emit-rtl.h.
+	(can_merge_compare_into_arith): New function.
+	(try_validate_parallel): Likewise.
+	(try_merge_compare): Likewise.
+	(try_eliminate_compare): Call the above when no previous clobber
+	is available.
+	(execute_compare_elim_after_reload): Add DF_UD_CHAIN and DF_DU_CHAIN
+	dataflow problems.
+
+2017-10-14  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/62263
+	PR middle-end/82498
+	* tree-ssa-phiopt.c (value_replacement): Comment fix.  Handle
+	up to 2 preparation statements for ASSIGN in MIDDLE_BB.
+
+	PR middle-end/62263
+	PR middle-end/82498
+	* tree-ssa-forwprop.c (simplify_rotate): Allow def_arg1[N]
+	to be any operand_equal_p operands.  For & (B - 1) require
+	B to be power of 2.  Recognize
+	(X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1))) and similar patterns.
+
+2017-10-14  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR bootstrap/82553
+	* optabs.c (expand_memory_blockage): Fix call of
+	targetm.have_memory_blockage.
+
+2017-10-14  Jakub Jelinek  <jakub@redhat.com>
+
+	PR bootstrap/82548
+	* config.gcc (*-*-solaris2*, i[34567]86-*-cygwin*,
+	x86_64-*-cygwin*, i[34567]86-*-mingw* | x86_64-*-mingw*): Append
+	objects to extra_objs instead of overwriting it.
+
+2017-10-14  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
+	Use any_fp_register_operand as operand[3] predicate.  Simplify
+	equality test for operands[2] and operands[4] memory location.
+	(LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
+	(FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage): New.
+	(LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
+	(FILD_ATOMIC/FIST_ATOMIC FP store peephole2): Use
+	any_fp_register_operand as operand[1] predicate.  Simplify
+	equality test for operands[0] and operands[3] memory location.
+	(LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
+	(FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage): New.
+	(LDX_ATOMIC/LDX_ATOMIC FP storepeephole2 with mem blockage): Ditto.
+
+2017-10-14  Uros Bizjak  <ubizjak@gmail.com>
+
+	* target-insns.def: Add memory_blockage.
+	* optabs.c (expand_memory_blockage): New function.
+	(expand_asm_memory_barrier): Rename ...
+	(expand_asm_memory_blockage): ... to this.
+	(expand_mem_thread_fence): Call expand_memory_blockage
+	instead of expand_asm_memory_barrier.
+	(expand_mem_singnal_fence): Ditto.
+	(expand_atomic_load): Ditto.
+	(expand_atomic_store): Ditto.
+	* doc/md.texi (Standard Pattern Names For Generation):
+	Document memory_blockage instruction pattern.
+
+2017-10-13  Sebastian Perta  <sebastian.perta@renesas.com>
+
+	* config/rl78/rl78.c (rl78_emit_libcall): New function.
+	* config/rl78/rl78-protos.h (rl78_emit_libcall): New function.
+	* config/rl78/rl78.md: New define_expand "adddi3".
+
+2017-10-13  Jan Hubicka  <hubicka@ucw.cz>
+
+	* cfghooks.c (verify_flow_info): Disable check that all probabilities
+	are set correctly.
+
+2017-10-13  Jeff Law  <law@redhat.com>
+
+	* tree-ssa-reassoc.c (reassociate_bb): Clarify code slighly.
+
+2017-10-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82274
+	* internal-fn.c (expand_mul_overflow): If both operands have
+	the same highpart of -1 or 0 and the topmost bit of lowpart
+	is different, overflow is if res <= 0 rather than res < 0.
+
+2017-10-13  Pat Haugen  <pthaugen@us.ibm.com>
+
+	* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Remove
+	TARGET_P9_VECTOR code for unaligned_load case.
+
+2017-10-13  Jan Hubicka  <hubicka@ucw.cz>
+
+	* cfghooks.c (verify_flow_info): Check that edge probabilities are set.
+
+2017-10-13  Nathan Sidwell  <nathan@acm.org>
+
+	* tree-core.h (tree_contains_struct): Make bool.
+	* tree.c (tree_contains_struct): Likewise.
+	* tree.h (MARK_TS_BASE): Remove do ... while (0) idiom.
+	(MARK_TS_TYPED, MARK_TS_COMMON, MARK_TS_TYPE_COMMON,
+	MARK_TS_TYPE_WITH_LANG_SPECIFIC, MARK_TS_DECL_MINIMAL,
+	MARK_TS_DECL_COMMON, MARK_TS_DECL_WRTL, MARK_TS_DECL_WITH_VIS,
+	MARK_TS_DECL_NON_COMMON): Likewise, use comma operator.
+
+2017-10-13  Richard Biener  <rguenther@suse.de>
+
+	* graphite-isl-ast-to-gimple.c
+	(translate_isl_ast_to_gimple::get_rename_from_scev): Remove unused
+	parameters and dominance check.
+	(translate_isl_ast_to_gimple::graphite_copy_stmts_from_block): Adjust.
+	(translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences): Likewise.
+	(translate_isl_ast_to_gimple::graphite_regenerate_ast_isl):
+	Do not update SSA form here or do intermediate IL verification.
+	* graphite.c: Include tree-ssa.h and tree-into-ssa.h.
+	(graphite_initialize): Remove check on the number of loops in
+	the function and inline into graphite_transform_loops.
+	(graphite_finalize): Inline into graphite_transform_loops.
+	(graphite_transform_loops): Perform SSA update and IL verification
+	here.
+	* params.def (PARAM_GRAPHITE_MIN_LOOPS_PER_FUNCTION): Remove.
+
+2017-10-13  Richard Biener  <rguenther@suse.de>
+
+	* graphite-isl-ast-to-gimple.c (max_mode_int_precision,
+	graphite_expression_type_precision): Avoid global constructor
+	by moving ...
+	(translate_isl_ast_to_gimple::translate_isl_ast_to_gimple): Here.
+	(translate_isl_ast_to_gimple::graphite_expr_type): Add type member.
+	(translate_isl_ast_to_gimple::translate_isl_ast_node_for): Use it.
+	(translate_isl_ast_to_gimple::build_iv_mapping): Likewise.
+	(translate_isl_ast_to_gimple::graphite_create_new_guard): Likewise.
+	* graphite-sese-to-poly.c (build_original_schedule): Return nothing.
+
+2017-10-13  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82499
+	* config/i386/i386.h (ix86_red_zone_size): New.
+	* config/i386/i386.md (push peephole2s): Replace
+	"!ix86_using_red_zone ()" with "ix86_red_zone_size == 0".
+
+2017-10-13  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* combine.c (can_change_dest_mode): Reject changes in
+	REGMODE_NATURAL_SIZE.
+
+2017-10-13  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* cfgexpand.c (expand_debug_expr): Use GET_MODE_UNIT_BITSIZE.
+	(expand_debug_source_expr): Likewise.
+	* combine.c (combine_simplify_rtx): Likewise.
+	* cse.c (fold_rtx): Likewise.
+	* fwprop.c (canonicalize_address): Likewise.
+	* targhooks.c (default_shift_truncation_mask): Likewise.
+
+2017-10-13  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* optabs.c (add_equal_note): Use GET_MODE_UNIT_SIZE.
+	(widened_mode): Likewise.
+	(expand_unop): Likewise.
+	* ree.c (transform_ifelse): Likewise.
+	(merge_def_and_ext): Likewise.
+	(combine_reaching_defs): Likewise.
+	* simplify-rtx.c (simplify_unary_operation_1): Likewise.
+
+2017-10-13  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* caller-save.c (replace_reg_with_saved_mem): Use byte_lowpart_offset.
+	* combine.c (gen_lowpart_for_combine): Likewise.
+	* dwarf2out.c (rtl_for_decl_location): Likewise.
+	* final.c (alter_subreg): Likewise.
+	* rtlhooks.c (gen_lowpart_general): Likewise.
+	(gen_lowpart_if_possible): Likewise.
+
+2017-10-13  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* calls.c (expand_call): Use subreg_lowpart_offset.
+	* cse.c (cse_insn): Likewise.
+	* regcprop.c (copy_value): Likewise.
+	(copyprop_hardreg_forward_1): Likewise.
+
 2017-10-13  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/82524
@@ -185,7 +3546,7 @@
 
 2017-10-12  Jan Hubicka  <hubicka@ucw.cz>
 
-	* x86-tune-sched.c (ix86_adjust_cost): Fix Zen support.
+	* config/i386/x86-tune-sched.c (ix86_adjust_cost): Fix Zen support.
 
 2017-10-12  Uros Bizjak  <ubizjak@gmail.com>
 
@@ -385,7 +3746,7 @@
 2017-10-11  Jan Hubicka  <hubicka@ucw.cz>
 
 	* config.gcc (i386, x86_64): Add extra objects.
-	* i386/i386-protos.h (ix86_rip_relative_addr_p): Declare.
+	* config/i386/i386-protos.h (ix86_rip_relative_addr_p): Declare.
 	(ix86_min_insn_size): Declare.
 	(ix86_issue_rate): Declare.
 	(ix86_adjust_cost): Declare.
@@ -396,7 +3757,7 @@
 	(ix86_bd_do_dispatch): Declare.
 	(ix86_core2i7_init_hooks): Declare.
 	(ix86_atom_sched_reorder): Declare.
-	* i386/i386.c Move all CPU cost tables to x86-tune-costs.h.
+	* config/i386/i386.c Move all CPU cost tables to x86-tune-costs.h.
 	(COSTS_N_BYTES): Move to x86-tune-costs.h.
 	(DUMMY_STRINGOP_ALGS):Move to x86-tune-costs.h.
 	(rip_relative_addr_p): Rename to ...
@@ -467,12 +3828,12 @@
 	(debug_ready_dispatch): Move to ix86-tune-sched-bd.c.
 	(do_dispatch): Move to ix86-tune-sched-bd.c.
 	(has_dispatch): Move to ix86-tune-sched-bd.c.
-	* i386/t-i386: Add new object files.
-	* i386/x86-tune-costs.h: New file.
-	* i386/x86-tune-sched-atom.c: New file.
-	* i386/x86-tune-sched-bd.c: New file.
-	* i386/x86-tune-sched-core.c: New file.
-	* i386/x86-tune-sched.c: New file.
+	* config/i386/t-i386: Add new object files.
+	* config/i386/x86-tune-costs.h: New file.
+	* config/i386/x86-tune-sched-atom.c: New file.
+	* config/i386/x86-tune-sched-bd.c: New file.
+	* config/i386/x86-tune-sched-core.c: New file.
+	* config/i386/x86-tune-sched.c: New file.
 
 2017-10-11  Liu Hao  <lh_mouse@126.com>
 
@@ -973,12 +4334,12 @@
 
 2017-10-08  Jan Hubicka  <hubicka@ucw.cz>
 
-	* i386.c (ix86_expand_set_or_movmem): Disable 512bit loops for targets
-	that preffer 128bit.
+	* config/i386/i386.c (ix86_expand_set_or_movmem): Disable 512bit loops
+	for targets that preffer 128bit.
 
 2017-10-08  Jan Hubicka  <hubicka@ucw.cz>
 
-	* i386.c (has_dispatch): Disable for Ryzen.
+	* config/i386/i386.c (has_dispatch): Disable for Ryzen.
 
 2017-10-08  Olivier Hainque  <hainque@adacore.com>
 
@@ -1175,8 +4536,8 @@
 
 2017-10-05  Jan Hubicka <hubicka@ucw.cz>
 
-	* i386.c (ia32_multipass_dfa_lookahead): Default to issue rate
-	for post-reload scheduling.
+	* config/i386/i386.c (ia32_multipass_dfa_lookahead): Default to issue
+	rate for post-reload scheduling.
 
 2017-10-05  Tamar Christina  <tamar.christina@arm.com>
 
@@ -1184,13 +4545,13 @@
 
 2017-10-05  Jan Hubicka <hubicka@ucw.cz>
 
-	* i386.c (znver1_cost): Set branch_cost to 3 (instead of 2)
+	* config/i386/i386.c (znver1_cost): Set branch_cost to 3 (instead of 2)
 	to improve monte carlo in scimark.
 
 2017-10-05  Jan Hubicka <hubicka@ucw.cz>
 
-	* i386.c (ix86_size_cost, i386_cost, i486_cost, pentium_cost,
-	lakemont_cost, pentiumpro_cost, geode_cost, k6_cost,
+	* config/i386/i386.c (ix86_size_cost, i386_cost, i486_cost,
+	pentium_cost, lakemont_cost, pentiumpro_cost, geode_cost, k6_cost,
 	athlon_cost, k8_cost, amdfam10_cost, btver1_cost, btver2_cost,
 	pentium4_cost, nocona_cost): Set reassociation width to 1.
 	(bdver1_cost, bdver2_cost, bdver3_cost, bdver4_cost): Set reassociation
@@ -1205,7 +4566,7 @@
 	(ix86_reassociation_width): Rewrite using cost table; special case
 	plus/minus on Zen; honor X86_TUNE_SSE_SPLIT_REGS
 	and TARGET_AVX128_OPTIMAL.
-	* i386.h (processor_costs): Add
+	* config/i386/i386.h (processor_costs): Add
 	reassoc_int, reassoc_fp, reassoc_vec_int, reassoc_vec_fp.
 	(TARGET_VECTOR_PARALLEL_EXECUTION, TARGET_REASSOC_INT_TO_PARALLEL,
 	TARGET_REASSOC_FP_TO_PARALLEL): Remove.
@@ -25475,11 +28836,6 @@
 	* doc/invoke.texi: Replace inequality signs with square brackets
 	for -Wnormalized.
 
-2017-02-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
-
-	PR tree-optimization/68644
-	* gcc.dg/tree-ssa/ivopts-lt-2.c: Skip for powerpc*-*-*.
-
 2017-02-22  Matthew Fortune  <matthew.fortune@imgtec.com>
 
 	PR target/78660
@@ -27052,8 +30408,6 @@
 	* tree-vrp.c (process_assert_insertions): Properly adjust common
 	when removing a duplicate.
 
-	* gcc.dg/torture/pr79276.c: New testcase.
-
 2017-01-30  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/79256
diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP
index 96354492ed7..2c700d42332 100644
--- a/gcc/DATESTAMP
+++ b/gcc/DATESTAMP
@@ -1 +1 @@
-20171013
+20171104
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0f7110d227a..7e23a230793 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1161,6 +1161,7 @@ FLAGS_TO_PASS = \
 	"libexecsubdir=$(libexecsubdir)" \
 	"datarootdir=$(datarootdir)" \
 	"datadir=$(datadir)" \
+	"libsubdir=$(libsubdir)" \
 	"localedir=$(localedir)"
 #
 # Lists of files for various purposes.
@@ -1447,7 +1448,6 @@ OBJS = \
 	sched-deps.o \
 	sched-ebb.o \
 	sched-rgn.o \
-	sdbout.o \
 	sel-sched-ir.o \
 	sel-sched-dump.o \
 	sel-sched.o \
@@ -1569,6 +1569,7 @@ OBJS = \
 	tree-vrp.o \
 	tree.o \
 	typed-splay-tree.o \
+	unique-ptr-tests.o \
 	valtrack.o \
 	value-prof.o \
 	var-tracking.o \
@@ -1591,7 +1592,7 @@ OBJS-libcommon = diagnostic.o diagnostic-color.o diagnostic-show-locus.o \
 	pretty-print.o intl.o \
 	sbitmap.o \
 	vec.o input.o version.o hash-table.o ggc-none.o memory-block.o \
-	selftest.o
+	selftest.o selftest-diagnostic.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
@@ -2525,7 +2526,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/lists.c $(srcdir)/optabs-libfuncs.c \
   $(srcdir)/profile.c $(srcdir)/mcf.c \
   $(srcdir)/reg-stack.c $(srcdir)/cfgrtl.c \
-  $(srcdir)/sdbout.c $(srcdir)/stor-layout.c \
+  $(srcdir)/stor-layout.c \
   $(srcdir)/stringpool.c $(srcdir)/tree.c $(srcdir)/varasm.c \
   $(srcdir)/gimple.h \
   $(srcdir)/gimple-ssa.h \
diff --git a/gcc/acinclude.m4 b/gcc/acinclude.m4
index dbc0ba7e003..da4ddfd39ed 100644
--- a/gcc/acinclude.m4
+++ b/gcc/acinclude.m4
@@ -277,8 +277,7 @@ fi
 fi])
 
 AC_DEFUN([gcc_AC_INITFINI_ARRAY],
-[AC_REQUIRE([gcc_SUN_LD_VERSION])dnl
-AC_ARG_ENABLE(initfini-array,
+[AC_ARG_ENABLE(initfini-array,
 	[  --enable-initfini-array	use .init_array/.fini_array sections],
 	[], [
 AC_CACHE_CHECK(for .preinit_array/.init_array/.fini_array support,
@@ -556,43 +555,6 @@ if test $[$2] != yes; then
   $8
 fi])])
 
-dnl gcc_SUN_LD_VERSION
-dnl
-dnl Determines Sun linker version numbers, setting gcc_cv_sun_ld_vers to
-dnl the complete version number and gcc_cv_sun_ld_vers_{major, minor} to
-dnl the corresponding fields.
-dnl
-dnl ld and ld.so.1 are guaranteed to be updated in lockstep, so ld version
-dnl numbers can be used in ld.so.1 feature checks even if a different
-dnl linker is configured.
-dnl
-AC_DEFUN([gcc_SUN_LD_VERSION],
-[changequote(,)dnl
-if test "x${build}" = "x${target}" && test "x${build}" = "x${host}"; then
-  case "${target}" in
-    *-*-solaris2*)
-      #
-      # Solaris 2 ld -V output looks like this for a regular version:
-      #
-      # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1699
-      #
-      # but test versions add stuff at the end:
-      #
-      # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10
-      #
-      gcc_cv_sun_ld_ver=`/usr/ccs/bin/ld -V 2>&1`
-      if echo "$gcc_cv_sun_ld_ver" | grep 'Solaris Link Editors' > /dev/null; then
-	gcc_cv_sun_ld_vers=`echo $gcc_cv_sun_ld_ver | sed -n \
-	  -e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'`
-	gcc_cv_sun_ld_vers_major=`expr "$gcc_cv_sun_ld_vers" : '\([0-9]*\)'`
-	gcc_cv_sun_ld_vers_minor=`expr "$gcc_cv_sun_ld_vers" : '[0-9]*\.\([0-9]*\)'`
-      fi
-      ;;
-  esac
-fi
-changequote([,])dnl
-])
-
 dnl GCC_TARGET_TEMPLATE(KEY)
 dnl ------------------------
 dnl Define KEY as a valid configure key on the target machine.
diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
index 3e1f53762c0..6e2a7ffd099 100644
--- a/gcc/ada/ChangeLog
+++ b/gcc/ada/ChangeLog
@@ -1,3 +1,501 @@
+2017-10-31  Eric Botcazou  <ebotcazou@adacore.com>
+
+	PR ada/82785
+	* gcc-interface/Makefile.in (m68k/Linux): Fix typo.
+
+2017-10-21  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc-interface/Makefile.in: Remove bogus settings for VxWorks.
+
+2017-10-21  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc-interface/utils.c (pad_type_hash): Use hashval_t for hash value.
+	(convert): Do not use an unchecked conversion for converting from a
+	type to another type padding it.
+
+2017-10-20  Doug Rupp  <rupp@adacore.com>
+
+	* libgnarl/s-osinte__linux.ads (Relative_Timed_Wait): Add variable
+	needed for using monotonic clock.
+	* libgnarl/s-taprop__linux.adb: Revert previous monotonic clock
+	changes.
+	* libgnarl/s-taprop__linux.adb, s-taprop__posix.adb: Unify and factor
+	out monotonic clock related functions body.
+	(Timed_Sleep, Timed_Delay, Montonic_Clock, RT_Resolution,
+	Compute_Deadline): Move to...
+	* libgnarl/s-tpopmo.adb: ... here. New separate package body.
+
+2017-10-20  Ed Schonberg  <schonberg@adacore.com>
+
+	* sem_util.adb (Is_Controlling_Limited_Procedure): Handle properly the
+	case where the controlling formal is an anonymous access to interface
+	type.
+	* exp_ch9.adb (Extract_Dispatching_Call): If controlling actual is an
+	access type, handle properly the the constructed dereference that
+	designates the object used in the rewritten synchronized call.
+	(Parameter_Block_Pack): If the type of the actual is by-copy, its
+	generated declaration in the parameter block does not need an
+	initialization even if the type is a null-excluding access type,
+	because it will be initialized with the value of the actual later on.
+	(Parameter_Block_Pack): Do not add controlling actual to parameter
+	block when its type is by-copy.
+
+2017-10-20  Justin Squirek  <squirek@adacore.com>
+
+	* sem_ch8.adb (Update_Use_Clause_Chain): Add sanity check to verify
+	scope stack traversal into the context clause.
+
+2017-10-20  Bob Duff  <duff@adacore.com>
+
+	* sinfo.ads: Fix a comment typo.
+
+2017-10-20  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* doc/gnat_ugn/building_executable_programs_with_gnat.rst (-flto): Add
+	warning against usage in conjunction with -gnatn.
+	(-fdump-xref): Delete entry.
+	* doc/gnat_ugn/gnat_utility_programs.rst (--ext): Remove mention of
+	-fdump-xref switch.
+	* gnat_ugn.texi: Regenerate.
+
+2017-10-20  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* sem_type.adb, exp_util.adb, sem_util.adb, sem_dim.adb, sem_elab.adb:
+	Minor reformatting.
+
+2017-10-20  Yannick Moy  <moy@adacore.com>
+
+	* sem_dim.adb (Analyze_Dimension_Binary_Op): Accept with a warning to
+	compare a dimensioned expression with a literal.
+	(Dim_Warning_For_Numeric_Literal): Do not issue a warning for the
+	special value zero.
+	* doc/gnat_ugn/gnat_and_program_execution.rst: Update description of
+	dimensionality system in GNAT.
+	* gnat_ugn.texi: Regenerate.
+
+2017-10-20  Yannick Moy  <moy@adacore.com>
+
+	* sem_ch6.adb (Analyze_Expression_Function.Freeze_Expr_Types): Remove
+	inadequate silencing of errors.
+	* sem_util.adb (Check_Part_Of_Reference): Do not issue an error when
+	checking the subprogram body generated from an expression function,
+	when this is done as part of the preanalysis done on expression
+	functions, as the subprogram body may not yet be attached in the AST.
+	The error if any will be issued later during the analysis of the body.
+	(Is_Aliased_View): Trivial rewrite with Is_Formal_Object.
+
+2017-10-20  Arnaud Charlet  <charlet@adacore.com>
+
+	* sem_ch8.adb (Update_Chain_In_Scope): Add missing [-gnatwu] marker for
+	warning on ineffective use clause.
+
+2017-10-20  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* exp_ch11.ads (Warn_If_No_Local_Raise): Declare.
+	* exp_ch11.adb (Expand_Exception_Handlers): Use Warn_If_No_Local_Raise
+	to issue the warning on the absence of local raise.
+	(Possible_Local_Raise): Do not issue the warning for Call_Markers.
+	(Warn_If_No_Local_Raise): New procedure to issue the warning on the
+	absence of local raise.
+	* sem_elab.adb: Add with and use clauses for Exp_Ch11.
+	(Record_Elaboration_Scenario): Call Possible_Local_Raise in the cases
+	where a scenario could give rise to raising Program_Error.
+	* sem_elab.adb: Typo fixes.
+	* fe.h (Warn_If_No_Local_Raise): Declare.
+	* gcc-interface/gigi.h (get_exception_label): Change return type.
+	* gcc-interface/trans.c (gnu_constraint_error_label_stack): Change to
+	simple vector of Entity_Id.
+	(gnu_storage_error_label_stack): Likewise.
+	(gnu_program_error_label_stack): Likewise.
+	(gigi): Adjust to above changes.
+	(Raise_Error_to_gnu): Likewise.
+	(gnat_to_gnu) <N_Goto_Statement>: Set TREE_USED on the label.
+	(N_Push_Constraint_Error_Label): Push the label onto the stack.
+	(N_Push_Storage_Error_Label): Likewise.
+	(N_Push_Program_Error_Label): Likewise.
+	(N_Pop_Constraint_Error_Label): Pop the label from the stack and issue
+	a warning on the absence of local raise.
+	(N_Pop_Storage_Error_Label): Likewise.
+	(N_Pop_Program_Error_Label): Likewise.
+	(push_exception_label_stack): Delete.
+	(get_exception_label): Change return type to Entity_Id and adjust.
+	* gcc-interface/utils2.c (build_goto_raise): Change type of first
+	parameter to Entity_Id and adjust.  Set TREE_USED on the label.
+	(build_call_raise): Adjust calls to get_exception_label and also
+	build_goto_raise.
+	(build_call_raise_column): Likewise.
+	(build_call_raise_range): Likewise.
+	* doc/gnat_ugn/building_executable_programs_with_gnat.rst (-gnatw.x):
+	Document actual default behavior.
+
+2017-10-20  Piotr Trojanek  <trojanek@adacore.com>
+
+	* einfo.ads: Minor consistent punctuation in comment.  All numbered
+	items in the comment of Is_Internal are now terminated with a period.
+
+2017-10-20  Piotr Trojanek  <trojanek@adacore.com>
+
+	* exp_util.adb (Build_Temporary): Mark created temporary entity as
+	internal.
+
+2017-10-20  Piotr Trojanek  <trojanek@adacore.com>
+
+	* sem_type.adb (In_Generic_Actual): Simplified.
+
+2017-10-20  Justin Squirek  <squirek@adacore.com>
+
+	* sem_ch12.adb (Check_Formal_Package_Instance): Add sanity check to
+	verify a renaming exists for a generic formal before comparing it to
+	the actual as defaulted formals will not have a renamed_object.
+
+2017-10-20  Javier Miranda  <miranda@adacore.com>
+
+	* exp_ch6.adb (Replace_Returns): Fix wrong management of
+	N_Block_Statement nodes.
+
+2017-10-20  Bob Duff  <duff@adacore.com>
+
+	* exp_aggr.adb (Initialize_Array_Component): Avoid adjusting a
+	component of an array aggregate if it is initialized by a
+	build-in-place function call.
+	* exp_ch6.adb (Is_Build_In_Place_Result_Type): Use -gnatd.9 to disable
+	bip for nonlimited types.
+	* debug.adb: Document -gnatd.9.
+
+2017-10-20  Bob Duff  <duff@adacore.com>
+
+	* sem_ch12.adb: Remove redundant setting of Parent.
+
+2017-10-20  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* sem_ch4.adb (Find_Concatenation_Types): Filter out operators if one
+	of the operands is a string literal.
+
+2017-10-20  Bob Duff  <duff@adacore.com>
+
+	* einfo.ads: Comment fix.
+
+2017-10-20  Clement Fumex  <fumex@adacore.com>
+
+	* switch-c.adb: Remove -gnatwm from the switches triggered by -gnateC.
+
+2017-10-20  Ed Schonberg  <schonberg@adacore.com>
+
+	* sem_dim.adb (Extract_Power): Accept dimension values that are not
+	non-negative integers when the dimensioned base type is an Integer
+	type.
+
+2017-10-20  Bob Duff  <duff@adacore.com>
+
+	* sinfo.ads, sinfo.adb (Alloc_For_BIP_Return): New flag to indicate
+	that an allocator came from a b-i-p return statement.
+	* exp_ch4.adb (Expand_Allocator_Expression): Avoid adjusting the return
+	object of a nonlimited build-in-place function call.
+	* exp_ch6.adb (Expand_N_Extended_Return_Statement): Set the
+	Alloc_For_BIP_Return flag on generated allocators.
+	* sem_ch5.adb (Analyze_Assignment): Move Assert to where it can't fail.
+	If the N_Assignment_Statement has been transformed into something else,
+	then Should_Transform_BIP_Assignment won't work.
+	* exp_ch3.adb (Expand_N_Object_Declaration): A previous revision said,
+	"Remove Adjust if we're building the return object of an extended
+	return statement in place." Back out that part of the change, because
+	the Alloc_For_BIP_Return flag is now used for that.
+
+2017-10-19  Bob Duff  <duff@adacore.com>
+
+	* exp_ch6.adb (Is_Build_In_Place_Result_Type): Fix silly bug -- "Typ"
+	should be "T".  Handle case of a subtype of a class-wide type.
+
+2017-10-19  Bob Duff  <duff@adacore.com>
+
+	* exp_util.adb: (Process_Statements_For_Controlled_Objects): Clarify
+	which node kinds can legitimately be ignored, and raise Program_Error
+	for others.
+
+2017-10-19  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* sem_elab.adb (Compilation_Unit): Handle the case of a subprogram
+	instantiation that acts as a compilation unit.
+	(Find_Code_Unit): Reimplemented.
+	(Find_Top_Unit): Reimplemented.
+	(Find_Unit_Entity): New routine.
+	(Process_Instantiation_SPARK): Correct the elaboration requirement a
+	package instantiation imposes on a unit.
+
+2017-10-19  Bob Duff  <duff@adacore.com>
+
+	* exp_ch6.adb (Is_Build_In_Place_Result_Type): Enable build-in-place
+	for a narrow set of controlled types.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* sinput.ads (Line_Start): Add pragma Inline.
+	* widechar.ads (Is_Start_Of_Wide_Char): Likewise.
+
+2017-10-19  Bob Duff  <duff@adacore.com>
+
+	* exp_attr.adb (Expand_N_Attribute_Reference): Disable
+	Make_Build_In_Place_Call_... for F(...)'Old, where F(...) is a
+	build-in-place function call so that the temp is declared in the right
+	place.
+
+2017-10-18  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc-interface/misc.c (gnat_tree_size): Move around.
+
+	* gcc-interface/utils.c (max_size): Deal with SSA names.
+
+2017-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	* gcc-interface/misc.c (gnat_tree_size): New function.
+	(LANG_HOOKS_TREE_SIZE): Redefine.
+
+2017-10-14  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* sem_elab.adb (In_Preelaborated_Context): A generic package subject to
+	Remote_Call_Interface is not a suitable preelaboratd context when the
+	call appears in the package body.
+
+2017-10-14  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* layout.ads (Set_Elem_Alignment): Add Align parameter defaulted to 0.
+	* layout.adb (Set_Elem_Alignment): Likewise.  Use M name as maximum
+	alignment for consistency.  If Align is non-zero, use the minimum of
+	Align and M for the alignment.
+	* cstand.adb (Build_Float_Type): Use Set_Elem_Alignment instead of
+	setting the alignment directly.
+
+2017-10-14  Ed Schonberg  <schonberg@adacore.com>
+
+	* sem_prag.adb (Analyze_Pragma, case Check): Defer evaluation of the
+	optional string in an Assert pragma until the expansion of the pragma
+	has rewritten it as a conditional statement, so that the string
+	argument is only evaluaed if the assertion fails. This is mandated by
+	RM 11.4.2.
+
+2017-10-14  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* debug.adb: Switch -gnatd.v and associated flag are now used to
+	enforce the SPARK rules for elaboration in SPARK code.
+	* sem_elab.adb: Describe switch -gnatd.v.
+	(Process_Call): Verify the SPARK rules only when -gnatd.v is in effect.
+	(Process_Instantiation): Verify the SPARK rules only when -gnatd.v is
+	in effect.
+	(Process_Variable_Assignment): Clarify why variable assignments are
+	processed reglardless of whether -gnatd.v is in effect.
+	* doc/gnat_ugn/elaboration_order_handling_in_gnat.rst: Update the
+	sections on elaboration code and compilation switches.
+	* gnat_ugn.texi: Regenerate.
+
+2017-10-14  Gary Dismukes  <dismukes@adacore.com>
+
+	* exp_util.adb, freeze.adb, sem_aggr.adb, sem_util.ads, sem_util.adb,
+	sem_warn.adb: Minor reformattings.
+
+2017-10-14  Ed Schonberg  <schonberg@adacore.com>
+
+	* doc/gnat_rm/implementation_defined_aspects.rst: Add documentation
+	for reverse iteration over formal containers.
+	* gnat_rm.texi: Regenerate.
+
+2017-10-14  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* sem_elab.adb (Ensure_Dynamic_Prior_Elaboration): Renamed to
+	Ensure_Prior_Elaboration_Dynamic for consistency reasons.
+	(Ensure_Static_Prior_Elaboration): Renamed to
+	Ensure_Prior_Elaboration_Static for consistency reasons.
+	(Info_Variable_Reference): Renamed to Info_Variable_Read in order to
+	reflect its new purpose.
+	(Is_Initialized): New routine.
+	(Is_Suitable_Variable_Reference): Renamed to Is_Suitable_Variable_Read
+	in order to reflect its new purpose.
+	(Is_Variable_Read): New routine.
+	(Output_Variable_Reference): Renamed to Output_Variable_Read in order
+	to reflect its new purpose.
+	(Process_Variable_Assignment): This routine now acts as a top level
+	dispatcher for variable assignments.
+	(Process_Variable_Assignment_Ada): New routine.
+	(Process_Variable_Assignment_SPARK): New routine.
+	(Process_Variable_Reference): Renamed to Process_Variable_Read in order
+	to reflects its new purpose. A reference to a variable is now suitable
+	for ABE processing only when it is a read. The logic in the routine now
+	reflects the latest SPARK elaboration rules.
+
+2017-10-14  Justin Squirek  <squirek@adacore.com>
+
+	* sem_ch8.adb (Analyze_Subprogram_Renaming): Modify condition that
+	triggers marking on formal subprograms.
+
+2017-10-14  Javier Miranda  <miranda@adacore.com>
+
+	* checks.adb (Ensure_Valid): Do not skip adding the validity check on
+	renamings of objects that come from the sources.
+
+2017-10-14  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* cstand.adb (Build_Float_Type): Move down Siz parameter, add Align
+	parameter and set the alignment of the type to Align.
+	(Copy_Float_Type): Adjust call to Build_Float_Type.
+	(Register_Float_Type): Add pragma Unreferenced for Precision.  Adjust
+	call to Build_Float_Type and do not set RM_Size and Alignment.
+
+2017-10-14  Patrick Bernardi  <bernardi@adacore.com>
+
+	* Makefile.rtl (GNATRTL_NONTASKING_OBJ): Add s-soliin to
+	GNATRTL_NONTASKING_OBJ.
+
+2017-10-14  Bob Duff  <duff@adacore.com>
+
+	* exp_ch6.adb (Is_Build_In_Place_Result_Type): Include code for
+	enabling b-i-p for nonlimited controlled types (but disabled).
+
+2017-10-14  Justin Squirek  <squirek@adacore.com>
+
+	* sem_elab.adb (Is_Suitable_Variable_Assignment): Replace call to
+	Has_Warnings_Off with Warnings_Off.
+
+2017-10-14  Piotr Trojanek  <trojanek@adacore.com>
+
+	* sinfo.ads (Generic_Parent): Remove wrong (possibly obsolete) comment.
+
+2017-10-14  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* sem_ch3.adb (Analyze_Declarations): Analyze the contract of an
+	enclosing package at the end of the visible declarations.
+	* sem_prag.adb (Analyze_Initialization_Item): Suppress the analysis of
+	an initialization item which is undefined due to some illegality.
+
+2017-10-14  Patrick Bernardi  <bernardi@adacore.com>
+
+	* ali.adb: Add new ALI line 'T' to read the number of tasks contain
+	within each unit that require a default-sized primary and secondary
+	stack to be generated by the binder.
+	(Scan_ALI): Scan new 'T' lines.
+	* ali.ads: Add Primary_Stack_Count and Sec_Stack_Count to Unit_Record.
+	* bindgen.adb (Gen_Output_File): Count the number of default-sized
+	stacks within the closure that are to be created by the binder.
+	(Gen_Adainit, Gen_Output_File_Ada): Generate default-sized secondary
+	stacks and record these in System.Secodnary_Stack.
+	(Resolve_Binder_Options): Check if System.Secondary_Stack is in the
+	closure of the program being bound.
+	* bindusg.adb (Display): Add "-Q" switch. Remove rouge "--RTS" comment.
+	* exp_ch3.adb (Count_Default_Sized_Task_Stacks): New routine.
+	(Expand_N_Object_Declaration): Count the number of default-sized stacks
+	used by task objects contained within the object whose declaration is
+	being expanded.  Only performed when either the restrictions
+	No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations are in
+	effect.
+	* exp_ch9.adb (Create_Secondary_Stack_For_Task): New routine.
+	(Expand_N_Task_Type_Declaration): Create a secondary stack as part of
+	the expansion of a task type if the size of the stack is known at
+	run-time and the restrictions No_Implicit_Heap_Allocations or
+	No_Implicit_Task_Allocations are in effect.
+	(Make_Task_Create_Call): If using a restricted profile provide
+	secondary stack parameter: either the statically created stack or null.
+	* lib-load.adb (Create_Dummy_Package_Unit, Load_Unit,
+	Load_Main_Source): Include Primary_Stack_Count and Sec_Stack_Count in
+	Unit_Record initialization expressions.
+	* lib-writ.adb (Add_Preprocessing_Dependency,
+	Ensure_System_Dependency): Include Primary_Stack_Count and
+	Sec_Stack_Count in Unit_Record initialization expression.
+	(Write_ALI): Write T lines.
+	(Write_Unit_Information): Do not output 'T' lines if there are no
+	stacks for the binder to generate.
+	* lib-writ.ads: Updated library information documentation to include
+	new T line entry.
+	* lib.adb (Increment_Primary_Stack_Count): New routine.
+	(Increment_Sec_Stack_Count): New routine.
+	(Primary_Stack_Count): New routine.
+	(Sec_Stack_Count): New routine.
+	* lib.ads: Add Primary_Stack_Count and Sec_Stack_Count components to
+	Unit_Record and updated documentation.
+	(Increment_Primary_Stack_Count): New routine along with pragma Inline.
+	(Increment_Sec_Stack_Count): New routine along with pragma Inline.
+	(Primary_Stack_Count): New routine along with pragma Inline.
+	(Sec_Stack_Count): New routine along with pragma Inline.
+	* opt.ads: New constant No_Stack_Size.	Flag Default_Stack_Size
+	redefined.  New flag Default_Sec_Stack_Size and
+	Quantity_Of_Default_Size_Sec_Stacks.
+	* rtfinal.c Fixed erroneous comment.
+	* rtsfind.ads: Moved RE_Default_Secondary_Stack_Size from
+	System.Secondary_Stack to System.Parameters.  Add RE_SS_Stack.
+	* sem_util.adb (Number_Of_Elements_In_Array): New routine.
+	* sem_util.ads (Number_Of_Elements_In_Array): New routine.
+	* switch-b.adb (Scan_Binder_Switches): Scan "-Q" switch.
+	* libgnarl/s-solita.adb (Get_Sec_Stack_Addr): Removed routine.
+	(Set_Sec_Stack_Addr): Removed routine.
+	(Get_Sec_Stack): New routine.
+	(Set_Sec_Stack): New routine.
+	(Init_Tasking_Soft_Links): Update System.Soft_Links reference to
+	reflect new procedure and global names.
+	* libgnarl/s-taprop__linux.adb, libgnarl/s-taprop__mingw.adb,
+	libgnarl/s-taprop__posix.adb, libgnarl/s-taprop__solaris.adb,
+	libgnarl/s-taprop__vxworks.adb (Register_Foreign_Thread): Update
+	parameter profile to allow the secondary stack size to be specified.
+	* libgnarl/s-tarest.adb (Create_Restricted_Task): Update the parameter
+	profile to include Sec_Stack_Address.  Update Tasking.Initialize_ATCB
+	call to remove Secondary_Stack_Size reference.  Add secondary stack
+	address and size to SSL.Create_TSD call.
+	(Task_Wrapper): Remove secondary stack creation.
+	* libgnarl/s-tarest.ads (Create_Restricted_Task,
+	Create_Restricted_Task_Sequential): Update parameter profile to include
+	Sec_Stack_Address and clarify the Size parameter.
+	* libgnarl/s-taskin.adb (Initialize_ATCB): Remove Secondary_Stack_Size
+	from profile and body.
+	(Initialize): Remove Secondary_Stack_Size from Initialize_ATCB call.
+	* libgnarl/s-taskin.ads: Removed component Secondary_Stack_Size from
+	Common_ATCB.
+	(Initialize_ATCB): Update the parameter profile to remove
+	Secondary_Stack_Size.
+	* libgnarl/s-tassta.adb (Create_Task): Updated parameter profile and
+	call to Initialize_ATCB.  Add secondary stack address and size to
+	SSL.Create_TSD call, and catch any storage exception from the call.
+	(Finalize_Global_Tasks): Update System.Soft_Links references to reflect
+	new subprogram and component names.
+	(Task_Wrapper): Remove secondary stack creation.
+	(Vulnerable_Complete_Master): Update to reflect TSD changes.
+	* libgnarl/s-tassta.ads: Reformat comments.
+	(Create_Task): Update parameter profile.
+	* libgnarl/s-tporft.adb (Register_Foreign_Thread): Update parameter
+	profile to include secondary stack size. Remove secondary size
+	parameter from Initialize_ATCB call and add it to Create_TSD call.
+	* libgnat/s-parame.adb, libgnat/s-parame__rtems.adb,
+	libgnat/s-parame__vxworks.adb (Default_Sec_Stack_Size): New routine.
+	* libgnat/s-parame.ads, libgnat/s-parame__ae653.ads,
+	libgnat/s-parame__hpux.ads, libgnat/s-parame__vxworks.ads: Remove type
+	Percentage.  Remove constants Dynamic, Sec_Stack_Percentage and
+	Sec_Stack_Dynamic.  Add constant Runtime_Default_Sec_Stack_Size and
+	Sec_Stack_Dynamic.
+	(Default_Sec_Stack_Size): New routine.
+	* libgnat/s-secsta.adb, libgnat/s-secsta.ads: New implementation. Is
+	now Preelaborate.
+	* libgnat/s-soflin.adb: Removed unused with-clauses.  With
+	System.Soft_Links.Initialize to initialize non-tasking TSD.
+	(Create_TSD): Update parameter profile. Initialize the TSD and
+	unconditionally call SS_Init.
+	(Destroy_TSD): Update SST.SS_Free call.
+	(Get_Sec_Stack_Addr_NT, Get_Sec_Stack_Addr_Soft, Set_Sec_Stack_Addr_NT,
+	Set_Sec_Stack_Addr_Soft): Remove routines.
+	(Get_Sec_Stack_NT, Get_Sec_Stack_Soft, Set_Sec_Stack_NT,
+	Set_Sec_Stack_Soft): Add routines.
+	(NT_TSD): Move to private part of package specification.
+	* libgnat/s-soflin.ads: New types Get_Stack_Call and Set_Stack_Call
+	with suppressed access checks.  Renamed *_Sec_Stack_Addr_* routines and
+	objects to *_Sec_Stack_*.  TSD: removed warning suppression and
+	component intialization. Changed Sec_Stack_Addr to Sec_Stack_Ptr.
+	(Create_TSD): Update parameter profile.
+	(NT_TSD): Move to private section from body.
+	* libgnat/s-soliin.adb, libgnat/s-soliin.ads: New files.
+	* libgnat/s-thread.ads (Thread_Body_Enter): Update parameter profile.
+	* libgnat/s-thread__ae653.adb (Get_Sec_Stack_Addr, Set_Sec_Stack_Addr):
+	Remove routine.
+	(Get_Sec_Stack, Set_Sec_Stack): Add routine.
+	(Thread_Body_Enter): Update parameter profile and body to adapt to new
+	System.Secondary_Stack.
+	(Init_RTS): Update body for new System.Soft_Links names.
+	* gcc-interface/Make-lang.in (GNAT_ADA_OBJS, GNATBIND_OBJS): Add
+	s-soliin.o.
+
 2017-10-10  Richard Sandiford  <richard.sandiford@linaro.org>
 
 	* gcc-interface/decl.c (annotate_value): Use wi::to_wide when
diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 021da824c0d..ed43ae5273c 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -659,6 +659,7 @@ GNATRTL_NONTASKING_OBJS= \
   s-sequio$(objext) \
   s-shasto$(objext) \
   s-soflin$(objext) \
+  s-soliin$(objext) \
   s-spsufi$(objext) \
   s-stache$(objext) \
   s-stalib$(objext) \
diff --git a/gcc/ada/ali.adb b/gcc/ada/ali.adb
index 2b1d472baba..959b3058728 100644
--- a/gcc/ada/ali.adb
+++ b/gcc/ada/ali.adb
@@ -58,6 +58,7 @@ package body ALI is
       'Z'    => True,   -- implicit with from instantiation
       'C'    => True,   -- SCO information
       'F'    => True,   -- SPARK cross-reference information
+      'T'    => True,   -- task stack information
       others => False);
 
    --------------------
@@ -842,7 +843,7 @@ package body ALI is
 
       if Read_Xref then
          Ignore :=
-           ('U' | 'W' | 'Y' | 'Z' | 'D' | 'X' => False, others => True);
+           ('T' | 'U' | 'W' | 'Y' | 'Z' | 'D' | 'X' => False, others => True);
 
       --  Read_Lines parameter given
 
@@ -1744,6 +1745,8 @@ package body ALI is
             UL.Elaborate_Body_Desirable := False;
             UL.Optimize_Alignment       := 'O';
             UL.Has_Finalizer            := False;
+            UL.Primary_Stack_Count      := 0;
+            UL.Sec_Stack_Count          := 0;
 
             if Debug_Flag_U then
                Write_Str (" ----> reading unit ");
@@ -2096,6 +2099,28 @@ package body ALI is
          Units.Table (Units.Last).Last_With := Withs.Last;
          Units.Table (Units.Last).Last_Arg  := Args.Last;
 
+         --  Scan out task stack information for the unit if present
+
+         Check_Unknown_Line;
+
+         if C = 'T' then
+            if Ignore ('T') then
+               Skip_Line;
+
+            else
+               Checkc (' ');
+               Skip_Space;
+
+               Units.Table (Units.Last).Primary_Stack_Count := Get_Nat;
+               Skip_Space;
+               Units.Table (Units.Last).Sec_Stack_Count := Get_Nat;
+               Skip_Space;
+               Skip_Eol;
+            end if;
+
+            C := Getc;
+         end if;
+
          --  If there are linker options lines present, scan them
 
          Name_Len := 0;
diff --git a/gcc/ada/ali.ads b/gcc/ada/ali.ads
index e15a1c455bd..3fa4d99fb09 100644
--- a/gcc/ada/ali.ads
+++ b/gcc/ada/ali.ads
@@ -388,11 +388,19 @@ package ALI is
       --  together as possible.
 
       Optimize_Alignment : Character;
-      --  Optimize_Alignment setting. Set to L/S/T/O for OL/OS/OT/OO present
+      --  Optimize_Alignment setting. Set to L/S/T/O for OL/OS/OT/OO present.
 
       Has_Finalizer : Boolean;
       --  Indicates whether a package body or a spec has a library-level
       --  finalization routine.
+
+      Primary_Stack_Count : Int;
+      --  Indicates the number of task objects declared in this unit that have
+      --  default sized primary stacks.
+
+      Sec_Stack_Count : Int;
+      --  Indicates the number of task objects declared in this unit that have
+      --  default sized secondary stacks.
    end record;
 
    package Units is new Table.Table (
diff --git a/gcc/ada/bindgen.adb b/gcc/ada/bindgen.adb
index a9ea20ebd9b..e3d875bc8cc 100644
--- a/gcc/ada/bindgen.adb
+++ b/gcc/ada/bindgen.adb
@@ -59,6 +59,14 @@ package body Bindgen is
    Num_Elab_Calls : Nat := 0;
    --  Number of generated calls to elaboration routines
 
+   Num_Primary_Stacks : Int := 0;
+   --  Number of default-sized primary stacks the binder needs to allocate for
+   --  task objects declared in the program.
+
+   Num_Sec_Stacks : Int := 0;
+   --  Number of default-sized primary stacks the binder needs to allocate for
+   --  task objects declared in the program.
+
    System_Restrictions_Used : Boolean := False;
    --  Flag indicating whether the unit System.Restrictions is in the closure
    --  of the partition. This is set by Resolve_Binder_Options, and is used
@@ -74,6 +82,12 @@ package body Bindgen is
    --  domains just before calling the main procedure from the environment
    --  task.
 
+   System_Secondary_Stack_Used : Boolean := False;
+   --  Flag indicating whether the unit System.Secondary_Stack is in the
+   --  closure of the partition. This is set by Resolve_Binder_Options, and
+   --  is used to initialize the package in cases where the run-time brings
+   --  in package but the secondary stack is not used.
+
    System_Tasking_Restricted_Stages_Used : Boolean := False;
    --  Flag indicating whether the unit System.Tasking.Restricted.Stages is in
    --  the closure of the partition. This is set by Resolve_Binder_Options,
@@ -179,8 +193,11 @@ package body Bindgen is
    --     Exception_Tracebacks_Symbolic : Integer;
    --     Detect_Blocking               : Integer;
    --     Default_Stack_Size            : Integer;
+   --     Default_Secondary_Stack_Size  : System.Parameters.Size_Type;
    --     Leap_Seconds_Support          : Integer;
    --     Main_CPU                      : Integer;
+   --     Default_Sized_SS_Pool         : System.Address;
+   --     Binder_Sec_Stacks_Count       : Natural;
 
    --  Main_Priority is the priority value set by pragma Priority in the main
    --  program. If no such pragma is present, the value is -1.
@@ -261,6 +278,9 @@ package body Bindgen is
    --  Default_Stack_Size is the default stack size used when creating an Ada
    --  task with no explicit Storage_Size clause.
 
+   --  Default_Secondary_Stack_Size is the default secondary stack size used
+   --  when creating an Ada task with no explicit Secondary_Stack_Size clause.
+
    --  Leap_Seconds_Support denotes whether leap seconds have been enabled or
    --  disabled. A value of zero indicates that leap seconds are turned "off",
    --  while a value of one signifies "on" status.
@@ -268,6 +288,14 @@ package body Bindgen is
    --  Main_CPU is the processor set by pragma CPU in the main program. If no
    --  such pragma is present, the value is -1.
 
+   --  Default_Sized_SS_Pool is set to the address of the default-sized
+   --  secondary stacks array generated by the binder. This pool of stacks is
+   --  generated when either the restriction No_Implicit_Heap_Allocations
+   --  or No_Implicit_Task_Allocations is active.
+
+   --  Binder_Sec_Stacks_Count is the number of generated secondary stacks in
+   --  the Default_Sized_SS_Pool.
+
    procedure WBI (Info : String) renames Osint.B.Write_Binder_Info;
    --  Convenient shorthand used throughout
 
@@ -554,6 +582,32 @@ package body Bindgen is
             WBI ("      procedure Start_Slave_CPUs;");
             WBI ("      pragma Import (C, Start_Slave_CPUs," &
                  " ""__gnat_start_slave_cpus"");");
+            WBI ("");
+         end if;
+
+         --  A restricted run-time may attempt to initialize the main task's
+         --  secondary stack even if the stack is not used. Consequently,
+         --  the binder needs to initialize Binder_Sec_Stacks_Count anytime
+         --  System.Secondary_Stack is in the enclosure of the partition.
+
+         if System_Secondary_Stack_Used then
+            WBI ("      Binder_Sec_Stacks_Count : Natural;");
+            WBI ("      pragma Import (Ada, Binder_Sec_Stacks_Count, " &
+                 """__gnat_binder_ss_count"");");
+            WBI ("");
+         end if;
+
+         if Sec_Stack_Used then
+            WBI ("      Default_Secondary_Stack_Size : " &
+                 "System.Parameters.Size_Type;");
+            WBI ("      pragma Import (C, Default_Secondary_Stack_Size, " &
+                 """__gnat_default_ss_size"");");
+
+            WBI ("      Default_Sized_SS_Pool : System.Address;");
+            WBI ("      pragma Import (Ada, Default_Sized_SS_Pool, " &
+                 """__gnat_default_ss_pool"");");
+
+            WBI ("");
          end if;
 
          WBI ("   begin");
@@ -588,6 +642,50 @@ package body Bindgen is
             WBI ("      null;");
          end if;
 
+         --  Generate default-sized secondary stack pool and set secondary
+         --  stack globals.
+
+         if Sec_Stack_Used then
+
+            --  Elaborate the body of the binder to initialize the default-
+            --  sized secondary stack pool.
+
+            WBI ("");
+            WBI ("      " & Get_Ada_Main_Name & "'Elab_Body;");
+
+            --  Generate the default-sized secondary stack pool and set the
+            --  related secondary stack globals.
+
+            Set_String ("      Default_Secondary_Stack_Size := ");
+
+            if Opt.Default_Sec_Stack_Size /= Opt.No_Stack_Size then
+               Set_Int (Opt.Default_Sec_Stack_Size);
+            else
+               Set_String ("System.Parameters.Runtime_Default_Sec_Stack_Size");
+            end if;
+
+            Set_Char (';');
+            Write_Statement_Buffer;
+
+            Set_String ("      Binder_Sec_Stacks_Count := ");
+            Set_Int (Num_Sec_Stacks);
+            Set_Char (';');
+            Write_Statement_Buffer;
+
+            WBI ("      Default_Sized_SS_Pool := " &
+                   "Sec_Default_Sized_Stacks'Address;");
+            WBI ("");
+
+         --  When a restricted run-time initializes the main task's secondary
+         --  stack but the program does not use it, no secondary stack is
+         --  generated. Binder_Sec_Stacks_Count is set to zero so the run-time
+         --  is aware that the lack of pre-allocated secondary stack is
+         --  expected.
+
+         elsif System_Secondary_Stack_Used then
+            WBI ("      Binder_Sec_Stacks_Count := 0;");
+         end if;
+
       --  Normal case (standard library not suppressed). Set all global values
       --  used by the run time.
 
@@ -647,6 +745,10 @@ package body Bindgen is
          WBI ("      Default_Stack_Size : Integer;");
          WBI ("      pragma Import (C, Default_Stack_Size, " &
               """__gl_default_stack_size"");");
+         WBI ("      Default_Secondary_Stack_Size : " &
+              "System.Parameters.Size_Type;");
+         WBI ("      pragma Import (C, Default_Secondary_Stack_Size, " &
+              """__gnat_default_ss_size"");");
          WBI ("      Leap_Seconds_Support : Integer;");
          WBI ("      pragma Import (C, Leap_Seconds_Support, " &
               """__gl_leap_seconds_support"");");
@@ -730,6 +832,18 @@ package body Bindgen is
                  & """__gnat_freeze_dispatching_domains"");");
          end if;
 
+         --  Secondary stack global variables
+
+         WBI ("      Binder_Sec_Stacks_Count : Natural;");
+         WBI ("      pragma Import (Ada, Binder_Sec_Stacks_Count, " &
+              """__gnat_binder_ss_count"");");
+
+         WBI ("      Default_Sized_SS_Pool : System.Address;");
+         WBI ("      pragma Import (Ada, Default_Sized_SS_Pool, " &
+              """__gnat_default_ss_pool"");");
+
+         WBI ("");
+
          --  Start of processing for Adainit
 
          WBI ("   begin");
@@ -870,9 +984,51 @@ package body Bindgen is
             WBI ("      Bind_Env_Addr := Bind_Env'Address;");
          end if;
 
-         --  Generate call to Install_Handler
-
          WBI ("");
+
+         --  Generate default-sized secondary stack pool and set secondary
+         --  stack globals.
+
+         if Sec_Stack_Used then
+
+            --  Elaborate the body of the binder to initialize the default-
+            --  sized secondary stack pool.
+
+            WBI ("      " & Get_Ada_Main_Name & "'Elab_Body;");
+
+            --  Generate the default-sized secondary stack pool and set the
+            --  related secondary stack globals.
+
+            Set_String ("      Default_Secondary_Stack_Size := ");
+
+            if Opt.Default_Sec_Stack_Size /= Opt.No_Stack_Size then
+               Set_Int (Opt.Default_Sec_Stack_Size);
+            else
+               Set_String ("System.Parameters.Runtime_Default_Sec_Stack_Size");
+            end if;
+
+            Set_Char (';');
+            Write_Statement_Buffer;
+
+            Set_String ("      Binder_Sec_Stacks_Count := ");
+            Set_Int (Num_Sec_Stacks);
+            Set_Char (';');
+            Write_Statement_Buffer;
+
+            Set_String ("      Default_Sized_SS_Pool := ");
+
+            if Num_Sec_Stacks > 0 then
+               Set_String ("Sec_Default_Sized_Stacks'Address;");
+            else
+               Set_String ("System.Null_Address;");
+            end if;
+
+            Write_Statement_Buffer;
+            WBI ("");
+         end if;
+
+         --  Generate call to Runtime_Initialize
+
          WBI ("      Runtime_Initialize (1);");
       end if;
 
@@ -888,17 +1044,6 @@ package body Bindgen is
          Write_Statement_Buffer;
       end if;
 
-      --  Generate assignment of default secondary stack size if set
-
-      if Sec_Stack_Used and then Default_Sec_Stack_Size /= -1 then
-         WBI ("");
-         Set_String ("      System.Secondary_Stack.");
-         Set_String ("Default_Secondary_Stack_Size := ");
-         Set_Int (Opt.Default_Sec_Stack_Size);
-         Set_Char (';');
-         Write_Statement_Buffer;
-      end if;
-
       --  Initialize stack limit variable of the environment task if the stack
       --  check method is stack limit and stack check is enabled.
 
@@ -2044,6 +2189,26 @@ package body Bindgen is
          end if;
       end loop;
 
+      --  Count the number of statically allocated stacks to be generated by
+      --  the binder. If the user has specified the number of default-sized
+      --  secondary stacks, use that number. Otherwise start the count at one
+      --  as the binder is responsible for creating a secondary stack for the
+      --  main task.
+
+      if Opt.Quantity_Of_Default_Size_Sec_Stacks /= -1 then
+         Num_Sec_Stacks := Quantity_Of_Default_Size_Sec_Stacks;
+      elsif Sec_Stack_Used then
+         Num_Sec_Stacks := 1;
+      end if;
+
+      for J in Units.First .. Units.Last loop
+         Num_Primary_Stacks :=
+           Num_Primary_Stacks + Units.Table (J).Primary_Stack_Count;
+
+         Num_Sec_Stacks :=
+           Num_Sec_Stacks + Units.Table (J).Sec_Stack_Count;
+      end loop;
+
       --  Generate output file in appropriate language
 
       Gen_Output_File_Ada (Filename, Elab_Order);
@@ -2114,9 +2279,11 @@ package body Bindgen is
          WBI ("with System.Scalar_Values;");
       end if;
 
-      --  Generate with of System.Secondary_Stack if active
+      --  Generate withs of System.Secondary_Stack and System.Parameters to
+      --  allow the generation of the default-sized secondary stack pool.
 
-      if Sec_Stack_Used and then Default_Sec_Stack_Size /= -1 then
+      if Sec_Stack_Used then
+         WBI ("with System.Parameters;");
          WBI ("with System.Secondary_Stack;");
       end if;
 
@@ -2156,10 +2323,10 @@ package body Bindgen is
             end if;
          end if;
 
-         --  Define exit status. Again in normal mode, this is in the
-         --  run-time library, and is initialized there, but in the
-         --  configurable runtime case, the variable is declared and
-         --  initialized in this file.
+         --  Define exit status. Again in normal mode, this is in the run-time
+         --  library, and is initialized there, but in the configurable
+         --  run-time case, the variable is declared and initialized in this
+         --  file.
 
          WBI ("");
 
@@ -2358,6 +2525,29 @@ package body Bindgen is
 
       Gen_Elab_Externals (Elab_Order);
 
+      --  Generate default-sized secondary stacks pool. At least one stack is
+      --  created and assigned to the environment task if secondary stacks are
+      --  used by the program.
+
+      if Sec_Stack_Used then
+         Set_String ("   Sec_Default_Sized_Stacks");
+         Set_String (" : array (1 .. ");
+         Set_Int (Num_Sec_Stacks);
+         Set_String (") of aliased System.Secondary_Stack.SS_Stack (");
+
+         if Opt.Default_Sec_Stack_Size /= No_Stack_Size then
+            Set_Int (Opt.Default_Sec_Stack_Size);
+         else
+            Set_String ("System.Parameters.Runtime_Default_Sec_Stack_Size");
+         end if;
+
+         Set_String (");");
+         Write_Statement_Buffer;
+         WBI ("");
+      end if;
+
+      --  Generate reference
+
       if not CodePeer_Mode then
          if not Suppress_Standard_Library_On_Target then
 
@@ -2389,8 +2579,8 @@ package body Bindgen is
 
          if not Suppress_Standard_Library_On_Target then
 
-            --  The B.1(39) implementation advice says that the adainit
-            --  and adafinal routines should be idempotent. Generate a flag to
+            --  The B.1(39) implementation advice says that the adainit and
+            --  adafinal routines should be idempotent. Generate a flag to
             --  ensure that. This is not needed if we are suppressing the
             --  standard library since it would never be referenced.
 
@@ -2873,6 +3063,11 @@ package body Bindgen is
 
          Check_Package (System_Restrictions_Used, "system.restrictions%s");
 
+         --  Ditto for the use of System.Secondary_Stack
+
+         Check_Package
+           (System_Secondary_Stack_Used, "system.secondary_stack%s");
+
          --  Ditto for use of an SMP bareboard runtime
 
          Check_Package (System_BB_CPU_Primitives_Multiprocessors_Used,
diff --git a/gcc/ada/bindusg.adb b/gcc/ada/bindusg.adb
index 6cf7710219e..7c17f939514 100644
--- a/gcc/ada/bindusg.adb
+++ b/gcc/ada/bindusg.adb
@@ -210,6 +210,11 @@ package body Bindusg is
       Write_Line
         ("  -P        Generate binder file suitable for CodePeer");
 
+      --  Line for Q switch
+
+      Write_Line
+        ("  -Qnnn     Generate nnn default-sized secondary stacks");
+
       --  Line for -r switch
 
       Write_Line
@@ -309,8 +314,6 @@ package body Bindusg is
       Write_Line
         ("  -z        No main subprogram (zero main)");
 
-      --  Line for --RTS
-
       --  Line for -Z switch
 
       Write_Line
diff --git a/gcc/ada/checks.adb b/gcc/ada/checks.adb
index a99da08c733..b2c26ca4981 100644
--- a/gcc/ada/checks.adb
+++ b/gcc/ada/checks.adb
@@ -5940,6 +5940,10 @@ package body Checks is
       --  In addition, we force a check if Force_Validity_Checks is set
 
       elsif not Comes_From_Source (Expr)
+        and then not
+          (Nkind (Expr) = N_Identifier
+            and then Present (Renamed_Object (Entity (Expr)))
+            and then Comes_From_Source (Renamed_Object (Entity (Expr))))
         and then not Force_Validity_Checks
         and then (Nkind (Expr) /= N_Unchecked_Type_Conversion
                     or else Kill_Range_Check (Expr))
diff --git a/gcc/ada/cstand.adb b/gcc/ada/cstand.adb
index fe480beb426..e45c0542f26 100644
--- a/gcc/ada/cstand.adb
+++ b/gcc/ada/cstand.adb
@@ -62,15 +62,22 @@ package body CStand is
    -----------------------
 
    procedure Build_Float_Type
-     (E    : Entity_Id;
-      Siz  : Int;
-      Rep  : Float_Rep_Kind;
-      Digs : Int);
+     (E     : Entity_Id;
+      Digs  : Int;
+      Rep   : Float_Rep_Kind;
+      Siz   : Int;
+      Align : Int);
    --  Procedure to build standard predefined float base type. The first
-   --  parameter is the entity for the type, and the second parameter is the
-   --  size in bits. The third parameter indicates the kind of representation
-   --  to be used. The fourth parameter is the digits value. Each type
+   --  parameter is the entity for the type. The second parameter is the
+   --  digits value. The third parameter indicates the representation to
+   --  be used for the type. The fourth parameter is the size in bits.
+   --  The fifth parameter is the alignment in storage units. Each type
    --  is added to the list of predefined floating point types.
+   --
+   --  Note that both RM_Size and Esize are set to the specified size, i.e.
+   --  we do not set the RM_Size to the precision passed by the back end.
+   --  This is consistent with the semantics of 'Size specified in the RM
+   --  because we cannot pack components of the type tighter than this size.
 
    procedure Build_Signed_Integer_Type (E : Entity_Id; Siz : Nat);
    --  Procedure to build standard predefined signed integer subtype. The
@@ -189,10 +196,11 @@ package body CStand is
    ----------------------
 
    procedure Build_Float_Type
-     (E    : Entity_Id;
-      Siz  : Int;
-      Rep  : Float_Rep_Kind;
-      Digs : Int)
+     (E     : Entity_Id;
+      Digs  : Int;
+      Rep   : Float_Rep_Kind;
+      Siz   : Int;
+      Align : Int)
    is
    begin
       Set_Type_Definition (Parent (E),
@@ -201,10 +209,10 @@ package body CStand is
 
       Set_Ekind                      (E, E_Floating_Point_Type);
       Set_Etype                      (E, E);
-      Set_Float_Rep (E, Rep);
-      Init_Size                      (E, Siz);
-      Set_Elem_Alignment             (E);
       Init_Digits_Value              (E, Digs);
+      Set_Float_Rep                  (E, Rep);
+      Init_Size                      (E, Siz);
+      Set_Elem_Alignment             (E, Align);
       Set_Float_Bounds               (E);
       Set_Is_Frozen                  (E);
       Set_Is_Public                  (E);
@@ -295,8 +303,9 @@ package body CStand is
 
    procedure Copy_Float_Type (To : Entity_Id; From : Entity_Id) is
    begin
-      Build_Float_Type (To, UI_To_Int (Esize (From)), Float_Rep (From),
-                        UI_To_Int (Digits_Value (From)));
+      Build_Float_Type
+        (To, UI_To_Int (Digits_Value (From)), Float_Rep (From),
+         UI_To_Int (Esize (From)), UI_To_Int (Alignment (From)));
    end Copy_Float_Type;
 
    ----------------------
@@ -2065,15 +2074,17 @@ package body CStand is
       Size      : Positive;
       Alignment : Natural)
    is
+      pragma Unreferenced (Precision);
+      --  See Build_Float_Type for the rationale
+
       Ent : constant Entity_Id := New_Standard_Entity;
 
    begin
       Set_Defining_Identifier (New_Node (N_Full_Type_Declaration, Stloc), Ent);
       Make_Name (Ent, Name);
       Set_Scope (Ent, Standard_Standard);
-      Build_Float_Type (Ent, Int (Size), Float_Rep, Pos (Digs));
-      Set_RM_Size (Ent, UI_From_Int (Int (Precision)));
-      Set_Alignment (Ent, UI_From_Int (Int (Alignment / 8)));
+      Build_Float_Type
+        (Ent, Pos (Digs), Float_Rep, Int (Size), Int (Alignment / 8));
 
       if No (Back_End_Float_Types) then
          Back_End_Float_Types := New_Elmt_List;
diff --git a/gcc/ada/debug.adb b/gcc/ada/debug.adb
index 4e747203394..442ce0873e5 100644
--- a/gcc/ada/debug.adb
+++ b/gcc/ada/debug.adb
@@ -112,7 +112,7 @@ package body Debug is
    --  d.s  Strict secondary stack management
    --  d.t  Disable static allocation of library level dispatch tables
    --  d.u  Enable Modify_Tree_For_C (update tree for c)
-   --  d.v
+   --  d.v  Enforce SPARK elaboration rules in SPARK code
    --  d.w  Do not check for infinite loops
    --  d.x  No exception handlers
    --  d.y  Disable implicit pragma Elaborate_All on task bodies
@@ -163,7 +163,7 @@ package body Debug is
    --  d.6  Do not avoid declaring unreferenced types in C code
    --  d.7
    --  d.8
-   --  d.9  Enable build-in-place for nonlimited types
+   --  d.9  Disable build-in-place for nonlimited types
 
    --  Debug flags for binder (GNATBIND)
 
@@ -600,6 +600,13 @@ package body Debug is
    --  d.u  Sets Modify_Tree_For_C mode in which tree is modified to make it
    --       easier to generate code using a C compiler.
 
+   --  d.v  This flag enforces the elaboration rules defined in the SPARK
+   --       Reference Manual, chapter 7.7, to all SPARK code within a unit. As
+   --       a result, constructs which violate the rules in chapter 7.7 are no
+   --       longer accepted, even if the implementation is able to statically
+   --       ensure that accepting these constructs does not introduce the
+   --       possibility of failing an elaboration check.
+
    --  d.w  This flag turns off the scanning of loops to detect possible
    --       infinite loops.
 
diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_aspects.rst b/gcc/ada/doc/gnat_rm/implementation_defined_aspects.rst
index be7338f7436..c6018227b06 100644
--- a/gcc/ada/doc/gnat_rm/implementation_defined_aspects.rst
+++ b/gcc/ada/doc/gnat_rm/implementation_defined_aspects.rst
@@ -302,11 +302,15 @@ Aspect Iterable
 This aspect provides a light-weight mechanism for loops and quantified
 expressions over container types, without the overhead imposed by the tampering
 checks of standard Ada 2012 iterators. The value of the aspect is an aggregate
-with four named components: ``First``, ``Next``, ``Has_Element``, and ``Element`` (the
-last one being optional). When only 3 components are specified, only the
-``for .. in`` form of iteration over cursors is available. When all 4 components
-are specified, both this form and the ``for .. of`` form of iteration over
-elements are available. The following is a typical example of use:
+with six named components, or which the last three are optional: ``First``,
+ ``Next``, ``Has_Element``,``Element``, ``Last``, and ``Previous``.
+When only the first three components are specified, only the
+``for .. in`` form of iteration over cursors is available. When ``Element``
+is specified, both this form and the ``for .. of`` form of iteration over
+elements are available. If the last two components are specified, reverse
+iterations over the container can be specified (analogous to what can be done
+over predefined containers that support the Reverse_Iterator interface).
+The following is a typical example of use:
 
 .. code-block:: ada
 
diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index 046fe35a825..b6447d05dd6 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -1243,21 +1243,13 @@ Alphabetical List of All Switches
   :file:`scos.adb`.
 
 
-.. index:: -fdump-xref  (gcc)
-
-:switch:`-fdump-xref`
-  Generates cross reference information in GLI files for C and C++ sources.
-  The GLI files have the same syntax as the ALI files for Ada, and can be used
-  for source navigation in IDEs and on the command line using e.g. gnatxref
-  and the :switch:`--ext=gli` switch.
-
-
 .. index:: -flto  (gcc)
 
 :switch:`-flto[={n}]`
   Enables Link Time Optimization. This switch must be used in conjunction
-  with the traditional :switch:`-Ox` switches and instructs the compiler to
-  defer most optimizations until the link stage. The advantage of this
+  with the :switch:`-Ox` switches (but not with the :switch:`-gnatn` switch
+  since it is a full replacement for the latter) and instructs the compiler
+  to defer most optimizations until the link stage. The advantage of this
   approach is that the compiler can do a whole-program analysis and choose
   the best interprocedural optimization strategy based on a complete view
   of the program, instead of a fragmentary view with the usual approach.
@@ -3898,8 +3890,8 @@ of the pragma in the :title:`GNAT_Reference_manual`).
   This switch activates warnings for exception usage when pragma Restrictions
   (No_Exception_Propagation) is in effect. Warnings are given for implicit or
   explicit exception raises which are not covered by a local handler, and for
-  exception handlers which do not cover a local raise. The default is that these
-  warnings are not given.
+  exception handlers which do not cover a local raise. The default is that
+  these warnings are given for units that contain exception handlers.
 
 
 :switch:`-gnatw.X`
diff --git a/gcc/ada/doc/gnat_ugn/elaboration_order_handling_in_gnat.rst b/gcc/ada/doc/gnat_ugn/elaboration_order_handling_in_gnat.rst
index d943c716d3f..c45d3fcdbee 100644
--- a/gcc/ada/doc/gnat_ugn/elaboration_order_handling_in_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/elaboration_order_handling_in_gnat.rst
@@ -133,8 +133,43 @@ Elaboration Order
 =================
 
 The sequence by which the elaboration code of all units within a partition is
-executed is referred to as **elaboration order**. The elaboration order depends
-on the following factors:
+executed is referred to as **elaboration order**.
+
+Within a single unit, elaboration code is executed in sequential order.
+
+::
+
+   package body Client is
+      Result : ... := Server.Func;
+
+      procedure Proc is
+         package Inst is new Server.Gen;
+      begin
+         Inst.Eval (Result);
+      end Proc;
+   begin
+      Proc;
+   end Client;
+
+In the example above, the elaboration order within package body ``Client`` is
+as follows:
+
+1. The object declaration of ``Result`` is elaborated.
+
+   * Function ``Server.Func`` is invoked.
+
+2. The subprogram body of ``Proc`` is elaborated.
+
+3. Procedure ``Proc`` is invoked.
+
+   * Generic unit ``Server.Gen`` is instantiated as ``Inst``.
+
+   * Instance ``Inst`` is elaborated.
+
+   * Procedure ``Inst.Eval`` is invoked.
+
+The elaboration order of all units within a partition depends on the following
+factors:
 
 * |withed| units
 
@@ -571,7 +606,7 @@ elaboration order and to diagnose elaboration problems.
   a partition is elaboration code. GNAT performs very few diagnostics and
   generates run-time checks to verify the elaboration order of a program. This
   behavior is identical to that specified by the Ada Reference Manual. The
-  dynamic model is enabled with compilation switch :switch:`-gnatE`.
+  dynamic model is enabled with compiler switch :switch:`-gnatE`.
 
 .. index:: Static elaboration model
 
@@ -860,7 +895,7 @@ SPARK Elaboration Model in GNAT
 The SPARK model is identical to the static model in its handling of internal
 targets. The SPARK model, however, requires explicit ``Elaborate`` or
 ``Elaborate_All`` pragmas to be present in the program when a target is
-external, and emits hard errors instead of warnings:
+external, and compiler switch :switch:`-gnatd.v` is in effect.
 
 ::
 
@@ -987,7 +1022,7 @@ available.
 * *Switch to more permissive elaboration model*
 
   If the compilation was performed using the static model, enable the dynamic
-  model with compilation switch :switch:`-gnatE`. GNAT will no longer generate
+  model with compiler switch :switch:`-gnatE`. GNAT will no longer generate
   implicit ``Elaborate`` and ``Elaborate_All`` pragmas, resulting in a behavior
   identical to that specified by the Ada Reference Manual. The binder will
   generate an executable program that may or may not raise ``Program_Error``,
@@ -1504,6 +1539,17 @@ the elaboration order chosen by the binder.
   When this switch is in effect, GNAT will ignore ``'Access`` of an entry,
   operator, or subprogram when the static model is in effect.
 
+.. index:: -gnatd.v  (gnat)
+
+:switch:`-gnatd.v`
+  Enforce SPARK elaboration rules in SPARK code
+
+  When this switch is in effect, GNAT will enforce the SPARK rules of
+  elaboration as defined in the SPARK Reference Manual, section 7.7. As a
+  result, constructs which violate the SPARK elaboration rules are no longer
+  accepted, even if GNAT is able to statically ensure that these constructs
+  will not lead to ABE problems.
+
 .. index:: -gnatd.y  (gnat)
 
 :switch:`-gnatd.y`
@@ -1558,7 +1604,7 @@ the elaboration order chosen by the binder.
   - *SPARK model*
 
     GNAT will indicate how an elaboration requirement is met by the context of
-    a unit.
+    a unit. This diagnostic requires compiler switch :switch:`-gnatd.v`.
 
     ::
 
@@ -1612,8 +1658,8 @@ none of the binder or compiler switches. If the binder succeeds in finding an
 elaboration order, then apart from possible cases involing dispatching calls
 and access-to-subprogram types, the program is free of elaboration errors.
 If it is important for the program to be portable to compilers other than GNAT,
-then the programmer should use compilation switch :switch:`-gnatel` and
-consider the messages about missing or implicitly created ``Elaborate`` and
+then the programmer should use compiler switch :switch:`-gnatel` and consider
+the messages about missing or implicitly created ``Elaborate`` and
 ``Elaborate_All`` pragmas.
 
 If the binder reports an elaboration circularity, the programmer has several
diff --git a/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst b/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
index ac45cee3305..8f9f37cc0d8 100644
--- a/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
+++ b/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
@@ -3611,20 +3611,26 @@ combine a dimensioned and dimensionless value.  Thus an expression such as
 ``Acceleration``.
 
 The dimensionality checks for relationals use the same rules as
-for "+" and "-"; thus
+for "+" and "-", except when comparing to a literal; thus
 
   .. code-block:: ada
 
-        acc > 10.0
+        acc > len
 
 is equivalent to
 
   .. code-block:: ada
 
-       acc-10.0 > 0.0
+       acc-len > 0.0
+
+and is thus illegal, but
+
+  .. code-block:: ada
+
+        acc > 10.0
 
-and is thus illegal. Analogously a conditional expression
-requires the same dimension vector for each branch.
+is accepted with a warning. Analogously a conditional expression requires the
+same dimension vector for each branch (with no exception for literals).
 
 The dimension vector of a type conversion :samp:`T({expr})` is defined
 as follows, based on the nature of ``T``:
diff --git a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
index 3f5f2d64c6b..855bb8f3d4d 100644
--- a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
+++ b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
@@ -586,9 +586,9 @@ The following switches are available for ``gnatxref``:
 
 :switch:`--ext={extension}`
   Specify an alternate ali file extension. The default is ``ali`` and other
-  extensions (e.g. ``gli`` for C/C++ sources when using :switch:`-fdump-xref`)
-  may be specified via this switch. Note that if this switch overrides the
-  default, which means that only the new extension will be considered.
+  extensions (e.g. ``gli`` for C/C++ sources) may be specified via this switch.
+  Note that if this switch overrides the default, which means that only the
+  new extension will be considered.
 
 
 .. index:: --RTS (gnatxref)
diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
index d20440bcbf2..bfe14fcae7c 100644
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -1312,9 +1312,9 @@ package Einfo is
 --       that represents an activation record pointer is an extra formal.
 
 --    Extra_Formals (Node28)
---       Applies to subprograms and subprogram types, and also to entries
---       and entry families. Returns first extra formal of the subprogram
---       or entry. Returns Empty if there are no extra formals.
+--       Applies to subprograms, subprogram types, entries, and entry
+--       families. Returns first extra formal of the subprogram or entry.
+--       Returns Empty if there are no extra formals.
 
 --    Finalization_Master (Node23) [root type only]
 --       Defined in access-to-controlled or access-to-class-wide types. The
@@ -2756,7 +2756,7 @@ package Einfo is
 --         1) Internal entities (such as temporaries generated for the result
 --         of an inlined function call or dummy variables generated for the
 --         debugger). Set to indicate that they need not be initialized, even
---         when scalars are initialized or normalized;
+--         when scalars are initialized or normalized.
 --
 --         2) Predefined primitives of tagged types. Set to mark that they
 --         have specific properties: first they are primitives even if they
diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index 9faed933b9f..86621a4a06a 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -1251,6 +1251,7 @@ package body Exp_Aggr is
 
             if Finalization_OK
               and then not Is_Limited_Type (Comp_Typ)
+              and then not Is_Build_In_Place_Function_Call (Init_Expr)
               and then not
                 (Is_Array_Type (Comp_Typ)
                   and then Is_Controlled (Component_Type (Comp_Typ))
diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
index 719699566e4..70d39b7a916 100644
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -1756,7 +1756,18 @@ package body Exp_Attr is
       --  and access to it must be passed to the function.
 
       if Is_Build_In_Place_Function_Call (Pref) then
-         Make_Build_In_Place_Call_In_Anonymous_Context (Pref);
+
+         --  If attribute is 'Old, the context is a postcondition, and
+         --  the temporary must go in the corresponding subprogram, not
+         --  the postcondition function or any created blocks, as when
+         --  the attribute appears in a quantified expression. This is
+         --  handled below in the expansion of the attribute.
+
+         if Attribute_Name (Parent (Pref)) = Name_Old then
+            null;
+         else
+            Make_Build_In_Place_Call_In_Anonymous_Context (Pref);
+         end if;
 
       --  Ada 2005 (AI-318-02): Specialization of the previous case for prefix
       --  containing build-in-place function calls whose returned object covers
diff --git a/gcc/ada/exp_ch11.adb b/gcc/ada/exp_ch11.adb
index 8711c89d0eb..7941cbd2ca6 100644
--- a/gcc/ada/exp_ch11.adb
+++ b/gcc/ada/exp_ch11.adb
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                 B o d y                                  --
 --                                                                          --
---          Copyright (C) 1992-2016, Free Software Foundation, Inc.         --
+--          Copyright (C) 1992-2017, Free Software Foundation, Inc.         --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -64,7 +64,7 @@ package body Exp_Ch11 is
 
    procedure Warn_If_No_Propagation (N : Node_Id);
    --  Called for an exception raise that is not a local raise (and thus can
-   --  not be optimized to a goto. Issues warning if No_Exception_Propagation
+   --  not be optimized to a goto). Issues warning if No_Exception_Propagation
    --  restriction is set. N is the node for the raise or equivalent call.
 
    ---------------------------
@@ -998,15 +998,10 @@ package body Exp_Ch11 is
          --  if a source generated handler was not the target of a local raise.
 
          else
-            if Restriction_Active (No_Exception_Propagation)
-              and then not Has_Local_Raise (Handler)
+            if not Has_Local_Raise (Handler)
               and then Comes_From_Source (Handler)
-              and then Warn_On_Non_Local_Exception
             then
-               Warn_No_Exception_Propagation_Active (Handler);
-               Error_Msg_N
-                 ("\?X?this handler can never be entered, "
-                  & "and has been removed", Handler);
+               Warn_If_No_Local_Raise (Handler);
             end if;
 
             if No_Exception_Propagation_Active then
@@ -1859,8 +1854,12 @@ package body Exp_Ch11 is
          --  Otherwise, if the No_Exception_Propagation restriction is active
          --  and the warning is enabled, generate the appropriate warnings.
 
+         --  ??? Do not do it for the Call_Marker nodes inserted by the ABE
+         --  mechanism because this generates too many false positives.
+
          elsif Warn_On_Non_Local_Exception
            and then Restriction_Active (No_Exception_Propagation)
+           and then Nkind (N) /= N_Call_Marker
          then
             Warn_No_Exception_Propagation_Active (N);
 
@@ -2155,6 +2154,22 @@ package body Exp_Ch11 is
    end Get_RT_Exception_Name;
 
    ----------------------------
+   -- Warn_If_No_Local_Raise --
+   ----------------------------
+
+   procedure Warn_If_No_Local_Raise (N : Node_Id) is
+   begin
+      if Restriction_Active (No_Exception_Propagation)
+        and then Warn_On_Non_Local_Exception
+      then
+         Warn_No_Exception_Propagation_Active (N);
+
+         Error_Msg_N
+           ("\?X?this handler can never be entered, and has been removed", N);
+      end if;
+   end Warn_If_No_Local_Raise;
+
+   ----------------------------
    -- Warn_If_No_Propagation --
    ----------------------------
 
diff --git a/gcc/ada/exp_ch11.ads b/gcc/ada/exp_ch11.ads
index cdd53de626e..99efdeb2305 100644
--- a/gcc/ada/exp_ch11.ads
+++ b/gcc/ada/exp_ch11.ads
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                 S p e c                                  --
 --                                                                          --
---          Copyright (C) 1992-2015, Free Software Foundation, Inc.         --
+--          Copyright (C) 1992-2017, Free Software Foundation, Inc.         --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -90,4 +90,9 @@ package Exp_Ch11 is
    --  is a local handler marking that it has a local raise. E is the entity
    --  of the corresponding exception.
 
+   procedure Warn_If_No_Local_Raise (N : Node_Id);
+   --  Called for an exception handler that is not the target of a local raise.
+   --  Issues warning if No_Exception_Propagation restriction is set. N is the
+   --  node for the handler.
+
 end Exp_Ch11;
diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 29e79dcead9..043a02c64ba 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -43,6 +43,7 @@ with Exp_Tss;  use Exp_Tss;
 with Exp_Util; use Exp_Util;
 with Freeze;   use Freeze;
 with Ghost;    use Ghost;
+with Lib;      use Lib;
 with Namet;    use Namet;
 with Nlists;   use Nlists;
 with Nmake;    use Nmake;
@@ -5580,6 +5581,15 @@ package body Exp_Ch3 is
       --  arithmetic might yield a meaningless value for the length of the
       --  array, or its corresponding attribute.
 
+      procedure Count_Default_Sized_Task_Stacks
+        (Typ         : Entity_Id;
+         Pri_Stacks  : out Int;
+         Sec_Stacks  : out Int);
+      --  Count the number of default-sized primary and secondary task stacks
+      --  required for task objects contained within type Typ. If the number of
+      --  task objects contained within the type is not known at compile time
+      --  the procedure will return the stack counts of zero.
+
       procedure Default_Initialize_Object (After : Node_Id);
       --  Generate all default initialization actions for object Def_Id. Any
       --  new code is inserted after node After.
@@ -5772,6 +5782,119 @@ package body Exp_Ch3 is
          end if;
       end Check_Large_Modular_Array;
 
+      -------------------------------------
+      -- Count_Default_Sized_Task_Stacks --
+      -------------------------------------
+
+      procedure Count_Default_Sized_Task_Stacks
+        (Typ         : Entity_Id;
+         Pri_Stacks  : out Int;
+         Sec_Stacks  : out Int)
+      is
+         Component : Entity_Id;
+
+      begin
+         --  To calculate the number of default-sized task stacks required for
+         --  an object of Typ, a depth-first recursive traversal of the AST
+         --  from the Typ entity node is undertaken. Only type nodes containing
+         --  task objects are visited.
+
+         Pri_Stacks := 0;
+         Sec_Stacks := 0;
+
+         if not Has_Task (Typ) then
+            return;
+         end if;
+
+         case Ekind (Typ) is
+            when E_Task_Subtype
+               | E_Task_Type
+            =>
+               --  A task type is found marking the bottom of the descent. If
+               --  the type has no representation aspect for the corresponding
+               --  stack then that stack is using the default size.
+
+               if Present (Get_Rep_Item (Typ, Name_Storage_Size)) then
+                  Pri_Stacks := 0;
+               else
+                  Pri_Stacks := 1;
+               end if;
+
+               if Present (Get_Rep_Item (Typ, Name_Secondary_Stack_Size)) then
+                  Sec_Stacks := 0;
+               else
+                  Sec_Stacks := 1;
+               end if;
+
+            when E_Array_Subtype
+               | E_Array_Type
+            =>
+               --  First find the number of default stacks contained within an
+               --  array component.
+
+               Count_Default_Sized_Task_Stacks
+                 (Component_Type (Typ),
+                  Pri_Stacks,
+                  Sec_Stacks);
+
+               --  Then multiply the result by the size of the array
+
+               declare
+                  Quantity : constant Int := Number_Of_Elements_In_Array (Typ);
+                  --  Number_Of_Elements_In_Array is non-trival, consequently
+                  --  its result is captured as an optimization.
+
+               begin
+                  Pri_Stacks := Pri_Stacks * Quantity;
+                  Sec_Stacks := Sec_Stacks * Quantity;
+               end;
+
+            when E_Protected_Subtype
+               | E_Protected_Type
+               | E_Record_Subtype
+               | E_Record_Type
+            =>
+               Component := First_Component_Or_Discriminant (Typ);
+
+               --  Recursively descend each component of the composite type
+               --  looking for tasks, but only if the component is marked as
+               --  having a task.
+
+               while Present (Component) loop
+                  if Has_Task (Etype (Component)) then
+                     declare
+                        P : Int;
+                        S : Int;
+
+                     begin
+                        Count_Default_Sized_Task_Stacks
+                          (Etype (Component), P, S);
+                        Pri_Stacks := Pri_Stacks + P;
+                        Sec_Stacks := Sec_Stacks + S;
+                     end;
+                  end if;
+
+                  Next_Component_Or_Discriminant (Component);
+               end loop;
+
+            when E_Limited_Private_Subtype
+               | E_Limited_Private_Type
+               | E_Record_Subtype_With_Private
+               | E_Record_Type_With_Private
+            =>
+               --  Switch to the full view of the private type to continue
+               --  search.
+
+               Count_Default_Sized_Task_Stacks
+                 (Full_View (Typ), Pri_Stacks, Sec_Stacks);
+
+            --  Other types should not contain tasks
+
+            when others =>
+               raise Program_Error;
+         end case;
+      end Count_Default_Sized_Task_Stacks;
+
       -------------------------------
       -- Default_Initialize_Object --
       -------------------------------
@@ -6198,6 +6321,37 @@ package body Exp_Ch3 is
 
       Check_Large_Modular_Array;
 
+      --  If No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations
+      --  restrictions are active then default-sized secondary stacks are
+      --  generated by the binder and allocated by SS_Init. To provide the
+      --  binder the number of stacks to generate, the number of default-sized
+      --  stacks required for task objects contained within the object
+      --  declaration N is calculated here as it is at this point where
+      --  unconstrained types become constrained. The result is stored in the
+      --  enclosing unit's Unit_Record.
+
+      --  Note if N is an array object declaration that has an initialization
+      --  expression, a second object declaration for the initialization
+      --  expression is created by the compiler. To prevent double counting
+      --  of the stacks in this scenario, the stacks of the first array are
+      --  not counted.
+
+      if Has_Task (Typ)
+        and then not Restriction_Active (No_Secondary_Stack)
+        and then (Restriction_Active (No_Implicit_Heap_Allocations)
+          or else Restriction_Active (No_Implicit_Task_Allocations))
+        and then not (Ekind_In (Ekind (Typ), E_Array_Type, E_Array_Subtype)
+                      and then (Has_Init_Expression (N)))
+      then
+         declare
+            PS_Count, SS_Count : Int := 0;
+         begin
+            Count_Default_Sized_Task_Stacks (Typ, PS_Count, SS_Count);
+            Increment_Primary_Stack_Count (PS_Count);
+            Increment_Sec_Stack_Count (SS_Count);
+         end;
+      end if;
+
       --  Default initialization required, and no expression present
 
       if No (Expr) then
@@ -6649,15 +6803,7 @@ package body Exp_Ch3 is
             --  adjustment is required if we are going to rewrite the object
             --  declaration into a renaming declaration.
 
-            if Is_Build_In_Place_Result_Type (Typ)
-              and then Nkind (Parent (N)) = N_Extended_Return_Statement
-              and then
-                not Is_Definite_Subtype (Etype (Return_Applies_To
-                      (Return_Statement_Entity (Parent (N)))))
-            then
-               null;
-
-            elsif Needs_Finalization (Typ)
+            if Needs_Finalization (Typ)
               and then not Is_Limited_View (Typ)
               and then not Rewrite_As_Renaming
             then
diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index 770341ce9eb..abf6d635451 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -1069,12 +1069,15 @@ package body Exp_Ch4 is
          --  object can be limited but not inherently limited if this allocator
          --  came from a return statement (we're allocating the result on the
          --  secondary stack). In that case, the object will be moved, so we do
-         --  want to Adjust.
+         --  want to Adjust. However, if it's a nonlimited build-in-place
+         --  function call, Adjust is not wanted.
 
          if Needs_Finalization (DesigT)
            and then Needs_Finalization (T)
            and then not Aggr_In_Place
            and then not Is_Limited_View (T)
+           and then not Alloc_For_BIP_Return (N)
+           and then not Is_Build_In_Place_Function_Call (Expression (N))
          then
             --  An unchecked conversion is needed in the classwide case because
             --  the designated type can be an ancestor of the subtype mark of
@@ -5561,6 +5564,7 @@ package body Exp_Ch4 is
          declare
             Cnn     : constant Entity_Id := Make_Temporary (Loc, 'C', N);
             Ptr_Typ : constant Entity_Id := Make_Temporary (Loc, 'A');
+
          begin
             --  Generate:
             --    type Ann is access all Typ;
@@ -5638,6 +5642,7 @@ package body Exp_Ch4 is
       then
          declare
             Cnn : constant Node_Id := Make_Temporary (Loc, 'C', N);
+
          begin
             Insert_Action (N,
               Make_Object_Declaration (Loc,
@@ -5678,6 +5683,7 @@ package body Exp_Ch4 is
 
             declare
                Cnn : constant Node_Id := Make_Temporary (Loc, 'C', N);
+
             begin
                Decl :=
                  Make_Object_Declaration (Loc,
diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 6c27741d37c..bca7e5deae4 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -712,7 +712,8 @@ package body Exp_Ch6 is
          Stmt := First (Stmts);
          while Present (Stmt) loop
             if Nkind (Stmt) = N_Block_Statement then
-               Replace_Returns (Param_Id, Statements (Stmt));
+               Replace_Returns (Param_Id,
+                 Statements (Handled_Statement_Sequence (Stmt)));
 
             elsif Nkind (Stmt) = N_Case_Statement then
                declare
@@ -5145,11 +5146,19 @@ package body Exp_Ch6 is
                         Set_No_Initialization (Heap_Allocator);
                      end if;
 
+                     --  Set the flag indicating that the allocator came from
+                     --  a build-in-place return statement, so we can avoid
+                     --  adjusting the allocated object. Note that this flag
+                     --  will be inherited by the copies made below.
+
+                     Set_Alloc_For_BIP_Return (Heap_Allocator);
+
                      --  The Pool_Allocator is just like the Heap_Allocator,
                      --  except we set Storage_Pool and Procedure_To_Call so
                      --  it will use the user-defined storage pool.
 
                      Pool_Allocator := New_Copy_Tree (Heap_Allocator);
+                     pragma Assert (Alloc_For_BIP_Return (Pool_Allocator));
 
                      --  Do not generate the renaming of the build-in-place
                      --  pool parameter on ZFP because the parameter is not
@@ -5191,6 +5200,7 @@ package body Exp_Ch6 is
 
                      else
                         SS_Allocator := New_Copy_Tree (Heap_Allocator);
+                        pragma Assert (Alloc_For_BIP_Return (SS_Allocator));
 
                         --  The heap and pool allocators are marked as
                         --  Comes_From_Source since they correspond to an
@@ -7239,8 +7249,68 @@ package body Exp_Ch6 is
 
       if Is_Limited_View (Typ) then
          return Ada_Version >= Ada_2005 and then not Debug_Flag_Dot_L;
+
       else
-         return Debug_Flag_Dot_9;
+         if Debug_Flag_Dot_9 then
+            return False;
+         end if;
+
+         if Has_Interfaces (Typ) then
+            return False;
+         end if;
+
+         declare
+            T : Entity_Id := Typ;
+         begin
+            --  For T'Class, return True if it's True for T. This is necessary
+            --  because a class-wide function might say "return F (...)", where
+            --  F returns the corresponding specific type. We need a loop in
+            --  case T is a subtype of a class-wide type.
+
+            while Is_Class_Wide_Type (T) loop
+               T := Etype (T);
+            end loop;
+
+            --  If this is a generic formal type in an instance, return True if
+            --  it's True for the generic actual type.
+
+            if Nkind (Parent (T)) = N_Subtype_Declaration
+              and then Present (Generic_Parent_Type (Parent (T)))
+            then
+               T := Entity (Subtype_Indication (Parent (T)));
+
+               if Present (Full_View (T)) then
+                  T := Full_View (T);
+               end if;
+            end if;
+
+            if Present (Underlying_Type (T)) then
+               T := Underlying_Type (T);
+            end if;
+
+            declare
+               Result : Boolean;
+               --  So we can stop here in the debugger
+            begin
+               --  ???For now, enable build-in-place for a very narrow set of
+               --  controlled types. Change "if True" to "if False" to
+               --  experiment more controlled types. Eventually, we would
+               --  like to enable build-in-place for all tagged types, all
+               --  types that need finalization, and all caller-unknown-size
+               --  types.
+
+               if True then
+                  Result := Is_Controlled (T)
+                    and then Present (Enclosing_Subprogram (T))
+                    and then not Is_Compilation_Unit (Enclosing_Subprogram (T))
+                    and then Ekind (Enclosing_Subprogram (T)) = E_Procedure;
+               else
+                  Result := Is_Controlled (T);
+               end if;
+
+               return Result;
+            end;
+         end;
       end if;
    end Is_Build_In_Place_Result_Type;
 
@@ -7326,7 +7396,12 @@ package body Exp_Ch6 is
          raise Program_Error;
       end if;
 
-      return Is_Build_In_Place_Function (Function_Id);
+      declare
+         Result : constant Boolean := Is_Build_In_Place_Function (Function_Id);
+         --  So we can stop here in the debugger
+      begin
+         return Result;
+      end;
    end Is_Build_In_Place_Function_Call;
 
    -----------------------
diff --git a/gcc/ada/exp_ch9.adb b/gcc/ada/exp_ch9.adb
index aca0c18e3b6..063b812f9bc 100644
--- a/gcc/ada/exp_ch9.adb
+++ b/gcc/ada/exp_ch9.adb
@@ -339,6 +339,14 @@ package body Exp_Ch9 is
    --  same parameter names and the same resolved types, but with new entities
    --  for the formals.
 
+   function Create_Secondary_Stack_For_Task (T : Node_Id) return Boolean;
+   --  Return whether a secondary stack for the task T should be created by the
+   --  expander. The secondary stack for a task will be created by the expander
+   --  if the size of the stack has been specified by the Secondary_Stack_Size
+   --  representation aspect and either the No_Implicit_Heap_Allocations or
+   --  No_Implicit_Task_Allocations restrictions are in effect and the
+   --  No_Secondary_Stack restriction is not.
+
    procedure Debug_Private_Data_Declarations (Decls : List_Id);
    --  Decls is a list which may contain the declarations created by Install_
    --  Private_Data_Declarations. All generated entities are marked as needing
@@ -5415,6 +5423,20 @@ package body Exp_Ch9 is
    end Convert_Concurrent;
 
    -------------------------------------
+   -- Create_Secondary_Stack_For_Task --
+   -------------------------------------
+
+   function Create_Secondary_Stack_For_Task (T : Node_Id) return Boolean is
+   begin
+      return
+        (Restriction_Active (No_Implicit_Heap_Allocations)
+          or else Restriction_Active (No_Implicit_Task_Allocations))
+        and then not Restriction_Active (No_Secondary_Stack)
+        and then Has_Rep_Item
+                   (T, Name_Secondary_Stack_Size, Check_Parents => False);
+   end Create_Secondary_Stack_For_Task;
+
+   -------------------------------------
    -- Debug_Private_Data_Declarations --
    -------------------------------------
 
@@ -11712,6 +11734,7 @@ package body Exp_Ch9 is
       Body_Decl  : Node_Id;
       Cdecls     : List_Id;
       Decl_Stack : Node_Id;
+      Decl_SS    : Node_Id;
       Elab_Decl  : Node_Id;
       Ent_Stack  : Entity_Id;
       Proc_Spec  : Node_Id;
@@ -11939,6 +11962,57 @@ package body Exp_Ch9 is
 
       end if;
 
+      --  Declare a static secondary stack if the conditions for a statically
+      --  generated stack are met.
+
+      if Create_Secondary_Stack_For_Task (TaskId) then
+         declare
+            Ritem     : Node_Id;
+            Size_Expr : Node_Id;
+
+         begin
+            --  First extract the secondary stack size from the task type's
+            --  representation aspect.
+
+            Ritem :=
+              Get_Rep_Item
+                (TaskId, Name_Secondary_Stack_Size, Check_Parents => False);
+
+            --  Get Secondary_Stack_Size expression. Can be a pragma or aspect.
+
+            if Nkind (Ritem) = N_Pragma then
+               Size_Expr :=
+                 Expression
+                   (First (Pragma_Argument_Associations (Ritem)));
+            else
+               Size_Expr := Expression (Ritem);
+            end if;
+
+            pragma Assert (Compile_Time_Known_Value (Size_Expr));
+
+            --  Create the secondary stack for the task
+
+            Decl_SS :=
+              Make_Component_Declaration (Loc,
+                Defining_Identifier  =>
+                  Make_Defining_Identifier (Loc, Name_uSecondary_Stack),
+                Component_Definition =>
+                  Make_Component_Definition (Loc,
+                    Aliased_Present     => True,
+                    Subtype_Indication  =>
+                      Make_Subtype_Indication (Loc,
+                        Subtype_Mark =>
+                          New_Occurrence_Of (RTE (RE_SS_Stack), Loc),
+                        Constraint   =>
+                          Make_Index_Or_Discriminant_Constraint (Loc,
+                            Constraints  => New_List (
+                              Make_Integer_Literal (Loc,
+                                Expr_Value (Size_Expr)))))));
+
+            Append_To (Cdecls, Decl_SS);
+         end;
+      end if;
+
       --  Add components for entry families
 
       Collect_Entry_Families (Loc, Cdecls, Size_Decl, Tasktyp);
@@ -12835,11 +12909,14 @@ package body Exp_Ch9 is
       end if;
 
       --  If the type of the dispatching object is an access type then return
-      --  an explicit dereference.
+      --  an explicit dereference  of a copy of the object, and note that
+      --  this is the controlling actual of the call.
 
       if Is_Access_Type (Etype (Object)) then
-         Object := Make_Explicit_Dereference (Sloc (N), Object);
+         Object :=
+           Make_Explicit_Dereference (Sloc (N), New_Copy_Tree (Object));
          Analyze (Object);
+         Set_Is_Controlling_Actual (Object);
       end if;
    end Extract_Dispatching_Call;
 
@@ -14136,11 +14213,33 @@ package body Exp_Ch9 is
            New_Occurrence_Of (Storage_Size_Variable (Ttyp), Loc));
       end if;
 
-      --  Secondary_Stack_Size parameter. Set Default_Secondary_Stack_Size
-      --  unless there is a Secondary_Stack_Size rep item, in which case we
-      --  take the value from the rep item. If the restriction
-      --  No_Secondary_Stack is active then a size of 0 is passed regardless
-      --  to prevent the allocation of the unused stack.
+      --  Secondary_Stack parameter used for restricted profiles
+
+      if Restricted_Profile then
+
+         --  If the secondary stack has been allocated by the expander then
+         --  pass its access pointer. Otherwise, pass null.
+
+         if Create_Secondary_Stack_For_Task (Ttyp) then
+            Append_To (Args,
+              Make_Attribute_Reference (Loc,
+                Prefix         =>
+                  Make_Selected_Component (Loc,
+                    Prefix        => Make_Identifier (Loc, Name_uInit),
+                    Selector_Name =>
+                      Make_Identifier (Loc, Name_uSecondary_Stack)),
+                Attribute_Name => Name_Unrestricted_Access));
+
+         else
+            Append_To (Args, Make_Null (Loc));
+         end if;
+      end if;
+
+      --  Secondary_Stack_Size parameter. Set RE_Unspecified_Size unless there
+      --  is a Secondary_Stack_Size rep item, in which case take the value from
+      --  the rep item. If the restriction No_Secondary_Stack is active then a
+      --  size of 0 is passed regardless to prevent the allocation of the
+      --  unused stack.
 
       if Restriction_Active (No_Secondary_Stack) then
          Append_To (Args, Make_Integer_Literal (Loc, 0));
@@ -14465,6 +14564,12 @@ package body Exp_Ch9 is
                 Object_Definition   =>
                   New_Occurrence_Of (Etype (Formal), Loc)));
 
+            --  The object is initialized with an explicit assignment
+            --  later. Indicate that it does not need an initialization
+            --  to prevent spurious warnings if the type excludes null.
+
+            Set_No_Initialization (Last (Decls));
+
             if Ekind (Formal) /= E_Out_Parameter then
 
                --  Generate:
@@ -14481,15 +14586,22 @@ package body Exp_Ch9 is
                    Expression => New_Copy_Tree (Actual)));
             end if;
 
-            --  Generate:
+            --  If the actual is not controlling, generate:
+
             --    Jnn'unchecked_access
 
-            Append_To (Params,
-              Make_Attribute_Reference (Loc,
-                Attribute_Name => Name_Unchecked_Access,
-                Prefix         => New_Occurrence_Of (Temp_Nam, Loc)));
+            --  and add it to aggegate for access to formals. Note that
+            --  the actual may be by-copy but still be a controlling actual
+            --  if it is an access to class-wide interface.
 
-            Has_Param := True;
+            if not Is_Controlling_Actual (Actual) then
+               Append_To (Params,
+                 Make_Attribute_Reference (Loc,
+                   Attribute_Name => Name_Unchecked_Access,
+                   Prefix         => New_Occurrence_Of (Temp_Nam, Loc)));
+
+               Has_Param := True;
+            end if;
 
          --  The controlling parameter is omitted
 
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index b1ab606f055..8fdd8aa8200 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -10817,8 +10817,17 @@ package body Exp_Util is
                Analyze (Block);
             end if;
 
-         when others =>
+         --  Could be e.g. a loop that was transformed into a block or null
+         --  statement. Do nothing for terminate alternatives.
+
+         when N_Block_Statement
+            | N_Null_Statement
+            | N_Terminate_Alternative
+         =>
             null;
+
+         when others =>
+            raise Program_Error;
       end case;
    end Process_Statements_For_Controlled_Objects;
 
@@ -10969,7 +10978,8 @@ package body Exp_Util is
          Related_Nod : Node_Id := Empty) return Entity_Id;
       --  Create an external symbol of the form xxx_FIRST/_LAST if Related_Nod
       --  is present (xxx is taken from the Chars field of Related_Nod),
-      --  otherwise it generates an internal temporary.
+      --  otherwise it generates an internal temporary. The created temporary
+      --  entity is marked as internal.
 
       ---------------------
       -- Build_Temporary --
@@ -10980,6 +10990,7 @@ package body Exp_Util is
          Id          : Character;
          Related_Nod : Node_Id := Empty) return Entity_Id
       is
+         Temp_Id  : Entity_Id;
          Temp_Nam : Name_Id;
 
       begin
@@ -10992,13 +11003,17 @@ package body Exp_Util is
                Temp_Nam := New_External_Name (Chars (Related_Id), "_LAST");
             end if;
 
-            return Make_Defining_Identifier (Loc, Temp_Nam);
+            Temp_Id := Make_Defining_Identifier (Loc, Temp_Nam);
 
          --  Otherwise generate an internal temporary
 
          else
-            return Make_Temporary (Loc, Id, Related_Nod);
+            Temp_Id := Make_Temporary (Loc, Id, Related_Nod);
          end if;
+
+         Set_Is_Internal (Temp_Id);
+
+         return Temp_Id;
       end Build_Temporary;
 
       --  Local variables
@@ -11249,7 +11264,7 @@ package body Exp_Util is
          --  Exp_Ch2.Expand_Renaming). Otherwise the temporary must be
          --  elaborated by gigi, and is of course not to be replaced in-line
          --  by the expression it renames, which would defeat the purpose of
-         --  removing the side-effect.
+         --  removing the side effect.
 
          if Nkind_In (Exp, N_Selected_Component, N_Indexed_Component)
            and then Has_Non_Standard_Rep (Etype (Prefix (Exp)))
@@ -12650,7 +12665,7 @@ package body Exp_Util is
            and then Variable_Ref
          then
             --  Exception is a prefix that is the result of a previous removal
-            --  of side-effects.
+            --  of side effects.
 
             return Is_Entity_Name (Prefix (N))
               and then not Comes_From_Source (Prefix (N))
diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
index 513cfa97daa..6b6d524bcd7 100644
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -109,10 +109,12 @@ extern Nat       Serious_Errors_Detected;
 #define Get_Local_Raise_Call_Entity exp_ch11__get_local_raise_call_entity
 #define Get_RT_Exception_Entity exp_ch11__get_rt_exception_entity
 #define Get_RT_Exception_Name exp_ch11__get_rt_exception_name
+#define Warn_If_No_Local_Raise exp_ch11__warn_if_no_local_raise
 
 extern Entity_Id Get_Local_Raise_Call_Entity (void);
 extern Entity_Id Get_RT_Exception_Entity (int);
 extern void Get_RT_Exception_Name (int);
+extern void Warn_If_No_Local_Raise (int);
 
 /* exp_code:  */
 
diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
index 794fdf3d095..a106d68ae86 100644
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -8450,7 +8450,7 @@ package body Freeze is
             --  The analysis of the expression may generate insert actions,
             --  which of course must not be executed. We wrap those actions
             --  in a procedure that is not called, and later on eliminated.
-            --  The following cases have no side-effects, and are analyzed
+            --  The following cases have no side effects, and are analyzed
             --  directly.
 
             if Nkind (Dcopy) = N_Identifier
diff --git a/gcc/ada/gcc-interface/Make-lang.in b/gcc/ada/gcc-interface/Make-lang.in
index 113c84f390b..9c7b6e1496f 100644
--- a/gcc/ada/gcc-interface/Make-lang.in
+++ b/gcc/ada/gcc-interface/Make-lang.in
@@ -390,6 +390,7 @@ GNAT_ADA_OBJS =	\
  ada/libgnat/s-restri.o	\
  ada/libgnat/s-secsta.o	\
  ada/libgnat/s-soflin.o	\
+ ada/libgnat/s-soliin.o	\
  ada/libgnat/s-sopco3.o	\
  ada/libgnat/s-sopco4.o	\
  ada/libgnat/s-sopco5.o	\
@@ -579,6 +580,7 @@ GNATBIND_OBJS = \
  ada/libgnat/s-restri.o   \
  ada/libgnat/s-secsta.o   \
  ada/libgnat/s-soflin.o   \
+ ada/libgnat/s-soliin.o   \
  ada/libgnat/s-sopco3.o   \
  ada/libgnat/s-sopco4.o   \
  ada/libgnat/s-sopco5.o   \
diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in
index 2fa47caa547..b1621d11b11 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -627,10 +627,10 @@ ifeq ($(strip $(filter-out %86 x86_64 wrs vxworks vxworks7,$(target_cpu) $(targe
 
   ifeq ($(strip $(filter-out x86_64, $(target_cpu))),)
      X86CPU=x86_64
-     LIBGNAT_TARGET_PAIRS=s-atocou.adb<libgnat/s-atocou__builtin.adb
+     LIBGNAT_TARGET_PAIRS=$(X86_64_TARGET_PAIRS)
   else
      X86CPU=x86
-     LIBGNAT_TARGET_PAIRS=s-atocou.adb<libgnat/s-atocou__x86.adb
+     LIBGNAT_TARGET_PAIRS=$(X86_TARGET_PAIRS)
   endif
 
   LIBGNAT_TARGET_PAIRS+= \
@@ -653,10 +653,7 @@ ifeq ($(strip $(filter-out %86 x86_64 wrs vxworks vxworks7,$(target_cpu) $(targe
   g-socthi.ads<libgnat/g-socthi__vxworks.ads \
   g-socthi.adb<libgnat/g-socthi__vxworks.adb \
   g-stsifd.adb<libgnat/g-stsifd__sockets.adb \
-  $(ATOMICS_TARGET_PAIRS) \
-  $(CERTMATH_TARGET_PAIRS) \
-  $(CERTMATH_TARGET_PAIRS_SQRT_FPU) \
-  $(CERTMATH_TARGET_PAIRS_X86TRA)
+  $(ATOMICS_TARGET_PAIRS)
 
   TOOLS_TARGET_PAIRS=indepsw.adb<indepsw-gnu.adb
 
@@ -745,8 +742,7 @@ ifeq ($(strip $(filter-out %86 x86_64 wrs vxworks vxworks7,$(target_cpu) $(targe
     endif
   endif
 
-  EXTRA_GNATRTL_NONTASKING_OBJS += s-stchop.o \
-    $(CERTMATH_GNATRTL_OBJS) $(CERTMATH_GNATRTL_X86TRA_OBJS)
+  EXTRA_GNATRTL_NONTASKING_OBJS += s-stchop.o
   EXTRA_GNATRTL_TASKING_OBJS += i-vxinco.o s-vxwork.o s-vxwext.o
 
   EXTRA_LIBGNAT_OBJS+=vx_stack_info.o
@@ -845,7 +841,7 @@ ifeq ($(strip $(filter-out arm% coff wrs vx%,$(target_cpu) $(target_vendor) $(ta
     endif
   endif
 
-  EXTRA_GNATRTL_NONTASKING_OBJS=i-vxwork.o i-vxwoio.o $(CERTMATH_GNATRTL_OBJS) \
+  EXTRA_GNATRTL_NONTASKING_OBJS=i-vxwork.o i-vxwoio.o \
     s-stchop.o
   EXTRA_GNATRTL_TASKING_OBJS=i-vxinco.o s-vxwork.o s-vxwext.o
 
@@ -1633,7 +1629,7 @@ ifeq ($(strip $(filter-out m68k% linux%,$(target_cpu) $(target_os))),)
   a-intnam.ads<libgnarl/a-intnam__linux.ads \
   s-inmaop.adb<libgnarl/s-inmaop__posix.adb \
   s-intman.adb<libgnarl/s-intman__posix.adb \
-  s-linux.ads<libgnat/s-linux.ads \
+  s-linux.ads<libgnarl/s-linux.ads \
   s-osinte.adb<libgnarl/s-osinte__posix.adb \
   s-osinte.ads<libgnarl/s-osinte__linux.ads \
   s-osprim.adb<libgnat/s-osprim__posix.adb \
diff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
index 4ddd0f0a8d2..a957de5e589 100644
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -312,9 +312,9 @@ extern void post_error_ne_tree (const char *msg, Node_Id node, Entity_Id ent,
 extern void post_error_ne_tree_2 (const char *msg, Node_Id node, Entity_Id ent,
                                   tree t, int num);
 
-/* Return a label to branch to for the exception type in KIND or NULL_TREE
+/* Return a label to branch to for the exception type in KIND or Empty
    if none.  */
-extern tree get_exception_label (char kind);
+extern Entity_Id get_exception_label (char kind);
 
 /* If nonzero, pretend we are allocating at global level.  */
 extern int force_global;
diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index 7e4b2e30286..4d7f432bff2 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -1373,6 +1373,23 @@ gnat_init_ts (void)
   MARK_TS_TYPED (EXIT_STMT);
 }
 
+/* Return the size of a tree with CODE, which is a language-specific tree code
+   in category tcc_constant, tcc_exceptional or tcc_type.  The default expects
+   never to be called.  */
+
+static size_t
+gnat_tree_size (enum tree_code code)
+{
+  gcc_checking_assert (code >= NUM_TREE_CODES);
+  switch (code)
+    {
+    case UNCONSTRAINED_ARRAY_TYPE:
+      return sizeof (tree_type_non_common);
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Return the lang specific structure attached to NODE.  Allocate it (cleared)
    if needed.  */
 
@@ -1390,6 +1407,8 @@ get_lang_specific (tree node)
 #define LANG_HOOKS_NAME			"GNU Ada"
 #undef  LANG_HOOKS_IDENTIFIER_SIZE
 #define LANG_HOOKS_IDENTIFIER_SIZE	sizeof (struct tree_identifier)
+#undef  LANG_HOOKS_TREE_SIZE
+#define LANG_HOOKS_TREE_SIZE		gnat_tree_size
 #undef  LANG_HOOKS_INIT
 #define LANG_HOOKS_INIT			gnat_init
 #undef  LANG_HOOKS_OPTION_LANG_MASK
diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 8b094733806..d22d82ad610 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -211,9 +211,9 @@ typedef struct loop_info_d *loop_info;
 static GTY(()) vec<loop_info, va_gc> *gnu_loop_stack;
 
 /* The stacks for N_{Push,Pop}_*_Label.  */
-static GTY(()) vec<tree, va_gc> *gnu_constraint_error_label_stack;
-static GTY(()) vec<tree, va_gc> *gnu_storage_error_label_stack;
-static GTY(()) vec<tree, va_gc> *gnu_program_error_label_stack;
+static vec<Entity_Id> gnu_constraint_error_label_stack;
+static vec<Entity_Id> gnu_storage_error_label_stack;
+static vec<Entity_Id> gnu_program_error_label_stack;
 
 /* Map GNAT tree codes to GCC tree codes for simple expressions.  */
 static enum tree_code gnu_codes[Number_Node_Kinds];
@@ -226,7 +226,6 @@ static void record_code_position (Node_Id);
 static void insert_code_for (Node_Id);
 static void add_cleanup (tree, Node_Id);
 static void add_stmt_list (List_Id);
-static void push_exception_label_stack (vec<tree, va_gc> **, Entity_Id);
 static tree build_stmt_group (List_Id, bool);
 static inline bool stmt_group_may_fallthru (void);
 static enum gimplify_status gnat_gimplify_stmt (tree *);
@@ -647,9 +646,10 @@ gigi (Node_Id gnat_root,
   gnat_install_builtins ();
 
   vec_safe_push (gnu_except_ptr_stack, NULL_TREE);
-  vec_safe_push (gnu_constraint_error_label_stack, NULL_TREE);
-  vec_safe_push (gnu_storage_error_label_stack, NULL_TREE);
-  vec_safe_push (gnu_program_error_label_stack, NULL_TREE);
+
+  gnu_constraint_error_label_stack.safe_push (Empty);
+  gnu_storage_error_label_stack.safe_push (Empty);
+  gnu_program_error_label_stack.safe_push (Empty);
 
   /* Process any Pragma Ident for the main unit.  */
   if (Present (Ident_String (Main_Unit)))
@@ -5614,7 +5614,7 @@ Raise_Error_to_gnu (Node_Id gnat_node, tree *gnu_result_type_p)
   const bool with_extra_info
     = Exception_Extra_Info
       && !No_Exception_Handlers_Set ()
-      && !get_exception_label (kind);
+      && No (get_exception_label (kind));
   tree gnu_result = NULL_TREE, gnu_cond = NULL_TREE;
 
   /* The following processing is not required for correctness.  Its purpose is
@@ -7271,8 +7271,9 @@ gnat_to_gnu (Node_Id gnat_node)
       break;
 
     case N_Goto_Statement:
-      gnu_result
-	= build1 (GOTO_EXPR, void_type_node, gnat_to_gnu (Name (gnat_node)));
+      gnu_expr = gnat_to_gnu (Name (gnat_node));
+      gnu_result = build1 (GOTO_EXPR, void_type_node, gnu_expr);
+      TREE_USED (gnu_expr) = 1;
       break;
 
     /***************************/
@@ -7492,30 +7493,36 @@ gnat_to_gnu (Node_Id gnat_node)
       break;
 
     case N_Push_Constraint_Error_Label:
-      push_exception_label_stack (&gnu_constraint_error_label_stack,
-				  Exception_Label (gnat_node));
+      gnu_constraint_error_label_stack.safe_push (Exception_Label (gnat_node));
       break;
 
     case N_Push_Storage_Error_Label:
-      push_exception_label_stack (&gnu_storage_error_label_stack,
-				  Exception_Label (gnat_node));
+      gnu_storage_error_label_stack.safe_push (Exception_Label (gnat_node));
       break;
 
     case N_Push_Program_Error_Label:
-      push_exception_label_stack (&gnu_program_error_label_stack,
-				  Exception_Label (gnat_node));
+      gnu_program_error_label_stack.safe_push (Exception_Label (gnat_node));
       break;
 
     case N_Pop_Constraint_Error_Label:
-      gnu_constraint_error_label_stack->pop ();
+      gnat_temp = gnu_constraint_error_label_stack.pop ();
+      if (Present (gnat_temp)
+	  && !TREE_USED (gnat_to_gnu_entity (gnat_temp, NULL_TREE, false)))
+	Warn_If_No_Local_Raise (gnat_temp);
       break;
 
     case N_Pop_Storage_Error_Label:
-      gnu_storage_error_label_stack->pop ();
+      gnat_temp = gnu_storage_error_label_stack.pop ();
+      if (Present (gnat_temp)
+	  && !TREE_USED (gnat_to_gnu_entity (gnat_temp, NULL_TREE, false)))
+	Warn_If_No_Local_Raise (gnat_temp);
       break;
 
     case N_Pop_Program_Error_Label:
-      gnu_program_error_label_stack->pop ();
+      gnat_temp = gnu_program_error_label_stack.pop ();
+      if (Present (gnat_temp)
+	  && !TREE_USED (gnat_to_gnu_entity (gnat_temp, NULL_TREE, false)))
+	Warn_If_No_Local_Raise (gnat_temp);
       break;
 
     /******************************/
@@ -8029,20 +8036,6 @@ gnat_to_gnu_external (Node_Id gnat_node)
   return gnu_result;
 }
 
-/* Subroutine of above to push the exception label stack.  GNU_STACK is
-   a pointer to the stack to update and GNAT_LABEL, if present, is the
-   label to push onto the stack.  */
-
-static void
-push_exception_label_stack (vec<tree, va_gc> **gnu_stack, Entity_Id gnat_label)
-{
-  tree gnu_label = (Present (gnat_label)
-		    ? gnat_to_gnu_entity (gnat_label, NULL_TREE, false)
-		    : NULL_TREE);
-
-  vec_safe_push (*gnu_stack, gnu_label);
-}
-
 /* Return true if the statement list STMT_LIST is empty.  */
 
 static bool
@@ -10226,28 +10219,28 @@ post_error_ne_tree_2 (const char *msg, Node_Id node, Entity_Id ent, tree t,
   post_error_ne_tree (msg, node, ent, t);
 }
 
-/* Return a label to branch to for the exception type in KIND or NULL_TREE
+/* Return a label to branch to for the exception type in KIND or Empty
    if none.  */
 
-tree
+Entity_Id
 get_exception_label (char kind)
 {
   switch (kind)
     {
     case N_Raise_Constraint_Error:
-      return gnu_constraint_error_label_stack->last ();
+      return gnu_constraint_error_label_stack.last ();
 
     case N_Raise_Storage_Error:
-      return gnu_storage_error_label_stack->last ();
+      return gnu_storage_error_label_stack.last ();
 
     case N_Raise_Program_Error:
-      return gnu_program_error_label_stack->last ();
+      return gnu_program_error_label_stack.last ();
 
     default:
-      break;
+      return Empty;
     }
 
-  return NULL_TREE;
+  gcc_unreachable ();
 }
 
 /* Return the decl for the current elaboration procedure.  */
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 6718da45a9a..bad5aeade13 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -101,7 +101,7 @@ static tree handle_vector_type_attribute (tree *, tree, tree, int, bool *);
 
 /* Fake handler for attributes we don't properly support, typically because
    they'd require dragging a lot of the common-c front-end circuitry.  */
-static tree fake_attribute_handler      (tree *, tree, tree, int, bool *);
+static tree fake_attribute_handler (tree *, tree, tree, int, bool *);
 
 /* Table of machine-independent internal attributes for Ada.  We support
    this minimal set of attributes to accommodate the needs of builtins.  */
@@ -222,8 +222,9 @@ static GTY((deletable)) tree free_block_chain;
 /* A hash table of padded types.  It is modelled on the generic type
    hash table in tree.c, which must thus be used as a reference.  */
 
-struct GTY((for_user)) pad_type_hash {
-  unsigned long hash;
+struct GTY((for_user)) pad_type_hash
+{
+  hashval_t hash;
   tree type;
 };
 
@@ -3595,6 +3596,10 @@ max_size (tree exp, bool max_p)
     case tcc_constant:
       return exp;
 
+    case tcc_exceptional:
+      gcc_assert (code == SSA_NAME);
+      return exp;
+
     case tcc_vl_exp:
       if (code == CALL_EXPR)
 	{
@@ -4245,10 +4250,13 @@ convert (tree type, tree expr)
 	return convert (type, TREE_OPERAND (expr, 0));
 
       /* If the inner type is of self-referential size and the expression type
-	 is a record, do this as an unchecked conversion.  But first pad the
-	 expression if possible to have the same size on both sides.  */
+	 is a record, do this as an unchecked conversion unless both types are
+	 essentially the same.  But first pad the expression if possible to
+	 have the same size on both sides.  */
       if (ecode == RECORD_TYPE
-	  && CONTAINS_PLACEHOLDER_P (DECL_SIZE (TYPE_FIELDS (type))))
+	  && CONTAINS_PLACEHOLDER_P (DECL_SIZE (TYPE_FIELDS (type)))
+	  && TYPE_MAIN_VARIANT (etype)
+	     != TYPE_MAIN_VARIANT (TREE_TYPE (TYPE_FIELDS (type))))
 	{
 	  if (TREE_CODE (TYPE_SIZE (etype)) == INTEGER_CST)
 	    expr = convert (maybe_pad_type (etype, TYPE_SIZE (type), 0, Empty,
diff --git a/gcc/ada/gcc-interface/utils2.c b/gcc/ada/gcc-interface/utils2.c
index 321bdbfbe47..7f3a3d3ff1a 100644
--- a/gcc/ada/gcc-interface/utils2.c
+++ b/gcc/ada/gcc-interface/utils2.c
@@ -1788,9 +1788,10 @@ build_call_n_expr (tree fndecl, int n, ...)
    MSG gives the exception's identity for the call to Local_Raise, if any.  */
 
 static tree
-build_goto_raise (tree label, int msg)
+build_goto_raise (Entity_Id gnat_label, int msg)
 {
-  tree gnu_result = build1 (GOTO_EXPR, void_type_node, label);
+  tree gnu_label = gnat_to_gnu_entity (gnat_label, NULL_TREE, false);
+  tree gnu_result = build1 (GOTO_EXPR, void_type_node, gnu_label);
   Entity_Id local_raise = Get_Local_Raise_Call_Entity ();
 
   /* If Local_Raise is present, build Local_Raise (Exception'Identity).  */
@@ -1808,6 +1809,7 @@ build_goto_raise (tree label, int msg)
 	= build2 (COMPOUND_EXPR, void_type_node, gnu_call, gnu_result);
     }
 
+  TREE_USED (gnu_label) = 1;
   return gnu_result;
 }
 
@@ -1860,13 +1862,13 @@ expand_sloc (Node_Id gnat_node, tree *filename, tree *line, tree *col)
 tree
 build_call_raise (int msg, Node_Id gnat_node, char kind)
 {
+  Entity_Id gnat_label = get_exception_label (kind);
   tree fndecl = gnat_raise_decls[msg];
-  tree label = get_exception_label (kind);
   tree filename, line;
 
   /* If this is to be done as a goto, handle that case.  */
-  if (label)
-    return build_goto_raise (label, msg);
+  if (Present (gnat_label))
+    return build_goto_raise (gnat_label, msg);
 
   expand_sloc (gnat_node, &filename, &line, NULL);
 
@@ -1884,13 +1886,13 @@ build_call_raise (int msg, Node_Id gnat_node, char kind)
 tree
 build_call_raise_column (int msg, Node_Id gnat_node, char kind)
 {
+  Entity_Id gnat_label = get_exception_label (kind);
   tree fndecl = gnat_raise_decls_ext[msg];
-  tree label = get_exception_label (kind);
   tree filename, line, col;
 
   /* If this is to be done as a goto, handle that case.  */
-  if (label)
-    return build_goto_raise (label, msg);
+  if (Present (gnat_label))
+    return build_goto_raise (gnat_label, msg);
 
   expand_sloc (gnat_node, &filename, &line, &col);
 
@@ -1909,13 +1911,13 @@ tree
 build_call_raise_range (int msg, Node_Id gnat_node, char kind,
 			tree index, tree first, tree last)
 {
+  Entity_Id gnat_label = get_exception_label (kind);
   tree fndecl = gnat_raise_decls_ext[msg];
-  tree label = get_exception_label (kind);
   tree filename, line, col;
 
   /* If this is to be done as a goto, handle that case.  */
-  if (label)
-    return build_goto_raise (label, msg);
+  if (Present (gnat_label))
+    return build_goto_raise (gnat_label, msg);
 
   expand_sloc (gnat_node, &filename, &line, &col);
 
diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
index 8ed58c4fc7f..b042e2be3e1 100644
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT Reference Manual , Sep 29, 2017
+GNAT Reference Manual , Oct 14, 2017
 
 AdaCore
 
@@ -9413,11 +9413,20 @@ that it is separately controllable using pragma @code{Assertion_Policy}.
 This aspect provides a light-weight mechanism for loops and quantified
 expressions over container types, without the overhead imposed by the tampering
 checks of standard Ada 2012 iterators. The value of the aspect is an aggregate
-with four named components: @code{First}, @code{Next}, @code{Has_Element}, and @code{Element} (the
-last one being optional). When only 3 components are specified, only the
-@code{for .. in} form of iteration over cursors is available. When all 4 components
-are specified, both this form and the @code{for .. of} form of iteration over
-elements are available. The following is a typical example of use:
+with six named components, or which the last three are optional: @code{First},
+
+@quotation
+
+@code{Next}, @code{Has_Element},`@w{`}Element`@w{`}, @code{Last}, and @code{Previous}.
+@end quotation
+
+When only the first three components are specified, only the
+@code{for .. in} form of iteration over cursors is available. When @code{Element}
+is specified, both this form and the @code{for .. of} form of iteration over
+elements are available. If the last two components are specified, reverse
+iterations over the container can be specified (analogous to what can be done
+over predefined containers that support the Reverse_Iterator interface).
+The following is a typical example of use:
 
 @example
 type List is private with
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index a39c2572be0..947506799a5 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Oct 09, 2017
+GNAT User's Guide for Native Platforms , Oct 20, 2017
 
 AdaCore
 
@@ -8809,19 +8809,6 @@ in the compiler sources for details in files @code{scos.ads} and
 @code{scos.adb}.
 @end table
 
-@geindex -fdump-xref (gcc)
-
-
-@table @asis
-
-@item @code{-fdump-xref}
-
-Generates cross reference information in GLI files for C and C++ sources.
-The GLI files have the same syntax as the ALI files for Ada, and can be used
-for source navigation in IDEs and on the command line using e.g. gnatxref
-and the @code{--ext=gli} switch.
-@end table
-
 @geindex -flto (gcc)
 
 
@@ -8830,8 +8817,9 @@ and the @code{--ext=gli} switch.
 @item @code{-flto[=@emph{n}]}
 
 Enables Link Time Optimization. This switch must be used in conjunction
-with the traditional @code{-Ox} switches and instructs the compiler to
-defer most optimizations until the link stage. The advantage of this
+with the @code{-Ox} switches (but not with the @code{-gnatn} switch
+since it is a full replacement for the latter) and instructs the compiler
+to defer most optimizations until the link stage. The advantage of this
 approach is that the compiler can do a whole-program analysis and choose
 the best interprocedural optimization strategy based on a complete view
 of the program, instead of a fragmentary view with the usual approach.
@@ -12474,8 +12462,8 @@ should not complain at you.
 This switch activates warnings for exception usage when pragma Restrictions
 (No_Exception_Propagation) is in effect. Warnings are given for implicit or
 explicit exception raises which are not covered by a local handler, and for
-exception handlers which do not cover a local raise. The default is that these
-warnings are not given.
+exception handlers which do not cover a local raise. The default is that
+these warnings are given for units that contain exception handlers.
 
 @item @code{-gnatw.X}
 
@@ -17949,9 +17937,9 @@ Do not look for library files in the system default directory.
 @item @code{--ext=@emph{extension}}
 
 Specify an alternate ali file extension. The default is @code{ali} and other
-extensions (e.g. @code{gli} for C/C++ sources when using @code{-fdump-xref})
-may be specified via this switch. Note that if this switch overrides the
-default, which means that only the new extension will be considered.
+extensions (e.g. @code{gli} for C/C++ sources) may be specified via this switch.
+Note that if this switch overrides the default, which means that only the
+new extension will be considered.
 @end table
 
 @geindex --RTS (gnatxref)
@@ -22901,12 +22889,12 @@ combine a dimensioned and dimensionless value.  Thus an expression such as
 @code{Acceleration}.
 
 The dimensionality checks for relationals use the same rules as
-for "+" and "-"; thus
+for "+" and "-", except when comparing to a literal; thus
 
 @quotation
 
 @example
-acc > 10.0
+acc > len
 @end example
 @end quotation
 
@@ -22915,12 +22903,21 @@ is equivalent to
 @quotation
 
 @example
-acc-10.0 > 0.0
+acc-len > 0.0
+@end example
+@end quotation
+
+and is thus illegal, but
+
+@quotation
+
+@example
+acc > 10.0
 @end example
 @end quotation
 
-and is thus illegal. Analogously a conditional expression
-requires the same dimension vector for each branch.
+is accepted with a warning. Analogously a conditional expression requires the
+same dimension vector for each branch (with no exception for literals).
 
 The dimension vector of a type conversion @code{T(@emph{expr})} is defined
 as follows, based on the nature of @code{T}:
@@ -27187,8 +27184,62 @@ elaborated.
 
 
 The sequence by which the elaboration code of all units within a partition is
-executed is referred to as @strong{elaboration order}. The elaboration order depends
-on the following factors:
+executed is referred to as @strong{elaboration order}.
+
+Within a single unit, elaboration code is executed in sequential order.
+
+@example
+package body Client is
+   Result : ... := Server.Func;
+
+   procedure Proc is
+      package Inst is new Server.Gen;
+   begin
+      Inst.Eval (Result);
+   end Proc;
+begin
+   Proc;
+end Client;
+@end example
+
+In the example above, the elaboration order within package body @code{Client} is
+as follows:
+
+
+@enumerate 
+
+@item 
+The object declaration of @code{Result} is elaborated.
+
+
+@itemize *
+
+@item 
+Function @code{Server.Func} is invoked.
+@end itemize
+
+@item 
+The subprogram body of @code{Proc} is elaborated.
+
+@item 
+Procedure @code{Proc} is invoked.
+
+
+@itemize *
+
+@item 
+Generic unit @code{Server.Gen} is instantiated as @code{Inst}.
+
+@item 
+Instance @code{Inst} is elaborated.
+
+@item 
+Procedure @code{Inst.Eval} is invoked.
+@end itemize
+@end enumerate
+
+The elaboration order of all units within a partition depends on the following
+factors:
 
 
 @itemize *
@@ -27689,7 +27740,7 @@ dynamic model is in effect, GNAT assumes that all code within all units in
 a partition is elaboration code. GNAT performs very few diagnostics and
 generates run-time checks to verify the elaboration order of a program. This
 behavior is identical to that specified by the Ada Reference Manual. The
-dynamic model is enabled with compilation switch @code{-gnatE}.
+dynamic model is enabled with compiler switch @code{-gnatE}.
 @end itemize
 
 @geindex Static elaboration model
@@ -28001,7 +28052,7 @@ elaborated prior to the body of @code{Static_Model}.
 The SPARK model is identical to the static model in its handling of internal
 targets. The SPARK model, however, requires explicit @code{Elaborate} or
 @code{Elaborate_All} pragmas to be present in the program when a target is
-external, and emits hard errors instead of warnings:
+external, and compiler switch @code{-gnatd.v} is in effect.
 
 @example
 1. with Server;
@@ -28146,7 +28197,7 @@ code.
 @emph{Switch to more permissive elaboration model}
 
 If the compilation was performed using the static model, enable the dynamic
-model with compilation switch @code{-gnatE}. GNAT will no longer generate
+model with compiler switch @code{-gnatE}. GNAT will no longer generate
 implicit @code{Elaborate} and @code{Elaborate_All} pragmas, resulting in a behavior
 identical to that specified by the Ada Reference Manual. The binder will
 generate an executable program that may or may not raise @code{Program_Error},
@@ -28711,6 +28762,22 @@ When this switch is in effect, GNAT will ignore @code{'Access} of an entry,
 operator, or subprogram when the static model is in effect.
 @end table
 
+@geindex -gnatd.v (gnat)
+
+
+@table @asis
+
+@item @code{-gnatd.v}
+
+Enforce SPARK elaboration rules in SPARK code
+
+When this switch is in effect, GNAT will enforce the SPARK rules of
+elaboration as defined in the SPARK Reference Manual, section 7.7. As a
+result, constructs which violate the SPARK elaboration rules are no longer
+accepted, even if GNAT is able to statically ensure that these constructs
+will not lead to ABE problems.
+@end table
+
 @geindex -gnatd.y (gnat)
 
 
@@ -28785,7 +28852,7 @@ it will provide detailed traceback when an implicit @code{Elaborate} or
 @emph{SPARK model}
 
 GNAT will indicate how an elaboration requirement is met by the context of
-a unit.
+a unit. This diagnostic requires compiler switch @code{-gnatd.v}.
 
 @example
 1. with Server; pragma Elaborate_All (Server);
@@ -28846,8 +28913,8 @@ none of the binder or compiler switches. If the binder succeeds in finding an
 elaboration order, then apart from possible cases involing dispatching calls
 and access-to-subprogram types, the program is free of elaboration errors.
 If it is important for the program to be portable to compilers other than GNAT,
-then the programmer should use compilation switch @code{-gnatel} and
-consider the messages about missing or implicitly created @code{Elaborate} and
+then the programmer should use compiler switch @code{-gnatel} and consider
+the messages about missing or implicitly created @code{Elaborate} and
 @code{Elaborate_All} pragmas.
 
 If the binder reports an elaboration circularity, the programmer has several
diff --git a/gcc/ada/layout.adb b/gcc/ada/layout.adb
index 34c5b5d0f9a..52e84526ca4 100644
--- a/gcc/ada/layout.adb
+++ b/gcc/ada/layout.adb
@@ -843,7 +843,7 @@ package body Layout is
    -- Set_Elem_Alignment --
    ------------------------
 
-   procedure Set_Elem_Alignment (E : Entity_Id) is
+   procedure Set_Elem_Alignment (E : Entity_Id; Align : Nat := 0) is
    begin
       --  Do not set alignment for packed array types, this is handled in the
       --  backend.
@@ -869,15 +869,12 @@ package body Layout is
          return;
       end if;
 
-      --  Here we calculate the alignment as the largest power of two multiple
-      --  of System.Storage_Unit that does not exceed either the object size of
-      --  the type, or the maximum allowed alignment.
+      --  We attempt to set the alignment in all the other cases
 
       declare
          S : Int;
          A : Nat;
-
-         Max_Alignment : Nat;
+         M : Nat;
 
       begin
          --  The given Esize may be larger that int'last because of a previous
@@ -908,7 +905,7 @@ package body Layout is
            and then S = 8
            and then Is_Floating_Point_Type (E)
          then
-            Max_Alignment := Ttypes.Target_Double_Float_Alignment;
+            M := Ttypes.Target_Double_Float_Alignment;
 
          --  If the default alignment of "double" or larger scalar types is
          --  specifically capped, enforce the cap.
@@ -917,18 +914,27 @@ package body Layout is
            and then S >= 8
            and then Is_Scalar_Type (E)
          then
-            Max_Alignment := Ttypes.Target_Double_Scalar_Alignment;
+            M := Ttypes.Target_Double_Scalar_Alignment;
 
          --  Otherwise enforce the overall alignment cap
 
          else
-            Max_Alignment := Ttypes.Maximum_Alignment;
+            M := Ttypes.Maximum_Alignment;
          end if;
 
-         A := 1;
-         while 2 * A <= Max_Alignment and then 2 * A <= S loop
-            A := 2 * A;
-         end loop;
+         --  We calculate the alignment as the largest power-of-two multiple
+         --  of System.Storage_Unit that does not exceed the object size of
+         --  the type and the maximum allowed alignment, if none was specified.
+         --  Otherwise we only cap it to the maximum allowed alignment.
+
+         if Align = 0 then
+            A := 1;
+            while 2 * A <= S and then 2 * A <= M loop
+               A := 2 * A;
+            end loop;
+         else
+            A := Nat'Min (Align, M);
+         end if;
 
          --  If alignment is currently not set, then we can safely set it to
          --  this new calculated value.
diff --git a/gcc/ada/layout.ads b/gcc/ada/layout.ads
index 57aa93e4f5a..246970fd8fd 100644
--- a/gcc/ada/layout.ads
+++ b/gcc/ada/layout.ads
@@ -74,10 +74,11 @@ package Layout is
    --  types, the RM_Size is simply set to zero. This routine also sets
    --  the Is_Constrained flag in Def_Id.
 
-   procedure Set_Elem_Alignment (E : Entity_Id);
+   procedure Set_Elem_Alignment (E : Entity_Id; Align : Nat := 0);
    --  The front end always sets alignments for elementary types by calling
    --  this procedure. Note that we have to do this for discrete types (since
    --  the Alignment attribute is static), so we might as well do it for all
-   --  elementary types, since the processing is the same.
+   --  elementary types, as the processing is the same. If Align is nonzero,
+   --  it is an external alignment setting that we must respect.
 
 end Layout;
diff --git a/gcc/ada/lib-load.adb b/gcc/ada/lib-load.adb
index 977567d4983..0b0ea7f5057 100644
--- a/gcc/ada/lib-load.adb
+++ b/gcc/ada/lib-load.adb
@@ -214,34 +214,36 @@ package body Lib.Load is
       Unum := Units.Last;
 
       Units.Table (Unum) :=
-        (Cunit             => Cunit,
-         Cunit_Entity      => Cunit_Entity,
-         Dependency_Num    => 0,
-         Dynamic_Elab      => False,
-         Error_Location    => Sloc (With_Node),
-         Expected_Unit     => Spec_Name,
-         Fatal_Error       => Error_Detected,
-         Generate_Code     => False,
-         Has_RACW          => False,
-         Filler            => False,
-         Ident_String      => Empty,
+        (Cunit                  => Cunit,
+         Cunit_Entity           => Cunit_Entity,
+         Dependency_Num         => 0,
+         Dynamic_Elab           => False,
+         Error_Location         => Sloc (With_Node),
+         Expected_Unit          => Spec_Name,
+         Fatal_Error            => Error_Detected,
+         Generate_Code          => False,
+         Has_RACW               => False,
+         Filler                 => False,
+         Ident_String           => Empty,
 
          Is_Predefined_Renaming => Ren_Name,
          Is_Predefined_Unit     => Pre_Name or Ren_Name,
          Is_Internal_Unit       => Pre_Name or Ren_Name or GNAT_Name,
          Filler2                => False,
 
-         Loading           => False,
-         Main_Priority     => Default_Main_Priority,
-         Main_CPU          => Default_Main_CPU,
-         Munit_Index       => 0,
-         No_Elab_Code_All  => False,
-         Serial_Number     => 0,
-         Source_Index      => No_Source_File,
-         Unit_File_Name    => Fname,
-         Unit_Name         => Spec_Name,
-         Version           => 0,
-         OA_Setting        => 'O');
+         Loading                => False,
+         Main_Priority          => Default_Main_Priority,
+         Main_CPU               => Default_Main_CPU,
+         Primary_Stack_Count    => 0,
+         Sec_Stack_Count        => 0,
+         Munit_Index            => 0,
+         No_Elab_Code_All       => False,
+         Serial_Number          => 0,
+         Source_Index           => No_Source_File,
+         Unit_File_Name         => Fname,
+         Unit_Name              => Spec_Name,
+         Version                => 0,
+         OA_Setting             => 'O');
 
       Set_Comes_From_Source_Default (Save_CS);
       Set_Error_Posted (Cunit_Entity);
@@ -350,34 +352,37 @@ package body Lib.Load is
          end if;
 
          Units.Table (Main_Unit) :=
-           (Cunit             => Empty,
-            Cunit_Entity      => Empty,
-            Dependency_Num    => 0,
-            Dynamic_Elab      => False,
-            Error_Location    => No_Location,
-            Expected_Unit     => No_Unit_Name,
-            Fatal_Error       => None,
-            Generate_Code     => False,
-            Has_RACW          => False,
-            Filler            => False,
-            Ident_String      => Empty,
+           (Cunit                  => Empty,
+            Cunit_Entity           => Empty,
+            Dependency_Num         => 0,
+            Dynamic_Elab           => False,
+            Error_Location         => No_Location,
+            Expected_Unit          => No_Unit_Name,
+            Fatal_Error            => None,
+            Generate_Code          => False,
+            Has_RACW               => False,
+            Filler                 => False,
+            Ident_String           => Empty,
 
             Is_Predefined_Renaming => Ren_Name,
             Is_Predefined_Unit     => Pre_Name or Ren_Name,
             Is_Internal_Unit       => Pre_Name or Ren_Name or GNAT_Name,
             Filler2                => False,
 
-            Loading           => True,
-            Main_Priority     => Default_Main_Priority,
-            Main_CPU          => Default_Main_CPU,
-            Munit_Index       => 0,
-            No_Elab_Code_All  => False,
-            Serial_Number     => 0,
-            Source_Index      => Main_Source_File,
-            Unit_File_Name    => Fname,
-            Unit_Name         => No_Unit_Name,
-            Version           => Version,
-            OA_Setting        => 'O');
+            Loading                => True,
+            Main_Priority          => Default_Main_Priority,
+            Main_CPU               => Default_Main_CPU,
+            Primary_Stack_Count    => 0,
+            Sec_Stack_Count        => 0,
+
+            Munit_Index            => 0,
+            No_Elab_Code_All       => False,
+            Serial_Number          => 0,
+            Source_Index           => Main_Source_File,
+            Unit_File_Name         => Fname,
+            Unit_Name              => No_Unit_Name,
+            Version                => Version,
+            OA_Setting             => 'O');
       end if;
    end Load_Main_Source;
 
@@ -728,34 +733,36 @@ package body Lib.Load is
 
          if Src_Ind > No_Source_File then
             Units.Table (Unum) :=
-              (Cunit             => Empty,
-               Cunit_Entity      => Empty,
-               Dependency_Num    => 0,
-               Dynamic_Elab      => False,
-               Error_Location    => Sloc (Error_Node),
-               Expected_Unit     => Uname_Actual,
-               Fatal_Error       => None,
-               Generate_Code     => False,
-               Has_RACW          => False,
-               Filler            => False,
-               Ident_String      => Empty,
+              (Cunit                  => Empty,
+               Cunit_Entity           => Empty,
+               Dependency_Num         => 0,
+               Dynamic_Elab           => False,
+               Error_Location         => Sloc (Error_Node),
+               Expected_Unit          => Uname_Actual,
+               Fatal_Error            => None,
+               Generate_Code          => False,
+               Has_RACW               => False,
+               Filler                 => False,
+               Ident_String           => Empty,
 
                Is_Predefined_Renaming => Ren_Name,
                Is_Predefined_Unit     => Pre_Name or Ren_Name,
                Is_Internal_Unit       => Pre_Name or Ren_Name or GNAT_Name,
                Filler2                => False,
 
-               Loading           => True,
-               Main_Priority     => Default_Main_Priority,
-               Main_CPU          => Default_Main_CPU,
-               Munit_Index       => 0,
-               No_Elab_Code_All  => False,
-               Serial_Number     => 0,
-               Source_Index      => Src_Ind,
-               Unit_File_Name    => Fname,
-               Unit_Name         => Uname_Actual,
-               Version           => Source_Checksum (Src_Ind),
-               OA_Setting        => 'O');
+               Loading                => True,
+               Main_Priority          => Default_Main_Priority,
+               Main_CPU               => Default_Main_CPU,
+               Primary_Stack_Count    => 0,
+               Sec_Stack_Count        => 0,
+               Munit_Index            => 0,
+               No_Elab_Code_All       => False,
+               Serial_Number          => 0,
+               Source_Index           => Src_Ind,
+               Unit_File_Name         => Fname,
+               Unit_Name              => Uname_Actual,
+               Version                => Source_Checksum (Src_Ind),
+               OA_Setting             => 'O');
 
             --  Parse the new unit
 
diff --git a/gcc/ada/lib-writ.adb b/gcc/ada/lib-writ.adb
index d263b05dc1c..47109b4e3f9 100644
--- a/gcc/ada/lib-writ.adb
+++ b/gcc/ada/lib-writ.adb
@@ -96,6 +96,8 @@ package body Lib.Writ is
          Main_CPU               => -1,
          Munit_Index            => 0,
          No_Elab_Code_All       => False,
+         Primary_Stack_Count    => 0,
+         Sec_Stack_Count        => 0,
          Serial_Number          => 0,
          Version                => 0,
          Error_Location         => No_Location,
@@ -157,6 +159,8 @@ package body Lib.Writ is
          Main_CPU               => -1,
          Munit_Index            => 0,
          No_Elab_Code_All       => False,
+         Primary_Stack_Count    => 0,
+         Sec_Stack_Count        => 0,
          Serial_Number          => 0,
          Version                => 0,
          Error_Location         => No_Location,
@@ -616,6 +620,19 @@ package body Lib.Writ is
 
          Write_With_Lines;
 
+         --  Generate task stack lines
+
+         if Primary_Stack_Count (Unit_Num) > 0
+           or else Sec_Stack_Count (Unit_Num) > 0
+         then
+            Write_Info_Initiate ('T');
+            Write_Info_Char (' ');
+            Write_Info_Int (Primary_Stack_Count (Unit_Num));
+            Write_Info_Char (' ');
+            Write_Info_Int (Sec_Stack_Count (Unit_Num));
+            Write_Info_EOL;
+         end if;
+
          --  Generate the linker option lines
 
          for J in 1 .. Linker_Option_Lines.Last loop
diff --git a/gcc/ada/lib-writ.ads b/gcc/ada/lib-writ.ads
index f113b0a5993..a959e94e2fc 100644
--- a/gcc/ada/lib-writ.ads
+++ b/gcc/ada/lib-writ.ads
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                 S p e c                                  --
 --                                                                          --
---          Copyright (C) 1992-2016, Free Software Foundation, Inc.         --
+--          Copyright (C) 1992-2017, Free Software Foundation, Inc.         --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -670,14 +670,33 @@ package Lib.Writ is
    --      binder do the consistency check, but not include the unit in the
    --      partition closure (unless it is properly With'ed somewhere).
 
+   --  --------------------
+   --  -- T  Task Stacks --
+   --  --------------------
+
+   --  Following the W lines (if any, or the U line if not), is an optional
+   --  line that identifies the number of default-sized primary and secondary
+   --  stacks that the binder needs to create for the tasks declared within the
+   --  unit. For each compilation unit, a line is present in the form:
+
+   --    T primary-stack-quantity secondary-stack-quantity
+
+   --     The first parameter of T defines the number of task objects declared
+   --     in the unit that have no Storage_Size specified. The second parameter
+   --     defines the number of task objects declared in the unit that have no
+   --     Secondary_Stack_Size specified. These values are non-zero only if
+   --     the restrictions No_Implicit_Heap_Allocations or
+   --     No_Implicit_Task_Allocations are active.
+
    --  -----------------------
    --  -- L  Linker_Options --
    --  -----------------------
 
-   --  Following the W lines (if any, or the U line if not), are an optional
-   --  series of lines that indicates the usage of the pragma Linker_Options in
-   --  the associated unit. For each appearance of a pragma Linker_Options (or
-   --  Link_With) in the unit, a line is present with the form:
+   --  Following the T and W lines (if any, or the U line if not), are
+   --  an optional series of lines that indicates the usage of the pragma
+   --  Linker_Options in the associated unit. For each appearance of a pragma
+   --  Linker_Options (or Link_With) in the unit, a line is present with the
+   --  form:
 
    --    L "string"
 
diff --git a/gcc/ada/lib.adb b/gcc/ada/lib.adb
index 8de6f355d0c..02eb1987d8e 100644
--- a/gcc/ada/lib.adb
+++ b/gcc/ada/lib.adb
@@ -178,6 +178,16 @@ package body Lib is
       return Units.Table (U).OA_Setting;
    end OA_Setting;
 
+   function Primary_Stack_Count (U : Unit_Number_Type) return Int is
+   begin
+      return Units.Table (U).Primary_Stack_Count;
+   end Primary_Stack_Count;
+
+   function Sec_Stack_Count  (U : Unit_Number_Type) return Int is
+   begin
+      return Units.Table (U).Sec_Stack_Count;
+   end Sec_Stack_Count;
+
    function Source_Index (U : Unit_Number_Type) return Source_File_Index is
    begin
       return Units.Table (U).Source_Index;
@@ -1027,6 +1037,26 @@ package body Lib is
       return Get_Source_Unit (N1) = Get_Source_Unit (N2);
    end In_Same_Source_Unit;
 
+   -----------------------------------
+   -- Increment_Primary_Stack_Count --
+   -----------------------------------
+
+   procedure Increment_Primary_Stack_Count (Increment : Int) is
+      PSC : Int renames Units.Table (Current_Sem_Unit).Primary_Stack_Count;
+   begin
+      PSC := PSC + Increment;
+   end Increment_Primary_Stack_Count;
+
+   -------------------------------
+   -- Increment_Sec_Stack_Count --
+   -------------------------------
+
+   procedure Increment_Sec_Stack_Count (Increment : Int) is
+      SSC : Int renames Units.Table (Current_Sem_Unit).Sec_Stack_Count;
+   begin
+      SSC := SSC + Increment;
+   end Increment_Sec_Stack_Count;
+
    -----------------------------
    -- Increment_Serial_Number --
    -----------------------------
diff --git a/gcc/ada/lib.ads b/gcc/ada/lib.ads
index be6864a3e83..c9686992f5a 100644
--- a/gcc/ada/lib.ads
+++ b/gcc/ada/lib.ads
@@ -370,6 +370,20 @@ package Lib is
    --      This is a character field containing L if Optimize_Alignment mode
    --      was set locally, and O/T/S for Off/Time/Space default if not.
 
+   --    Primary_Stack_Count
+   --      The number of primary stacks belonging to tasks defined within the
+   --      unit that have no Storage_Size specified when the either restriction
+   --      No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations is
+   --      active. Only used by the binder to generate stacks for these tasks
+   --      at bind time.
+
+   --    Sec_Stack_Count
+   --      The number of secondary stacks belonging to tasks defined within the
+   --      unit that have no Secondary_Stack_Size specified when the either
+   --      the No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations
+   --      restrictions are active. Only used by the binder to generate stacks
+   --      for these tasks at bind time.
+
    --    Serial_Number
    --      This field holds a serial number used by New_Internal_Name to
    --      generate unique temporary numbers on a unit by unit basis. The
@@ -441,15 +455,20 @@ package Lib is
    function Generate_Code    (U : Unit_Number_Type) return Boolean;
    function Ident_String     (U : Unit_Number_Type) return Node_Id;
    function Has_RACW         (U : Unit_Number_Type) return Boolean;
-   function Is_Predefined_Renaming (U : Unit_Number_Type) return Boolean;
-   function Is_Internal_Unit       (U : Unit_Number_Type) return Boolean;
-   function Is_Predefined_Unit     (U : Unit_Number_Type) return Boolean;
+   function Is_Predefined_Renaming
+                             (U : Unit_Number_Type) return Boolean;
+   function Is_Internal_Unit (U : Unit_Number_Type) return Boolean;
+   function Is_Predefined_Unit
+                             (U : Unit_Number_Type) return Boolean;
    function Loading          (U : Unit_Number_Type) return Boolean;
    function Main_CPU         (U : Unit_Number_Type) return Int;
    function Main_Priority    (U : Unit_Number_Type) return Int;
    function Munit_Index      (U : Unit_Number_Type) return Nat;
    function No_Elab_Code_All (U : Unit_Number_Type) return Boolean;
    function OA_Setting       (U : Unit_Number_Type) return Character;
+   function Primary_Stack_Count
+                             (U : Unit_Number_Type) return Int;
+   function Sec_Stack_Count  (U : Unit_Number_Type) return Int;
    function Source_Index     (U : Unit_Number_Type) return Source_File_Index;
    function Unit_File_Name   (U : Unit_Number_Type) return File_Name_Type;
    function Unit_Name        (U : Unit_Number_Type) return Unit_Name_Type;
@@ -662,6 +681,13 @@ package Lib is
    --  source unit, the criterion being that Get_Source_Unit yields the
    --  same value for each argument.
 
+   procedure Increment_Primary_Stack_Count (Increment : Int);
+   --  Increment the Primary_Stack_Count field for the current unit by
+   --  Increment.
+
+   procedure Increment_Sec_Stack_Count (Increment : Int);
+   --  Increment the Sec_Stack_Count field for the current unit by Increment
+
    function Increment_Serial_Number return Nat;
    --  Increment Serial_Number field for current unit, and return the
    --  incremented value.
@@ -794,6 +820,8 @@ private
    pragma Inline (Fatal_Error);
    pragma Inline (Generate_Code);
    pragma Inline (Has_RACW);
+   pragma Inline (Increment_Primary_Stack_Count);
+   pragma Inline (Increment_Sec_Stack_Count);
    pragma Inline (Increment_Serial_Number);
    pragma Inline (Loading);
    pragma Inline (Main_CPU);
@@ -809,6 +837,8 @@ private
    pragma Inline (Is_Predefined_Renaming);
    pragma Inline (Is_Internal_Unit);
    pragma Inline (Is_Predefined_Unit);
+   pragma Inline (Primary_Stack_Count);
+   pragma Inline (Sec_Stack_Count);
    pragma Inline (Set_Loading);
    pragma Inline (Set_Main_CPU);
    pragma Inline (Set_Main_Priority);
@@ -822,28 +852,30 @@ private
    --  The Units Table
 
    type Unit_Record is record
-      Unit_File_Name    : File_Name_Type;
-      Unit_Name         : Unit_Name_Type;
-      Munit_Index       : Nat;
-      Expected_Unit     : Unit_Name_Type;
-      Source_Index      : Source_File_Index;
-      Cunit             : Node_Id;
-      Cunit_Entity      : Entity_Id;
-      Dependency_Num    : Int;
-      Ident_String      : Node_Id;
-      Main_Priority     : Int;
-      Main_CPU          : Int;
-      Serial_Number     : Nat;
-      Version           : Word;
-      Error_Location    : Source_Ptr;
-      Fatal_Error       : Fatal_Type;
-      Generate_Code     : Boolean;
-      Has_RACW          : Boolean;
-      Dynamic_Elab      : Boolean;
-      No_Elab_Code_All  : Boolean;
-      Filler            : Boolean;
-      Loading           : Boolean;
-      OA_Setting        : Character;
+      Unit_File_Name         : File_Name_Type;
+      Unit_Name              : Unit_Name_Type;
+      Munit_Index            : Nat;
+      Expected_Unit          : Unit_Name_Type;
+      Source_Index           : Source_File_Index;
+      Cunit                  : Node_Id;
+      Cunit_Entity           : Entity_Id;
+      Dependency_Num         : Int;
+      Ident_String           : Node_Id;
+      Main_Priority          : Int;
+      Main_CPU               : Int;
+      Primary_Stack_Count    : Int;
+      Sec_Stack_Count        : Int;
+      Serial_Number          : Nat;
+      Version                : Word;
+      Error_Location         : Source_Ptr;
+      Fatal_Error            : Fatal_Type;
+      Generate_Code          : Boolean;
+      Has_RACW               : Boolean;
+      Dynamic_Elab           : Boolean;
+      No_Elab_Code_All       : Boolean;
+      Filler                 : Boolean;
+      Loading                : Boolean;
+      OA_Setting             : Character;
 
       Is_Predefined_Renaming : Boolean;
       Is_Internal_Unit       : Boolean;
@@ -856,36 +888,38 @@ private
    --  written by Tree_Gen, we do not write uninitialized values to the file.
 
    for Unit_Record use record
-      Unit_File_Name    at  0 range 0 .. 31;
-      Unit_Name         at  4 range 0 .. 31;
-      Munit_Index       at  8 range 0 .. 31;
-      Expected_Unit     at 12 range 0 .. 31;
-      Source_Index      at 16 range 0 .. 31;
-      Cunit             at 20 range 0 .. 31;
-      Cunit_Entity      at 24 range 0 .. 31;
-      Dependency_Num    at 28 range 0 .. 31;
-      Ident_String      at 32 range 0 .. 31;
-      Main_Priority     at 36 range 0 .. 31;
-      Main_CPU          at 40 range 0 .. 31;
-      Serial_Number     at 44 range 0 .. 31;
-      Version           at 48 range 0 .. 31;
-      Error_Location    at 52 range 0 .. 31;
-      Fatal_Error       at 56 range 0 ..  7;
-      Generate_Code     at 57 range 0 ..  7;
-      Has_RACW          at 58 range 0 ..  7;
-      Dynamic_Elab      at 59 range 0 ..  7;
-      No_Elab_Code_All  at 60 range 0 ..  7;
-      Filler            at 61 range 0 ..  7;
-      OA_Setting        at 62 range 0 ..  7;
-      Loading           at 63 range 0 ..  7;
-
-      Is_Predefined_Renaming at 64 range 0 .. 7;
-      Is_Internal_Unit       at 65 range 0 .. 7;
-      Is_Predefined_Unit     at 66 range 0 .. 7;
-      Filler2                at 67 range 0 .. 7;
+      Unit_File_Name         at  0 range 0 .. 31;
+      Unit_Name              at  4 range 0 .. 31;
+      Munit_Index            at  8 range 0 .. 31;
+      Expected_Unit          at 12 range 0 .. 31;
+      Source_Index           at 16 range 0 .. 31;
+      Cunit                  at 20 range 0 .. 31;
+      Cunit_Entity           at 24 range 0 .. 31;
+      Dependency_Num         at 28 range 0 .. 31;
+      Ident_String           at 32 range 0 .. 31;
+      Main_Priority          at 36 range 0 .. 31;
+      Main_CPU               at 40 range 0 .. 31;
+      Primary_Stack_Count    at 44 range 0 .. 31;
+      Sec_Stack_Count        at 48 range 0 .. 31;
+      Serial_Number          at 52 range 0 .. 31;
+      Version                at 56 range 0 .. 31;
+      Error_Location         at 60 range 0 .. 31;
+      Fatal_Error            at 64 range 0 ..  7;
+      Generate_Code          at 65 range 0 ..  7;
+      Has_RACW               at 66 range 0 ..  7;
+      Dynamic_Elab           at 67 range 0 ..  7;
+      No_Elab_Code_All       at 68 range 0 ..  7;
+      Filler                 at 69 range 0 ..  7;
+      OA_Setting             at 70 range 0 ..  7;
+      Loading                at 71 range 0 ..  7;
+
+      Is_Predefined_Renaming at 72 range 0 .. 7;
+      Is_Internal_Unit       at 73 range 0 .. 7;
+      Is_Predefined_Unit     at 74 range 0 .. 7;
+      Filler2                at 75 range 0 .. 7;
    end record;
 
-   for Unit_Record'Size use 68 * 8;
+   for Unit_Record'Size use 76 * 8;
    --  This ensures that we did not leave out any fields
 
    package Units is new Table.Table (
diff --git a/gcc/ada/libgnarl/s-osinte__linux.ads b/gcc/ada/libgnarl/s-osinte__linux.ads
index 87da7ff01a5..a2ba537fb37 100644
--- a/gcc/ada/libgnarl/s-osinte__linux.ads
+++ b/gcc/ada/libgnarl/s-osinte__linux.ads
@@ -448,6 +448,9 @@ package System.OS_Interface is
       abstime : access timespec) return int;
    pragma Import (C, pthread_cond_timedwait, "pthread_cond_timedwait");
 
+   Relative_Timed_Wait : constant Boolean := False;
+   --  pthread_cond_timedwait requires an absolute delay time
+
    --------------------------
    -- POSIX.1c  Section 13 --
    --------------------------
diff --git a/gcc/ada/libgnarl/s-solita.adb b/gcc/ada/libgnarl/s-solita.adb
index bb38578b06f..a5485aa268d 100644
--- a/gcc/ada/libgnarl/s-solita.adb
+++ b/gcc/ada/libgnarl/s-solita.adb
@@ -44,6 +44,7 @@ with Ada.Exceptions.Is_Null_Occurrence;
 with System.Task_Primitives.Operations;
 with System.Tasking;
 with System.Stack_Checking;
+with System.Secondary_Stack;
 
 package body System.Soft_Links.Tasking is
 
@@ -52,6 +53,8 @@ package body System.Soft_Links.Tasking is
 
    use Ada.Exceptions;
 
+   use type System.Secondary_Stack.SS_Stack_Ptr;
+
    use type System.Tasking.Task_Id;
    use type System.Tasking.Termination_Handler;
 
@@ -71,8 +74,8 @@ package body System.Soft_Links.Tasking is
    procedure Set_Jmpbuf_Address (Addr : Address);
    --  Get/Set Jmpbuf_Address for current task
 
-   function  Get_Sec_Stack_Addr return  Address;
-   procedure Set_Sec_Stack_Addr (Addr : Address);
+   function  Get_Sec_Stack return SST.SS_Stack_Ptr;
+   procedure Set_Sec_Stack (Stack : SST.SS_Stack_Ptr);
    --  Get/Set location of current task's secondary stack
 
    procedure Timed_Delay_T (Time : Duration; Mode : Integer);
@@ -93,14 +96,14 @@ package body System.Soft_Links.Tasking is
       return STPO.Self.Common.Compiler_Data.Jmpbuf_Address;
    end Get_Jmpbuf_Address;
 
-   function Get_Sec_Stack_Addr return  Address is
+   function Get_Sec_Stack return SST.SS_Stack_Ptr is
    begin
-      return Result : constant Address :=
-        STPO.Self.Common.Compiler_Data.Sec_Stack_Addr
+      return Result : constant SST.SS_Stack_Ptr :=
+        STPO.Self.Common.Compiler_Data.Sec_Stack_Ptr
       do
-         pragma Assert (Result /= Null_Address);
+         pragma Assert (Result /= null);
       end return;
-   end Get_Sec_Stack_Addr;
+   end Get_Sec_Stack;
 
    function Get_Stack_Info return Stack_Checking.Stack_Access is
    begin
@@ -116,10 +119,10 @@ package body System.Soft_Links.Tasking is
       STPO.Self.Common.Compiler_Data.Jmpbuf_Address := Addr;
    end Set_Jmpbuf_Address;
 
-   procedure Set_Sec_Stack_Addr (Addr : Address) is
+   procedure Set_Sec_Stack (Stack : SST.SS_Stack_Ptr) is
    begin
-      STPO.Self.Common.Compiler_Data.Sec_Stack_Addr := Addr;
-   end Set_Sec_Stack_Addr;
+      STPO.Self.Common.Compiler_Data.Sec_Stack_Ptr := Stack;
+   end Set_Sec_Stack;
 
    -------------------
    -- Timed_Delay_T --
@@ -213,20 +216,20 @@ package body System.Soft_Links.Tasking is
 
          SSL.Get_Jmpbuf_Address       := Get_Jmpbuf_Address'Access;
          SSL.Set_Jmpbuf_Address       := Set_Jmpbuf_Address'Access;
-         SSL.Get_Sec_Stack_Addr       := Get_Sec_Stack_Addr'Access;
+         SSL.Get_Sec_Stack            := Get_Sec_Stack'Access;
          SSL.Get_Stack_Info           := Get_Stack_Info'Access;
-         SSL.Set_Sec_Stack_Addr       := Set_Sec_Stack_Addr'Access;
+         SSL.Set_Sec_Stack            := Set_Sec_Stack'Access;
          SSL.Timed_Delay              := Timed_Delay_T'Access;
          SSL.Task_Termination_Handler := Task_Termination_Handler_T'Access;
 
          --  No need to create a new secondary stack, since we will use the
          --  default one created in s-secsta.adb.
 
-         SSL.Set_Sec_Stack_Addr     (SSL.Get_Sec_Stack_Addr_NT);
+         SSL.Set_Sec_Stack          (SSL.Get_Sec_Stack_NT);
          SSL.Set_Jmpbuf_Address     (SSL.Get_Jmpbuf_Address_NT);
       end if;
 
-      pragma Assert (Get_Sec_Stack_Addr /= Null_Address);
+      pragma Assert (Get_Sec_Stack /= null);
    end Init_Tasking_Soft_Links;
 
 end System.Soft_Links.Tasking;
diff --git a/gcc/ada/libgnarl/s-taprop__linux.adb b/gcc/ada/libgnarl/s-taprop__linux.adb
index 1dfcf39dd81..5da10824a15 100644
--- a/gcc/ada/libgnarl/s-taprop__linux.adb
+++ b/gcc/ada/libgnarl/s-taprop__linux.adb
@@ -38,9 +38,7 @@ pragma Polling (Off);
 --  Turn off polling, we do not want ATC polling to take place during tasking
 --  operations. It causes infinite loops and other problems.
 
-with Interfaces.C; use Interfaces;
-use type Interfaces.C.int;
-use type Interfaces.C.long;
+with Interfaces.C; use Interfaces; use type Interfaces.C.int;
 
 with System.Task_Info;
 with System.Tasking.Debug;
@@ -112,8 +110,6 @@ package body System.Task_Primitives.Operations is
    --  Constant to indicate that the thread identifier has not yet been
    --  initialized.
 
-   Base_Monotonic_Clock : Duration := 0.0;
-
    --------------------
    -- Local Packages --
    --------------------
@@ -141,6 +137,38 @@ package body System.Task_Primitives.Operations is
    package body Specific is separate;
    --  The body of this package is target specific
 
+   package Monotonic is
+
+      function Monotonic_Clock return Duration;
+      pragma Inline (Monotonic_Clock);
+      --  Returns "absolute" time, represented as an offset relative to "the
+      --  Epoch", which is Jan 1, 1970. This clock implementation is immune to
+      --  the system's clock changes.
+
+      function RT_Resolution return Duration;
+      pragma Inline (RT_Resolution);
+      --  Returns resolution of the underlying clock used to implement RT_Clock
+
+      procedure Timed_Sleep
+        (Self_ID  : ST.Task_Id;
+         Time     : Duration;
+         Mode     : ST.Delay_Modes;
+         Reason   : System.Tasking.Task_States;
+         Timedout : out Boolean;
+         Yielded  : out Boolean);
+      --  Combination of Sleep (above) and Timed_Delay
+
+      procedure Timed_Delay
+        (Self_ID : ST.Task_Id;
+         Time    : Duration;
+         Mode    : ST.Delay_Modes);
+      --  Implement the semantics of the delay statement.
+      --  The caller should be abort-deferred and should not hold any locks.
+
+   end Monotonic;
+
+   package body Monotonic is separate;
+
    ----------------------------------
    -- ATCB allocation/deallocation --
    ----------------------------------
@@ -152,11 +180,16 @@ package body System.Task_Primitives.Operations is
    -- Support for foreign threads --
    ---------------------------------
 
-   function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id;
-   --  Allocate and Initialize a new ATCB for the current Thread
+   function Register_Foreign_Thread
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size) return Task_Id;
+   --  Allocate and initialize a new ATCB for the current Thread. The size of
+   --  the secondary stack can be optionally specified.
 
    function Register_Foreign_Thread
-     (Thread : Thread_Id) return Task_Id is separate;
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size)
+     return Task_Id is separate;
 
    -----------------------
    -- Local Subprograms --
@@ -164,11 +197,6 @@ package body System.Task_Primitives.Operations is
 
    procedure Abort_Handler (signo : Signal);
 
-   function Compute_Base_Monotonic_Clock return Duration;
-   --  The monotonic clock epoch is set to some undetermined time in the past
-   --  (typically system boot time). In order to use the monotonic clock for
-   --  absolute time, the offset from a known epoch is needed.
-
    function GNAT_pthread_condattr_setup
      (attr : access pthread_condattr_t) return C.int;
    pragma Import
@@ -270,100 +298,6 @@ package body System.Task_Primitives.Operations is
       end if;
    end Abort_Handler;
 
-   ----------------------------------
-   -- Compute_Base_Monotonic_Clock --
-   ----------------------------------
-
-   function Compute_Base_Monotonic_Clock return Duration is
-      Aft     : Duration;
-      Bef     : Duration;
-      Mon     : Duration;
-      Res_A   : Interfaces.C.int;
-      Res_B   : Interfaces.C.int;
-      Res_M   : Interfaces.C.int;
-      TS_Aft  : aliased timespec;
-      TS_Aft0 : aliased timespec;
-      TS_Bef  : aliased timespec;
-      TS_Bef0 : aliased timespec;
-      TS_Mon  : aliased timespec;
-      TS_Mon0 : aliased timespec;
-
-   begin
-      Res_B :=
-        clock_gettime
-          (clock_id => OSC.CLOCK_REALTIME,
-           tp       => TS_Bef0'Unchecked_Access);
-      pragma Assert (Res_B = 0);
-
-      Res_M :=
-        clock_gettime
-          (clock_id => OSC.CLOCK_RT_Ada,
-           tp       => TS_Mon0'Unchecked_Access);
-      pragma Assert (Res_M = 0);
-
-      Res_A :=
-        clock_gettime
-          (clock_id => OSC.CLOCK_REALTIME,
-           tp       => TS_Aft0'Unchecked_Access);
-      pragma Assert (Res_A = 0);
-
-      for I in 1 .. 10 loop
-
-         --  Guard against a leap second that will cause CLOCK_REALTIME to jump
-         --  backwards. In the extrenmely unlikely event we call clock_gettime
-         --  before and after the jump the epoch, the result will be off
-         --  slightly.
-         --  Use only results where the tv_sec values match, for the sake of
-         --  convenience.
-         --  Also try to calculate the most accurate epoch by taking the
-         --  minimum difference of 10 tries.
-
-         Res_B :=
-           clock_gettime
-             (clock_id => OSC.CLOCK_REALTIME,
-              tp       => TS_Bef'Unchecked_Access);
-         pragma Assert (Res_B = 0);
-
-         Res_M :=
-           clock_gettime
-             (clock_id => OSC.CLOCK_RT_Ada,
-              tp       => TS_Mon'Unchecked_Access);
-         pragma Assert (Res_M = 0);
-
-         Res_A :=
-           clock_gettime
-             (clock_id => OSC.CLOCK_REALTIME,
-              tp       => TS_Aft'Unchecked_Access);
-         pragma Assert (Res_A = 0);
-
-         --  The calls to clock_gettime before the loop were no good
-
-         if (TS_Bef0.tv_sec /= TS_Aft0.tv_sec
-               and then TS_Bef.tv_sec  = TS_Aft.tv_sec)
-
-           --  The most recent calls to clock_gettime were better
-
-           or else
-             (TS_Bef0.tv_sec = TS_Aft0.tv_sec
-                and then TS_Bef.tv_sec = TS_Aft.tv_sec
-                and then (TS_Aft.tv_nsec - TS_Bef.tv_nsec
-                            < TS_Aft0.tv_nsec - TS_Bef0.tv_nsec))
-         then
-            TS_Bef0 := TS_Bef;
-            TS_Aft0 := TS_Aft;
-            TS_Mon0 := TS_Mon;
-         end if;
-      end loop;
-
-      Bef := To_Duration (TS_Bef0);
-      Mon := To_Duration (TS_Mon0);
-      Aft := To_Duration (TS_Aft0);
-
-      --  Distribute the division, to avoid potential type overflow someday
-
-      return Bef / 2 + Aft / 2 - Mon;
-   end Compute_Base_Monotonic_Clock;
-
    --------------
    -- Lock_RTS --
    --------------
@@ -685,56 +619,7 @@ package body System.Task_Primitives.Operations is
       Mode     : ST.Delay_Modes;
       Reason   : System.Tasking.Task_States;
       Timedout : out Boolean;
-      Yielded  : out Boolean)
-   is
-      pragma Unreferenced (Reason);
-
-      Base_Time  : constant Duration := Monotonic_Clock;
-      Check_Time : Duration := Base_Time - Base_Monotonic_Clock;
-      Abs_Time   : Duration;
-      Request    : aliased timespec;
-      Result     : C.int;
-
-   begin
-      Timedout := True;
-      Yielded := False;
-
-      Abs_Time :=
-        (if Mode = Relative
-         then Duration'Min (Time, Max_Sensible_Delay) + Check_Time
-         else Duration'Min (Check_Time + Max_Sensible_Delay,
-                            Time - Base_Monotonic_Clock));
-
-      if Abs_Time > Check_Time then
-         Request := To_Timespec (Abs_Time);
-
-         loop
-            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
-
-            Result :=
-              pthread_cond_timedwait
-                (cond    => Self_ID.Common.LL.CV'Access,
-                 mutex   => (if Single_Lock
-                             then Single_RTS_Lock'Access
-                             else Self_ID.Common.LL.L'Access),
-                 abstime => Request'Access);
-
-            Check_Time := Monotonic_Clock;
-            exit when Abs_Time + Base_Monotonic_Clock <= Check_Time
-                      or else Check_Time < Base_Time;
-
-            if Result in 0 | EINTR then
-
-               --  Somebody may have called Wakeup for us
-
-               Timedout := False;
-               exit;
-            end if;
-
-            pragma Assert (Result = ETIMEDOUT);
-         end loop;
-      end if;
-   end Timed_Sleep;
+      Yielded  : out Boolean) renames Monotonic.Timed_Sleep;
 
    -----------------
    -- Timed_Delay --
@@ -746,92 +631,19 @@ package body System.Task_Primitives.Operations is
    procedure Timed_Delay
      (Self_ID : Task_Id;
       Time    : Duration;
-      Mode    : ST.Delay_Modes)
-   is
-      Base_Time  : constant Duration := Monotonic_Clock;
-      Check_Time : Duration := Base_Time - Base_Monotonic_Clock;
-      Abs_Time   : Duration;
-      Request    : aliased timespec;
-
-      Result : C.int;
-      pragma Warnings (Off, Result);
-
-   begin
-      if Single_Lock then
-         Lock_RTS;
-      end if;
-
-      Write_Lock (Self_ID);
-
-      Abs_Time :=
-        (if Mode = Relative
-         then Time + Check_Time
-         else Duration'Min (Check_Time + Max_Sensible_Delay,
-                            Time - Base_Monotonic_Clock));
-
-      if Abs_Time > Check_Time then
-         Request := To_Timespec (Abs_Time);
-         Self_ID.Common.State := Delay_Sleep;
-
-         loop
-            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
-
-            Result :=
-              pthread_cond_timedwait
-                (cond    => Self_ID.Common.LL.CV'Access,
-                 mutex   => (if Single_Lock
-                             then Single_RTS_Lock'Access
-                             else Self_ID.Common.LL.L'Access),
-                 abstime => Request'Access);
-
-            Check_Time := Monotonic_Clock;
-            exit when Abs_Time + Base_Monotonic_Clock <= Check_Time
-                      or else Check_Time < Base_Time;
-
-            pragma Assert (Result in 0 | ETIMEDOUT | EINTR);
-         end loop;
-
-         Self_ID.Common.State := Runnable;
-      end if;
-
-      Unlock (Self_ID);
-
-      if Single_Lock then
-         Unlock_RTS;
-      end if;
-
-      Result := sched_yield;
-   end Timed_Delay;
+      Mode    : ST.Delay_Modes) renames Monotonic.Timed_Delay;
 
    ---------------------
    -- Monotonic_Clock --
    ---------------------
 
-   function Monotonic_Clock return Duration is
-      TS     : aliased timespec;
-      Result : Interfaces.C.int;
-   begin
-      Result := clock_gettime
-        (clock_id => OSC.CLOCK_RT_Ada, tp => TS'Unchecked_Access);
-      pragma Assert (Result = 0);
-
-      return Base_Monotonic_Clock + To_Duration (TS);
-   end Monotonic_Clock;
+   function Monotonic_Clock return Duration renames Monotonic.Monotonic_Clock;
 
    -------------------
    -- RT_Resolution --
    -------------------
 
-   function RT_Resolution return Duration is
-      TS     : aliased timespec;
-      Result : C.int;
-
-   begin
-      Result := clock_getres (OSC.CLOCK_REALTIME, TS'Unchecked_Access);
-      pragma Assert (Result = 0);
-
-      return To_Duration (TS);
-   end RT_Resolution;
+   function RT_Resolution return Duration renames Monotonic.RT_Resolution;
 
    ------------
    -- Wakeup --
@@ -1607,8 +1419,6 @@ package body System.Task_Primitives.Operations is
 
       Interrupt_Management.Initialize;
 
-      Base_Monotonic_Clock := Compute_Base_Monotonic_Clock;
-
       --  Prepare the set of signals that should be unblocked in all tasks
 
       Result := sigemptyset (Unblocked_Signal_Mask'Access);
diff --git a/gcc/ada/libgnarl/s-taprop__mingw.adb b/gcc/ada/libgnarl/s-taprop__mingw.adb
index fa966514568..b14444ad185 100644
--- a/gcc/ada/libgnarl/s-taprop__mingw.adb
+++ b/gcc/ada/libgnarl/s-taprop__mingw.adb
@@ -190,11 +190,16 @@ package body System.Task_Primitives.Operations is
    -- Support for foreign threads --
    ---------------------------------
 
-   function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id;
-   --  Allocate and Initialize a new ATCB for the current Thread
+   function Register_Foreign_Thread
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size) return Task_Id;
+   --  Allocate and initialize a new ATCB for the current Thread. The size of
+   --  the secondary stack can be optionally specified.
 
    function Register_Foreign_Thread
-     (Thread : Thread_Id) return Task_Id is separate;
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size)
+     return Task_Id is separate;
 
    ----------------------------------
    -- Condition Variable Functions --
diff --git a/gcc/ada/libgnarl/s-taprop__posix.adb b/gcc/ada/libgnarl/s-taprop__posix.adb
index 3efc1e0de1a..d9ee078b364 100644
--- a/gcc/ada/libgnarl/s-taprop__posix.adb
+++ b/gcc/ada/libgnarl/s-taprop__posix.adb
@@ -145,6 +145,38 @@ package body System.Task_Primitives.Operations is
    package body Specific is separate;
    --  The body of this package is target specific
 
+   package Monotonic is
+
+      function Monotonic_Clock return Duration;
+      pragma Inline (Monotonic_Clock);
+      --  Returns "absolute" time, represented as an offset relative to "the
+      --  Epoch", which is Jan 1, 1970. This clock implementation is immune to
+      --  the system's clock changes.
+
+      function RT_Resolution return Duration;
+      pragma Inline (RT_Resolution);
+      --  Returns resolution of the underlying clock used to implement RT_Clock
+
+      procedure Timed_Sleep
+        (Self_ID  : ST.Task_Id;
+         Time     : Duration;
+         Mode     : ST.Delay_Modes;
+         Reason   : System.Tasking.Task_States;
+         Timedout : out Boolean;
+         Yielded  : out Boolean);
+      --  Combination of Sleep (above) and Timed_Delay
+
+      procedure Timed_Delay
+        (Self_ID : ST.Task_Id;
+         Time    : Duration;
+         Mode    : ST.Delay_Modes);
+      --  Implement the semantics of the delay statement.
+      --  The caller should be abort-deferred and should not hold any locks.
+
+   end Monotonic;
+
+   package body Monotonic is separate;
+
    ----------------------------------
    -- ATCB allocation/deallocation --
    ----------------------------------
@@ -156,11 +188,16 @@ package body System.Task_Primitives.Operations is
    -- Support for foreign threads --
    ---------------------------------
 
-   function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id;
-   --  Allocate and Initialize a new ATCB for the current Thread
+   function Register_Foreign_Thread
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size) return Task_Id;
+   --  Allocate and initialize a new ATCB for the current Thread. The size of
+   --  the secondary stack can be optionally specified.
 
    function Register_Foreign_Thread
-     (Thread : Thread_Id) return Task_Id is separate;
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size)
+     return Task_Id is separate;
 
    -----------------------
    -- Local Subprograms --
@@ -178,18 +215,6 @@ package body System.Task_Primitives.Operations is
    pragma Import (C,
      GNAT_pthread_condattr_setup, "__gnat_pthread_condattr_setup");
 
-   procedure Compute_Deadline
-     (Time       : Duration;
-      Mode       : ST.Delay_Modes;
-      Check_Time : out Duration;
-      Abs_Time   : out Duration;
-      Rel_Time   : out Duration);
-   --  Helper for Timed_Sleep and Timed_Delay: given a deadline specified by
-   --  Time and Mode, compute the current clock reading (Check_Time), and the
-   --  target absolute and relative clock readings (Abs_Time, Rel_Time). The
-   --  epoch for Time depends on Mode; the epoch for Check_Time and Abs_Time
-   --  is always that of CLOCK_RT_Ada.
-
    -------------------
    -- Abort_Handler --
    -------------------
@@ -248,67 +273,6 @@ package body System.Task_Primitives.Operations is
       end if;
    end Abort_Handler;
 
-   ----------------------
-   -- Compute_Deadline --
-   ----------------------
-
-   procedure Compute_Deadline
-     (Time       : Duration;
-      Mode       : ST.Delay_Modes;
-      Check_Time : out Duration;
-      Abs_Time   : out Duration;
-      Rel_Time   : out Duration)
-   is
-   begin
-      Check_Time := Monotonic_Clock;
-
-      --  Relative deadline
-
-      if Mode = Relative then
-         Abs_Time := Duration'Min (Time, Max_Sensible_Delay) + Check_Time;
-
-         if Relative_Timed_Wait then
-            Rel_Time := Duration'Min (Max_Sensible_Delay, Time);
-         end if;
-
-         pragma Warnings (Off);
-         --  Comparison "OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME" is compile
-         --  time known.
-
-      --  Absolute deadline specified using the tasking clock (CLOCK_RT_Ada)
-
-      elsif Mode = Absolute_RT
-        or else OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME
-      then
-         pragma Warnings (On);
-         Abs_Time := Duration'Min (Check_Time + Max_Sensible_Delay, Time);
-
-         if Relative_Timed_Wait then
-            Rel_Time := Duration'Min (Max_Sensible_Delay, Time - Check_Time);
-         end if;
-
-      --  Absolute deadline specified using the calendar clock, in the
-      --  case where it is not the same as the tasking clock: compensate for
-      --  difference between clock epochs (Base_Time - Base_Cal_Time).
-
-      else
-         declare
-            Cal_Check_Time : constant Duration := OS_Primitives.Clock;
-            RT_Time        : constant Duration :=
-                               Time + Check_Time - Cal_Check_Time;
-
-         begin
-            Abs_Time :=
-              Duration'Min (Check_Time + Max_Sensible_Delay, RT_Time);
-
-            if Relative_Timed_Wait then
-               Rel_Time :=
-                 Duration'Min (Max_Sensible_Delay, RT_Time - Check_Time);
-            end if;
-         end;
-      end if;
-   end Compute_Deadline;
-
    -----------------
    -- Stack_Guard --
    -----------------
@@ -595,60 +559,7 @@ package body System.Task_Primitives.Operations is
       Mode     : ST.Delay_Modes;
       Reason   : Task_States;
       Timedout : out Boolean;
-      Yielded  : out Boolean)
-   is
-      pragma Unreferenced (Reason);
-
-      Base_Time  : Duration;
-      Check_Time : Duration;
-      Abs_Time   : Duration;
-      Rel_Time   : Duration;
-
-      Request    : aliased timespec;
-      Result     : Interfaces.C.int;
-
-   begin
-      Timedout := True;
-      Yielded := False;
-
-      Compute_Deadline
-        (Time       => Time,
-         Mode       => Mode,
-         Check_Time => Check_Time,
-         Abs_Time   => Abs_Time,
-         Rel_Time   => Rel_Time);
-      Base_Time := Check_Time;
-
-      if Abs_Time > Check_Time then
-         Request :=
-           To_Timespec (if Relative_Timed_Wait then Rel_Time else Abs_Time);
-
-         loop
-            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
-
-            Result :=
-              pthread_cond_timedwait
-                (cond    => Self_ID.Common.LL.CV'Access,
-                 mutex   => (if Single_Lock
-                             then Single_RTS_Lock'Access
-                             else Self_ID.Common.LL.L'Access),
-                 abstime => Request'Access);
-
-            Check_Time := Monotonic_Clock;
-            exit when Abs_Time <= Check_Time or else Check_Time < Base_Time;
-
-            if Result = 0 or Result = EINTR then
-
-               --  Somebody may have called Wakeup for us
-
-               Timedout := False;
-               exit;
-            end if;
-
-            pragma Assert (Result = ETIMEDOUT);
-         end loop;
-      end if;
-   end Timed_Sleep;
+      Yielded  : out Boolean) renames Monotonic.Timed_Sleep;
 
    -----------------
    -- Timed_Delay --
@@ -660,95 +571,19 @@ package body System.Task_Primitives.Operations is
    procedure Timed_Delay
      (Self_ID : Task_Id;
       Time    : Duration;
-      Mode    : ST.Delay_Modes)
-   is
-      Base_Time  : Duration;
-      Check_Time : Duration;
-      Abs_Time   : Duration;
-      Rel_Time   : Duration;
-      Request    : aliased timespec;
-
-      Result : Interfaces.C.int;
-      pragma Warnings (Off, Result);
-
-   begin
-      if Single_Lock then
-         Lock_RTS;
-      end if;
-
-      Write_Lock (Self_ID);
-
-      Compute_Deadline
-        (Time       => Time,
-         Mode       => Mode,
-         Check_Time => Check_Time,
-         Abs_Time   => Abs_Time,
-         Rel_Time   => Rel_Time);
-      Base_Time := Check_Time;
-
-      if Abs_Time > Check_Time then
-         Request :=
-           To_Timespec (if Relative_Timed_Wait then Rel_Time else Abs_Time);
-         Self_ID.Common.State := Delay_Sleep;
-
-         loop
-            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
-
-            Result :=
-              pthread_cond_timedwait
-                (cond    => Self_ID.Common.LL.CV'Access,
-                 mutex   => (if Single_Lock
-                             then Single_RTS_Lock'Access
-                             else Self_ID.Common.LL.L'Access),
-                 abstime => Request'Access);
-
-            Check_Time := Monotonic_Clock;
-            exit when Abs_Time <= Check_Time or else Check_Time < Base_Time;
-
-            pragma Assert (Result = 0
-                             or else Result = ETIMEDOUT
-                             or else Result = EINTR);
-         end loop;
-
-         Self_ID.Common.State := Runnable;
-      end if;
-
-      Unlock (Self_ID);
-
-      if Single_Lock then
-         Unlock_RTS;
-      end if;
-
-      Result := sched_yield;
-   end Timed_Delay;
+      Mode    : ST.Delay_Modes) renames Monotonic.Timed_Delay;
 
    ---------------------
    -- Monotonic_Clock --
    ---------------------
 
-   function Monotonic_Clock return Duration is
-      TS     : aliased timespec;
-      Result : Interfaces.C.int;
-   begin
-      Result := clock_gettime
-        (clock_id => OSC.CLOCK_RT_Ada, tp => TS'Unchecked_Access);
-      pragma Assert (Result = 0);
-      return To_Duration (TS);
-   end Monotonic_Clock;
+   function Monotonic_Clock return Duration renames Monotonic.Monotonic_Clock;
 
    -------------------
    -- RT_Resolution --
    -------------------
 
-   function RT_Resolution return Duration is
-      TS     : aliased timespec;
-      Result : Interfaces.C.int;
-   begin
-      Result := clock_getres (OSC.CLOCK_REALTIME, TS'Unchecked_Access);
-      pragma Assert (Result = 0);
-
-      return To_Duration (TS);
-   end RT_Resolution;
+   function RT_Resolution return Duration renames Monotonic.RT_Resolution;
 
    ------------
    -- Wakeup --
diff --git a/gcc/ada/libgnarl/s-taprop__solaris.adb b/gcc/ada/libgnarl/s-taprop__solaris.adb
index e97662c12b1..26d83e584d6 100644
--- a/gcc/ada/libgnarl/s-taprop__solaris.adb
+++ b/gcc/ada/libgnarl/s-taprop__solaris.adb
@@ -237,11 +237,16 @@ package body System.Task_Primitives.Operations is
    -- Support for foreign threads --
    ---------------------------------
 
-   function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id;
-   --  Allocate and Initialize a new ATCB for the current Thread
+   function Register_Foreign_Thread
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size) return Task_Id;
+   --  Allocate and initialize a new ATCB for the current Thread. The size of
+   --  the secondary stack can be optionally specified.
 
    function Register_Foreign_Thread
-     (Thread : Thread_Id) return Task_Id is separate;
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size)
+     return Task_Id is separate;
 
    ------------
    -- Checks --
diff --git a/gcc/ada/libgnarl/s-taprop__vxworks.adb b/gcc/ada/libgnarl/s-taprop__vxworks.adb
index b77fb106b37..83ebc22312e 100644
--- a/gcc/ada/libgnarl/s-taprop__vxworks.adb
+++ b/gcc/ada/libgnarl/s-taprop__vxworks.adb
@@ -149,11 +149,16 @@ package body System.Task_Primitives.Operations is
    -- Support for foreign threads --
    ---------------------------------
 
-   function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id;
-   --  Allocate and Initialize a new ATCB for the current Thread
+   function Register_Foreign_Thread
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size) return Task_Id;
+   --  Allocate and initialize a new ATCB for the current Thread. The size of
+   --  the secondary stack can be optionally specified.
 
    function Register_Foreign_Thread
-     (Thread : Thread_Id) return Task_Id is separate;
+     (Thread         : Thread_Id;
+      Sec_Stack_Size : Size_Type := Unspecified_Size)
+     return Task_Id is separate;
 
    -----------------------
    -- Local Subprograms --
diff --git a/gcc/ada/libgnarl/s-tarest.adb b/gcc/ada/libgnarl/s-tarest.adb
index daff5c1c3ae..7b9f260927e 100644
--- a/gcc/ada/libgnarl/s-tarest.adb
+++ b/gcc/ada/libgnarl/s-tarest.adb
@@ -47,12 +47,6 @@ with Ada.Exceptions;
 
 with System.Task_Primitives.Operations;
 with System.Soft_Links.Tasking;
-with System.Storage_Elements;
-
-with System.Secondary_Stack;
-pragma Elaborate_All (System.Secondary_Stack);
---  Make sure the body of Secondary_Stack is elaborated before calling
---  Init_Tasking_Soft_Links. See comments for this routine for explanation.
 
 with System.Soft_Links;
 --  Used for the non-tasking routines (*_NT) that refer to global data. They
@@ -65,8 +59,6 @@ package body System.Tasking.Restricted.Stages is
 
    package STPO renames System.Task_Primitives.Operations;
    package SSL  renames System.Soft_Links;
-   package SSE  renames System.Storage_Elements;
-   package SST  renames System.Secondary_Stack;
 
    use Ada.Exceptions;
 
@@ -115,17 +107,18 @@ package body System.Tasking.Restricted.Stages is
    --  This should only be called by the Task_Wrapper procedure.
 
    procedure Create_Restricted_Task
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Task_Image           : String;
-      Created_Task         : Task_Id);
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Task_Image        : String;
+      Created_Task      : Task_Id);
    --  Code shared between Create_Restricted_Task (the concurrent version) and
    --  Create_Restricted_Task_Sequential. See comment of the former in the
    --  specification of this package.
@@ -205,54 +198,6 @@ package body System.Tasking.Restricted.Stages is
       --
       --  DO NOT delete ID. As noted, it is needed on some targets.
 
-      function Secondary_Stack_Size return Storage_Elements.Storage_Offset;
-      --  Returns the size of the secondary stack for the task. For fixed
-      --  secondary stacks, the function will return the ATCB field
-      --  Secondary_Stack_Size if it is not set to Unspecified_Size,
-      --  otherwise a percentage of the stack is reserved using the
-      --  System.Parameters.Sec_Stack_Percentage property.
-
-      --  Dynamic secondary stacks are allocated in System.Soft_Links.
-      --  Create_TSD and thus the function returns 0 to suppress the
-      --  creation of the fixed secondary stack in the primary stack.
-
-      --------------------------
-      -- Secondary_Stack_Size --
-      --------------------------
-
-      function Secondary_Stack_Size return Storage_Elements.Storage_Offset is
-         use System.Storage_Elements;
-         use System.Secondary_Stack;
-
-      begin
-         if Parameters.Sec_Stack_Dynamic then
-            return 0;
-
-         elsif Self_ID.Common.Secondary_Stack_Size = Unspecified_Size then
-            return (Self_ID.Common.Compiler_Data.Pri_Stack_Info.Size
-                       * SSE.Storage_Offset (Sec_Stack_Percentage) / 100);
-         else
-            --  Use the size specified by aspect Secondary_Stack_Size padded
-            --  by the amount of space used by the stack data structure.
-
-            return Storage_Offset (Self_ID.Common.Secondary_Stack_Size) +
-                     Storage_Offset (Minimum_Secondary_Stack_Size);
-         end if;
-      end Secondary_Stack_Size;
-
-      Secondary_Stack : aliased Storage_Elements.Storage_Array
-                          (1 .. Secondary_Stack_Size);
-      for Secondary_Stack'Alignment use Standard'Maximum_Alignment;
-      --  This is the secondary stack data. Note that it is critical that this
-      --  have maximum alignment, since any kind of data can be allocated here.
-
-      pragma Warnings (Off);
-      Secondary_Stack_Address : System.Address := Secondary_Stack'Address;
-      pragma Warnings (On);
-      --  Address of secondary stack. In the fixed secondary stack case, this
-      --  value is not modified, causing a warning, hence the bracketing with
-      --  Warnings (Off/On).
-
       Cause : Cause_Of_Termination := Normal;
       --  Indicates the reason why this task terminates. Normal corresponds to
       --  a task terminating due to completing the last statement of its body.
@@ -266,15 +211,7 @@ package body System.Tasking.Restricted.Stages is
       --  execution of its task body, then EO will contain the associated
       --  exception occurrence. Otherwise, it will contain Null_Occurrence.
 
-   --  Start of processing for Task_Wrapper
-
    begin
-      if not Parameters.Sec_Stack_Dynamic then
-         Self_ID.Common.Compiler_Data.Sec_Stack_Addr :=
-           Secondary_Stack'Address;
-         SST.SS_Init (Secondary_Stack_Address, Integer (Secondary_Stack'Last));
-      end if;
-
       --  Initialize low-level TCB components, that cannot be initialized by
       --  the creator.
 
@@ -539,17 +476,18 @@ package body System.Tasking.Restricted.Stages is
    ----------------------------
 
    procedure Create_Restricted_Task
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Task_Image           : String;
-      Created_Task         : Task_Id)
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Task_Image        : String;
+      Created_Task      : Task_Id)
    is
       Self_ID       : constant Task_Id := STPO.Self;
       Base_Priority : System.Any_Priority;
@@ -608,8 +546,7 @@ package body System.Tasking.Restricted.Stages is
 
       Initialize_ATCB
         (Self_ID, State, Discriminants, Self_ID, Elaborated, Base_Priority,
-         Base_CPU, null, Task_Info, Size, Secondary_Stack_Size,
-         Created_Task, Success);
+         Base_CPU, null, Task_Info, Stack_Size, Created_Task, Success);
 
       --  If we do our job right then there should never be any failures, which
       --  was probably said about the Titanic; so just to be safe, let's retain
@@ -639,25 +576,31 @@ package body System.Tasking.Restricted.Stages is
          Unlock_RTS;
       end if;
 
-      --  Create TSD as early as possible in the creation of a task, since it
-      --  may be used by the operation of Ada code within the task.
+      --  Create TSD as early as possible in the creation of a task, since
+      --  it may be used by the operation of Ada code within the task. If the
+      --  compiler has not allocated a secondary stack, a stack will be
+      --  allocated fromt the binder generated pool.
 
-      SSL.Create_TSD (Created_Task.Common.Compiler_Data);
+      SSL.Create_TSD
+        (Created_Task.Common.Compiler_Data,
+         Sec_Stack_Address,
+         Sec_Stack_Size);
    end Create_Restricted_Task;
 
    procedure Create_Restricted_Task
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Chain                : in out Activation_Chain;
-      Task_Image           : String;
-      Created_Task         : Task_Id)
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Chain             : in out Activation_Chain;
+      Task_Image        : String;
+      Created_Task      : Task_Id)
    is
    begin
       if Partition_Elaboration_Policy = 'S' then
@@ -668,14 +611,14 @@ package body System.Tasking.Restricted.Stages is
          --  sequential, activation must be deferred.
 
          Create_Restricted_Task_Sequential
-           (Priority, Stack_Address, Size, Secondary_Stack_Size,
-            Task_Info, CPU, State, Discriminants, Elaborated,
+           (Priority, Stack_Address, Stack_Size, Sec_Stack_Address,
+            Sec_Stack_Size, Task_Info, CPU, State, Discriminants, Elaborated,
             Task_Image, Created_Task);
 
       else
          Create_Restricted_Task
-           (Priority, Stack_Address, Size, Secondary_Stack_Size,
-            Task_Info, CPU, State, Discriminants, Elaborated,
+           (Priority, Stack_Address, Stack_Size, Sec_Stack_Address,
+            Sec_Stack_Size, Task_Info, CPU, State, Discriminants, Elaborated,
             Task_Image, Created_Task);
 
          --  Append this task to the activation chain
@@ -690,22 +633,24 @@ package body System.Tasking.Restricted.Stages is
    ---------------------------------------
 
    procedure Create_Restricted_Task_Sequential
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Task_Image           : String;
-      Created_Task         : Task_Id) is
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Task_Image        : String;
+      Created_Task      : Task_Id)
+   is
    begin
-      Create_Restricted_Task (Priority, Stack_Address, Size,
-                              Secondary_Stack_Size, Task_Info,
-                              CPU, State, Discriminants, Elaborated,
-                              Task_Image, Created_Task);
+      Create_Restricted_Task
+        (Priority, Stack_Address, Stack_Size, Sec_Stack_Address,
+         Sec_Stack_Size, Task_Info, CPU, State, Discriminants, Elaborated,
+         Task_Image, Created_Task);
 
       --  Append this task to the activation chain
 
diff --git a/gcc/ada/libgnarl/s-tarest.ads b/gcc/ada/libgnarl/s-tarest.ads
index ccc5683bd31..e51fa58ca61 100644
--- a/gcc/ada/libgnarl/s-tarest.ads
+++ b/gcc/ada/libgnarl/s-tarest.ads
@@ -43,8 +43,9 @@
 --  The restricted GNARLI is also composed of System.Protected_Objects and
 --  System.Protected_Objects.Single_Entry
 
-with System.Task_Info;
 with System.Parameters;
+with System.Secondary_Stack;
+with System.Task_Info;
 
 package System.Tasking.Restricted.Stages is
    pragma Elaborate_Body;
@@ -128,33 +129,38 @@ package System.Tasking.Restricted.Stages is
    --  by the binder generated code, before calling elaboration code.
 
    procedure Create_Restricted_Task
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Chain                : in out Activation_Chain;
-      Task_Image           : String;
-      Created_Task         : Task_Id);
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Chain             : in out Activation_Chain;
+      Task_Image        : String;
+      Created_Task      : Task_Id);
    --  Compiler interface only. Do not call from within the RTS.
    --  This must be called to create a new task, when the partition
    --  elaboration policy is not specified (or is concurrent).
    --
    --  Priority is the task's priority (assumed to be in the
-   --  System.Any_Priority'Range)
+   --  System.Any_Priority'Range).
    --
    --  Stack_Address is the start address of the stack associated to the task,
    --  in case it has been preallocated by the compiler; it is equal to
    --  Null_Address when the stack needs to be allocated by the underlying
    --  operating system.
    --
-   --  Size is the stack size of the task to create
+   --  Stack_Size is the stack size of the task to create.
+   --
+   --  Sec_Stack_Address is the pointer to the secondary stack created by the
+   --  compiler. If null, the secondary stack is either allocated by the binder
+   --  or the run-time.
    --
-   --  Secondary_Stack_Size is the secondary stack size of the task to create
+   --  Secondary_Stack_Size is the secondary stack size of the task to create.
    --
    --  Task_Info is the task info associated with the created task, or
    --  Unspecified_Task_Info if none.
@@ -164,7 +170,7 @@ package System.Tasking.Restricted.Stages is
    --   checks are performed when analyzing the pragma, and dynamic ones are
    --   performed before setting the affinity at run time.
    --
-   --  State is the compiler generated task's procedure body
+   --  State is the compiler generated task's procedure body.
    --
    --  Discriminants is a pointer to a limited record whose discriminants are
    --  those of the task to create. This parameter should be passed as the
@@ -182,20 +188,21 @@ package System.Tasking.Restricted.Stages is
    --
    --  Created_Task is the resulting task.
    --
-   --  This procedure can raise Storage_Error if the task creation fails
+   --  This procedure can raise Storage_Error if the task creation fails.
 
    procedure Create_Restricted_Task_Sequential
-     (Priority             : Integer;
-      Stack_Address        : System.Address;
-      Size                 : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      Task_Info            : System.Task_Info.Task_Info_Type;
-      CPU                  : Integer;
-      State                : Task_Procedure_Access;
-      Discriminants        : System.Address;
-      Elaborated           : Access_Boolean;
-      Task_Image           : String;
-      Created_Task         : Task_Id);
+     (Priority          : Integer;
+      Stack_Address     : System.Address;
+      Stack_Size        : System.Parameters.Size_Type;
+      Sec_Stack_Address : System.Secondary_Stack.SS_Stack_Ptr;
+      Sec_Stack_Size    : System.Parameters.Size_Type;
+      Task_Info         : System.Task_Info.Task_Info_Type;
+      CPU               : Integer;
+      State             : Task_Procedure_Access;
+      Discriminants     : System.Address;
+      Elaborated        : Access_Boolean;
+      Task_Image        : String;
+      Created_Task      : Task_Id);
    --  Compiler interface only. Do not call from within the RTS.
    --  This must be called to create a new task, when the sequential partition
    --  elaboration policy is used.
diff --git a/gcc/ada/libgnarl/s-taskin.adb b/gcc/ada/libgnarl/s-taskin.adb
index 462e229645c..d9fc6e3213b 100644
--- a/gcc/ada/libgnarl/s-taskin.adb
+++ b/gcc/ada/libgnarl/s-taskin.adb
@@ -96,7 +96,6 @@ package body System.Tasking is
       Domain               : Dispatching_Domain_Access;
       Task_Info            : System.Task_Info.Task_Info_Type;
       Stack_Size           : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
       T                    : Task_Id;
       Success              : out Boolean)
    is
@@ -147,7 +146,6 @@ package body System.Tasking is
       T.Common.Specific_Handler         := null;
       T.Common.Debug_Events             := (others => False);
       T.Common.Task_Image_Len           := 0;
-      T.Common.Secondary_Stack_Size     := Secondary_Stack_Size;
 
       if T.Common.Parent = null then
 
@@ -244,7 +242,6 @@ package body System.Tasking is
          Domain               => System_Domain,
          Task_Info            => Task_Info.Unspecified_Task_Info,
          Stack_Size           => 0,
-         Secondary_Stack_Size => Parameters.Unspecified_Size,
          T                    => T,
          Success              => Success);
       pragma Assert (Success);
diff --git a/gcc/ada/libgnarl/s-taskin.ads b/gcc/ada/libgnarl/s-taskin.ads
index cd53cf93471..7c8b44b952c 100644
--- a/gcc/ada/libgnarl/s-taskin.ads
+++ b/gcc/ada/libgnarl/s-taskin.ads
@@ -37,12 +37,12 @@
 with Ada.Exceptions;
 with Ada.Unchecked_Conversion;
 
+with System.Multiprocessors;
 with System.Parameters;
-with System.Task_Info;
 with System.Soft_Links;
-with System.Task_Primitives;
 with System.Stack_Usage;
-with System.Multiprocessors;
+with System.Task_Info;
+with System.Task_Primitives;
 
 package System.Tasking is
    pragma Preelaborate;
@@ -702,13 +702,6 @@ package System.Tasking is
       --  need to do different things depending on the situation.
       --
       --  Protection: Self.L
-
-      Secondary_Stack_Size : System.Parameters.Size_Type;
-      --  Secondary_Stack_Size is the size of the secondary stack for the
-      --  task. Defined here since it is the responsibility of the task to
-      --  creates its own secondary stack.
-      --
-      --  Protected: Only accessed by Self
    end record;
 
    ---------------------------------------
@@ -1173,7 +1166,6 @@ package System.Tasking is
       Domain               : Dispatching_Domain_Access;
       Task_Info            : System.Task_Info.Task_Info_Type;
       Stack_Size           : System.Parameters.Size_Type;
-      Secondary_Stack_Size : System.Parameters.Size_Type;
       T                    : Task_Id;
       Success              : out Boolean);
    --  Initialize fields of the TCB for task T, and link into global TCB
diff --git a/gcc/ada/libgnarl/s-tassta.adb b/gcc/ada/libgnarl/s-tassta.adb
index 44c054fec3e..518a02c8b48 100644
--- a/gcc/ada/libgnarl/s-tassta.adb
+++ b/gcc/ada/libgnarl/s-tassta.adb
@@ -71,11 +71,11 @@ package body System.Tasking.Stages is
    package STPO renames System.Task_Primitives.Operations;
    package SSL  renames System.Soft_Links;
    package SSE  renames System.Storage_Elements;
-   package SST  renames System.Secondary_Stack;
 
    use Ada.Exceptions;
 
    use Parameters;
+   use Secondary_Stack;
    use Task_Primitives;
    use Task_Primitives.Operations;
 
@@ -465,7 +465,7 @@ package body System.Tasking.Stages is
 
    procedure Create_Task
      (Priority             : Integer;
-      Size                 : System.Parameters.Size_Type;
+      Stack_Size           : System.Parameters.Size_Type;
       Secondary_Stack_Size : System.Parameters.Size_Type;
       Task_Info            : System.Task_Info.Task_Info_Type;
       CPU                  : Integer;
@@ -604,8 +604,7 @@ package body System.Tasking.Stages is
       end if;
 
       Initialize_ATCB (Self_ID, State, Discriminants, P, Elaborated,
-        Base_Priority, Base_CPU, Domain, Task_Info, Size,
-        Secondary_Stack_Size, T, Success);
+        Base_Priority, Base_CPU, Domain, Task_Info, Stack_Size, T, Success);
 
       if not Success then
          Free (T);
@@ -692,10 +691,18 @@ package body System.Tasking.Stages is
            Dispatching_Domain_Tasks (Base_CPU) + 1;
       end if;
 
-      --  Create TSD as early as possible in the creation of a task, since it
-      --  may be used by the operation of Ada code within the task.
+      --  Create the secondary stack for the task as early as possible during
+      --  in the creation of a task, since it may be used by the operation of
+      --  Ada code within the task.
+
+      begin
+         SSL.Create_TSD (T.Common.Compiler_Data, null, Secondary_Stack_Size);
+      exception
+         when others =>
+            Initialization.Undefer_Abort_Nestable (Self_ID);
+            raise Storage_Error with "Secondary stack could not be allocated";
+      end;
 
-      SSL.Create_TSD (T.Common.Compiler_Data);
       T.Common.Activation_Link := Chain.T_ID;
       Chain.T_ID := T;
       Created_Task := T;
@@ -914,8 +921,8 @@ package body System.Tasking.Stages is
       SSL.Unlock_Task        := SSL.Task_Unlock_NT'Access;
       SSL.Get_Jmpbuf_Address := SSL.Get_Jmpbuf_Address_NT'Access;
       SSL.Set_Jmpbuf_Address := SSL.Set_Jmpbuf_Address_NT'Access;
-      SSL.Get_Sec_Stack_Addr := SSL.Get_Sec_Stack_Addr_NT'Access;
-      SSL.Set_Sec_Stack_Addr := SSL.Set_Sec_Stack_Addr_NT'Access;
+      SSL.Get_Sec_Stack      := SSL.Get_Sec_Stack_NT'Access;
+      SSL.Set_Sec_Stack      := SSL.Set_Sec_Stack_NT'Access;
       SSL.Check_Abort_Status := SSL.Check_Abort_Status_NT'Access;
       SSL.Get_Stack_Info     := SSL.Get_Stack_Info_NT'Access;
 
@@ -1014,7 +1021,6 @@ package body System.Tasking.Stages is
    --  at-end handler that the compiler generates.
 
    procedure Task_Wrapper (Self_ID : Task_Id) is
-      use type SSE.Storage_Offset;
       use System.Standard_Library;
       use System.Stack_Usage;
 
@@ -1027,52 +1033,6 @@ package body System.Tasking.Stages is
       Use_Alternate_Stack : constant Boolean := Alternate_Stack_Size /= 0;
       --  Whether to use above alternate signal stack for stack overflows
 
-      function Secondary_Stack_Size return Storage_Elements.Storage_Offset;
-      --  Returns the size of the secondary stack for the task. For fixed
-      --  secondary stacks, the function will return the ATCB field
-      --  Secondary_Stack_Size if it is not set to Unspecified_Size,
-      --  otherwise a percentage of the stack is reserved using the
-      --  System.Parameters.Sec_Stack_Percentage property.
-
-      --  Dynamic secondary stacks are allocated in System.Soft_Links.
-      --  Create_TSD and thus the function returns 0 to suppress the
-      --  creation of the fixed secondary stack in the primary stack.
-
-      --------------------------
-      -- Secondary_Stack_Size --
-      --------------------------
-
-      function Secondary_Stack_Size return Storage_Elements.Storage_Offset is
-         use System.Storage_Elements;
-
-      begin
-         if Parameters.Sec_Stack_Dynamic then
-            return 0;
-
-         elsif Self_ID.Common.Secondary_Stack_Size = Unspecified_Size then
-            return (Self_ID.Common.Compiler_Data.Pri_Stack_Info.Size
-                    * SSE.Storage_Offset (Sec_Stack_Percentage) / 100);
-         else
-            --  Use the size specified by aspect Secondary_Stack_Size padded
-            --  by the amount of space used by the stack data structure.
-
-            return Storage_Offset (Self_ID.Common.Secondary_Stack_Size) +
-                     Storage_Offset (SST.Minimum_Secondary_Stack_Size);
-         end if;
-      end Secondary_Stack_Size;
-
-      Secondary_Stack : aliased Storage_Elements.Storage_Array
-                          (1 .. Secondary_Stack_Size);
-      for Secondary_Stack'Alignment use Standard'Maximum_Alignment;
-      --  Actual area allocated for secondary stack. Note that it is critical
-      --  that this have maximum alignment, since any kind of data can be
-      --  allocated here.
-
-      Secondary_Stack_Address : System.Address := Secondary_Stack'Address;
-      --  Address of secondary stack. In the fixed secondary stack case, this
-      --  value is not modified, causing a warning, hence the bracketing with
-      --  Warnings (Off/On). But why is so much *more* bracketed???
-
       SEH_Table : aliased SSE.Storage_Array (1 .. 8);
       --  Structured Exception Registration table (2 words)
 
@@ -1136,14 +1096,6 @@ package body System.Tasking.Stages is
       Debug.Master_Hook
         (Self_ID, Self_ID.Common.Parent, Self_ID.Master_of_Task);
 
-      --  Assume a size of the stack taken at this stage
-
-      if not Parameters.Sec_Stack_Dynamic then
-         Self_ID.Common.Compiler_Data.Sec_Stack_Addr :=
-           Secondary_Stack'Address;
-         SST.SS_Init (Secondary_Stack_Address, Integer (Secondary_Stack'Last));
-      end if;
-
       if Use_Alternate_Stack then
          Self_ID.Common.Task_Alternate_Stack := Task_Alternate_Stack'Address;
       end if;
@@ -1197,15 +1149,6 @@ package body System.Tasking.Stages is
 
                Stack_Base := Bottom_Of_Stack'Address;
 
-               --  Also reduce the size of the stack to take into account the
-               --  secondary stack array declared in this frame. This is for
-               --  sure very conservative.
-
-               if not Parameters.Sec_Stack_Dynamic then
-                  Pattern_Size :=
-                    Pattern_Size - Natural (Secondary_Stack_Size);
-               end if;
-
                --  Adjustments for inner frames
 
                Pattern_Size := Pattern_Size -
@@ -1973,10 +1916,10 @@ package body System.Tasking.Stages is
          then
             Initialization.Task_Lock (Self_ID);
 
-            --  If Sec_Stack_Addr is not null, it means that Destroy_TSD
+            --  If Sec_Stack_Ptr is not null, it means that Destroy_TSD
             --  has not been called yet (case of an unactivated task).
 
-            if T.Common.Compiler_Data.Sec_Stack_Addr /= Null_Address then
+            if T.Common.Compiler_Data.Sec_Stack_Ptr /= null then
                SSL.Destroy_TSD (T.Common.Compiler_Data);
             end if;
 
diff --git a/gcc/ada/libgnarl/s-tassta.ads b/gcc/ada/libgnarl/s-tassta.ads
index bc837fc9af8..a1129a1085a 100644
--- a/gcc/ada/libgnarl/s-tassta.ads
+++ b/gcc/ada/libgnarl/s-tassta.ads
@@ -70,7 +70,7 @@ package System.Tasking.Stages is
    --   tE : aliased boolean := false;
    --   tZ : size_type := unspecified_size;
    --   type tV (discr : integer) is limited record
-   --      _task_id : task_id;
+   --      _task_id         : task_id;
    --   end record;
    --   procedure tB (_task : access tV);
    --   freeze tV [
@@ -168,7 +168,7 @@ package System.Tasking.Stages is
 
    procedure Create_Task
      (Priority             : Integer;
-      Size                 : System.Parameters.Size_Type;
+      Stack_Size           : System.Parameters.Size_Type;
       Secondary_Stack_Size : System.Parameters.Size_Type;
       Task_Info            : System.Task_Info.Task_Info_Type;
       CPU                  : Integer;
@@ -187,31 +187,44 @@ package System.Tasking.Stages is
    --
    --  Priority is the task's priority (assumed to be in range of type
    --   System.Any_Priority)
-   --  Size is the stack size of the task to create
-   --  Secondary_Stack_Size is the secondary stack size of the task to create
+   --
+   --  Stack_Size is the stack size of the task to create
+   --
+   --  Secondary_Stack_Size is the size of the secondary stack to be used by
+   --  the task.
+   --
    --  Task_Info is the task info associated with the created task, or
    --   Unspecified_Task_Info if none.
+   --
    --  CPU is the task affinity. Passed as an Integer because the undefined
    --   value is not in the range of CPU_Range. Static range checks are
    --   performed when analyzing the pragma, and dynamic ones are performed
    --   before setting the affinity at run time.
+   --
    --  Relative_Deadline is the relative deadline associated with the created
    --   task by means of a pragma Relative_Deadline, or 0.0 if none.
+   --
    --  Domain is the dispatching domain associated with the created task by
    --   means of a Dispatching_Domain pragma or aspect, or null if none.
+   --
    --  State is the compiler generated task's procedure body
+   --
    --  Discriminants is a pointer to a limited record whose discriminants
    --   are those of the task to create. This parameter should be passed as
    --   the single argument to State.
+   --
    --  Elaborated is a pointer to a Boolean that must be set to true on exit
    --   if the task could be successfully elaborated.
+   --
    --  Chain is a linked list of task that needs to be created. On exit,
    --   Created_Task.Activation_Link will be Chain.T_ID, and Chain.T_ID
    --   will be Created_Task (e.g the created task will be linked at the front
    --   of Chain).
+   --
    --  Task_Image is a string created by the compiler that the
    --   run time can store to ease the debugging and the
    --   Ada.Task_Identification facility.
+   --
    --  Created_Task is the resulting task.
    --
    --  This procedure can raise Storage_Error if the task creation failed.
diff --git a/gcc/ada/libgnarl/s-tpopmo.adb b/gcc/ada/libgnarl/s-tpopmo.adb
new file mode 100644
index 00000000000..b6164aa19ed
--- /dev/null
+++ b/gcc/ada/libgnarl/s-tpopmo.adb
@@ -0,0 +1,283 @@
+------------------------------------------------------------------------------
+--                                                                          --
+--                 GNAT RUN-TIME LIBRARY (GNARL) COMPONENTS                 --
+--                                                                          --
+--               SYSTEM.TASK_PRIMITIVES.OPERATIONS.MONOTONIC                --
+--                                                                          --
+--                                 B o d y                                  --
+--                                                                          --
+--         Copyright (C) 1992-2017, Free Software Foundation, Inc.          --
+--                                                                          --
+-- GNARL is free software; you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE.                                     --
+--                                                                          --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.               --
+--                                                                          --
+-- You should have received a copy of the GNU General Public License and    --
+-- a copy of the GCC Runtime Library Exception along with this program;     --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see    --
+-- <http://www.gnu.org/licenses/>.                                          --
+--                                                                          --
+-- GNARL was developed by the GNARL team at Florida State University.       --
+-- Extensive contributions were provided by Ada Core Technologies, Inc.     --
+--                                                                          --
+------------------------------------------------------------------------------
+
+--  This is the Monotonic version of this package for Posix and Linux targets.
+
+separate (System.Task_Primitives.Operations)
+package body Monotonic is
+
+   -----------------------
+   -- Local Subprograms --
+   -----------------------
+
+   procedure Compute_Deadline
+     (Time       : Duration;
+      Mode       : ST.Delay_Modes;
+      Check_Time : out Duration;
+      Abs_Time   : out Duration;
+      Rel_Time   : out Duration);
+   --  Helper for Timed_Sleep and Timed_Delay: given a deadline specified by
+   --  Time and Mode, compute the current clock reading (Check_Time), and the
+   --  target absolute and relative clock readings (Abs_Time, Rel_Time). The
+   --  epoch for Time depends on Mode; the epoch for Check_Time and Abs_Time
+   --  is always that of CLOCK_RT_Ada.
+
+   ---------------------
+   -- Monotonic_Clock --
+   ---------------------
+
+   function Monotonic_Clock return Duration is
+      TS     : aliased timespec;
+      Result : Interfaces.C.int;
+   begin
+      Result := clock_gettime
+        (clock_id => OSC.CLOCK_RT_Ada, tp => TS'Unchecked_Access);
+      pragma Assert (Result = 0);
+
+      return To_Duration (TS);
+   end Monotonic_Clock;
+
+   -------------------
+   -- RT_Resolution --
+   -------------------
+
+   function RT_Resolution return Duration is
+      TS     : aliased timespec;
+      Result : Interfaces.C.int;
+
+   begin
+      Result := clock_getres (OSC.CLOCK_REALTIME, TS'Unchecked_Access);
+      pragma Assert (Result = 0);
+
+      return To_Duration (TS);
+   end RT_Resolution;
+
+   ----------------------
+   -- Compute_Deadline --
+   ----------------------
+
+   procedure Compute_Deadline
+     (Time       : Duration;
+      Mode       : ST.Delay_Modes;
+      Check_Time : out Duration;
+      Abs_Time   : out Duration;
+      Rel_Time   : out Duration)
+   is
+   begin
+      Check_Time := Monotonic_Clock;
+
+      --  Relative deadline
+
+      if Mode = Relative then
+         Abs_Time := Duration'Min (Time, Max_Sensible_Delay) + Check_Time;
+
+         if Relative_Timed_Wait then
+            Rel_Time := Duration'Min (Max_Sensible_Delay, Time);
+         end if;
+
+         pragma Warnings (Off);
+         --  Comparison "OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME" is compile
+         --  time known.
+
+      --  Absolute deadline specified using the tasking clock (CLOCK_RT_Ada)
+
+      elsif Mode = Absolute_RT
+        or else OSC.CLOCK_RT_Ada = OSC.CLOCK_REALTIME
+      then
+         pragma Warnings (On);
+         Abs_Time := Duration'Min (Check_Time + Max_Sensible_Delay, Time);
+
+         if Relative_Timed_Wait then
+            Rel_Time := Duration'Min (Max_Sensible_Delay, Time - Check_Time);
+         end if;
+
+      --  Absolute deadline specified using the calendar clock, in the
+      --  case where it is not the same as the tasking clock: compensate for
+      --  difference between clock epochs (Base_Time - Base_Cal_Time).
+
+      else
+         declare
+            Cal_Check_Time : constant Duration := OS_Primitives.Clock;
+            RT_Time        : constant Duration :=
+                               Time + Check_Time - Cal_Check_Time;
+
+         begin
+            Abs_Time :=
+              Duration'Min (Check_Time + Max_Sensible_Delay, RT_Time);
+
+            if Relative_Timed_Wait then
+               Rel_Time :=
+                 Duration'Min (Max_Sensible_Delay, RT_Time - Check_Time);
+            end if;
+         end;
+      end if;
+   end Compute_Deadline;
+
+   -----------------
+   -- Timed_Sleep --
+   -----------------
+
+   --  This is for use within the run-time system, so abort is
+   --  assumed to be already deferred, and the caller should be
+   --  holding its own ATCB lock.
+
+   procedure Timed_Sleep
+     (Self_ID  : ST.Task_Id;
+      Time     : Duration;
+      Mode     : ST.Delay_Modes;
+      Reason   : System.Tasking.Task_States;
+      Timedout : out Boolean;
+      Yielded  : out Boolean)
+   is
+      pragma Unreferenced (Reason);
+
+      Base_Time  : Duration;
+      Check_Time : Duration;
+      Abs_Time   : Duration;
+      Rel_Time   : Duration;
+
+      Request    : aliased timespec;
+      Result     : Interfaces.C.int;
+
+   begin
+      Timedout := True;
+      Yielded := False;
+
+      Compute_Deadline
+        (Time       => Time,
+         Mode       => Mode,
+         Check_Time => Check_Time,
+         Abs_Time   => Abs_Time,
+         Rel_Time   => Rel_Time);
+      Base_Time := Check_Time;
+
+      if Abs_Time > Check_Time then
+         Request :=
+           To_Timespec (if Relative_Timed_Wait then Rel_Time else Abs_Time);
+
+         loop
+            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
+
+            Result :=
+              pthread_cond_timedwait
+                (cond    => Self_ID.Common.LL.CV'Access,
+                 mutex   => (if Single_Lock
+                             then Single_RTS_Lock'Access
+                             else Self_ID.Common.LL.L'Access),
+                 abstime => Request'Access);
+
+            Check_Time := Monotonic_Clock;
+            exit when Abs_Time <= Check_Time or else Check_Time < Base_Time;
+
+            if Result in 0 | EINTR then
+
+               --  Somebody may have called Wakeup for us
+
+               Timedout := False;
+               exit;
+            end if;
+
+            pragma Assert (Result = ETIMEDOUT);
+         end loop;
+      end if;
+   end Timed_Sleep;
+
+   -----------------
+   -- Timed_Delay --
+   -----------------
+
+   --  This is for use in implementing delay statements, so we assume the
+   --  caller is abort-deferred but is holding no locks.
+
+   procedure Timed_Delay
+     (Self_ID : ST.Task_Id;
+      Time    : Duration;
+      Mode    : ST.Delay_Modes)
+   is
+      Base_Time  : Duration;
+      Check_Time : Duration;
+      Abs_Time   : Duration;
+      Rel_Time   : Duration;
+      Request    : aliased timespec;
+
+      Result : Interfaces.C.int;
+      pragma Warnings (Off, Result);
+
+   begin
+      if Single_Lock then
+         Lock_RTS;
+      end if;
+
+      Write_Lock (Self_ID);
+
+      Compute_Deadline
+        (Time       => Time,
+         Mode       => Mode,
+         Check_Time => Check_Time,
+         Abs_Time   => Abs_Time,
+         Rel_Time   => Rel_Time);
+      Base_Time := Check_Time;
+
+      if Abs_Time > Check_Time then
+         Request :=
+           To_Timespec (if Relative_Timed_Wait then Rel_Time else Abs_Time);
+         Self_ID.Common.State := Delay_Sleep;
+
+         loop
+            exit when Self_ID.Pending_ATC_Level < Self_ID.ATC_Nesting_Level;
+
+            Result :=
+              pthread_cond_timedwait
+                (cond    => Self_ID.Common.LL.CV'Access,
+                 mutex   => (if Single_Lock
+                             then Single_RTS_Lock'Access
+                             else Self_ID.Common.LL.L'Access),
+                 abstime => Request'Access);
+
+            Check_Time := Monotonic_Clock;
+            exit when Abs_Time <= Check_Time or else Check_Time < Base_Time;
+
+            pragma Assert (Result in 0 | ETIMEDOUT | EINTR);
+         end loop;
+
+         Self_ID.Common.State := Runnable;
+      end if;
+
+      Unlock (Self_ID);
+
+      if Single_Lock then
+         Unlock_RTS;
+      end if;
+
+      Result := sched_yield;
+   end Timed_Delay;
+
+end Monotonic;
diff --git a/gcc/ada/libgnarl/s-tporft.adb b/gcc/ada/libgnarl/s-tporft.adb
index 7b8a59276f8..56eda26e6a1 100644
--- a/gcc/ada/libgnarl/s-tporft.adb
+++ b/gcc/ada/libgnarl/s-tporft.adb
@@ -29,16 +29,16 @@
 --                                                                          --
 ------------------------------------------------------------------------------
 
-with System.Task_Info;
---  Use for Unspecified_Task_Info
-
-with System.Soft_Links;
---  used to initialize TSD for a C thread, in function Self
-
 with System.Multiprocessors;
+with System.Soft_Links;
+with System.Task_Info;
 
 separate (System.Task_Primitives.Operations)
-function Register_Foreign_Thread (Thread : Thread_Id) return Task_Id is
+function Register_Foreign_Thread
+  (Thread         : Thread_Id;
+   Sec_Stack_Size : Size_Type := Unspecified_Size)
+   return Task_Id
+is
    Local_ATCB : aliased Ada_Task_Control_Block (0);
    Self_Id    : Task_Id;
    Succeeded  : Boolean;
@@ -66,7 +66,7 @@ begin
      (Self_Id, null, Null_Address, Null_Task,
       Foreign_Task_Elaborated'Access,
       System.Priority'First, System.Multiprocessors.Not_A_Specific_CPU, null,
-      Task_Info.Unspecified_Task_Info, 0, 0, Self_Id, Succeeded);
+      Task_Info.Unspecified_Task_Info, 0, Self_Id, Succeeded);
    Unlock_RTS;
    pragma Assert (Succeeded);
 
@@ -92,7 +92,10 @@ begin
 
    Self_Id.Common.Task_Alternate_Stack := Null_Address;
 
-   System.Soft_Links.Create_TSD (Self_Id.Common.Compiler_Data);
+   --  Create the TSD for the task
+
+   System.Soft_Links.Create_TSD
+     (Self_Id.Common.Compiler_Data, null, Sec_Stack_Size);
 
    Enter_Task (Self_Id);
 
diff --git a/gcc/ada/libgnat/s-parame.adb b/gcc/ada/libgnat/s-parame.adb
index 0f4d45f2da8..359edacb95e 100644
--- a/gcc/ada/libgnat/s-parame.adb
+++ b/gcc/ada/libgnat/s-parame.adb
@@ -50,6 +50,34 @@ package body System.Parameters is
       end if;
    end Adjust_Storage_Size;
 
+   ----------------------------
+   -- Default_Sec_Stack_Size --
+   ----------------------------
+
+   function Default_Sec_Stack_Size return Size_Type is
+      Default_SS_Size : Integer;
+      pragma Import (C, Default_SS_Size,
+                     "__gnat_default_ss_size");
+   begin
+      --  There are two situations where the default secondary stack size is
+      --  set to zero:
+      --
+      --    * The user sets it to zero erroneously thinking it will disable
+      --      the secondary stack.
+      --
+      --    * Or more likely, we are building with an old compiler and
+      --      Default_SS_Size is never set.
+      --
+      --  In both case set the default secondary stack size to the run-time
+      --  default.
+
+      if Default_SS_Size > 0 then
+         return Size_Type (Default_SS_Size);
+      else
+         return Runtime_Default_Sec_Stack_Size;
+      end if;
+   end Default_Sec_Stack_Size;
+
    ------------------------
    -- Default_Stack_Size --
    ------------------------
diff --git a/gcc/ada/libgnat/s-parame.ads b/gcc/ada/libgnat/s-parame.ads
index f48c7e0973f..60a5e997021 100644
--- a/gcc/ada/libgnat/s-parame.ads
+++ b/gcc/ada/libgnat/s-parame.ads
@@ -64,20 +64,6 @@ package System.Parameters is
    Unspecified_Size : constant Size_Type := Size_Type'First;
    --  Value used to indicate that no size type is set
 
-   subtype Percentage is Size_Type range -1 .. 100;
-   Dynamic : constant Size_Type := -1;
-   --  The secondary stack ratio is a constant between 0 and 100 which
-   --  determines the percentage of the allocated task stack that is
-   --  used by the secondary stack (the rest being the primary stack).
-   --  The special value of minus one indicates that the secondary
-   --  stack is to be allocated from the heap instead.
-
-   Sec_Stack_Percentage : constant Percentage := Dynamic;
-   --  This constant defines the handling of the secondary stack
-
-   Sec_Stack_Dynamic : constant Boolean := Sec_Stack_Percentage = Dynamic;
-   --  Convenient Boolean for testing for dynamic secondary stack
-
    function Default_Stack_Size return Size_Type;
    --  Default task stack size used if none is specified
 
@@ -94,15 +80,27 @@ package System.Parameters is
    --    otherwise return given Size
 
    Default_Env_Stack_Size : constant Size_Type := 8_192_000;
-   --  Assumed size of the environment task, if no other information
-   --  is available. This value is used when stack checking is
-   --  enabled and no GNAT_STACK_LIMIT environment variable is set.
+   --  Assumed size of the environment task, if no other information is
+   --  available. This value is used when stack checking is enabled and
+   --  no GNAT_STACK_LIMIT environment variable is set.
 
    Stack_Grows_Down  : constant Boolean := True;
    --  This constant indicates whether the stack grows up (False) or
    --  down (True) in memory as functions are called. It is used for
    --  proper implementation of the stack overflow check.
 
+   Runtime_Default_Sec_Stack_Size : constant Size_Type := 10 * 1024;
+   --  The run-time chosen default size for secondary stacks that may be
+   --  overriden by the user with the use of binder -D switch.
+
+   function Default_Sec_Stack_Size return Size_Type;
+   --  The default initial size for secondary stacks that reflects any user
+   --  specified default via the binder -D switch.
+
+   Sec_Stack_Dynamic : constant Boolean := True;
+   --  Indicates if secondary stacks can grow and shrink at run-time. If False,
+   --  the size of a secondary stack is fixed at the point of its creation.
+
    ----------------------------------------------
    -- Characteristics of types in Interfaces.C --
    ----------------------------------------------
diff --git a/gcc/ada/libgnat/s-parame__ae653.ads b/gcc/ada/libgnat/s-parame__ae653.ads
index 8a787f007bc..42d438e72ea 100644
--- a/gcc/ada/libgnat/s-parame__ae653.ads
+++ b/gcc/ada/libgnat/s-parame__ae653.ads
@@ -62,20 +62,6 @@ package System.Parameters is
    Unspecified_Size : constant Size_Type := Size_Type'First;
    --  Value used to indicate that no size type is set
 
-   subtype Percentage is Size_Type range -1 .. 100;
-   Dynamic : constant Size_Type := -1;
-   --  The secondary stack ratio is a constant between 0 and 100 which
-   --  determines the percentage of the allocated task stack that is
-   --  used by the secondary stack (the rest being the primary stack).
-   --  The special value of minus one indicates that the secondary
-   --  stack is to be allocated from the heap instead.
-
-   Sec_Stack_Percentage : constant Percentage := 25;
-   --  This constant defines the handling of the secondary stack
-
-   Sec_Stack_Dynamic : constant Boolean := Sec_Stack_Percentage = Dynamic;
-   --  Convenient Boolean for testing for dynamic secondary stack
-
    function Default_Stack_Size return Size_Type;
    --  Default task stack size used if none is specified
 
@@ -103,6 +89,18 @@ package System.Parameters is
    --  down (True) in memory as functions are called. It is used for
    --  proper implementation of the stack overflow check.
 
+   Runtime_Default_Sec_Stack_Size : constant Size_Type := 10 * 1024;
+   --  The run-time chosen default size for secondary stacks that may be
+   --  overriden by the user with the use of binder -D switch.
+
+   function Default_Sec_Stack_Size return Size_Type;
+   --  The default size for secondary stacks that reflects any user specified
+   --  default via the binder -D switch.
+
+   Sec_Stack_Dynamic : constant Boolean := False;
+   --  Indicates if secondary stacks can grow and shrink at run-time. If False,
+   --  the size of a secondary stack is fixed at the point of its creation.
+
    ----------------------------------------------
    -- Characteristics of types in Interfaces.C --
    ----------------------------------------------
diff --git a/gcc/ada/libgnat/s-parame__hpux.ads b/gcc/ada/libgnat/s-parame__hpux.ads
index f20cfbebe7e..846b165561e 100644
--- a/gcc/ada/libgnat/s-parame__hpux.ads
+++ b/gcc/ada/libgnat/s-parame__hpux.ads
@@ -62,20 +62,6 @@ package System.Parameters is
    Unspecified_Size : constant Size_Type := Size_Type'First;
    --  Value used to indicate that no size type is set
 
-   subtype Percentage is Size_Type range -1 .. 100;
-   Dynamic : constant Size_Type := -1;
-   --  The secondary stack ratio is a constant between 0 and 100 which
-   --  determines the percentage of the allocated task stack that is
-   --  used by the secondary stack (the rest being the primary stack).
-   --  The special value of minus one indicates that the secondary
-   --  stack is to be allocated from the heap instead.
-
-   Sec_Stack_Percentage : constant Percentage := Dynamic;
-   --  This constant defines the handling of the secondary stack
-
-   Sec_Stack_Dynamic : constant Boolean := Sec_Stack_Percentage = Dynamic;
-   --  Convenient Boolean for testing for dynamic secondary stack
-
    function Default_Stack_Size return Size_Type;
    --  Default task stack size used if none is specified
 
@@ -101,6 +87,18 @@ package System.Parameters is
    --  down (True) in memory as functions are called. It is used for
    --  proper implementation of the stack overflow check.
 
+   Runtime_Default_Sec_Stack_Size : constant Size_Type := 10 * 1024;
+   --  The run-time chosen default size for secondary stacks that may be
+   --  overriden by the user with the use of binder -D switch.
+
+   function Default_Sec_Stack_Size return Size_Type;
+   --  The default initial size for secondary stacks that reflects any user
+   --  specified default via the binder -D switch.
+
+   Sec_Stack_Dynamic : constant Boolean := True;
+   --  Indicates if secondary stacks can grow and shrink at run-time. If False,
+   --  the size of a secondary stack is fixed at the point of its creation.
+
    ----------------------------------------------
    -- Characteristics of Types in Interfaces.C --
    ----------------------------------------------
diff --git a/gcc/ada/libgnat/s-parame__rtems.adb b/gcc/ada/libgnat/s-parame__rtems.adb
index aa131147eb6..5a19c4396da 100644
--- a/gcc/ada/libgnat/s-parame__rtems.adb
+++ b/gcc/ada/libgnat/s-parame__rtems.adb
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                 B o d y                                  --
 --                                                                          --
---          Copyright (C) 1997-2009 Free Software Foundation, Inc.          --
+--          Copyright (C) 1997-2017, Free Software Foundation, Inc.         --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -39,6 +39,35 @@ package body System.Parameters is
    pragma Import (C, ada_pthread_minimum_stack_size,
      "_ada_pthread_minimum_stack_size");
 
+   -------------------------
+   -- Adjust_Storage_Size --
+   -------------------------
+
+   function Adjust_Storage_Size (Size : Size_Type) return Size_Type is
+   begin
+      if Size = Unspecified_Size then
+         return Default_Stack_Size;
+
+      elsif Size < Minimum_Stack_Size then
+         return Minimum_Stack_Size;
+
+      else
+         return Size;
+      end if;
+   end Adjust_Storage_Size;
+
+   ----------------------------
+   -- Default_Sec_Stack_Size --
+   ----------------------------
+
+   function Default_Sec_Stack_Size return Size_Type is
+      Default_SS_Size : Integer;
+      pragma Import (C, Default_SS_Size,
+                     "__gnat_default_ss_size");
+   begin
+      return Size_Type (Default_SS_Size);
+   end Default_Sec_Stack_Size;
+
    ------------------------
    -- Default_Stack_Size --
    ------------------------
@@ -58,21 +87,4 @@ package body System.Parameters is
       return Size_Type (ada_pthread_minimum_stack_size);
    end Minimum_Stack_Size;
 
-   -------------------------
-   -- Adjust_Storage_Size --
-   -------------------------
-
-   function Adjust_Storage_Size (Size : Size_Type) return Size_Type is
-   begin
-      if Size = Unspecified_Size then
-         return Default_Stack_Size;
-
-      elsif Size < Minimum_Stack_Size then
-         return Minimum_Stack_Size;
-
-      else
-         return Size;
-      end if;
-   end Adjust_Storage_Size;
-
 end System.Parameters;
diff --git a/gcc/ada/libgnat/s-parame__vxworks.adb b/gcc/ada/libgnat/s-parame__vxworks.adb
index 325aa2e4f08..97d74b6932e 100644
--- a/gcc/ada/libgnat/s-parame__vxworks.adb
+++ b/gcc/ada/libgnat/s-parame__vxworks.adb
@@ -48,6 +48,18 @@ package body System.Parameters is
       end if;
    end Adjust_Storage_Size;
 
+   ----------------------------
+   -- Default_Sec_Stack_Size --
+   ----------------------------
+
+   function Default_Sec_Stack_Size return Size_Type is
+      Default_SS_Size : Integer;
+      pragma Import (C, Default_SS_Size,
+                     "__gnat_default_ss_size");
+   begin
+      return Size_Type (Default_SS_Size);
+   end Default_Sec_Stack_Size;
+
    ------------------------
    -- Default_Stack_Size --
    ------------------------
diff --git a/gcc/ada/libgnat/s-parame__vxworks.ads b/gcc/ada/libgnat/s-parame__vxworks.ads
index 919361ad10d..e395e017b05 100644
--- a/gcc/ada/libgnat/s-parame__vxworks.ads
+++ b/gcc/ada/libgnat/s-parame__vxworks.ads
@@ -62,20 +62,6 @@ package System.Parameters is
    Unspecified_Size : constant Size_Type := Size_Type'First;
    --  Value used to indicate that no size type is set
 
-   subtype Percentage is Size_Type range -1 .. 100;
-   Dynamic : constant Size_Type := -1;
-   --  The secondary stack ratio is a constant between 0 and 100 which
-   --  determines the percentage of the allocated task stack that is
-   --  used by the secondary stack (the rest being the primary stack).
-   --  The special value of minus one indicates that the secondary
-   --  stack is to be allocated from the heap instead.
-
-   Sec_Stack_Percentage : constant Percentage := Dynamic;
-   --  This constant defines the handling of the secondary stack
-
-   Sec_Stack_Dynamic : constant Boolean := Sec_Stack_Percentage = Dynamic;
-   --  Convenient Boolean for testing for dynamic secondary stack
-
    function Default_Stack_Size return Size_Type;
    --  Default task stack size used if none is specified
 
@@ -103,6 +89,18 @@ package System.Parameters is
    --  down (True) in memory as functions are called. It is used for
    --  proper implementation of the stack overflow check.
 
+   Runtime_Default_Sec_Stack_Size : constant Size_Type := 10 * 1024;
+   --  The run-time chosen default size for secondary stacks that may be
+   --  overriden by the user with the use of binder -D switch.
+
+   function Default_Sec_Stack_Size return Size_Type;
+   --  The default initial size for secondary stacks that reflects any user
+   --  specified default via the binder -D switch.
+
+   Sec_Stack_Dynamic : constant Boolean := True;
+   --  Indicates if secondary stacks can grow and shrink at run-time. If False,
+   --  the size of a secondary stack is fixed at the point of its creation.
+
    ----------------------------------------------
    -- Characteristics of types in Interfaces.C --
    ----------------------------------------------
diff --git a/gcc/ada/libgnat/s-secsta.adb b/gcc/ada/libgnat/s-secsta.adb
index 0449ee4dbcd..b39cf0dc33d 100644
--- a/gcc/ada/libgnat/s-secsta.adb
+++ b/gcc/ada/libgnat/s-secsta.adb
@@ -31,203 +31,65 @@
 
 pragma Compiler_Unit_Warning;
 
-with System.Soft_Links;
-with System.Parameters;
-
 with Ada.Unchecked_Conversion;
 with Ada.Unchecked_Deallocation;
+with System.Soft_Links;
 
 package body System.Secondary_Stack is
 
    package SSL renames System.Soft_Links;
 
-   use type SSE.Storage_Offset;
    use type System.Parameters.Size_Type;
 
-   SS_Ratio_Dynamic : constant Boolean :=
-                        Parameters.Sec_Stack_Percentage = Parameters.Dynamic;
-   --  There are two entirely different implementations of the secondary
-   --  stack mechanism in this unit, and this Boolean is used to select
-   --  between them (at compile time, so the generated code will contain
-   --  only the code for the desired variant). If SS_Ratio_Dynamic is
-   --  True, then the secondary stack is dynamically allocated from the
-   --  heap in a linked list of chunks. If SS_Ration_Dynamic is False,
-   --  then the secondary stack is allocated statically by grabbing a
-   --  section of the primary stack and using it for this purpose.
-
-   type Memory is array (SS_Ptr range <>) of SSE.Storage_Element;
-   for Memory'Alignment use Standard'Maximum_Alignment;
-   --  This is the type used for actual allocation of secondary stack
-   --  areas. We require maximum alignment for all such allocations.
-
-   ---------------------------------------------------------------
-   -- Data Structures for Dynamically Allocated Secondary Stack --
-   ---------------------------------------------------------------
-
-   --  The following is a diagram of the data structures used for the
-   --  case of a dynamically allocated secondary stack, where the stack
-   --  is allocated as a linked list of chunks allocated from the heap.
-
-   --                                      +------------------+
-   --                                      |       Next       |
-   --                                      +------------------+
-   --                                      |                  | Last (200)
-   --                                      |                  |
-   --                                      |                  |
-   --                                      |                  |
-   --                                      |                  |
-   --                                      |                  |
-   --                                      |                  | First (101)
-   --                                      +------------------+
-   --                         +----------> |          |       |
-   --                         |            +--------- | ------+
-   --                         |                    ^  |
-   --                         |                    |  |
-   --                         |                    |  V
-   --                         |            +------ | ---------+
-   --                         |            |       |          |
-   --                         |            +------------------+
-   --                         |            |                  | Last (100)
-   --                         |            |         C        |
-   --                         |            |         H        |
-   --    +-----------------+  |   +------->|         U        |
-   --    |  Current_Chunk ----+   |        |         N        |
-   --    +-----------------+      |        |         K        |
-   --    |       Top      --------+        |                  | First (1)
-   --    +-----------------+               +------------------+
-   --    | Default_Size    |               |       Prev       |
-   --    +-----------------+               +------------------+
-   --
-
-   type Chunk_Id (First, Last : SS_Ptr);
-   type Chunk_Ptr is access all Chunk_Id;
-
-   type Chunk_Id (First, Last : SS_Ptr) is record
-      Prev, Next : Chunk_Ptr;
-      Mem        : Memory (First .. Last);
-   end record;
-
-   type Stack_Id is record
-      Top           : SS_Ptr;
-      Default_Size  : SSE.Storage_Count;
-      Current_Chunk : Chunk_Ptr;
-   end record;
-
-   type Stack_Ptr is access Stack_Id;
-   --  Pointer to record used to represent a dynamically allocated secondary
-   --  stack descriptor for a secondary stack chunk.
-
    procedure Free is new Ada.Unchecked_Deallocation (Chunk_Id, Chunk_Ptr);
    --  Free a dynamically allocated chunk
 
-   function To_Stack_Ptr is new
-     Ada.Unchecked_Conversion (Address, Stack_Ptr);
-   function To_Addr is new
-     Ada.Unchecked_Conversion (Stack_Ptr, Address);
-   --  Convert to and from address stored in task data structures
-
-   --------------------------------------------------------------
-   -- Data Structures for Statically Allocated Secondary Stack --
-   --------------------------------------------------------------
-
-   --  For the static case, the secondary stack is a single contiguous
-   --  chunk of storage, carved out of the primary stack, and represented
-   --  by the following data structure
-
-   type Fixed_Stack_Id is record
-      Top : SS_Ptr;
-      --  Index of next available location in Mem. This is initialized to
-      --  0, and then incremented on Allocate, and Decremented on Release.
-
-      Last : SS_Ptr;
-      --  Length of usable Mem array, which is thus the index past the
-      --  last available location in Mem. Mem (Last-1) can be used. This
-      --  is used to check that the stack does not overflow.
-
-      Max : SS_Ptr;
-      --  Maximum value of Top. Initialized to 0, and then may be incremented
-      --  on Allocate, but is never Decremented. The last used location will
-      --  be Mem (Max - 1), so Max is the maximum count of used stack space.
-
-      Mem : Memory (0 .. 0);
-      --  This is the area that is actually used for the secondary stack.
-      --  Note that the upper bound is a dummy value properly defined by
-      --  the value of Last. We never actually allocate objects of type
-      --  Fixed_Stack_Id, so the bounds declared here do not matter.
-   end record;
-
-   Dummy_Fixed_Stack : Fixed_Stack_Id;
-   pragma Warnings (Off, Dummy_Fixed_Stack);
-   --  Well it is not quite true that we never allocate an object of the
-   --  type. This dummy object is allocated for the purpose of getting the
-   --  offset of the Mem field via the 'Position attribute (such a nuisance
-   --  that we cannot apply this to a field of a type).
-
-   type Fixed_Stack_Ptr is access Fixed_Stack_Id;
-   --  Pointer to record used to describe statically allocated sec stack
-
-   function To_Fixed_Stack_Ptr is new
-     Ada.Unchecked_Conversion (Address, Fixed_Stack_Ptr);
-   --  Convert from address stored in task data structures
-
-   ----------------------------------
-   -- Minimum_Secondary_Stack_Size --
-   ----------------------------------
-
-   function Minimum_Secondary_Stack_Size return Natural is
-   begin
-      return Dummy_Fixed_Stack.Mem'Position;
-   end Minimum_Secondary_Stack_Size;
-
-   --------------
-   -- Allocate --
-   --------------
+   -----------------
+   -- SS_Allocate --
+   -----------------
 
    procedure SS_Allocate
      (Addr         : out Address;
       Storage_Size : SSE.Storage_Count)
    is
-      Max_Align : constant SS_Ptr := SS_Ptr (Standard'Maximum_Alignment);
-      Max_Size  : constant SS_Ptr :=
-                    ((SS_Ptr (Storage_Size) + Max_Align - 1) / Max_Align) *
-                      Max_Align;
-
+      Max_Align   : constant SS_Ptr := SS_Ptr (Standard'Maximum_Alignment);
+      Mem_Request : constant SS_Ptr :=
+                      ((SS_Ptr (Storage_Size) + Max_Align - 1) / Max_Align) *
+                        Max_Align;
+      --  Round up Storage_Size to the nearest multiple of the max alignment
+      --  value for the target. This ensures efficient stack access.
+
+      Stack : constant SS_Stack_Ptr := SSL.Get_Sec_Stack.all;
    begin
-      --  Case of fixed allocation secondary stack
-
-      if not SS_Ratio_Dynamic then
-         declare
-            Fixed_Stack : constant Fixed_Stack_Ptr :=
-                            To_Fixed_Stack_Ptr (SSL.Get_Sec_Stack_Addr.all);
+      --  Case of fixed secondary stack
 
-         begin
-            --  Check if max stack usage is increasing
+      if not SP.Sec_Stack_Dynamic then
+         --  Check if max stack usage is increasing
 
-            if Fixed_Stack.Top + Max_Size > Fixed_Stack.Max then
+         if Stack.Top + Mem_Request > Stack.Max then
 
-               --  If so, check if max size is exceeded
+            --  If so, check if the stack is exceeded, noting Stack.Top points
+            --  to the first free byte (so the value of Stack.Top on a fully
+            --  allocated stack will be Stack.Size + 1).
 
-               if Fixed_Stack.Top + Max_Size > Fixed_Stack.Last then
-                  raise Storage_Error;
-               end if;
+            if Stack.Top + Mem_Request > Stack.Size + 1 then
+               raise Storage_Error;
+            end if;
 
-               --  Record new max usage
+            --  Record new max usage
 
-               Fixed_Stack.Max := Fixed_Stack.Top + Max_Size;
-            end if;
+            Stack.Max := Stack.Top + Mem_Request;
+         end if;
 
-            --  Set resulting address and update top of stack pointer
+         --  Set resulting address and update top of stack pointer
 
-            Addr := Fixed_Stack.Mem (Fixed_Stack.Top)'Address;
-            Fixed_Stack.Top := Fixed_Stack.Top + Max_Size;
-         end;
+         Addr := Stack.Internal_Chunk.Mem (Stack.Top)'Address;
+         Stack.Top := Stack.Top + Mem_Request;
 
-      --  Case of dynamically allocated secondary stack
+      --  Case of dynamic secondary stack
 
       else
          declare
-            Stack : constant Stack_Ptr :=
-                      To_Stack_Ptr (SSL.Get_Sec_Stack_Addr.all);
             Chunk : Chunk_Ptr;
 
             To_Be_Released_Chunk : Chunk_Ptr;
@@ -235,7 +97,7 @@ package body System.Secondary_Stack is
          begin
             Chunk := Stack.Current_Chunk;
 
-            --  The Current_Chunk may not be the good one if a lot of release
+            --  The Current_Chunk may not be the best one if a lot of release
             --  operations have taken place. Go down the stack if necessary.
 
             while Chunk.First > Stack.Top loop
@@ -246,7 +108,7 @@ package body System.Secondary_Stack is
             --  sufficient, if not, go to the next one and eventually create
             --  the necessary room.
 
-            while Chunk.Last - Stack.Top + 1 < Max_Size loop
+            while Chunk.Last - Stack.Top + 1 < Mem_Request loop
                if Chunk.Next /= null then
 
                   --  Release unused non-first empty chunk
@@ -262,11 +124,11 @@ package body System.Secondary_Stack is
                --  Create new chunk of default size unless it is not sufficient
                --  to satisfy the current request.
 
-               elsif SSE.Storage_Count (Max_Size) <= Stack.Default_Size then
+               elsif Mem_Request <= Stack.Size then
                   Chunk.Next :=
                     new Chunk_Id
                       (First => Chunk.Last + 1,
-                       Last  => Chunk.Last + SS_Ptr (Stack.Default_Size));
+                       Last  => Chunk.Last + SS_Ptr (Stack.Size));
 
                   Chunk.Next.Prev := Chunk;
 
@@ -276,7 +138,7 @@ package body System.Secondary_Stack is
                   Chunk.Next :=
                     new Chunk_Id
                       (First => Chunk.Last + 1,
-                       Last  => Chunk.Last + Max_Size);
+                       Last  => Chunk.Last + Mem_Request);
 
                   Chunk.Next.Prev := Chunk;
                end if;
@@ -288,8 +150,15 @@ package body System.Secondary_Stack is
             --  Resulting address is the address pointed by Stack.Top
 
             Addr                := Chunk.Mem (Stack.Top)'Address;
-            Stack.Top           := Stack.Top + Max_Size;
+            Stack.Top           := Stack.Top + Mem_Request;
             Stack.Current_Chunk := Chunk;
+
+            --  Record new max usage
+
+            if Stack.Top > Stack.Max then
+               Stack.Max := Stack.Top;
+            end if;
+
          end;
       end if;
    end SS_Allocate;
@@ -298,40 +167,39 @@ package body System.Secondary_Stack is
    -- SS_Free --
    -------------
 
-   procedure SS_Free (Stk : in out Address) is
+   procedure SS_Free (Stack : in out SS_Stack_Ptr) is
+      procedure Free is
+         new Ada.Unchecked_Deallocation (SS_Stack, SS_Stack_Ptr);
    begin
-      --  Case of statically allocated secondary stack, nothing to free
-
-      if not SS_Ratio_Dynamic then
-         return;
+      --  If using dynamic secondary stack, free any external chunks
 
-      --  Case of dynamically allocated secondary stack
-
-      else
+      if SP.Sec_Stack_Dynamic then
          declare
-            Stack : Stack_Ptr := To_Stack_Ptr (Stk);
             Chunk : Chunk_Ptr;
 
             procedure Free is
-              new Ada.Unchecked_Deallocation (Stack_Id, Stack_Ptr);
+              new Ada.Unchecked_Deallocation (Chunk_Id, Chunk_Ptr);
 
          begin
             Chunk := Stack.Current_Chunk;
 
-            while Chunk.Prev /= null loop
-               Chunk := Chunk.Prev;
-            end loop;
+            --  Go to top of linked list and free backwards. Do not free the
+            --  internal chunk as it is part of SS_Stack.
 
             while Chunk.Next /= null loop
                Chunk := Chunk.Next;
-               Free (Chunk.Prev);
             end loop;
 
-            Free (Chunk);
-            Free (Stack);
-            Stk := Null_Address;
+            while Chunk.Prev /= null loop
+               Chunk := Chunk.Prev;
+               Free (Chunk.Next);
+            end loop;
          end;
       end if;
+
+      if Stack.Freeable then
+         Free (Stack);
+      end if;
    end SS_Free;
 
    ----------------
@@ -339,17 +207,13 @@ package body System.Secondary_Stack is
    ----------------
 
    function SS_Get_Max return Long_Long_Integer is
+      Stack : constant SS_Stack_Ptr := SSL.Get_Sec_Stack.all;
    begin
-      if SS_Ratio_Dynamic then
-         return -1;
-      else
-         declare
-            Fixed_Stack : constant Fixed_Stack_Ptr :=
-                            To_Fixed_Stack_Ptr (SSL.Get_Sec_Stack_Addr.all);
-         begin
-            return Long_Long_Integer (Fixed_Stack.Max);
-         end;
-      end if;
+      --  Stack.Max points to the first untouched byte in the stack, thus the
+      --  maximum number of bytes that have been allocated on the stack is one
+      --  less the value of Stack.Max.
+
+      return Long_Long_Integer (Stack.Max - 1);
    end SS_Get_Max;
 
    -------------
@@ -357,32 +221,25 @@ package body System.Secondary_Stack is
    -------------
 
    procedure SS_Info is
+      Stack : constant SS_Stack_Ptr := SSL.Get_Sec_Stack.all;
    begin
       Put_Line ("Secondary Stack information:");
 
       --  Case of fixed secondary stack
 
-      if not SS_Ratio_Dynamic then
-         declare
-            Fixed_Stack : constant Fixed_Stack_Ptr :=
-                            To_Fixed_Stack_Ptr (SSL.Get_Sec_Stack_Addr.all);
-
-         begin
-            Put_Line ("  Total size              : "
-                      & SS_Ptr'Image (Fixed_Stack.Last)
-                      & " bytes");
+      if not SP.Sec_Stack_Dynamic then
+         Put_Line ("  Total size              : "
+                   & SS_Ptr'Image (Stack.Size)
+                   & " bytes");
 
-            Put_Line ("  Current allocated space : "
-                      & SS_Ptr'Image (Fixed_Stack.Top)
-                      & " bytes");
-         end;
+         Put_Line ("  Current allocated space : "
+                   & SS_Ptr'Image (Stack.Top - 1)
+                   & " bytes");
 
-      --  Case of dynamically allocated secondary stack
+      --  Case of dynamic secondary stack
 
       else
          declare
-            Stack     : constant Stack_Ptr :=
-                          To_Stack_Ptr (SSL.Get_Sec_Stack_Addr.all);
             Nb_Chunks : Integer   := 1;
             Chunk     : Chunk_Ptr := Stack.Current_Chunk;
 
@@ -414,7 +271,7 @@ package body System.Secondary_Stack is
                       & Integer'Image (Nb_Chunks));
 
             Put_Line ("  Default size of Chunks : "
-                      & SSE.Storage_Count'Image (Stack.Default_Size));
+                      & SP.Size_Type'Image (Stack.Size));
          end;
       end if;
    end SS_Info;
@@ -424,42 +281,86 @@ package body System.Secondary_Stack is
    -------------
 
    procedure SS_Init
-     (Stk  : in out Address;
-      Size : Natural := Default_Secondary_Stack_Size)
+     (Stack : in out SS_Stack_Ptr;
+      Size  : SP.Size_Type := SP.Unspecified_Size)
    is
-   begin
-      --  Case of fixed size secondary stack
-
-      if not SS_Ratio_Dynamic then
-         declare
-            Fixed_Stack : constant Fixed_Stack_Ptr :=
-                            To_Fixed_Stack_Ptr (Stk);
-
-         begin
-            Fixed_Stack.Top  := 0;
-            Fixed_Stack.Max  := 0;
-
-            if Size <= Dummy_Fixed_Stack.Mem'Position then
-               Fixed_Stack.Last := 0;
-            else
-               Fixed_Stack.Last :=
-                 SS_Ptr (Size) - Dummy_Fixed_Stack.Mem'Position;
-            end if;
-         end;
-
-      --  Case of dynamically allocated secondary stack
+      use Parameters;
 
-      else
-         declare
-            Stack : Stack_Ptr;
-         begin
-            Stack               := new Stack_Id;
-            Stack.Current_Chunk := new Chunk_Id (1, SS_Ptr (Size));
-            Stack.Top           := 1;
-            Stack.Default_Size  := SSE.Storage_Count (Size);
-            Stk := To_Addr (Stack);
-         end;
+      Stack_Size : Size_Type;
+   begin
+      --  If Stack is not null then the stack has been allocated outside the
+      --  package (by the compiler or the user) and all that is left to do is
+      --  initialize the stack. Otherwise, SS_Init will allocate a secondary
+      --  stack from either the heap or the default-sized secondary stack pool
+      --  generated by the binder. In the later case, this pool is generated
+      --  only when the either No_Implicit_Heap_Allocations
+      --  or No_Implicit_Task_Allocations are active, and SS_Init will allocate
+      --  all requests for a secondary stack of Unspecified_Size from this
+      --  pool.
+
+      if Stack = null then
+         if Size = Unspecified_Size then
+            Stack_Size := Default_Sec_Stack_Size;
+         else
+            Stack_Size := Size;
+         end if;
+
+         if Size = Unspecified_Size
+           and then Binder_SS_Count > 0
+           and then Num_Of_Assigned_Stacks < Binder_SS_Count
+         then
+            --  The default-sized secondary stack pool is passed from the
+            --  binder to this package as an Address since it is not possible
+            --  to have a pointer to an array of unconstrained objects. A
+            --  pointer to the pool is obtainable via an unchecked conversion
+            --  to a constrained array of SS_Stacks that mirrors the one used
+            --  by the binder.
+
+            --  However, Ada understandably does not allow a local pointer to
+            --  a stack in the pool to be stored in a pointer outside of this
+            --  scope. While the conversion is safe in this case, since a view
+            --  of a global object is being used, using Unchecked_Access
+            --  would prevent users from specifying the restriction
+            --  No_Unchecked_Access whenever the secondary stack is used. As
+            --  a workaround, the local stack pointer is converted to a global
+            --  pointer via System.Address.
+
+            declare
+               type Stk_Pool_Array is array (1 .. Binder_SS_Count) of
+                 aliased SS_Stack (Default_SS_Size);
+               type Stk_Pool_Access is access Stk_Pool_Array;
+
+               function To_Stack_Pool is new
+                 Ada.Unchecked_Conversion (Address, Stk_Pool_Access);
+
+               pragma Warnings (Off);
+               function To_Global_Ptr is new
+                 Ada.Unchecked_Conversion (Address, SS_Stack_Ptr);
+               pragma Warnings (On);
+               --  Suppress aliasing warning since the pointer we return will
+               --  be the only access to the stack.
+
+               Local_Stk_Address : System.Address;
+
+            begin
+               Num_Of_Assigned_Stacks := Num_Of_Assigned_Stacks + 1;
+
+               Local_Stk_Address :=
+                 To_Stack_Pool
+                   (Default_Sized_SS_Pool) (Num_Of_Assigned_Stacks)'Address;
+               Stack := To_Global_Ptr (Local_Stk_Address);
+            end;
+
+            Stack.Freeable := False;
+         else
+            Stack := new SS_Stack (Stack_Size);
+            Stack.Freeable := True;
+         end if;
       end if;
+
+      Stack.Top := 1;
+      Stack.Max := 1;
+      Stack.Current_Chunk := Stack.Internal_Chunk'Access;
    end SS_Init;
 
    -------------
@@ -467,13 +368,9 @@ package body System.Secondary_Stack is
    -------------
 
    function SS_Mark return Mark_Id is
-      Sstk : constant System.Address := SSL.Get_Sec_Stack_Addr.all;
+      Stack : constant SS_Stack_Ptr := SSL.Get_Sec_Stack.all;
    begin
-      if SS_Ratio_Dynamic then
-         return (Sstk => Sstk, Sptr => To_Stack_Ptr (Sstk).Top);
-      else
-         return (Sstk => Sstk, Sptr => To_Fixed_Stack_Ptr (Sstk).Top);
-      end if;
+      return (Sec_Stack => Stack, Sptr => Stack.Top);
    end SS_Mark;
 
    ----------------
@@ -482,66 +379,7 @@ package body System.Secondary_Stack is
 
    procedure SS_Release (M : Mark_Id) is
    begin
-      if SS_Ratio_Dynamic then
-         To_Stack_Ptr (M.Sstk).Top := M.Sptr;
-      else
-         To_Fixed_Stack_Ptr (M.Sstk).Top := M.Sptr;
-      end if;
+      M.Sec_Stack.Top := M.Sptr;
    end SS_Release;
 
-   -------------------------
-   -- Package Elaboration --
-   -------------------------
-
-   --  Allocate a secondary stack for the main program to use
-
-   --  We make sure that the stack has maximum alignment. Some systems require
-   --  this (e.g. Sparc), and in any case it is a good idea for efficiency.
-
-   Stack : aliased Stack_Id;
-   for Stack'Alignment use Standard'Maximum_Alignment;
-
-   Static_Secondary_Stack_Size : constant := 10 * 1024;
-   --  Static_Secondary_Stack_Size must be static so that Chunk is allocated
-   --  statically, and not via dynamic memory allocation.
-
-   Chunk : aliased Chunk_Id (1, Static_Secondary_Stack_Size);
-   for Chunk'Alignment use Standard'Maximum_Alignment;
-   --  Default chunk used, unless gnatbind -D is specified with a value greater
-   --  than Static_Secondary_Stack_Size.
-
-begin
-   declare
-      Chunk_Address : Address;
-      Chunk_Access  : Chunk_Ptr;
-
-   begin
-      if Default_Secondary_Stack_Size <= Static_Secondary_Stack_Size then
-
-         --  Normally we allocate the secondary stack for the main program
-         --  statically, using the default secondary stack size.
-
-         Chunk_Access := Chunk'Access;
-
-      else
-         --  Default_Secondary_Stack_Size was increased via gnatbind -D, so we
-         --  need to allocate a chunk dynamically.
-
-         Chunk_Access :=
-           new Chunk_Id (1, SS_Ptr (Default_Secondary_Stack_Size));
-      end if;
-
-      if SS_Ratio_Dynamic then
-         Stack.Top           := 1;
-         Stack.Current_Chunk := Chunk_Access;
-         Stack.Default_Size  :=
-           SSE.Storage_Offset (Default_Secondary_Stack_Size);
-         System.Soft_Links.Set_Sec_Stack_Addr_NT (Stack'Address);
-
-      else
-         Chunk_Address := Chunk_Access.all'Address;
-         SS_Init (Chunk_Address, Default_Secondary_Stack_Size);
-         System.Soft_Links.Set_Sec_Stack_Addr_NT (Chunk_Address);
-      end if;
-   end;
 end System.Secondary_Stack;
diff --git a/gcc/ada/libgnat/s-secsta.ads b/gcc/ada/libgnat/s-secsta.ads
index 534708d1a6f..ae5ec888453 100644
--- a/gcc/ada/libgnat/s-secsta.ads
+++ b/gcc/ada/libgnat/s-secsta.ads
@@ -31,41 +31,27 @@
 
 pragma Compiler_Unit_Warning;
 
+with System.Parameters;
 with System.Storage_Elements;
 
 package System.Secondary_Stack is
+   pragma Preelaborate;
 
+   package SP renames System.Parameters;
    package SSE renames System.Storage_Elements;
 
-   Default_Secondary_Stack_Size : Natural := 10 * 1024;
-   --  Default size of a secondary stack. May be modified by binder -D switch
-   --  which causes the binder to generate an appropriate assignment in the
-   --  binder generated file.
+   type SS_Stack (Size : SP.Size_Type) is private;
+   --  Data structure for secondary stacks
 
-   function Minimum_Secondary_Stack_Size return Natural;
-   --  The minimum size of the secondary stack so that the internal
-   --  requirements of the stack are met.
+   type SS_Stack_Ptr is access all SS_Stack;
+   --  Pointer to secondary stack objects
 
    procedure SS_Init
-     (Stk  : in out Address;
-      Size : Natural := Default_Secondary_Stack_Size);
-   --  Initialize the secondary stack with a main stack of the given Size.
-   --
-   --  If System.Parameters.Sec_Stack_Percentage equals Dynamic, Stk is really
-   --  an OUT parameter that will be allocated on the heap. Then all further
-   --  allocations which do not overflow the main stack will not generate
-   --  dynamic (de)allocation calls. If the main Stack overflows, a new
-   --  chuck of at least the same size will be allocated and linked to the
-   --  previous chunk.
-   --
-   --  Otherwise (Sec_Stack_Percentage between 0 and 100), Stk is an IN
-   --  parameter that is already pointing to a Stack_Id. The secondary stack
-   --  in this case is fixed, and any attempt to allocate more than the initial
-   --  size will result in a Storage_Error being raised.
-   --
-   --  Note: the reason that Stk is passed is that SS_Init is called before
-   --  the proper interface is established to obtain the address of the
-   --  stack using System.Soft_Links.Get_Sec_Stack_Addr.
+     (Stack : in out SS_Stack_Ptr;
+      Size  : SP.Size_Type := SP.Unspecified_Size);
+   --  Initialize the secondary stack Stack. If Stack is null allocate a stack
+   --  from the heap or from the default-sized secondary stack pool if the
+   --  pool exists and the requested size is Unspecified_Size.
 
    procedure SS_Allocate
      (Addr         : out Address;
@@ -73,10 +59,9 @@ package System.Secondary_Stack is
    --  Allocate enough space for a 'Storage_Size' bytes object with Maximum
    --  alignment. The address of the allocated space is returned in Addr.
 
-   procedure SS_Free (Stk : in out Address);
-   --  Release the memory allocated for the Secondary Stack. That is
-   --  to say, all the allocated chunks. Upon return, Stk will be set
-   --  to System.Null_Address.
+   procedure SS_Free (Stack : in out SS_Stack_Ptr);
+   --  Release the memory allocated for the Stack. If the stack was statically
+   --  allocated the SS_Stack record is not freed.
 
    type Mark_Id is private;
    --  Type used to mark the stack for mark/release processing
@@ -85,17 +70,11 @@ package System.Secondary_Stack is
    --  Return the Mark corresponding to the current state of the stack
 
    procedure SS_Release (M : Mark_Id);
-   --  Restore the state of the stack corresponding to the mark M. If an
-   --  additional chunk have been allocated, it will never be freed during a
-   --  ??? missing comment here
+   --  Restore the state of the stack corresponding to the mark M
 
    function SS_Get_Max return Long_Long_Integer;
-   --  Return maximum used space in storage units for the current secondary
-   --  stack. For a dynamically allocated secondary stack, the returned
-   --  result is always -1. For a statically allocated secondary stack,
-   --  the returned value shows the largest amount of space allocated so
-   --  far during execution of the program to the current secondary stack,
-   --  i.e. the secondary stack for the current task.
+   --  Return the high water mark of the secondary stack for the current
+   --  secondary stack in bytes.
 
    generic
       with procedure Put_Line (S : String);
@@ -109,15 +88,142 @@ private
    --  Unused entity that is just present to ease the sharing of the pool
    --  mechanism for specific allocation/deallocation in the compiler
 
-   type SS_Ptr is new SSE.Integer_Address;
-   --  Stack pointer value for secondary stack
+   -------------------------------------
+   -- Secondary Stack Data Structures --
+   -------------------------------------
+
+   --  This package provides fixed and dynamically sized secondary stack
+   --  implementations centered around a common data structure SS_Stack. This
+   --  record contains an initial secondary stack allocation of the requested
+   --  size, and markers for the current top of the stack and the high-water
+   --  mark of the stack. A SS_Stack can be either pre-allocated outside the
+   --  package or SS_Init can allocate a stack from the heap or the
+   --  default-sized secondary stack from a pool generated by the binder.
+
+   --  For dynamically allocated secondary stacks, the stack can grow via a
+   --  linked list of stack chunks allocated from the heap. New chunks are
+   --  allocated once the initial static allocation and any existing chunks are
+   --  exhausted. The following diagram illustrated the data structures used
+   --  for a dynamically allocated secondary stack:
+   --
+   --                                       +------------------+
+   --                                       |       Next       |
+   --                                       +------------------+
+   --                                       |                  | Last (300)
+   --                                       |                  |
+   --                                       |                  |
+   --                                       |                  |
+   --                                       |                  |
+   --                                       |                  |
+   --                                       |                  | First (201)
+   --                                       +------------------+
+   --    +-----------------+       +------> |          |       |
+   --    |                 | (100) |        +--------- | ------+
+   --    |                 |       |                ^  |
+   --    |                 |       |                |  |
+   --    |                 |       |                |  V
+   --    |                 |       |        +------ | ---------+
+   --    |                 |       |        |       |          |
+   --    |                 |       |        +------------------+
+   --    |                 |       |        |                  | Last (200)
+   --    |                 |       |        |         C        |
+   --    |                 | (1)   |        |         H        |
+   --    +-----------------+       |  +---->|         U        |
+   --    |  Current_Chunk ---------+  |     |         N        |
+   --    +-----------------+          |     |         K        |
+   --    |       Top      ------------+     |                  | First (101)
+   --    +-----------------+                +------------------+
+   --    |       Size      |                |       Prev       |
+   --    +-----------------+                +------------------+
+   --
+   --  The implementation used by the runtime is controlled via the constant
+   --  System.Parameter.Sec_Stack_Dynamic. If True, the implementation is
+   --  permitted to grow the secondary stack at runtime. The implementation is
+   --  designed for the compiler to include only code to support the desired
+   --  secondary stack behavior.
+
+   subtype SS_Ptr is SP.Size_Type;
+   --  Stack pointer value for the current position within the secondary stack.
+   --  Size_Type is used as the base type since the Size discriminate of
+   --  SS_Stack forms the bounds of the internal memory array.
+
+   type Memory is array (SS_Ptr range <>) of SSE.Storage_Element;
+   for Memory'Alignment use Standard'Maximum_Alignment;
+   --  The region of memory that holds the stack itself. Requires maximum
+   --  alignment for efficient stack operations.
+
+   --  Chunk_Id
+
+   --  Chunk_Id is a contiguous block of dynamically allocated stack. First
+   --  and Last indicate the range of secondary stack addresses present in the
+   --  chunk. Chunk_Ptr points to a Chunk_Id block.
+
+   type Chunk_Id (First, Last : SS_Ptr);
+   type Chunk_Ptr is access all Chunk_Id;
+
+   type Chunk_Id (First, Last : SS_Ptr) is record
+      Prev, Next : Chunk_Ptr;
+      Mem        : Memory (First .. Last);
+   end record;
+
+   --  Secondary stack data structure
+
+   type SS_Stack (Size : SP.Size_Type) is record
+      Top : SS_Ptr;
+      --  Index of next available location in the stack. Initialized to 1 and
+      --  then incremented on Allocate and decremented on Release.
+
+      Max : SS_Ptr;
+      --  Contains the high-water mark of Top. Initialized to 1 and then
+      --  may be incremented on Allocate but never decremented. Since
+      --  Top = Size + 1 represents a fully used stack, Max - 1 indicates
+      --  the size of the stack used in bytes.
+
+      Current_Chunk : Chunk_Ptr;
+      --  A link to the chunk containing the highest range of the stack
+
+      Freeable : Boolean;
+      --  Indicates if an object of this type can be freed
+
+      Internal_Chunk : aliased Chunk_Id (1, Size);
+      --  Initial memory allocation of the secondary stack
+   end record;
 
    type Mark_Id is record
-      Sstk : System.Address;
-      Sptr : SS_Ptr;
+      Sec_Stack : SS_Stack_Ptr;
+      Sptr      : SS_Ptr;
    end record;
-   --  A mark value contains the address of the secondary stack structure,
-   --  as returned by System.Soft_Links.Get_Sec_Stack_Addr, and a stack
-   --  pointer value corresponding to the point of the mark call.
+   --  Contains the pointer to the secondary stack object and the stack pointer
+   --  value corresponding to the top of the stack at the time of the mark
+   --  call.
+
+   ------------------------------------
+   -- Binder Allocated Stack Support --
+   ------------------------------------
+
+   --  When the No_Implicit_Heap_Allocations or No_Implicit_Task_Allocations
+   --  restrictions are in effect the binder statically generates secondary
+   --  stacks for tasks who are using default-sized secondary stack. Assignment
+   --  of these stacks to tasks is handled by SS_Init. The following variables
+   --  assist SS_Init and are defined here so the runtime does not depend on
+   --  the binder.
+
+   Binder_SS_Count : Natural;
+   pragma Export (Ada, Binder_SS_Count, "__gnat_binder_ss_count");
+   --  The number of default sized secondary stacks allocated by the binder
+
+   Default_SS_Size : SP.Size_Type;
+   pragma Export (Ada, Default_SS_Size, "__gnat_default_ss_size");
+   --  The default size for secondary stacks. Defined here and not in init.c/
+   --  System.Init because these locations are not present on ZFP or
+   --  Ravenscar-SFP run-times.
+
+   Default_Sized_SS_Pool : System.Address;
+   pragma Export (Ada, Default_Sized_SS_Pool, "__gnat_default_ss_pool");
+   --  Address to the secondary stack pool generated by the binder that
+   --  contains default sized stacks.
+
+   Num_Of_Assigned_Stacks : Natural := 0;
+   --  The number of currently allocated secondary stacks
 
 end System.Secondary_Stack;
diff --git a/gcc/ada/libgnat/s-soflin.adb b/gcc/ada/libgnat/s-soflin.adb
index f604f4df3be..94ead0306fa 100644
--- a/gcc/ada/libgnat/s-soflin.adb
+++ b/gcc/ada/libgnat/s-soflin.adb
@@ -35,25 +35,19 @@ pragma Polling (Off);
 --  We must turn polling off for this unit, because otherwise we get an
 --  infinite loop from the code within the Poll routine itself.
 
-with System.Parameters;
-
 pragma Warnings (Off);
---  Disable warnings since System.Secondary_Stack is currently not Preelaborate
-with System.Secondary_Stack;
+--  Disable warnings as System.Soft_Links.Initialize is not Preelaborate. It is
+--  safe to with this unit as its elaboration routine will only be initializing
+--  NT_TSD, which is part of this package spec.
+with System.Soft_Links.Initialize;
 pragma Warnings (On);
 
 package body System.Soft_Links is
 
-   package SST renames System.Secondary_Stack;
-
-   NT_TSD : TSD;
-   --  Note: we rely on the default initialization of NT_TSD
-
-   --  Needed for Vx6Cert (Vx653mc) GOS cert and ravenscar-cert runtimes,
-   --  VxMILS cert, ravenscar-cert and full runtimes, Vx 5 default runtime
    Stack_Limit : aliased System.Address := System.Null_Address;
-
    pragma Export (C, Stack_Limit, "__gnat_stack_limit");
+   --  Needed for Vx6Cert (Vx653mc) GOS cert and ravenscar-cert runtimes,
+   --  VxMILS cert, ravenscar-cert and full runtimes, Vx 5 default runtime
 
    --------------------
    -- Abort_Defer_NT --
@@ -125,14 +119,16 @@ package body System.Soft_Links is
    -- Create_TSD --
    ----------------
 
-   procedure Create_TSD (New_TSD : in out TSD) is
-      use Parameters;
-      SS_Ratio_Dynamic : constant Boolean := Sec_Stack_Percentage = Dynamic;
+   procedure Create_TSD
+     (New_TSD        : in out TSD;
+      Sec_Stack      : SST.SS_Stack_Ptr;
+      Sec_Stack_Size : System.Parameters.Size_Type)
+   is
    begin
-      if SS_Ratio_Dynamic then
-         SST.SS_Init
-           (New_TSD.Sec_Stack_Addr, SST.Default_Secondary_Stack_Size);
-      end if;
+      New_TSD.Jmpbuf_Address := Null_Address;
+
+      New_TSD.Sec_Stack_Ptr := Sec_Stack;
+      SST.SS_Init (New_TSD.Sec_Stack_Ptr, Sec_Stack_Size);
    end Create_TSD;
 
    -----------------------
@@ -150,7 +146,7 @@ package body System.Soft_Links is
 
    procedure Destroy_TSD (Old_TSD : in out TSD) is
    begin
-      SST.SS_Free (Old_TSD.Sec_Stack_Addr);
+      SST.SS_Free (Old_TSD.Sec_Stack_Ptr);
    end Destroy_TSD;
 
    ---------------------
@@ -198,23 +194,23 @@ package body System.Soft_Links is
       return Get_Jmpbuf_Address.all;
    end Get_Jmpbuf_Address_Soft;
 
-   ---------------------------
-   -- Get_Sec_Stack_Addr_NT --
-   ---------------------------
+   ----------------------
+   -- Get_Sec_Stack_NT --
+   ----------------------
 
-   function Get_Sec_Stack_Addr_NT return  Address is
+   function Get_Sec_Stack_NT return SST.SS_Stack_Ptr is
    begin
-      return NT_TSD.Sec_Stack_Addr;
-   end Get_Sec_Stack_Addr_NT;
+      return NT_TSD.Sec_Stack_Ptr;
+   end Get_Sec_Stack_NT;
 
    -----------------------------
-   -- Get_Sec_Stack_Addr_Soft --
+   -- Get_Sec_Stack_Soft --
    -----------------------------
 
-   function Get_Sec_Stack_Addr_Soft return  Address is
+   function Get_Sec_Stack_Soft return SST.SS_Stack_Ptr is
    begin
-      return Get_Sec_Stack_Addr.all;
-   end Get_Sec_Stack_Addr_Soft;
+      return Get_Sec_Stack.all;
+   end Get_Sec_Stack_Soft;
 
    -----------------------
    -- Get_Stack_Info_NT --
@@ -254,23 +250,23 @@ package body System.Soft_Links is
       Set_Jmpbuf_Address (Addr);
    end Set_Jmpbuf_Address_Soft;
 
-   ---------------------------
-   -- Set_Sec_Stack_Addr_NT --
-   ---------------------------
+   ----------------------
+   -- Set_Sec_Stack_NT --
+   ----------------------
 
-   procedure Set_Sec_Stack_Addr_NT (Addr : Address) is
+   procedure Set_Sec_Stack_NT (Stack : SST.SS_Stack_Ptr) is
    begin
-      NT_TSD.Sec_Stack_Addr := Addr;
-   end Set_Sec_Stack_Addr_NT;
+      NT_TSD.Sec_Stack_Ptr := Stack;
+   end Set_Sec_Stack_NT;
 
-   -----------------------------
-   -- Set_Sec_Stack_Addr_Soft --
-   -----------------------------
+   ------------------------
+   -- Set_Sec_Stack_Soft --
+   ------------------------
 
-   procedure Set_Sec_Stack_Addr_Soft (Addr : Address) is
+   procedure Set_Sec_Stack_Soft (Stack : SST.SS_Stack_Ptr) is
    begin
-      Set_Sec_Stack_Addr (Addr);
-   end Set_Sec_Stack_Addr_Soft;
+      Set_Sec_Stack (Stack);
+   end Set_Sec_Stack_Soft;
 
    ------------------
    -- Task_Lock_NT --
@@ -308,5 +304,4 @@ package body System.Soft_Links is
    begin
       null;
    end Task_Unlock_NT;
-
 end System.Soft_Links;
diff --git a/gcc/ada/libgnat/s-soflin.ads b/gcc/ada/libgnat/s-soflin.ads
index 402ea84818b..4242fcee7ee 100644
--- a/gcc/ada/libgnat/s-soflin.ads
+++ b/gcc/ada/libgnat/s-soflin.ads
@@ -40,11 +40,15 @@
 pragma Compiler_Unit_Warning;
 
 with Ada.Exceptions;
+with System.Parameters;
+with System.Secondary_Stack;
 with System.Stack_Checking;
 
 package System.Soft_Links is
    pragma Preelaborate;
 
+   package SST renames System.Secondary_Stack;
+
    subtype EOA is Ada.Exceptions.Exception_Occurrence_Access;
    subtype EO is Ada.Exceptions.Exception_Occurrence;
 
@@ -89,6 +93,11 @@ package System.Soft_Links is
    type Set_EO_Call       is access procedure (Excep : EO);
    pragma Favor_Top_Level (Set_EO_Call);
 
+   type Get_Stack_Call    is access function return SST.SS_Stack_Ptr;
+   pragma Favor_Top_Level (Get_Stack_Call);
+   type Set_Stack_Call    is access procedure (Stack : SST.SS_Stack_Ptr);
+   pragma Favor_Top_Level (Set_Stack_Call);
+
    type Special_EO_Call   is access
      procedure (Excep : EO := Current_Target_Exception);
    pragma Favor_Top_Level (Special_EO_Call);
@@ -118,6 +127,8 @@ package System.Soft_Links is
    pragma Suppress (Access_Check, Set_Integer_Call);
    pragma Suppress (Access_Check, Get_EOA_Call);
    pragma Suppress (Access_Check, Set_EOA_Call);
+   pragma Suppress (Access_Check, Get_Stack_Call);
+   pragma Suppress (Access_Check, Set_Stack_Call);
    pragma Suppress (Access_Check, Timed_Delay_Call);
    pragma Suppress (Access_Check, Get_Stack_Access_Call);
    pragma Suppress (Access_Check, Task_Name_Call);
@@ -228,11 +239,11 @@ package System.Soft_Links is
    Get_Jmpbuf_Address : Get_Address_Call := Get_Jmpbuf_Address_NT'Access;
    Set_Jmpbuf_Address : Set_Address_Call := Set_Jmpbuf_Address_NT'Access;
 
-   function  Get_Sec_Stack_Addr_NT return  Address;
-   procedure Set_Sec_Stack_Addr_NT (Addr : Address);
+   function  Get_Sec_Stack_NT return  SST.SS_Stack_Ptr;
+   procedure Set_Sec_Stack_NT (Stack : SST.SS_Stack_Ptr);
 
-   Get_Sec_Stack_Addr : Get_Address_Call := Get_Sec_Stack_Addr_NT'Access;
-   Set_Sec_Stack_Addr : Set_Address_Call := Set_Sec_Stack_Addr_NT'Access;
+   Get_Sec_Stack : Get_Stack_Call := Get_Sec_Stack_NT'Access;
+   Set_Sec_Stack : Set_Stack_Call := Set_Sec_Stack_NT'Access;
 
    function Get_Current_Excep_NT return EOA;
 
@@ -320,19 +331,14 @@ package System.Soft_Links is
       --  must be initialized to the tasks requested stack size before the task
       --  can do its first stack check.
 
-      pragma Warnings (Off);
-      --  Needed because we are giving a non-static default to an object in
-      --  a preelaborated unit, which is formally not permitted, but OK here.
-
-      Jmpbuf_Address : System.Address := System.Null_Address;
+      Jmpbuf_Address : System.Address;
       --  Address of jump buffer used to store the address of the current
       --  longjmp/setjmp buffer for exception management. These buffers are
       --  threaded into a stack, and the address here is the top of the stack.
       --  A null address means that no exception handler is currently active.
 
-      Sec_Stack_Addr : System.Address := System.Null_Address;
-      pragma Warnings (On);
-      --  Address of currently allocated secondary stack
+      Sec_Stack_Ptr : SST.SS_Stack_Ptr;
+      --  Pointer of the allocated secondary stack
 
       Current_Excep : aliased EO;
       --  Exception occurrence that contains the information for the current
@@ -344,7 +350,10 @@ package System.Soft_Links is
       --  exception mechanism, organized as a stack with the most recent first.
    end record;
 
-   procedure Create_TSD (New_TSD : in out TSD);
+   procedure Create_TSD
+     (New_TSD        : in out TSD;
+      Sec_Stack      : SST.SS_Stack_Ptr;
+      Sec_Stack_Size : System.Parameters.Size_Type);
    pragma Inline (Create_TSD);
    --  Called from s-tassta when a new thread is created to perform
    --  any required initialization of the TSD.
@@ -370,10 +379,10 @@ package System.Soft_Links is
    pragma Inline (Get_Jmpbuf_Address_Soft);
    pragma Inline (Set_Jmpbuf_Address_Soft);
 
-   function  Get_Sec_Stack_Addr_Soft return  Address;
-   procedure Set_Sec_Stack_Addr_Soft (Addr : Address);
-   pragma Inline (Get_Sec_Stack_Addr_Soft);
-   pragma Inline (Set_Sec_Stack_Addr_Soft);
+   function  Get_Sec_Stack_Soft return  SST.SS_Stack_Ptr;
+   procedure Set_Sec_Stack_Soft (Stack : SST.SS_Stack_Ptr);
+   pragma Inline (Get_Sec_Stack_Soft);
+   pragma Inline (Set_Sec_Stack_Soft);
 
    --  The following is a dummy record designed to mimic Communication_Block as
    --  defined in s-tpobop.ads:
@@ -396,4 +405,11 @@ package System.Soft_Links is
       Comp_3 : Boolean;
    end record;
 
+private
+   NT_TSD : TSD;
+   --  The task specific data for the main task when the Ada tasking run-time
+   --  is not used. It relies on the default initialization of NT_TSD. It is
+   --  placed here and not the body to ensure the default initialization does
+   --  not clobber the secondary stack initialization that occurs as part of
+   --  System.Soft_Links.Initialization.
 end System.Soft_Links;
diff --git a/gcc/ada/libgnat/s-soliin.adb b/gcc/ada/libgnat/s-soliin.adb
new file mode 100644
index 00000000000..5364e46f6f4
--- /dev/null
+++ b/gcc/ada/libgnat/s-soliin.adb
@@ -0,0 +1,47 @@
+------------------------------------------------------------------------------
+--                                                                          --
+--                         GNAT COMPILER COMPONENTS                         --
+--                                                                          --
+--          S Y S T E M . S O F T _ L I N K S . I N I T I A L I Z E         --
+--                                                                          --
+--                                 B o d y                                  --
+--                                                                          --
+--            Copyright (C) 2017, Free Software Foundation, Inc.            --
+--                                                                          --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE.                                     --
+--                                                                          --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.               --
+--                                                                          --
+-- You should have received a copy of the GNU General Public License and    --
+-- a copy of the GCC Runtime Library Exception along with this program;     --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see    --
+-- <http://www.gnu.org/licenses/>.                                          --
+--                                                                          --
+-- GNAT was originally developed  by the GNAT team at  New York University. --
+-- Extensive contributions were provided by Ada Core Technologies Inc.      --
+--                                                                          --
+------------------------------------------------------------------------------
+
+with System.Secondary_Stack;
+
+package body System.Soft_Links.Initialize is
+
+   package SSS renames System.Secondary_Stack;
+
+begin
+   --  Initialize the TSD of the main task
+
+   NT_TSD.Jmpbuf_Address := System.Null_Address;
+
+   --  Allocate and initialize the secondary stack for the main task
+
+   NT_TSD.Sec_Stack_Ptr := null;
+   SSS.SS_Init (NT_TSD.Sec_Stack_Ptr);
+end System.Soft_Links.Initialize;
diff --git a/gcc/ada/libgnat/s-soliin.ads b/gcc/ada/libgnat/s-soliin.ads
new file mode 100644
index 00000000000..ba9cf745f48
--- /dev/null
+++ b/gcc/ada/libgnat/s-soliin.ads
@@ -0,0 +1,48 @@
+------------------------------------------------------------------------------
+--                                                                          --
+--                         GNAT COMPILER COMPONENTS                         --
+--                                                                          --
+--          S Y S T E M . S O F T _ L I N K S . I N I T I A L I Z E         --
+--                                                                          --
+--                                 S p e c                                  --
+--                                                                          --
+--            Copyright (C) 2017, Free Software Foundation, Inc.            --
+--                                                                          --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE.                                     --
+--                                                                          --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.               --
+--                                                                          --
+-- You should have received a copy of the GNU General Public License and    --
+-- a copy of the GCC Runtime Library Exception along with this program;     --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see    --
+-- <http://www.gnu.org/licenses/>.                                          --
+--                                                                          --
+-- GNAT was originally developed  by the GNAT team at  New York University. --
+-- Extensive contributions were provided by Ada Core Technologies Inc.      --
+--                                                                          --
+------------------------------------------------------------------------------
+
+--  This package exists to initialize the TSD record of the main task and in
+--  the process, allocate and initialize the secondary stack for the main task.
+--  The initialization routine is contained within its own package because
+--  System.Soft_Links and System.Secondary_Stack are both Preelaborate packages
+--  that are the parents to other Preelaborate System packages.
+
+--  Ideally, the secondary stack would be set up via __gnat_runtime_initialize
+--  to have the secondary stack active as early as possible and to remove the
+--  awkwardness of System.Soft_Links depending on a non-Preelaborate package.
+--  However, as this procedure only exists from 2014, for bootstrapping
+--  purposes the elaboration mechanism is used instead to perform these
+--  functions.
+
+package System.Soft_Links.Initialize is
+   pragma Elaborate_Body;
+   --  Allow this package to have a body
+end System.Soft_Links.Initialize;
diff --git a/gcc/ada/libgnat/s-thread.ads b/gcc/ada/libgnat/s-thread.ads
index cd4faaec1ed..185141b1f1b 100644
--- a/gcc/ada/libgnat/s-thread.ads
+++ b/gcc/ada/libgnat/s-thread.ads
@@ -42,10 +42,13 @@ with Ada.Unchecked_Conversion;
 
 with Interfaces.C;
 
+with System.Secondary_Stack;
 with System.Soft_Links;
 
 package System.Threads is
 
+   package SST renames System.Secondary_Stack;
+
    type ATSD is limited private;
    --  Type of the Ada thread specific data. It contains datas needed
    --  by the GNAT runtime.
@@ -71,8 +74,7 @@ package System.Threads is
    --  wrapper in the APEX process registration package.
 
    procedure Thread_Body_Enter
-     (Sec_Stack_Address    : System.Address;
-      Sec_Stack_Size       : Natural;
+     (Sec_Stack_Ptr        : SST.SS_Stack_Ptr;
       Process_ATSD_Address : System.Address);
    --  Enter thread body, see above for details
 
diff --git a/gcc/ada/libgnat/s-thread__ae653.adb b/gcc/ada/libgnat/s-thread__ae653.adb
index ca871286fce..9e8b2abb946 100644
--- a/gcc/ada/libgnat/s-thread__ae653.adb
+++ b/gcc/ada/libgnat/s-thread__ae653.adb
@@ -37,15 +37,11 @@ pragma Restrictions (No_Tasking);
 --  will be checked by the binder.
 
 with System.OS_Versions; use System.OS_Versions;
-with System.Secondary_Stack;
-pragma Elaborate_All (System.Secondary_Stack);
 
 package body System.Threads is
 
    use Interfaces.C;
 
-   package SSS renames System.Secondary_Stack;
-
    package SSL renames System.Soft_Links;
 
    Current_ATSD : aliased System.Address := System.Null_Address;
@@ -94,17 +90,16 @@ package body System.Threads is
    procedure Install_Handler;
    pragma Import (C, Install_Handler, "__gnat_install_handler");
 
-   function  Get_Sec_Stack_Addr return  Address;
+   function  Get_Sec_Stack return SST.SS_Stack_Ptr;
 
-   procedure Set_Sec_Stack_Addr (Addr : Address);
+   procedure Set_Sec_Stack (Stack : SST.SS_Stack_Ptr);
 
    -----------------------
    -- Thread_Body_Enter --
    -----------------------
 
    procedure Thread_Body_Enter
-     (Sec_Stack_Address    : System.Address;
-      Sec_Stack_Size       : Natural;
+     (Sec_Stack_Ptr        : SST.SS_Stack_Ptr;
       Process_ATSD_Address : System.Address)
    is
       --  Current_ATSD must already be a taskVar of taskIdSelf.
@@ -115,8 +110,8 @@ package body System.Threads is
 
    begin
 
-      TSD.Sec_Stack_Addr := Sec_Stack_Address;
-      SSS.SS_Init (TSD.Sec_Stack_Addr, Sec_Stack_Size);
+      TSD.Sec_Stack_Ptr := Sec_Stack_Ptr;
+      SST.SS_Init (TSD.Sec_Stack_Ptr);
       Current_ATSD := Process_ATSD_Address;
 
       Install_Handler;
@@ -166,23 +161,23 @@ package body System.Threads is
       pragma Assert (Result /= ERROR);
 
    begin
-      Main_ATSD.Sec_Stack_Addr := SSL.Get_Sec_Stack_Addr_NT;
+      Main_ATSD.Sec_Stack_Ptr := SSL.Get_Sec_Stack_NT;
       Current_ATSD := Main_ATSD'Address;
       Install_Handler;
-      SSL.Get_Sec_Stack_Addr := Get_Sec_Stack_Addr'Access;
-      SSL.Set_Sec_Stack_Addr := Set_Sec_Stack_Addr'Access;
+      SSL.Get_Sec_Stack := Get_Sec_Stack'Access;
+      SSL.Set_Sec_Stack := Set_Sec_Stack'Access;
    end Init_RTS;
 
-   ------------------------
-   -- Get_Sec_Stack_Addr --
-   ------------------------
+   -------------------
+   -- Get_Sec_Stack --
+   -------------------
 
-   function  Get_Sec_Stack_Addr return  Address is
+   function  Get_Sec_Stack return SST.SS_Stack_Ptr is
       CTSD : constant ATSD_Access := From_Address (Current_ATSD);
    begin
       pragma Assert (CTSD /= null);
-      return CTSD.Sec_Stack_Addr;
-   end Get_Sec_Stack_Addr;
+      return CTSD.Sec_Stack_Ptr;
+   end Get_Sec_Stack;
 
    --------------
    -- Register --
@@ -229,16 +224,16 @@ package body System.Threads is
       return Result;
    end Register;
 
-   ------------------------
-   -- Set_Sec_Stack_Addr --
-   ------------------------
+   -------------------
+   -- Set_Sec_Stack --
+   -------------------
 
-   procedure Set_Sec_Stack_Addr (Addr : Address) is
+   procedure Set_Sec_Stack (Stack : SST.SS_Stack_Ptr) is
       CTSD : constant ATSD_Access := From_Address (Current_ATSD);
    begin
       pragma Assert (CTSD /= null);
-      CTSD.Sec_Stack_Addr := Addr;
-   end Set_Sec_Stack_Addr;
+      CTSD.Sec_Stack_Ptr := Stack;
+   end Set_Sec_Stack;
 
 begin
    --  Initialize run-time library
diff --git a/gcc/ada/opt.ads b/gcc/ada/opt.ads
index 687d1eb75b9..96e2f3e2f92 100644
--- a/gcc/ada/opt.ads
+++ b/gcc/ada/opt.ads
@@ -462,18 +462,21 @@ package Opt is
    --    otherwise:   "pragma Default_Storage_Pool (X);" applies, and
    --                 this points to the name X.
    --  Push_Scope and Pop_Scope in Sem_Ch8 save and restore this value.
-   Default_Stack_Size : Int := -1;
+
+   No_Stack_Size : constant := -1;
+
+   Default_Stack_Size : Int := No_Stack_Size;
    --  GNATBIND
-   --  Set to default primary stack size in units of bytes. Set by
-   --  the -dnnn switch for the binder. A value of -1 indicates that no
-   --  default was set by the binder.
+   --  Set to default primary stack size in units of bytes. Set by the -dnnn
+   --  switch for the binder. A value of No_Stack_Size indicates that
+   --  no default was set by the binder.
 
-   Default_Sec_Stack_Size : Int := -1;
+   Default_Sec_Stack_Size : Int := No_Stack_Size;
    --  GNATBIND
-   --  Set to default secondary stack size in units of bytes. Set by
-   --  the -Dnnn switch for the binder. A value of -1 indicates that no
-   --  default was set by the binder, and that the default should be the
-   --  initial value of System.Secondary_Stack.Default_Secondary_Stack_Size.
+   --  Set to default secondary stack size in units of bytes. Set by the -Dnnn
+   --  switch for the binder. A value of No_Stack_Size indicates that no
+   --  default was set by the binder and the run-time value should be used
+   --  instead.
 
    Default_SSO : Character := ' ';
    --  GNAT
@@ -1313,6 +1316,13 @@ package Opt is
    --  Indicates if a project file is used or not. Set to In_Use by the first
    --  SFNP pragma.
 
+   Quantity_Of_Default_Size_Sec_Stacks : Int := -1;
+   --  GNATBIND
+   --  The number of default sized secondary stacks that the binder should
+   --  generate. Allows ZFP users to have the binder generate extra stacks if
+   --  needed to support multithreaded applications. A value of -1 indicates
+   --  that no size was set by the binder.
+
    Queuing_Policy : Character := ' ';
    --  GNAT, GNATBIND
    --  Set to ' ' for the default case (no queuing policy specified). Reset to
diff --git a/gcc/ada/rtfinal.c b/gcc/ada/rtfinal.c
index 8f7e163cded..9398af393ba 100644
--- a/gcc/ada/rtfinal.c
+++ b/gcc/ada/rtfinal.c
@@ -6,7 +6,7 @@
  *                                                                          *
  *                          C Implementation File                           *
  *                                                                          *
- *             Copyright (C) 2014, Free Software Foundation, Inc.           *
+ *            Copyright (C) 2014-2017, Free Software Foundation, Inc.       *
  *                                                                          *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -40,7 +40,7 @@ extern void __gnat_runtime_finalize (void);
    at all, the intention is that this be replaced by system specific code
    where finalization is required.
 
-   Note that __gnat_runtime_initialize() is called in adafinal()   */
+   Note that __gnat_runtime_finalize() is called in adafinal()   */
 
 extern int __gnat_rt_init_count;
 /*  see initialize.c  */
diff --git a/gcc/ada/rtsfind.ads b/gcc/ada/rtsfind.ads
index bdad2520fd4..c4d7d3c80c6 100644
--- a/gcc/ada/rtsfind.ads
+++ b/gcc/ada/rtsfind.ads
@@ -1249,6 +1249,7 @@ package Rtsfind is
      RE_Set_63,                          -- System.Pack_63
 
      RE_Adjust_Storage_Size,             -- System.Parameters
+     RE_Default_Secondary_Stack_Size,    -- System.Parameters
      RE_Default_Stack_Size,              -- System.Parameters
      RE_Garbage_Collected,               -- System.Parameters
      RE_Size_Type,                       -- System.Parameters
@@ -1424,12 +1425,12 @@ package Rtsfind is
      RE_IS_Ilf,                          -- System.Scalar_Values
      RE_IS_Ill,                          -- System.Scalar_Values
 
-     RE_Default_Secondary_Stack_Size,    -- System.Secondary_Stack
      RE_Mark_Id,                         -- System.Secondary_Stack
      RE_SS_Allocate,                     -- System.Secondary_Stack
      RE_SS_Pool,                         -- System.Secondary_Stack
      RE_SS_Mark,                         -- System.Secondary_Stack
      RE_SS_Release,                      -- System.Secondary_Stack
+     RE_SS_Stack,                        -- System.Secondary_Stack
 
      RE_Shared_Var_Lock,                 -- System.Shared_Storage
      RE_Shared_Var_Unlock,               -- System.Shared_Storage
@@ -2487,6 +2488,7 @@ package Rtsfind is
      RE_Set_63                           => System_Pack_63,
 
      RE_Adjust_Storage_Size              => System_Parameters,
+     RE_Default_Secondary_Stack_Size     => System_Parameters,
      RE_Default_Stack_Size               => System_Parameters,
      RE_Garbage_Collected                => System_Parameters,
      RE_Size_Type                        => System_Parameters,
@@ -2662,12 +2664,12 @@ package Rtsfind is
      RE_IS_Ilf                           => System_Scalar_Values,
      RE_IS_Ill                           => System_Scalar_Values,
 
-     RE_Default_Secondary_Stack_Size     => System_Secondary_Stack,
      RE_Mark_Id                          => System_Secondary_Stack,
      RE_SS_Allocate                      => System_Secondary_Stack,
      RE_SS_Mark                          => System_Secondary_Stack,
      RE_SS_Pool                          => System_Secondary_Stack,
      RE_SS_Release                       => System_Secondary_Stack,
+     RE_SS_Stack                         => System_Secondary_Stack,
 
      RE_Shared_Var_Lock                  => System_Shared_Storage,
      RE_Shared_Var_Unlock                => System_Shared_Storage,
diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb
index 677d59999dd..6c29b38b93a 100644
--- a/gcc/ada/sem_aggr.adb
+++ b/gcc/ada/sem_aggr.adb
@@ -1594,7 +1594,7 @@ package body Sem_Aggr is
             --  unless the expression covers a single component, or the
             --  expander is inactive.
 
-            --  In SPARK mode, expressions that can perform side-effects will
+            --  In SPARK mode, expressions that can perform side effects will
             --  be recognized by the gnat2why back-end, and the whole
             --  subprogram will be ignored. So semantic analysis can be
             --  performed safely.
@@ -3605,7 +3605,7 @@ package body Sem_Aggr is
                      --  This is redundant if the others_choice covers only
                      --  one component (small optimization possible???), but
                      --  indispensable otherwise, because each one must be
-                     --  expanded individually to preserve side-effects.
+                     --  expanded individually to preserve side effects.
 
                      --  Ada 2005 (AI-287): In case of default initialization
                      --  of components, we duplicate the corresponding default
@@ -3881,7 +3881,7 @@ package body Sem_Aggr is
          --  expansion is delayed until the enclosing aggregate is expanded
          --  into assignments. In that case, do not generate checks on the
          --  expression, because they will be generated later, and will other-
-         --  wise force a copy (to remove side-effects) that would leave a
+         --  wise force a copy (to remove side effects) that would leave a
          --  dynamic-sized aggregate in the code, something that gigi cannot
          --  handle.
 
diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index 223703d2a43..ac5035fd1bc 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -5305,8 +5305,7 @@ package body Sem_Ch12 is
             Valid_Operator_Definition (Act_Decl_Id);
          end if;
 
-         Set_Alias  (Act_Decl_Id, Anon_Id);
-         Set_Parent (Act_Decl_Id, Parent (Anon_Id));
+         Set_Alias (Act_Decl_Id, Anon_Id);
          Set_Has_Completion (Act_Decl_Id);
          Set_Related_Instance (Pack_Id, Act_Decl_Id);
 
@@ -6460,10 +6459,11 @@ package body Sem_Ch12 is
          elsif Ekind (E1) = E_Package then
             Check_Mismatch
               (Ekind (E1) /= Ekind (E2)
-                or else Renamed_Object (E1) /= Renamed_Object (E2));
+                or else (Present (Renamed_Object (E2))
+                          and then Renamed_Object (E1) /=
+                                     Renamed_Object (E2)));
 
          elsif Is_Overloadable (E1) then
-
             --  Verify that the actual subprograms match. Note that actuals
             --  that are attributes are rewritten as subprograms. If the
             --  subprogram in the formal package is defaulted, no check is
diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index c163aab8e78..1e3b78ccf2f 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -2820,24 +2820,10 @@ package body Sem_Ch3 is
 
          --  Analyze the contracts of packages and their bodies
 
-         if Nkind (Context) = N_Package_Specification then
-
-            --  When a package has private declarations, its contract must be
-            --  analyzed at the end of the said declarations. This way both the
-            --  analysis and freeze actions are properly synchronized in case
-            --  of private type use within the contract.
-
-            if L = Private_Declarations (Context) then
-               Analyze_Package_Contract (Defining_Entity (Context));
-
-            --  Otherwise the contract is analyzed at the end of the visible
-            --  declarations.
-
-            elsif L = Visible_Declarations (Context)
-              and then No (Private_Declarations (Context))
-            then
-               Analyze_Package_Contract (Defining_Entity (Context));
-            end if;
+         if Nkind (Context) = N_Package_Specification
+           and then L = Visible_Declarations (Context)
+         then
+            Analyze_Package_Contract (Defining_Entity (Context));
 
          elsif Nkind (Context) = N_Package_Body then
             Analyze_Package_Body_Contract (Defining_Entity (Context));
diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index fad52ebd106..538023524e3 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -6431,10 +6431,24 @@ package body Sem_Ch4 is
       Op_Id : Entity_Id;
       N     : Node_Id)
    is
-      Op_Type : constant Entity_Id := Etype (Op_Id);
+      Is_String : constant Boolean := Nkind (L) = N_String_Literal
+                                        or else
+                                      Nkind (R) = N_String_Literal;
+      Op_Type   : constant Entity_Id := Etype (Op_Id);
 
    begin
       if Is_Array_Type (Op_Type)
+
+        --  Small but very effective optimization: if at least one operand is a
+        --  string literal, then the type of the operator must be either array
+        --  of characters or array of strings.
+
+        and then (not Is_String
+                    or else
+                  Is_Character_Type (Component_Type (Op_Type))
+                    or else
+                  Is_String_Type (Component_Type (Op_Type)))
+
         and then not Is_Limited_Type (Op_Type)
 
         and then (Has_Compatible_Type (L, Op_Type)
diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb
index 8c92669876c..10002ea08c2 100644
--- a/gcc/ada/sem_ch5.adb
+++ b/gcc/ada/sem_ch5.adb
@@ -1090,12 +1090,14 @@ package body Sem_Ch5 is
       --  the context of the assignment statement. Restore the expander mode
       --  now so that assignment statement can be properly expanded.
 
-      if Nkind (N) = N_Assignment_Statement and then Has_Target_Names (N) then
-         Expander_Mode_Restore;
-         Full_Analysis := Save_Full_Analysis;
-      end if;
+      if Nkind (N) = N_Assignment_Statement then
+         if Has_Target_Names (N) then
+            Expander_Mode_Restore;
+            Full_Analysis := Save_Full_Analysis;
+         end if;
 
-      pragma Assert (not Should_Transform_BIP_Assignment (Typ => T1));
+         pragma Assert (not Should_Transform_BIP_Assignment (Typ => T1));
+      end if;
    end Analyze_Assignment;
 
    -----------------------------
diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
index a85ca60cd5f..4f719e9b81c 100644
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -442,18 +442,12 @@ package body Sem_Ch6 is
       begin
          --  Preanalyze a duplicate of the expression to have available the
          --  minimum decoration needed to locate referenced unfrozen types
-         --  without adding any decoration to the function expression. This
-         --  preanalysis is performed with errors disabled to avoid reporting
-         --  spurious errors on Ghost entities (since the expression is not
-         --  fully analyzed).
+         --  without adding any decoration to the function expression.
 
          Push_Scope (Def_Id);
          Install_Formals (Def_Id);
-         Ignore_Errors_Enable := Ignore_Errors_Enable + 1;
 
          Preanalyze_Spec_Expression (Dup_Expr, Etype (Def_Id));
-
-         Ignore_Errors_Enable := Ignore_Errors_Enable - 1;
          End_Scope;
 
          --  Restore certain attributes of Def_Id since the preanalysis may
diff --git a/gcc/ada/sem_ch8.adb b/gcc/ada/sem_ch8.adb
index aa53045498b..bdc8aba1e1f 100644
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -3644,19 +3644,16 @@ package body Sem_Ch8 is
       --  and mark any use_package_clauses that affect the visibility of the
       --  implicit generic actual.
 
-      if From_Default (N)
-        and then Is_Generic_Actual_Subprogram (New_S)
-        and then Present (Alias (New_S))
+      if Is_Generic_Actual_Subprogram (New_S)
+        and then (Is_Intrinsic_Subprogram (New_S) or else From_Default (N))
       then
-         Mark_Use_Clauses (Alias (New_S));
+         Mark_Use_Clauses (New_S);
 
-      --  Check intrinsic operators used as generic actuals since they may
-      --  make a use_type_clause effective.
+         --  Handle overloaded subprograms
 
-      elsif Is_Generic_Actual_Subprogram (New_S)
-        and then Is_Intrinsic_Subprogram (New_S)
-      then
-         Mark_Use_Clauses (New_S);
+         if Present (Alias (New_S)) then
+            Mark_Use_Clauses (Alias (New_S));
+         end if;
       end if;
    end Analyze_Subprogram_Renaming;
 
@@ -9078,7 +9075,7 @@ package body Sem_Ch8 is
                   then
                      Error_Msg_Node_1 := Entity (N);
                      Error_Msg_NE
-                       ("use clause for package &? has no effect",
+                       ("use clause for package & has no effect?u?",
                         Curr, Entity (N));
                   end if;
 
@@ -9087,7 +9084,7 @@ package body Sem_Ch8 is
                else
                   Error_Msg_Node_1 := Etype (N);
                   Error_Msg_NE
-                    ("use clause for }? has no effect", Curr, Etype (N));
+                    ("use clause for } has no effect?u?", Curr, Etype (N));
                end if;
             end if;
 
@@ -9111,10 +9108,10 @@ package body Sem_Ch8 is
       --  Deal with use clauses within the context area if the current
       --  scope is a compilation unit.
 
-      if Is_Compilation_Unit (Current_Scope) then
-
-         pragma Assert (Scope_Stack.Last /= Scope_Stack.First);
-
+      if Is_Compilation_Unit (Current_Scope)
+        and then Sloc (Scope_Stack.Table
+                        (Scope_Stack.Last - 1).Entity) = Standard_Location
+      then
          Update_Chain_In_Scope (Scope_Stack.Last - 1);
       end if;
    end Update_Use_Clause_Chain;
diff --git a/gcc/ada/sem_dim.adb b/gcc/ada/sem_dim.adb
index 6330703e071..a271ca55960 100644
--- a/gcc/ada/sem_dim.adb
+++ b/gcc/ada/sem_dim.adb
@@ -518,25 +518,17 @@ package body Sem_Dim is
          Position : Dimension_Position)
       is
       begin
-         --  Integer case
-
-         if Is_Integer_Type (Def_Id) then
-
-            --  Dimension value must be an integer literal
-
-            if Nkind (Expr) = N_Integer_Literal then
-               Dimensions (Position) := +Whole (UI_To_Int (Intval (Expr)));
-            else
-               Error_Msg_N ("integer literal expected", Expr);
-            end if;
+         Dimensions (Position) := Create_Rational_From (Expr, True);
+         Processed (Position) := True;
 
-         --  Float case
+         --  If the dimensioned root type is an integer type, it is not
+         --  particularly useful, and fractional dimensions do not make
+         --  much sense for such types, so previously we used to reject
+         --  dimensions of integer types that were not integer literals.
+         --  However, the manipulation of dimensions does not depend on
+         --  the kind of root type, so we can accept this usage for rare
+         --  cases where dimensions are specified for integer values.
 
-         else
-            Dimensions (Position) := Create_Rational_From (Expr, True);
-         end if;
-
-         Processed (Position) := True;
       end Extract_Power;
 
       ------------------------
@@ -1585,6 +1577,20 @@ package body Sem_Dim is
                   then
                      null;
 
+                  --  Numeric literal case. Issue a warning to indicate the
+                  --  literal is treated as if its dimension matches the type
+                  --  dimension.
+
+                  elsif Nkind_In (Original_Node (L), N_Integer_Literal,
+                                                     N_Real_Literal)
+                  then
+                     Dim_Warning_For_Numeric_Literal (L, Etype (R));
+
+                  elsif Nkind_In (Original_Node (R), N_Integer_Literal,
+                                                     N_Real_Literal)
+                  then
+                     Dim_Warning_For_Numeric_Literal (R, Etype (L));
+
                   else
                      Error_Dim_Msg_For_Binary_Op (N, L, R);
                   end if;
@@ -2732,6 +2738,24 @@ package body Sem_Dim is
 
    procedure Dim_Warning_For_Numeric_Literal (N : Node_Id; Typ : Entity_Id) is
    begin
+      --  Consider the literal zero (integer 0 or real 0.0) to be of any
+      --  dimension.
+
+      case Nkind (Original_Node (N)) is
+         when N_Real_Literal =>
+            if Expr_Value_R (N) = Ureal_0 then
+               return;
+            end if;
+
+         when N_Integer_Literal =>
+            if Expr_Value (N) = Uint_0 then
+               return;
+            end if;
+
+         when others =>
+            null;
+      end case;
+
       --  Initialize name buffer
 
       Name_Len := 0;
diff --git a/gcc/ada/sem_elab.adb b/gcc/ada/sem_elab.adb
index 5ba6938cf97..8dec4280eb3 100644
--- a/gcc/ada/sem_elab.adb
+++ b/gcc/ada/sem_elab.adb
@@ -27,6 +27,7 @@ with Atree;    use Atree;
 with Debug;    use Debug;
 with Einfo;    use Einfo;
 with Errout;   use Errout;
+with Exp_Ch11; use Exp_Ch11;
 with Exp_Tss;  use Exp_Tss;
 with Exp_Util; use Exp_Util;
 with Lib;      use Lib;
@@ -159,7 +160,7 @@ package body Sem_Elab is
    --
    --      -  Instantiations
    --
-   --      -  References to variables
+   --      -  Reads of variables
    --
    --      -  Task activation
    --
@@ -175,7 +176,7 @@ package body Sem_Elab is
    --
    --      - For instantiations, the target is the generic template
    --
-   --      - For references to variables, the target is the variable
+   --      - For reads of variables, the target is the variable
    --
    --      - For task activation, the target is the task body
    --
@@ -292,7 +293,7 @@ package body Sem_Elab is
    --  |       |                                                            |
    --  |       +--> Process_Variable_Assignment                             |
    --  |       |                                                            |
-   --  |       +--> Process_Variable_Reference                              |
+   --  |       +--> Process_Variable_Read                                   |
    --  |                                                                    |
    --  +------------------------- Processing phase -------------------------+
 
@@ -348,7 +349,7 @@ package body Sem_Elab is
    --           ABE mechanism effectively ignores all calls which cause the
    --           elaboration flow to "leave" the instance.
    --
-   --  -gnatd.o conservarive elaboration order for indirect calls
+   --  -gnatd.o conservative elaboration order for indirect calls
    --
    --           The ABE mechanism treats '[Unrestricted_]Access of an entry,
    --           operator, or subprogram as an immediate invocation of the
@@ -361,6 +362,13 @@ package body Sem_Elab is
    --           entries, operators, and subprograms. As a result, the scenarios
    --           are not recorder or processed.
    --
+   --  -gnatd.v enforce SPARK elaboration rules in SPARK code
+   --
+   --           The ABE mechanism applies some of the SPARK elaboration rules
+   --           defined in the SPARK reference manual, chapter 7.7. Note that
+   --           certain rules are always enforced, regardless of whether the
+   --           switch is active.
+   --
    --  -gnatd.y disable implicit pragma Elaborate_All on task bodies
    --
    --           The ABE mechanism does not generate implicit Elaborate_All when
@@ -776,14 +784,6 @@ package body Sem_Elab is
    --  message, otherwise it emits an error. If flag In_SPARK is set, then
    --  string " in SPARK" is added to the end of the message.
 
-   procedure Ensure_Dynamic_Prior_Elaboration
-     (N        : Node_Id;
-      Unit_Id  : Entity_Id;
-      Prag_Nam : Name_Id);
-   --  Guarantee the elaboration of unit Unit_Id with respect to the main unit
-   --  by suggesting the use of Elaborate[_All] with name Prag_Nam. N denotes
-   --  the related scenario.
-
    procedure Ensure_Prior_Elaboration
      (N            : Node_Id;
       Unit_Id      : Entity_Id;
@@ -792,7 +792,15 @@ package body Sem_Elab is
    --  N denotes the related scenario. Flag In_Task_Body should be set when the
    --  need for elaboration is initiated from a task body.
 
-   procedure Ensure_Static_Prior_Elaboration
+   procedure Ensure_Prior_Elaboration_Dynamic
+     (N        : Node_Id;
+      Unit_Id  : Entity_Id;
+      Prag_Nam : Name_Id);
+   --  Guarantee the elaboration of unit Unit_Id with respect to the main unit
+   --  by suggesting the use of Elaborate[_All] with name Prag_Nam. N denotes
+   --  the related scenario.
+
+   procedure Ensure_Prior_Elaboration_Static
      (N        : Node_Id;
       Unit_Id  : Entity_Id;
       Prag_Nam : Name_Id);
@@ -808,6 +816,7 @@ package body Sem_Elab is
      (Call      : Node_Id;
       Target_Id : out Entity_Id;
       Attrs     : out Call_Attributes);
+   pragma Inline (Extract_Call_Attributes);
    --  Obtain attributes Attrs associated with call Call. Target_Id is the
    --  entity of the call target.
 
@@ -828,6 +837,7 @@ package body Sem_Elab is
       Inst_Id  : out Entity_Id;
       Gen_Id   : out Entity_Id;
       Attrs    : out Instantiation_Attributes);
+   pragma Inline (Extract_Instantiation_Attributes);
    --  Obtain attributes Attrs associated with expanded instantiation Exp_Inst.
    --  Inst is the instantiation. Inst_Id is the entity of the instance. Gen_Id
    --  is the entity of the generic unit being instantiated.
@@ -841,13 +851,15 @@ package body Sem_Elab is
    procedure Extract_Task_Attributes
      (Typ   : Entity_Id;
       Attrs : out Task_Attributes);
+   pragma Inline (Extract_Task_Attributes);
    --  Obtain attributes Attrs associated with task type Typ
 
    procedure Extract_Variable_Reference_Attributes
      (Ref    : Node_Id;
       Var_Id : out Entity_Id;
       Attrs  : out Variable_Attributes);
-   --  Obtain attributes Attrs associated with reference Ref which mentions
+   pragma Inline (Extract_Variable_Reference_Attributes);
+   --  Obtain attributes Attrs associated with reference Ref that mentions
    --  variable Var_Id.
 
    function Find_Code_Unit (N : Node_Or_Entity_Id) return Entity_Id;
@@ -872,6 +884,10 @@ package body Sem_Elab is
    --  is obtained by logically unwinding instantiations and subunits when N
    --  resides within one.
 
+   function Find_Unit_Entity (N : Node_Id) return Entity_Id;
+   pragma Inline (Find_Unit_Entity);
+   --  Return the entity of unit N
+
    function First_Formal_Type (Subp_Id : Entity_Id) return Entity_Id;
    pragma Inline (First_Formal_Type);
    --  Return the type of subprogram Subp_Id's first formal parameter. If the
@@ -908,6 +924,7 @@ package body Sem_Elab is
    function In_External_Instance
      (N           : Node_Id;
       Target_Decl : Node_Id) return Boolean;
+   pragma Inline (In_External_Instance);
    --  Determine whether a target desctibed by its declaration Target_Decl
    --  resides in a package instance which is external to scenario N.
 
@@ -931,28 +948,30 @@ package body Sem_Elab is
       In_SPARK  : Boolean);
    --  Output information concerning call Call which invokes target Target_Id.
    --  If flag Info_Msg is set, the routine emits an information message,
-   --  otherwise it emits an error. If flag In_SPARK is set, then string " in
-   --  SPARK" is added to the end of the message.
+   --  otherwise it emits an error. If flag In_SPARK is set, then the string
+   --  " in SPARK" is added to the end of the message.
 
    procedure Info_Instantiation
      (Inst     : Node_Id;
       Gen_Id   : Entity_Id;
       Info_Msg : Boolean;
       In_SPARK : Boolean);
+   pragma Inline (Info_Instantiation);
    --  Output information concerning instantiation Inst which instantiates
    --  generic unit Gen_Id. If flag Info_Msg is set, the routine emits an
    --  information message, otherwise it emits an error. If flag In_SPARK
    --  is set, then string " in SPARK" is added to the end of the message.
 
-   procedure Info_Variable_Reference
+   procedure Info_Variable_Read
      (Ref      : Node_Id;
       Var_Id   : Entity_Id;
       Info_Msg : Boolean;
       In_SPARK : Boolean);
-   --  Output information concerning reference Ref which mentions variable
-   --  Var_Id. If flag Info_Msg is set, the routine emits an information
-   --  message, otherwise it emits an error. If flag In_SPARK is set, then
-   --  string " in SPARK" is added to the end of the message.
+   pragma Inline (Info_Variable_Read);
+   --  Output information concerning reference Ref which reads variable Var_Id.
+   --  If flag Info_Msg is set, the routine emits an information message,
+   --  otherwise it emits an error. If flag In_SPARK is set, then string " in
+   --  SPARK" is added to the end of the message.
 
    function Insertion_Node (N : Node_Id; Ins_Nod : Node_Id) return Node_Id;
    pragma Inline (Insertion_Node);
@@ -1026,6 +1045,7 @@ package body Sem_Elab is
      (N           : Node_Id;
       Target_Decl : Node_Id;
       Target_Body : Node_Id) return Boolean;
+   pragma Inline (Is_Guaranteed_ABE);
    --  Determine whether scenario N with a target described by its initial
    --  declaration Target_Decl and body Target_Decl results in a guaranteed
    --  ABE.
@@ -1035,6 +1055,10 @@ package body Sem_Elab is
    --  Determine whether arbitrary entity Id denotes internally generated
    --  routine Initial_Condition.
 
+   function Is_Initialized (Obj_Decl : Node_Id) return Boolean;
+   pragma Inline (Is_Initialized);
+   --  Determine whether object declaration Obj_Decl is initialized
+
    function Is_Invariant_Proc (Id : Entity_Id) return Boolean;
    pragma Inline (Is_Invariant_Proc);
    --  Determine whether arbitrary entity Id denotes an invariant procedure
@@ -1139,10 +1163,10 @@ package body Sem_Elab is
    --  Determine whether arbitrary node N denotes a suitable assignment for ABE
    --  processing.
 
-   function Is_Suitable_Variable_Reference (N : Node_Id) return Boolean;
-   pragma Inline (Is_Suitable_Variable_Reference);
-   --  Determine whether arbitrary node N is a suitable reference to a variable
-   --  for ABE processing.
+   function Is_Suitable_Variable_Read (N : Node_Id) return Boolean;
+   pragma Inline (Is_Suitable_Variable_Read);
+   --  Determine whether arbitrary node N is a suitable variable read for ABE
+   --  processing.
 
    function Is_Task_Entry (Id : Entity_Id) return Boolean;
    pragma Inline (Is_Task_Entry);
@@ -1234,7 +1258,7 @@ package body Sem_Elab is
       Call_Attrs   : Call_Attributes;
       Target_Id    : Entity_Id;
       In_Task_Body : Boolean);
-   --  Top level dispatcher for processing of calls. Perform ABE checks and
+   --  Top-level dispatcher for processing of calls. Perform ABE checks and
    --  diagnostics for call Call which invokes target Target_Id. Call_Attrs
    --  are the attributes of the call. Flag In_Task_Body should be set when
    --  the processing is initiated from a task body.
@@ -1334,10 +1358,24 @@ package body Sem_Elab is
    --  should be set when the processing is initiated from a task body.
 
    procedure Process_Variable_Assignment (Asmt : Node_Id);
-   --  Perform ABE checks and diagnostics for assignment statement Asmt
-
-   procedure Process_Variable_Reference (Ref : Node_Id);
-   --  Perform ABE checks and diagnostics for variable reference Ref
+   --  Top level dispatcher for processing of variable assignments. Perform ABE
+   --  checks and diagnostics for assignment statement Asmt.
+
+   procedure Process_Variable_Assignment_Ada
+     (Asmt   : Node_Id;
+      Var_Id : Entity_Id);
+   --  Perform ABE checks and diagnostics for assignment statement Asmt that
+   --  updates the value of variable Var_Id using the Ada rules.
+
+   procedure Process_Variable_Assignment_SPARK
+     (Asmt   : Node_Id;
+      Var_Id : Entity_Id);
+   --  Perform ABE checks and diagnostics for assignment statement Asmt that
+   --  updates the value of variable Var_Id using the SPARK rules.
+
+   procedure Process_Variable_Read (Ref : Node_Id);
+   --  Perform ABE checks and diagnostics for reference Ref that reads a
+   --  variable.
 
    procedure Push_Active_Scenario (N : Node_Id);
    pragma Inline (Push_Active_Scenario);
@@ -1359,6 +1397,7 @@ package body Sem_Elab is
    --  should be set when the traversal is initiated from a task body.
 
    procedure Update_Elaboration_Scenario (New_N : Node_Id; Old_N : Node_Id);
+   pragma Inline (Update_Elaboration_Scenario);
    --  Update all relevant internal data structures when scenario Old_N is
    --  transformed into scenario New_N by Atree.Rewrite.
 
@@ -1774,7 +1813,7 @@ package body Sem_Elab is
          --  be on another machine.
 
          if Ekind (Body_Id) = E_Package_Body
-           and then Ekind (Spec_Id) = E_Package
+           and then Ekind_In (Spec_Id, E_Generic_Package, E_Package)
            and then (Is_Remote_Call_Interface (Spec_Id)
                       or else Is_Remote_Types (Spec_Id))
          then
@@ -1870,7 +1909,20 @@ package body Sem_Elab is
          Comp_Unit := Parent (Unit_Declaration_Node (Unit_Id));
       end if;
 
-      if Nkind (Comp_Unit) = N_Subunit then
+      --  Handle the case where a subprogram instantiation which acts as a
+      --  compilation unit is expanded into an anonymous package that wraps
+      --  the instantiated subprogram.
+
+      if Nkind (Comp_Unit) = N_Package_Specification
+        and then Nkind_In (Original_Node (Parent (Comp_Unit)),
+                           N_Function_Instantiation,
+                           N_Procedure_Instantiation)
+      then
+         Comp_Unit := Parent (Parent (Comp_Unit));
+
+      --  Handle the case where the compilation unit is a subunit
+
+      elsif Nkind (Comp_Unit) = N_Subunit then
          Comp_Unit := Parent (Comp_Unit);
       end if;
 
@@ -1939,97 +1991,6 @@ package body Sem_Elab is
       return Elaboration_Context_Index (Key mod Elaboration_Context_Max);
    end Elaboration_Context_Hash;
 
-   --------------------------------------
-   -- Ensure_Dynamic_Prior_Elaboration --
-   --------------------------------------
-
-   procedure Ensure_Dynamic_Prior_Elaboration
-     (N        : Node_Id;
-      Unit_Id  : Entity_Id;
-      Prag_Nam : Name_Id)
-   is
-      procedure Info_Missing_Pragma;
-      pragma Inline (Info_Missing_Pragma);
-      --  Output information concerning missing Elaborate or Elaborate_All
-      --  pragma with name Prag_Nam for scenario N which ensures the prior
-      --  elaboration of Unit_Id.
-
-      -------------------------
-      -- Info_Missing_Pragma --
-      -------------------------
-
-      procedure Info_Missing_Pragma is
-      begin
-         --  Internal units are ignored as they cause unnecessary noise
-
-         if not In_Internal_Unit (Unit_Id) then
-
-            --  The name of the unit subjected to the elaboration pragma is
-            --  fully qualified to improve the clarity of the info message.
-
-            Error_Msg_Name_1     := Prag_Nam;
-            Error_Msg_Qual_Level := Nat'Last;
-
-            Error_Msg_NE ("info: missing pragma % for unit &", N, Unit_Id);
-            Error_Msg_Qual_Level := 0;
-         end if;
-      end Info_Missing_Pragma;
-
-      --  Local variables
-
-      Elab_Attrs : Elaboration_Attributes;
-      Level      : Enclosing_Level_Kind;
-
-   --  Start of processing for Ensure_Dynamic_Prior_Elaboration
-
-   begin
-      Elab_Attrs := Elaboration_Context.Get (Unit_Id);
-
-      --  Nothing to do when the unit is guaranteed prior elaboration by means
-      --  of a source Elaborate[_All] pragma.
-
-      if Present (Elab_Attrs.Source_Pragma) then
-         return;
-      end if;
-
-      --  Output extra information on a missing Elaborate[_All] pragma when
-      --  switch -gnatel (info messages on implicit Elaborate[_All] pragmas
-      --  is in effect.
-
-      if Elab_Info_Messages then
-
-         --  Performance note: parent traversal
-
-         Level := Find_Enclosing_Level (N);
-
-         --  Declaration level scenario
-
-         if (Is_Suitable_Call (N) or else Is_Suitable_Instantiation (N))
-           and then Level = Declaration_Level
-         then
-            null;
-
-         --  Library level scenario
-
-         elsif Level in Library_Level then
-            null;
-
-         --  Instantiation library level scenario
-
-         elsif Level = Instantiation then
-            null;
-
-         --  Otherwise the scenario does not appear at the proper level and
-         --  cannot possibly act as a top level scenario.
-
-         else
-            return;
-         end if;
-
-         Info_Missing_Pragma;
-      end if;
-   end Ensure_Dynamic_Prior_Elaboration;
-
    ------------------------------
    -- Ensure_Prior_Elaboration --
    ------------------------------
@@ -2147,7 +2108,7 @@ package body Sem_Elab is
       --  effect.
 
       elsif Dynamic_Elaboration_Checks then
-         Ensure_Dynamic_Prior_Elaboration
+         Ensure_Prior_Elaboration_Dynamic
            (N        => N,
             Unit_Id  => Unit_Id,
             Prag_Nam => Prag_Nam);
@@ -2158,18 +2119,109 @@ package body Sem_Elab is
       else
          pragma Assert (Static_Elaboration_Checks);
 
-         Ensure_Static_Prior_Elaboration
+         Ensure_Prior_Elaboration_Static
            (N        => N,
             Unit_Id  => Unit_Id,
             Prag_Nam => Prag_Nam);
       end if;
    end Ensure_Prior_Elaboration;
 
+   --------------------------------------
+   -- Ensure_Prior_Elaboration_Dynamic --
+   --------------------------------------
+
+   procedure Ensure_Prior_Elaboration_Dynamic
+     (N        : Node_Id;
+      Unit_Id  : Entity_Id;
+      Prag_Nam : Name_Id)
+   is
+      procedure Info_Missing_Pragma;
+      pragma Inline (Info_Missing_Pragma);
+      --  Output information concerning missing Elaborate or Elaborate_All
+      --  pragma with name Prag_Nam for scenario N, which would ensure the
+      --  prior elaboration of Unit_Id.
+
+      -------------------------
+      -- Info_Missing_Pragma --
+      -------------------------
+
+      procedure Info_Missing_Pragma is
+      begin
+         --  Internal units are ignored as they cause unnecessary noise
+
+         if not In_Internal_Unit (Unit_Id) then
+
+            --  The name of the unit subjected to the elaboration pragma is
+            --  fully qualified to improve the clarity of the info message.
+
+            Error_Msg_Name_1     := Prag_Nam;
+            Error_Msg_Qual_Level := Nat'Last;
+
+            Error_Msg_NE ("info: missing pragma % for unit &", N, Unit_Id);
+            Error_Msg_Qual_Level := 0;
+         end if;
+      end Info_Missing_Pragma;
+
+      --  Local variables
+
+      Elab_Attrs : Elaboration_Attributes;
+      Level      : Enclosing_Level_Kind;
+
+   --  Start of processing for Ensure_Prior_Elaboration_Dynamic
+
+   begin
+      Elab_Attrs := Elaboration_Context.Get (Unit_Id);
+
+      --  Nothing to do when the unit is guaranteed prior elaboration by means
+      --  of a source Elaborate[_All] pragma.
+
+      if Present (Elab_Attrs.Source_Pragma) then
+         return;
+      end if;
+
+      --  Output extra information on a missing Elaborate[_All] pragma when
+      --  switch -gnatel (info messages on implicit Elaborate[_All] pragmas
+      --  is in effect.
+
+      if Elab_Info_Messages then
+
+         --  Performance note: parent traversal
+
+         Level := Find_Enclosing_Level (N);
+
+         --  Declaration-level scenario
+
+         if (Is_Suitable_Call (N) or else Is_Suitable_Instantiation (N))
+           and then Level = Declaration_Level
+         then
+            null;
+
+         --  Library-level scenario
+
+         elsif Level in Library_Level then
+            null;
+
+         --  Instantiation library-level scenario
+
+         elsif Level = Instantiation then
+            null;
+
+         --  Otherwise the scenario does not appear at the proper level and
+         --  cannot possibly act as a top-level scenario.
+
+         else
+            return;
+         end if;
+
+         Info_Missing_Pragma;
+      end if;
+   end Ensure_Prior_Elaboration_Dynamic;
+
    -------------------------------------
-   -- Ensure_Static_Prior_Elaboration --
+   -- Ensure_Prior_Elaboration_Static --
    -------------------------------------
 
-   procedure Ensure_Static_Prior_Elaboration
+   procedure Ensure_Prior_Elaboration_Static
      (N        : Node_Id;
       Unit_Id  : Entity_Id;
       Prag_Nam : Name_Id)
@@ -2177,8 +2229,9 @@ package body Sem_Elab is
       function Find_With_Clause
         (Items     : List_Id;
          Withed_Id : Entity_Id) return Node_Id;
-      --  Find a non-limited with clause in the list of context items Items
-      --  which withs unit Withed_Id. Return Empty if no such clause is found.
+      pragma Inline (Find_With_Clause);
+      --  Find a nonlimited with clause in the list of context items Items
+      --  that withs unit Withed_Id. Return Empty if no such clause is found.
 
       procedure Info_Implicit_Pragma;
       pragma Inline (Info_Implicit_Pragma);
@@ -2253,7 +2306,7 @@ package body Sem_Elab is
       Elab_Attrs : Elaboration_Attributes;
       Items      : List_Id;
 
-   --  Start of processing for Ensure_Static_Prior_Elaboration
+   --  Start of processing for Ensure_Prior_Elaboration_Static
 
    begin
       Elab_Attrs := Elaboration_Context.Get (Unit_Id);
@@ -2347,7 +2400,7 @@ package body Sem_Elab is
       if Elab_Info_Messages then
          Info_Implicit_Pragma;
       end if;
-   end Ensure_Static_Prior_Elaboration;
+   end Ensure_Prior_Elaboration_Static;
 
    -----------------------------
    -- Extract_Assignment_Name --
@@ -2898,10 +2951,8 @@ package body Sem_Elab is
    --------------------
 
    function Find_Code_Unit (N : Node_Or_Entity_Id) return Entity_Id is
-      N_Unit : constant Node_Id := Unit (Cunit (Get_Code_Unit (N)));
-
    begin
-      return Defining_Entity (N_Unit, Concurrent_Subunit => True);
+      return Find_Unit_Entity (Unit (Cunit (Get_Code_Unit (N))));
    end Find_Code_Unit;
 
    ---------------------------
@@ -2921,7 +2972,7 @@ package body Sem_Elab is
          Full_Context : Boolean);
       --  Add unit Unit_Id to the elaboration context. Prag denotes the pragma
       --  which prompted the inclusion of the unit to the elaboration context.
-      --  If flag Full_Context is set, examine the non-limited clauses of unit
+      --  If flag Full_Context is set, examine the nonlimited clauses of unit
       --  Unit_Id and add each withed unit to the context.
 
       procedure Find_Elaboration_Context (Comp_Unit : Node_Id);
@@ -3018,7 +3069,7 @@ package body Sem_Elab is
 
          if Full_Context then
 
-            --  Process all non-limited with clauses found in the context of
+            --  Process all nonlimited with clauses found in the context of
             --  the current unit. Note that limited clauses do not impose an
             --  elaboration order.
 
@@ -3370,12 +3421,47 @@ package body Sem_Elab is
    -------------------
 
    function Find_Top_Unit (N : Node_Or_Entity_Id) return Entity_Id is
-      N_Unit : constant Node_Id := Unit (Cunit (Get_Top_Level_Code_Unit (N)));
-
    begin
-      return Defining_Entity (N_Unit, Concurrent_Subunit => True);
+      return Find_Unit_Entity (Unit (Cunit (Get_Top_Level_Code_Unit (N))));
    end Find_Top_Unit;
 
+   ----------------------
+   -- Find_Unit_Entity --
+   ----------------------
+
+   function Find_Unit_Entity (N : Node_Id) return Entity_Id is
+      Context : constant Node_Id := Parent (N);
+      Orig_N  : constant Node_Id := Original_Node (N);
+
+   begin
+      --  The unit denotes a package body of an instantiation which acts as
+      --  a compilation unit. The proper entity is that of the package spec.
+
+      if Nkind (N) = N_Package_Body
+        and then Nkind (Orig_N) = N_Package_Instantiation
+        and then Nkind (Context) = N_Compilation_Unit
+      then
+         return Corresponding_Spec (N);
+
+      --  The unit denotes an anonymous package created to wrap a subprogram
+      --  instantiation which acts as a compilation unit. The proper entity is
+      --  that of the "related instance".
+
+      elsif Nkind (N) = N_Package_Declaration
+        and then Nkind_In (Orig_N, N_Function_Instantiation,
+                                   N_Procedure_Instantiation)
+        and then Nkind (Context) = N_Compilation_Unit
+      then
+         return
+           Related_Instance (Defining_Entity (N, Concurrent_Subunit => True));
+
+      --  Otherwise the proper entity is the defining entity
+
+      else
+         return Defining_Entity (N, Concurrent_Subunit => True);
+      end if;
+   end Find_Unit_Entity;
+
    -----------------------
    -- First_Formal_Type --
    -----------------------
@@ -4140,11 +4226,11 @@ package body Sem_Elab is
          In_SPARK => In_SPARK);
    end Info_Instantiation;
 
-   -----------------------------
-   -- Info_Variable_Reference --
-   -----------------------------
+   ------------------------
+   -- Info_Variable_Read --
+   ------------------------
 
-   procedure Info_Variable_Reference
+   procedure Info_Variable_Read
      (Ref      : Node_Id;
       Var_Id   : Entity_Id;
       Info_Msg : Boolean;
@@ -4152,12 +4238,12 @@ package body Sem_Elab is
    is
    begin
       Elab_Msg_NE
-        (Msg      => "reference to variable & during elaboration",
+        (Msg      => "read of variable & during elaboration",
          N        => Ref,
          Id       => Var_Id,
          Info_Msg => Info_Msg,
          In_SPARK => In_SPARK);
-   end Info_Variable_Reference;
+   end Info_Variable_Read;
 
    --------------------
    -- Insertion_Node --
@@ -4642,6 +4728,18 @@ package body Sem_Elab is
         Ekind (Id) = E_Procedure and then Is_Initial_Condition_Procedure (Id);
    end Is_Initial_Condition_Proc;
 
+   --------------------
+   -- Is_Initialized --
+   --------------------
+
+   function Is_Initialized (Obj_Decl : Node_Id) return Boolean is
+   begin
+      --  To qualify, the object declaration must have an expression
+
+      return
+        Present (Expression (Obj_Decl)) or else Has_Init_Expression (Obj_Decl);
+   end Is_Initialized;
+
    -----------------------
    -- Is_Invariant_Proc --
    -----------------------
@@ -5102,7 +5200,7 @@ package body Sem_Elab is
           or else Is_Suitable_Call (N)
           or else Is_Suitable_Instantiation (N)
           or else Is_Suitable_Variable_Assignment (N)
-          or else Is_Suitable_Variable_Reference (N);
+          or else Is_Suitable_Variable_Read (N);
    end Is_Suitable_Scenario;
 
    -------------------------------------
@@ -5182,11 +5280,7 @@ package body Sem_Elab is
       --  To qualify, the assignment must meet the following prerequisites:
 
       return
-
-        --  The variable must be a source entity and susceptible to warnings
-
         Comes_From_Source (Var_Id)
-          and then not Has_Warnings_Off (Var_Id)
 
           --  The variable must be declared in the spec of compilation unit U
 
@@ -5196,29 +5290,23 @@ package body Sem_Elab is
 
           and then Find_Enclosing_Level (Var_Decl) = Package_Spec
 
-          --  The variable must lack initialization
-
-          and then not Has_Init_Expression (Var_Decl)
-          and then No (Expression (Var_Decl))
-
           --  The assignment must occur in the body of compilation unit U
 
           and then Nkind (N_Unit) = N_Package_Body
           and then Present (Corresponding_Body (Var_Unit))
-          and then Corresponding_Body (Var_Unit) = N_Unit_Id
-
-          --  The package spec must lack pragma Elaborate_Body
-
-          and then not Has_Pragma_Elaborate_Body (Var_Unit_Id);
+          and then Corresponding_Body (Var_Unit) = N_Unit_Id;
    end Is_Suitable_Variable_Assignment;
 
-   ------------------------------------
-   -- Is_Suitable_Variable_Reference --
-   ------------------------------------
+   -------------------------------
+   -- Is_Suitable_Variable_Read --
+   -------------------------------
 
-   function Is_Suitable_Variable_Reference (N : Node_Id) return Boolean is
+   function Is_Suitable_Variable_Read (N : Node_Id) return Boolean is
       function In_Pragma (Nod : Node_Id) return Boolean;
-      --  Determine whether arbitrary node N appears within a pragma
+      --  Determine whether arbitrary node Nod appears within a pragma
+
+      function Is_Variable_Read (Ref : Node_Id) return Boolean;
+      --  Determine whether variable reference Ref constitutes a read
 
       ---------------
       -- In_Pragma --
@@ -5245,12 +5333,88 @@ package body Sem_Elab is
          return False;
       end In_Pragma;
 
+      ----------------------
+      -- Is_Variable_Read --
+      ----------------------
+
+      function Is_Variable_Read (Ref : Node_Id) return Boolean is
+         function Is_Out_Actual (Call : Node_Id) return Boolean;
+         --  Determine whether the corresponding formal of actual Ref which
+         --  appears in call Call has mode OUT.
+
+         -------------------
+         -- Is_Out_Actual --
+         -------------------
+
+         function Is_Out_Actual (Call : Node_Id) return Boolean is
+            Actual     : Node_Id;
+            Call_Attrs : Call_Attributes;
+            Formal     : Entity_Id;
+            Target_Id  : Entity_Id;
+
+         begin
+            Extract_Call_Attributes
+              (Call      => Call,
+               Target_Id => Target_Id,
+               Attrs     => Call_Attrs);
+
+            --  Inspect the actual and formal parameters, trying to find the
+            --  corresponding formal for Ref.
+
+            Actual := First_Actual (Call);
+            Formal := First_Formal (Target_Id);
+            while Present (Actual) and then Present (Formal) loop
+               if Actual = Ref then
+                  return Ekind (Formal) = E_Out_Parameter;
+               end if;
+
+               Next_Actual (Actual);
+               Next_Formal (Formal);
+            end loop;
+
+            return False;
+         end Is_Out_Actual;
+
+         --  Local variables
+
+         Context : constant Node_Id := Parent (Ref);
+
+      --  Start of processing for Is_Variable_Read
+
+      begin
+         --  The majority of variable references are reads, and they can appear
+         --  in a great number of contexts. To determine whether a reference is
+         --  a read, it is more practical to find out whether it is a write.
+
+         --  A reference is a write when it appears immediately on the left-
+         --  hand side of an assignment.
+
+         if Nkind (Context) = N_Assignment_Statement
+           and then Name (Context) = Ref
+         then
+            return False;
+
+         --  A reference is a write when it acts as an actual in a subprogram
+         --  call and the corresponding formal has mode OUT.
+
+         elsif Nkind_In (Context, N_Function_Call,
+                                  N_Procedure_Call_Statement)
+           and then Is_Out_Actual (Context)
+         then
+            return False;
+         end if;
+
+         --  Any other reference is a read
+
+         return True;
+      end Is_Variable_Read;
+
       --  Local variables
 
       Prag   : Node_Id;
       Var_Id : Entity_Id;
 
-   --  Start of processing for Is_Suitable_Variable_Reference
+   --  Start of processing for Is_Suitable_Variable_Read
 
    begin
       --  This scenario is relevant only when the static model is in effect
@@ -5262,8 +5426,7 @@ package body Sem_Elab is
          return False;
 
       --  Attributes and operator sumbols are not considered to be suitable
-      --  references to variables even though they are part of predicate
-      --  Is_Entity_Name.
+      --  references even though they are part of predicate Is_Entity_Name.
 
       elsif not Nkind_In (N, N_Expanded_Name, N_Identifier) then
          return False;
@@ -5303,6 +5466,10 @@ package body Sem_Elab is
           and then Get_SPARK_Mode_From_Annotation (Prag) = On
           and then Is_SPARK_Mode_On_Node (N)
 
+          --  The reference must denote a variable read
+
+          and then Is_Variable_Read (N)
+
           --  The reference must not be considered when it appears in a pragma.
           --  If the pragma has run-time semantics, then the reference will be
           --  reconsidered once the pragma is expanded.
@@ -5310,7 +5477,7 @@ package body Sem_Elab is
           --  Performance note: parent traversal
 
           and then not In_Pragma (N);
-   end Is_Suitable_Variable_Reference;
+   end Is_Suitable_Variable_Read;
 
    -------------------
    -- Is_Task_Entry --
@@ -5485,8 +5652,8 @@ package body Sem_Elab is
                Info_Msg => False,
                In_SPARK => True);
 
-         elsif Is_Suitable_Variable_Reference (N) then
-            Info_Variable_Reference
+         elsif Is_Suitable_Variable_Read (N) then
+            Info_Variable_Read
               (Ref      => N,
                Var_Id   => Target_Id,
                Info_Msg => False,
@@ -5650,8 +5817,9 @@ package body Sem_Elab is
       procedure Output_Variable_Assignment (N : Node_Id);
       --  Emit a specific diagnostic message for assignment statement N
 
-      procedure Output_Variable_Reference (N : Node_Id);
-      --  Emit a specific diagnostic message for variable reference N
+      procedure Output_Variable_Read (N : Node_Id);
+      --  Emit a specific diagnostic message for reference N which reads a
+      --  variable.
 
       -------------------
       -- Output_Access --
@@ -5980,11 +6148,11 @@ package body Sem_Elab is
          Error_Msg_NE ("\\  variable & assigned #", Error_Nod, Var_Id);
       end Output_Variable_Assignment;
 
-      -------------------------------
-      -- Output_Variable_Reference --
-      -------------------------------
+      --------------------------
+      -- Output_Variable_Read --
+      --------------------------
 
-      procedure Output_Variable_Reference (N : Node_Id) is
+      procedure Output_Variable_Read (N : Node_Id) is
          Dummy  : Variable_Attributes;
          Var_Id : Entity_Id;
 
@@ -5995,8 +6163,8 @@ package body Sem_Elab is
             Attrs  => Dummy);
 
          Error_Msg_Sloc := Sloc (N);
-         Error_Msg_NE ("\\  variable & referenced #", Error_Nod, Var_Id);
-      end Output_Variable_Reference;
+         Error_Msg_NE ("\\  variable & read #", Error_Nod, Var_Id);
+      end Output_Variable_Read;
 
       --  Local variables
 
@@ -6057,10 +6225,10 @@ package body Sem_Elab is
          elsif Nkind (N) = N_Assignment_Statement then
             Output_Variable_Assignment (N);
 
-         --  Variable references
+         --  Variable read
 
-         elsif Is_Suitable_Variable_Reference (N) then
-            Output_Variable_Reference (N);
+         elsif Is_Suitable_Variable_Read (N) then
+            Output_Variable_Read (N);
 
          else
             pragma Assert (False);
@@ -6166,7 +6334,7 @@ package body Sem_Elab is
       end if;
 
       --  Treat the attribute as an immediate invocation of the target when
-      --  switch -gnatd.o (conservarive elaboration order for indirect calls)
+      --  switch -gnatd.o (conservative elaboration order for indirect calls)
       --  is in effect. Note that the prior elaboration of the unit containing
       --  the target is ensured processing the corresponding call marker.
 
@@ -6781,16 +6949,18 @@ package body Sem_Elab is
       elsif Is_Up_Level_Target (Target_Attrs.Spec_Decl) then
          return;
 
-      --  The SPARK rules are in effect
+      --  The SPARK rules are verified only when -gnatd.v (enforce SPARK
+      --  elaboration rules in SPARK code) is in effect.
 
-      elsif SPARK_Rules_On then
+      elsif SPARK_Rules_On and Debug_Flag_Dot_V then
          Process_Call_SPARK
            (Call         => Call,
             Call_Attrs   => Call_Attrs,
             Target_Id    => Target_Id,
             Target_Attrs => Target_Attrs);
 
-      --  Otherwise the Ada rules are in effect
+      --  Otherwise the Ada rules are in effect, or SPARK code is allowed to
+      --  violate the SPARK rules.
 
       else
          Process_Call_Ada
@@ -7349,9 +7519,10 @@ package body Sem_Elab is
       elsif Is_Up_Level_Target (Gen_Attrs.Spec_Decl) then
          return;
 
-      --  The SPARK rules are in effect
+      --  The SPARK rules are verified only when -gnatd.v (enforce SPARK
+      --  elaboration rules in SPARK code) is in effect.
 
-      elsif SPARK_Rules_On then
+      elsif SPARK_Rules_On and Debug_Flag_Dot_V then
          Process_Instantiation_SPARK
            (Exp_Inst   => Exp_Inst,
             Inst       => Inst,
@@ -7359,7 +7530,8 @@ package body Sem_Elab is
             Gen_Id     => Gen_Id,
             Gen_Attrs  => Gen_Attrs);
 
-      --  Otherwise the Ada rules are in effect
+      --  Otherwise the Ada rules are in effect, or SPARK code is allowed to
+      --  violate the SPARK rules.
 
       else
          Process_Instantiation_Ada
@@ -7675,9 +7847,9 @@ package body Sem_Elab is
       --  ABE ramifications of the instantiation.
 
       if Nkind (Inst) = N_Package_Instantiation then
-         Req_Nam := Name_Elaborate;
-      else
          Req_Nam := Name_Elaborate_All;
+      else
+         Req_Nam := Name_Elaborate;
       end if;
 
       Meet_Elaboration_Requirement
@@ -7732,31 +7904,76 @@ package body Sem_Elab is
    ---------------------------------
 
    procedure Process_Variable_Assignment (Asmt : Node_Id) is
-      Var_Id  : constant Entity_Id := Entity (Extract_Assignment_Name (Asmt));
-      Spec_Id : Entity_Id;
+      Var_Id : constant Entity_Id := Entity (Extract_Assignment_Name (Asmt));
+      Prag   : constant Node_Id   := SPARK_Pragma (Var_Id);
+
+      SPARK_Rules_On : Boolean;
+      --  This flag is set when the SPARK rules are in effect
 
    begin
+      --  The SPARK rules are in effect when both the assignment and the
+      --  variable are subject to SPARK_Mode On.
+
+      SPARK_Rules_On :=
+        Present (Prag)
+          and then Get_SPARK_Mode_From_Annotation (Prag) = On
+          and then Is_SPARK_Mode_On_Node (Asmt);
+
       --  Output relevant information when switch -gnatel (info messages on
       --  implicit Elaborate[_All] pragmas) is in effect.
 
       if Elab_Info_Messages then
-         Error_Msg_NE
-           ("info: assignment to & during elaboration", Asmt, Var_Id);
+         Elab_Msg_NE
+           (Msg      => "assignment to & during elaboration",
+            N        => Asmt,
+            Id       => Var_Id,
+            Info_Msg => True,
+            In_SPARK => SPARK_Rules_On);
       end if;
 
-      Spec_Id := Find_Top_Unit (Var_Id);
+      --  The SPARK rules are in effect. These rules are applied regardless of
+      --  whether -gnatd.v (enforce SPARK elaboration rules in SPARK code) is
+      --  in effect because the static model cannot ensure safe assignment of
+      --  variables.
 
-      --  Generate an implicit Elaborate_Body in the spec
+      if SPARK_Rules_On then
+         Process_Variable_Assignment_SPARK
+           (Asmt   => Asmt,
+            Var_Id => Var_Id);
 
-      Set_Elaborate_Body_Desirable (Spec_Id);
+      --  Otherwise the Ada rules are in effect
 
-      --  No warning is emitted for internal uses. This behaviour parallels
-      --  that of the old ABE mechanism.
+      else
+         Process_Variable_Assignment_Ada
+           (Asmt   => Asmt,
+            Var_Id => Var_Id);
+      end if;
+   end Process_Variable_Assignment;
 
-      if GNAT_Mode then
-         null;
+   -------------------------------------
+   -- Process_Variable_Assignment_Ada --
+   -------------------------------------
+
+   procedure Process_Variable_Assignment_Ada
+     (Asmt   : Node_Id;
+      Var_Id : Entity_Id)
+   is
+      Var_Decl : constant Node_Id   := Declaration_Node (Var_Id);
+      Spec_Id  : constant Entity_Id := Find_Top_Unit (Var_Decl);
+
+   begin
+      --  Emit a warning when an uninitialized variable declared in a package
+      --  spec without a pragma Elaborate_Body is initialized by elaboration
+      --  code within the corresponding body.
+
+      if not Warnings_Off (Var_Id)
+        and then not Is_Initialized (Var_Decl)
+        and then not Has_Pragma_Elaborate_Body (Spec_Id)
+      then
+         --  Generate an implicit Elaborate_Body in the spec
+
+         Set_Elaborate_Body_Desirable (Spec_Id);
 
-      else
          Error_Msg_NE
            ("??variable & can be accessed by clients before this "
             & "initialization", Asmt, Var_Id);
@@ -7767,13 +7984,44 @@ package body Sem_Elab is
 
          Output_Active_Scenarios (Asmt);
       end if;
-   end Process_Variable_Assignment;
+   end Process_Variable_Assignment_Ada;
 
-   --------------------------------
-   -- Process_Variable_Reference --
-   --------------------------------
+   ---------------------------------------
+   -- Process_Variable_Assignment_SPARK --
+   ---------------------------------------
+
+   procedure Process_Variable_Assignment_SPARK
+     (Asmt   : Node_Id;
+      Var_Id : Entity_Id)
+   is
+      Var_Decl : constant Node_Id   := Declaration_Node (Var_Id);
+      Spec_Id  : constant Entity_Id := Find_Top_Unit (Var_Decl);
+
+   begin
+      --  Emit an error when an initialized variable declared in a package spec
+      --  without pragma Elaborate_Body is further modified by elaboration code
+      --  within the corresponding body.
+
+      if Is_Initialized (Var_Decl)
+        and then not Has_Pragma_Elaborate_Body (Spec_Id)
+      then
+         Error_Msg_NE
+           ("variable & modified by elaboration code in package body",
+            Asmt, Var_Id);
 
-   procedure Process_Variable_Reference (Ref : Node_Id) is
+         Error_Msg_NE
+           ("\add pragma ""Elaborate_Body"" to spec & to ensure full "
+            & "initialization", Asmt, Spec_Id);
+
+         Output_Active_Scenarios (Asmt);
+      end if;
+   end Process_Variable_Assignment_SPARK;
+
+   ---------------------------
+   -- Process_Variable_Read --
+   ---------------------------
+
+   procedure Process_Variable_Read (Ref : Node_Id) is
       Var_Attrs : Variable_Attributes;
       Var_Id    : Entity_Id;
 
@@ -7788,22 +8036,42 @@ package body Sem_Elab is
 
       if Elab_Info_Messages then
          Elab_Msg_NE
-           (Msg      => "reference to variable & during elaboration",
+           (Msg      => "read of variable & during elaboration",
             N        => Ref,
             Id       => Var_Id,
             Info_Msg => True,
             In_SPARK => True);
       end if;
 
-      --  A source variable reference imposes an Elaborate_All requirement on
-      --  the context of the main unit. Determine whethe the context has a
-      --  pragma strong enough to meet the requirement.
+      --  Nothing to do when the variable appears within the main unit because
+      --  diagnostics on reads are relevant only for external variables.
 
-      Meet_Elaboration_Requirement
-        (N         => Ref,
-         Target_Id => Var_Id,
-         Req_Nam   => Name_Elaborate_All);
-   end Process_Variable_Reference;
+      if Is_Same_Unit (Var_Attrs.Unit_Id, Cunit_Entity (Main_Unit)) then
+         null;
+
+      --  Nothing to do when the variable is already initialized. Note that the
+      --  variable may be further modified by the external unit.
+
+      elsif Is_Initialized (Declaration_Node (Var_Id)) then
+         null;
+
+      --  Nothing to do when the external unit guarantees the initialization of
+      --  the variable by means of pragma Elaborate_Body.
+
+      elsif Has_Pragma_Elaborate_Body (Var_Attrs.Unit_Id) then
+         null;
+
+      --  A variable read imposes an Elaborate requirement on the context of
+      --  the main unit. Determine whether the context has a pragma strong
+      --  enough to meet the requirement.
+
+      else
+         Meet_Elaboration_Requirement
+           (N         => Ref,
+            Target_Id => Var_Id,
+            Req_Nam   => Name_Elaborate);
+      end if;
+   end Process_Variable_Read;
 
    --------------------------
    -- Push_Active_Scenario --
@@ -7874,10 +8142,10 @@ package body Sem_Elab is
       elsif Is_Suitable_Variable_Assignment (N) then
          Process_Variable_Assignment (N);
 
-      --  Variable references
+      --  Variable read
 
-      elsif Is_Suitable_Variable_Reference (N) then
-         Process_Variable_Reference (N);
+      elsif Is_Suitable_Variable_Read (N) then
+         Process_Variable_Read (N);
       end if;
 
       --  Remove the current scenario from the stack of active scenarios once
@@ -7938,20 +8206,40 @@ package body Sem_Elab is
       --  listed below are not considered. The categories are:
 
       --   'Access for entries, operators, and subprograms
+      --    Assignments to variables
       --    Calls (includes task activation)
       --    Instantiations
-      --    Variable assignments
-      --    Variable references
+      --    Reads of variables
 
-      elsif Is_Suitable_Access (N)
-        or else Is_Suitable_Variable_Assignment (N)
-        or else Is_Suitable_Variable_Reference (N)
-      then
-         null;
+      elsif Is_Suitable_Access (N) then
+
+         --  Signal any enclosing local exception handlers that the 'Access may
+         --  raise Program_Error due to a failed ABE check when switch -gnatd.o
+         --  (conservative elaboration order for indirect calls) is in effect.
+         --  Marking the exception handlers ensures proper expansion by both
+         --  the front and back end restriction when No_Exception_Propagation
+         --  is in effect.
+
+         if Debug_Flag_Dot_O then
+            Possible_Local_Raise (N, Standard_Program_Error);
+         end if;
 
       elsif Is_Suitable_Call (N) or else Is_Suitable_Instantiation (N) then
          Declaration_Level_OK := True;
 
+         --  Signal any enclosing local exception handlers that the call or
+         --  instantiation may raise Program_Error due to a failed ABE check.
+         --  Marking the exception handlers ensures proper expansion by both
+         --  the front and back end restriction when No_Exception_Propagation
+         --  is in effect.
+
+         Possible_Local_Raise (N, Standard_Program_Error);
+
+      elsif Is_Suitable_Variable_Assignment (N)
+        or else Is_Suitable_Variable_Read (N)
+      then
+         null;
+
       --  Otherwise the input does not denote a suitable scenario
 
       else
@@ -8004,7 +8292,7 @@ package body Sem_Elab is
 
       --  Mark a scenario which may produce run-time conditional ABE checks or
       --  guaranteed ABE failures as recorded. The flag ensures that scenario
-      --  rewritting performed by Atree.Rewrite will be properly reflected in
+      --  rewriting performed by Atree.Rewrite will be properly reflected in
       --  all relevant internal data structures.
 
       if Is_Check_Emitting_Scenario (N) then
diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 0456101092a..eae149805fa 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -2818,10 +2818,16 @@ package body Sem_Prag is
                                              E_Constant,
                                              E_Variable)
                then
+                  --  When the initialization item is undefined, it appears as
+                  --  Any_Id. Do not continue with the analysis of the item.
+
+                  if Item_Id = Any_Id then
+                     null;
+
                   --  The state or variable must be declared in the visible
                   --  declarations of the package (SPARK RM 7.1.5(7)).
 
-                  if not Contains (States_And_Objs, Item_Id) then
+                  elsif not Contains (States_And_Objs, Item_Id) then
                      Error_Msg_Name_1 := Chars (Pack_Id);
                      SPARK_Msg_NE
                        ("initialization item & must appear in the visible "
@@ -13236,23 +13242,21 @@ package body Sem_Prag is
                Set_SCO_Pragma_Enabled (Loc);
             end if;
 
-            --  Deal with analyzing the string argument
+            --  Deal with analyzing the string argument. If checks are not
+            --  on we don't want any expansion (since such expansion would
+            --  not get properly deleted) but we do want to analyze (to get
+            --  proper references). The Preanalyze_And_Resolve routine does
+            --  just what we want. Ditto if pragma is active, because it will
+            --  be rewritten as an if-statement whose analysis will complete
+            --  analysis and expansion of the string message. This makes a
+            --  difference in the unusual case where the expression for the
+            --  string may have a side effect, such as raising an exception.
+            --  This is mandated by RM 11.4.2, which specifies that the string
+            --  expression is only evaluated if the check fails and
+            --  Assertion_Error is to be raised.
 
             if Arg_Count = 3 then
-
-               --  If checks are not on we don't want any expansion (since
-               --  such expansion would not get properly deleted) but
-               --  we do want to analyze (to get proper references).
-               --  The Preanalyze_And_Resolve routine does just what we want
-
-               if Is_Ignored (N) then
-                  Preanalyze_And_Resolve (Str, Standard_String);
-
-                  --  Otherwise we need a proper analysis and expansion
-
-               else
-                  Analyze_And_Resolve (Str, Standard_String);
-               end if;
+               Preanalyze_And_Resolve (Str, Standard_String);
             end if;
 
             --  Now you might think we could just do the same with the Boolean
diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
index 68c1a0892a6..f5c5f9e96dc 100644
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -4843,9 +4843,8 @@ package body Sem_Res is
              (Comes_From_Source (Parent (N))
                or else
                  (Ekind (Current_Scope) = E_Function
-                   and then Nkind
-                     (Original_Node (Unit_Declaration_Node (Current_Scope)))
-                       = N_Expression_Function))
+                   and then Nkind (Original_Node (Unit_Declaration_Node
+                              (Current_Scope))) = N_Expression_Function))
            and then not In_Instance_Body
          then
             if not OK_For_Limited_Init (Etype (E), Expression (E)) then
diff --git a/gcc/ada/sem_type.adb b/gcc/ada/sem_type.adb
index 05315852511..e2b3afdf898 100644
--- a/gcc/ada/sem_type.adb
+++ b/gcc/ada/sem_type.adb
@@ -2838,11 +2838,9 @@ package body Sem_Type is
          return False;
 
       elsif Nkind (Par) in N_Declaration then
-         if Nkind (Par) = N_Object_Declaration then
-            return Present (Corresponding_Generic_Association (Par));
-         else
-            return False;
-         end if;
+         return
+           Nkind (Par) = N_Object_Declaration
+             and then Present (Corresponding_Generic_Association (Par));
 
       elsif Nkind (Par) = N_Object_Renaming_Declaration then
          return Present (Corresponding_Generic_Association (Par));
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index f003ef5a8ac..3698bbf16bd 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -3354,10 +3354,13 @@ package body Sem_Util is
            and then not Comes_From_Source (Par)
          then
             --  Continue to examine the context if the reference appears in a
-            --  subprogram body which was previously an expression function.
+            --  subprogram body which was previously an expression function,
+            --  unless this is during preanalysis (when In_Spec_Expression is
+            --  True), as the body may not yet be inserted in the tree.
 
             if Nkind (Par) = N_Subprogram_Body
               and then Was_Expression_Function (Par)
+              and then not In_Spec_Expression
             then
                null;
 
@@ -12545,10 +12548,8 @@ package body Sem_Util is
                  or else (Present (Renamed_Object (E))
                            and then Is_Aliased_View (Renamed_Object (E)))))
 
-           or else ((Is_Formal (E)
-                      or else Ekind_In (E, E_Generic_In_Out_Parameter,
-                                           E_Generic_In_Parameter))
-                    and then Is_Tagged_Type (Etype (E)))
+           or else ((Is_Formal (E) or else Is_Formal_Object (E))
+                      and then Is_Tagged_Type (Etype (E)))
 
            or else (Is_Concurrent_Type (E) and then In_Open_Scopes (E))
 
@@ -13185,17 +13186,29 @@ package body Sem_Util is
    function Is_Controlling_Limited_Procedure
      (Proc_Nam : Entity_Id) return Boolean
    is
+      Param     : Node_Id;
       Param_Typ : Entity_Id := Empty;
 
    begin
       if Ekind (Proc_Nam) = E_Procedure
         and then Present (Parameter_Specifications (Parent (Proc_Nam)))
       then
-         Param_Typ := Etype (Parameter_Type (First (
-                        Parameter_Specifications (Parent (Proc_Nam)))));
+         Param := Parameter_Type (First (
+                    Parameter_Specifications (Parent (Proc_Nam))));
 
-      --  In this case where an Itype was created, the procedure call has been
-      --  rewritten.
+         --  The formal may be an anonymous access type.
+
+         if Nkind (Param) = N_Access_Definition then
+            Param_Typ := Entity (Subtype_Mark (Param));
+
+         else
+            Param_Typ := Etype (Param);
+         end if;
+
+      --  In the case where an Itype was created for a dispatchin call, the
+      --  procedure call has been rewritten. The actual may be an access to
+      --  interface type in which case it is the designated type that is the
+      --  controlling type.
 
       elsif Present (Associated_Node_For_Itype (Proc_Nam))
         and then Present (Original_Node (Associated_Node_For_Itype (Proc_Nam)))
@@ -13206,6 +13219,10 @@ package body Sem_Util is
          Param_Typ :=
            Etype (First (Parameter_Associations
                           (Associated_Node_For_Itype (Proc_Nam))));
+
+         if Ekind (Param_Typ) = E_Anonymous_Access_Type then
+            Param_Typ := Directly_Designated_Type (Param_Typ);
+         end if;
       end if;
 
       if Present (Param_Typ) then
@@ -13387,7 +13404,7 @@ package body Sem_Util is
                end if;
 
             --  A discriminant check on a selected component may be expanded
-            --  into a dereference when removing side-effects. Recover the
+            --  into a dereference when removing side effects. Recover the
             --  original node and its type, which may be unconstrained.
 
             elsif Nkind (P) = N_Explicit_Dereference
@@ -20584,6 +20601,51 @@ package body Sem_Util is
       return False;
    end Null_To_Null_Address_Convert_OK;
 
+   ---------------------------------
+   -- Number_Of_Elements_In_Array --
+   ---------------------------------
+
+   function Number_Of_Elements_In_Array (T : Entity_Id) return Int is
+      Indx : Node_Id;
+      Typ  : Entity_Id;
+      Low  : Node_Id;
+      High : Node_Id;
+      Num  : Int := 1;
+
+   begin
+      pragma Assert (Is_Array_Type (T));
+
+      Indx := First_Index (T);
+      while Present (Indx) loop
+         Typ := Underlying_Type (Etype (Indx));
+
+         --  Never look at junk bounds of a generic type
+
+         if Is_Generic_Type (Typ) then
+            return 0;
+         end if;
+
+         --  Check the array bounds are known at compile time and return zero
+         --  if they are not.
+
+         Low  := Type_Low_Bound (Typ);
+         High := Type_High_Bound (Typ);
+
+         if not Compile_Time_Known_Value (Low) then
+            return 0;
+         elsif not Compile_Time_Known_Value (High) then
+            return 0;
+         else
+            Num :=
+              Num * UI_To_Int ((Expr_Value (High) - Expr_Value (Low) + 1));
+         end if;
+
+         Next_Index (Indx);
+      end loop;
+
+      return Num;
+   end Number_Of_Elements_In_Array;
+
    -------------------------
    -- Object_Access_Level --
    -------------------------
@@ -20603,7 +20665,7 @@ package body Sem_Util is
       --  This construct appears in the context of dispatching calls.
 
       function Reference_To (Obj : Node_Id) return Node_Id;
-      --  An explicit dereference is created when removing side-effects from
+      --  An explicit dereference is created when removing side effects from
       --  expressions for constraint checking purposes. In this case a local
       --  access type is created for it. The correct access level is that of
       --  the original source node. We detect this case by noting that the
diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
index 2ebd54f3989..c6958cb1aaa 100644
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -2275,6 +2275,11 @@ package Sem_Util is
    --   2) N is a comparison operator, one of the operands is null, and the
    --      type of the other operand is a descendant of System.Address.
 
+   function Number_Of_Elements_In_Array (T : Entity_Id) return Int;
+   --  Returns the number of elements in the array T if the index bounds of T
+   --  is known at compile time. If the bounds are not known at compile time,
+   --  the function returns the value zero.
+
    function Object_Access_Level (Obj : Node_Id) return Uint;
    --  Return the accessibility level of the view of the object Obj. For
    --  convenience, qualified expressions applied to object names are also
diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index 91f430a29f5..0e498d3e6cb 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -509,7 +509,7 @@ package body Sem_Warn is
             end if;
 
             --  If the condition contains a function call, we consider it may
-            --  be modified by side-effects from a procedure call. Otherwise,
+            --  be modified by side effects from a procedure call. Otherwise,
             --  we consider the condition may not be modified, although that
             --  might happen if Variable is itself a by-reference parameter,
             --  and the procedure called modifies the global object referred to
diff --git a/gcc/ada/sinfo.adb b/gcc/ada/sinfo.adb
index e4f8608eb73..dc4e8fb2c1a 100644
--- a/gcc/ada/sinfo.adb
+++ b/gcc/ada/sinfo.adb
@@ -203,6 +203,14 @@ package body Sinfo is
       return Flag4 (N);
    end Aliased_Present;
 
+   function Alloc_For_BIP_Return
+      (N : Node_Id) return Boolean is
+   begin
+      pragma Assert (False
+        or else NT (N).Nkind = N_Allocator);
+      return Flag1 (N);
+   end Alloc_For_BIP_Return;
+
    function All_Others
       (N : Node_Id) return Boolean is
    begin
@@ -3626,6 +3634,14 @@ package body Sinfo is
       Set_Flag4 (N, Val);
    end Set_Aliased_Present;
 
+   procedure Set_Alloc_For_BIP_Return
+      (N : Node_Id; Val : Boolean := True) is
+   begin
+      pragma Assert (False
+        or else NT (N).Nkind = N_Allocator);
+      Set_Flag1 (N, Val);
+   end Set_Alloc_For_BIP_Return;
+
    procedure Set_All_Others
       (N : Node_Id; Val : Boolean := True) is
    begin
diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
index 247d127982d..cf220e4e563 100644
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -770,7 +770,7 @@ package Sinfo is
    --  The following flag fields appear in all nodes:
 
    --  Analyzed
-   --    This flag is used to indicate that a node (and all its children have
+   --    This flag is used to indicate that a node (and all its children) have
    --    been analyzed. It is used to avoid reanalysis of a node that has
    --    already been analyzed, both for efficiency and functional correctness
    --    reasons.
@@ -903,6 +903,10 @@ package Sinfo is
    --    known at compile time, this field points to an N_Range node with those
    --    bounds. Otherwise Empty.
 
+   --  Alloc_For_BIP_Return (Flag1-Sem)
+   --    Present in N_Allocator nodes. True if the allocator is one of those
+   --    generated for a build-in-place return statement.
+
    --  All_Others (Flag11-Sem)
    --    Present in an N_Others_Choice node. This flag is set for an others
    --    exception where all exceptions are to be caught, even those that are
@@ -1472,10 +1476,7 @@ package Sinfo is
    --  Generic_Parent (Node5-Sem)
    --    Generic_Parent is defined on declaration nodes that are instances. The
    --    value of Generic_Parent is the generic entity from which the instance
-   --    is obtained. Generic_Parent is also defined for the renaming
-   --    declarations and object declarations created for the actuals in an
-   --    instantiation. The generic parent of such a declaration is the
-   --    corresponding generic association in the Instantiation node.
+   --    is obtained.
 
    --  Generic_Parent_Type (Node4-Sem)
    --    Generic_Parent_Type is defined on Subtype_Declaration nodes for the
@@ -4776,6 +4777,7 @@ package Sinfo is
       --  Subpool_Handle_Name (Node4) (set to Empty if not present)
       --  Storage_Pool (Node1-Sem)
       --  Procedure_To_Call (Node2-Sem)
+      --  Alloc_For_BIP_Return (Flag1-Sem)
       --  Null_Exclusion_Present (Flag11)
       --  No_Initialization (Flag13-Sem)
       --  Is_Static_Coextension (Flag14-Sem)
@@ -7840,7 +7842,7 @@ package Sinfo is
 
       --  The required semantics is that the set of actions is executed in
       --  the order in which it appears, as though they appeared by themselves
-      --  in the enclosing list of declarations of statements. Unlike what
+      --  in the enclosing list of declarations or statements. Unlike what
       --  happens when using an N_Block_Statement, no new scope is introduced.
 
       --  Note: for the time being, this is used only as a transient
@@ -9128,6 +9130,9 @@ package Sinfo is
    function Aliased_Present
      (N : Node_Id) return Boolean;    -- Flag4
 
+   function Alloc_For_BIP_Return
+     (N : Node_Id) return Boolean;    -- Flag1
+
    function All_Others
      (N : Node_Id) return Boolean;    -- Flag11
 
@@ -10217,6 +10222,9 @@ package Sinfo is
    procedure Set_Aliased_Present
      (N : Node_Id; Val : Boolean := True);    -- Flag4
 
+   procedure Set_Alloc_For_BIP_Return
+     (N : Node_Id; Val : Boolean := True);    -- Flag1
+
    procedure Set_All_Others
      (N : Node_Id; Val : Boolean := True);    -- Flag11
 
@@ -13066,6 +13074,7 @@ package Sinfo is
    pragma Inline (Address_Warning_Posted);
    pragma Inline (Aggregate_Bounds);
    pragma Inline (Aliased_Present);
+   pragma Inline (Alloc_For_BIP_Return);
    pragma Inline (All_Others);
    pragma Inline (All_Present);
    pragma Inline (Alternatives);
@@ -13426,6 +13435,7 @@ package Sinfo is
    pragma Inline (Set_Address_Warning_Posted);
    pragma Inline (Set_Aggregate_Bounds);
    pragma Inline (Set_Aliased_Present);
+   pragma Inline (Set_Alloc_For_BIP_Return);
    pragma Inline (Set_All_Others);
    pragma Inline (Set_All_Present);
    pragma Inline (Set_Alternatives);
diff --git a/gcc/ada/sinput.ads b/gcc/ada/sinput.ads
index bde59b131dd..ecbe83cdd88 100644
--- a/gcc/ada/sinput.ads
+++ b/gcc/ada/sinput.ads
@@ -755,6 +755,8 @@ private
    pragma Inline (Num_Source_Files);
    pragma Inline (Num_Source_Lines);
 
+   pragma Inline (Line_Start);
+
    No_Instance_Id : constant Instance_Id := 0;
 
    -------------------------
diff --git a/gcc/ada/switch-b.adb b/gcc/ada/switch-b.adb
index 52a72e4de40..61fe4404b7d 100644
--- a/gcc/ada/switch-b.adb
+++ b/gcc/ada/switch-b.adb
@@ -391,6 +391,18 @@ package body Switch.B is
             Ptr := Ptr + 1;
             Quiet_Output := True;
 
+         --  Processing for Q switch
+
+         when 'Q' =>
+            if Ptr = Max then
+               Bad_Switch (Switch_Chars);
+            end if;
+
+            Ptr := Ptr + 1;
+            Scan_Pos
+              (Switch_Chars, Max, Ptr,
+               Quantity_Of_Default_Size_Sec_Stacks, C);
+
          --  Processing for r switch
 
          when 'r' =>
diff --git a/gcc/ada/switch-c.adb b/gcc/ada/switch-c.adb
index cd6b2006e22..5ad10e348a5 100644
--- a/gcc/ada/switch-c.adb
+++ b/gcc/ada/switch-c.adb
@@ -548,7 +548,6 @@ package body Switch.C is
                         Warn_On_Bad_Fixed_Value          := True; -- -gnatwb
                         Warn_On_Biased_Representation    := True; -- -gnatw.b
                         Warn_On_Export_Import            := True; -- -gnatwx
-                        Warn_On_Modified_Unread          := True; -- -gnatwm
                         Warn_On_No_Value_Assigned        := True; -- -gnatwv
                         Warn_On_Object_Renames_Function  := True; -- -gnatw.r
                         Warn_On_Overlap                  := True; -- -gnatw.i
diff --git a/gcc/ada/widechar.ads b/gcc/ada/widechar.ads
index a6e8293ae5d..3d2f9170976 100644
--- a/gcc/ada/widechar.ads
+++ b/gcc/ada/widechar.ads
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                 S p e c                                  --
 --                                                                          --
---          Copyright (C) 1992-2014, Free Software Foundation, Inc.         --
+--          Copyright (C) 1992-2017, Free Software Foundation, Inc.         --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -95,4 +95,7 @@ package Widechar is
       P : Source_Ptr) return Boolean;
    --  Determines if S (P) is the start of a wide character sequence
 
+private
+   pragma Inline (Is_Start_Of_Wide_Char);
+
 end Widechar;
diff --git a/gcc/alias.c b/gcc/alias.c
index f288299ec32..c69ef410eda 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2054,13 +2054,15 @@ compare_base_decls (tree base1, tree base2)
     return 1;
 
   /* If we have two register decls with register specification we
-     cannot decide unless their assembler name is the same.  */
+     cannot decide unless their assembler names are the same.  */
   if (DECL_REGISTER (base1)
       && DECL_REGISTER (base2)
+      && HAS_DECL_ASSEMBLER_NAME_P (base1)
+      && HAS_DECL_ASSEMBLER_NAME_P (base2)
       && DECL_ASSEMBLER_NAME_SET_P (base1)
       && DECL_ASSEMBLER_NAME_SET_P (base2))
     {
-      if (DECL_ASSEMBLER_NAME (base1) == DECL_ASSEMBLER_NAME (base2))
+      if (DECL_ASSEMBLER_NAME_RAW (base1) == DECL_ASSEMBLER_NAME_RAW (base2))
 	return 1;
       return -1;
     }
@@ -2331,7 +2333,7 @@ addr_side_effect_eval (rtx addr, poly_int64 size, int n_refs)
 static inline bool
 offset_overlap_p (poly_int64 c, poly_int64 xsize, poly_int64 ysize)
 {
-  if (known_zero (xsize) || known_zero (ysize))
+  if (must_eq (xsize, 0) || must_eq (ysize, 0))
     return true;
 
   if (may_ge (c, 0))
@@ -2561,7 +2563,7 @@ memrefs_conflict_p (poly_int64 xsize, rtx x, poly_int64 ysize, rtx y,
 	{
 	  if (may_gt (xsize, 0))
 	    xsize = -xsize;
-	  if (maybe_nonzero (xsize))
+	  if (may_ne (xsize, 0))
 	    xsize += sc + 1;
 	  c -= sc + 1;
 	  return memrefs_conflict_p (xsize, canon_rtx (XEXP (x, 0)),
@@ -2576,7 +2578,7 @@ memrefs_conflict_p (poly_int64 xsize, rtx x, poly_int64 ysize, rtx y,
 	{
 	  if (may_gt (ysize, 0))
 	    ysize = -ysize;
-	  if (maybe_nonzero (ysize))
+	  if (may_ne (ysize, 0))
 	    ysize += sc + 1;
 	  c += sc + 1;
 	  return memrefs_conflict_p (xsize, x,
diff --git a/gcc/asan.c b/gcc/asan.c
index 779aa78976d..d00089d04dc 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -628,10 +628,9 @@ handle_builtin_alloca (gcall *call, gimple_stmt_iterator *iter)
   tree ptr_type = gimple_call_lhs (call) ? TREE_TYPE (gimple_call_lhs (call))
 					 : ptr_type_node;
   tree partial_size = NULL_TREE;
-  bool alloca_with_align
-    = DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA_WITH_ALIGN;
   unsigned int align
-    = alloca_with_align ? tree_to_uhwi (gimple_call_arg (call, 1)) : 0;
+    = DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA
+      ? 0 : tree_to_uhwi (gimple_call_arg (call, 1));
 
   /* If ALIGN > ASAN_RED_ZONE_SIZE, we embed left redzone into first ALIGN
      bytes of allocated space.  Otherwise, align alloca to ASAN_RED_ZONE_SIZE
@@ -793,8 +792,7 @@ get_mem_refs_of_builtin_call (gcall *call,
       handle_builtin_stack_restore (call, iter);
       break;
 
-    case BUILT_IN_ALLOCA_WITH_ALIGN:
-    case BUILT_IN_ALLOCA:
+    CASE_BUILT_IN_ALLOCA:
       handle_builtin_alloca (call, iter);
       break;
     /* And now the __atomic* and __sync builtins.
@@ -1804,13 +1802,13 @@ create_cond_insert_point (gimple_stmt_iterator *iter,
     ? profile_probability::very_unlikely ()
     : profile_probability::very_likely ();
   e->probability = fallthrough_probability.invert ();
+  then_bb->count = e->count ();
   if (create_then_fallthru_edge)
     make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
 
   /* Set up the fallthrough basic block.  */
   e = find_edge (cond_bb, fallthru_bb);
   e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
   e->probability = fallthrough_probability;
 
   /* Update dominance info for the newly created then_bb; note that
@@ -2946,6 +2944,9 @@ asan_finish_file (void)
       TREE_CONSTANT (ctor) = 1;
       TREE_STATIC (ctor) = 1;
       DECL_INITIAL (var) = ctor;
+      SET_DECL_ALIGN (var, MAX (DECL_ALIGN (var),
+				ASAN_SHADOW_GRANULARITY * BITS_PER_UNIT));
+
       varpool_node::finalize_decl (var);
 
       tree fn = builtin_decl_implicit (BUILT_IN_ASAN_REGISTER_GLOBALS);
@@ -3401,6 +3402,10 @@ asan_expand_poison_ifn (gimple_stmt_iterator *iter,
 	      {
 		edge e = gimple_phi_arg_edge (phi, i);
 
+		/* Do not insert on an edge we can't split.  */
+		if (e->flags & EDGE_ABNORMAL)
+		  continue;
+
 		if (call_to_insert == NULL)
 		  call_to_insert = gimple_copy (call);
 
diff --git a/gcc/attribs.c b/gcc/attribs.c
index 4ef35b861f8..809f4c3a8d5 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1125,9 +1125,9 @@ attribute_value_equal (const_tree attr1, const_tree attr2)
 				     TREE_VALUE (attr2)) == 1);
     }
 
-  if ((flag_openmp || flag_openmp_simd)
-      && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
       && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+      && TREE_VALUE (attr2)
       && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
     return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
 					   TREE_VALUE (attr2));
@@ -1182,6 +1182,9 @@ comp_type_attributes (const_tree type1, const_tree type2)
     }
   if (lookup_attribute ("transaction_safe", CONST_CAST_TREE (a)))
     return 0;
+  if ((lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (type1)) != NULL)
+      ^ (lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (type2)) != NULL))
+    return 0;
   /* As some type combinations - like default calling-convention - might
      be compatible, we have to call the target hook to get the final result.  */
   return targetm.comp_type_attributes (type1, type2);
@@ -1319,6 +1322,44 @@ merge_decl_attributes (tree olddecl, tree newdecl)
 			   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  attr = lookup_attribute (name, attr);
+  if (!attr)
+    return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+    {
+      tree a2;
+      for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+	if (attribute_value_equal (attr, a2))
+	  break;
+      if (!a2)
+	{
+	  a2 = copy_node (attr);
+	  TREE_CHAIN (a2) = *attrs;
+	  *attrs = a2;
+	}
+      attr = lookup_attribute (name, TREE_CHAIN (attr));
+    }
+}
+
+/* Duplicate all attributes from user DECL to the corresponding
+   builtin that should be propagated.  */
+
+void
+copy_attributes_to_builtin (tree decl)
+{
+  tree b = builtin_decl_explicit (DECL_FUNCTION_CODE (decl));
+  if (b)
+    duplicate_one_attribute (&DECL_ATTRIBUTES (b),
+			     DECL_ATTRIBUTES (decl), "omp declare simd");
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 65e002ce988..f4bfe03e467 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -77,6 +77,16 @@ extern tree remove_attribute (const char *, tree);
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
+/* Duplicate all attributes from user DECL to the corresponding
+   builtin that should be propagated.  */
+
+extern void copy_attributes_to_builtin (tree);
+
 /* Given two Windows decl attributes lists, possibly including
    dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 9226e202d50..130d8df5b1e 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -852,7 +852,7 @@ autofdo_source_profile::read ()
 {
   if (gcov_read_unsigned () != GCOV_TAG_AFDO_FUNCTION)
     {
-      inform (0, "Not expected TAG.");
+      inform (UNKNOWN_LOCATION, "Not expected TAG.");
       return false;
     }
 
@@ -1234,7 +1234,7 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb,
       if (!is_edge_annotated (e, *annotated_edge))
 	num_unknown_edge++, unknown_edge = e;
       else
-	total_known_count += e->count;
+	total_known_count += e->count ();
 
     if (num_unknown_edge == 0)
       {
@@ -1251,7 +1251,8 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb,
       }
     else if (num_unknown_edge == 1 && is_bb_annotated (bb, *annotated_bb))
       {
-        unknown_edge->count = bb->count - total_known_count;
+        unknown_edge->probability
+	  = total_known_count.probability_in (bb->count);
         set_edge_annotated (unknown_edge, annotated_edge);
         changed = true;
       }
@@ -1349,15 +1350,13 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
           if (!e->probability.initialized_p ()
 	      && !is_edge_annotated (ep, *annotated_edge))
             {
-              ep->probability = profile_probability::never ();
-              ep->count = profile_count::zero ().afdo ();
+              ep->probability = profile_probability::never ().afdo ();
               set_edge_annotated (ep, annotated_edge);
             }
         }
       if (total == 1 && !is_edge_annotated (only_one, *annotated_edge))
         {
           only_one->probability = e->probability;
-          only_one->count = e->count;
           set_edge_annotated (only_one, annotated_edge);
         }
     }
@@ -1433,23 +1432,16 @@ afdo_calculate_branch_prob (bb_set *annotated_bb, edge_set *annotated_edge)
       if (!is_edge_annotated (e, *annotated_edge))
         num_unknown_succ++;
       else
-        total_count += e->count;
+        total_count += e->count ();
     }
     if (num_unknown_succ == 0 && total_count > profile_count::zero ())
       {
         FOR_EACH_EDGE (e, ei, bb->succs)
-        e->probability = e->count.probability_in (total_count);
+          e->probability = e->count ().probability_in (total_count);
       }
   }
   FOR_ALL_BB_FN (bb, cfun)
-  {
-    edge e;
-    edge_iterator ei;
-
-    FOR_EACH_EDGE (e, ei, bb->succs)
-      e->count = bb->count.apply_probability (e->probability);
     bb->aux = NULL;
-  }
 
   loop_optimizer_finalize ();
   free_dominance_info (CDI_DOMINATORS);
@@ -1551,7 +1543,7 @@ afdo_annotate_cfg (const stmt_set &promoted_stmts)
        counters are zero when not seen by autoFDO.  */
     bb->count = profile_count::zero ().afdo ();
     FOR_EACH_EDGE (e, ei, bb->succs)
-      e->count = profile_count::zero ().afdo ();
+      e->probability = profile_probability::uninitialized ();
 
     if (afdo_set_bb_count (bb, promoted_stmts))
       set_bb_annotated (bb, &annotated_bb);
diff --git a/gcc/basic-block.h b/gcc/basic-block.h
index c0c47784c02..5a5ddbfcb6d 100644
--- a/gcc/basic-block.h
+++ b/gcc/basic-block.h
@@ -46,8 +46,9 @@ struct GTY((user)) edge_def {
 
   int flags;			/* see cfg-flags.def */
   profile_probability probability;
-  profile_count count;		/* Expected number of executions calculated
-				   in profile.c  */
+
+  /* Return count of edge E.  */
+  inline profile_count count () const;
 };
 
 /* Masks for edge.flags.  */
@@ -147,9 +148,6 @@ struct GTY((chain_next ("%h.next_bb"), chain_prev ("%h.prev_bb"))) basic_block_d
   /* Expected number of executions: calculated in profile.c.  */
   profile_count count;
 
-  /* Expected frequency.  Normalized to be in range 0 to BB_FREQ_MAX.  */
-  int frequency;
-
   /* The discriminator for this block.  The discriminator distinguishes
      among several basic blocks that share a common locus, allowing for
      more accurate sample-based profiling.  */
@@ -300,7 +298,7 @@ enum cfg_bb_flags
 					 ? EDGE_SUCC ((bb), 1) : EDGE_SUCC ((bb), 0))
 
 /* Return expected execution frequency of the edge E.  */
-#define EDGE_FREQUENCY(e)		e->probability.apply (e->src->frequency)
+#define EDGE_FREQUENCY(e)		e->count ().to_frequency (cfun)
 
 /* Compute a scale factor (or probability) suitable for scaling of
    gcov_type values via apply_probability() and apply_scale().  */
@@ -639,4 +637,10 @@ has_abnormal_call_or_eh_pred_edge_p (basic_block bb)
   return false;
 }
 
+/* Return count of edge E.  */
+inline profile_count edge_def::count () const
+{
+  return src->count.apply_probability (probability);
+}
+
 #endif /* GCC_BASIC_BLOCK_H */
diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 4dad298fe59..f7c1f4c971e 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -256,8 +256,8 @@ push_to_next_round_p (const_basic_block bb, int round, int number_of_rounds,
 
   there_exists_another_round = round < number_of_rounds - 1;
 
-  block_not_hot_enough = (bb->frequency < exec_th
-			  || bb->count < count_th
+  block_not_hot_enough = (bb->count.to_frequency (cfun) < exec_th
+			  || bb->count.ipa () < count_th
 			  || probably_never_executed_bb_p (cfun, bb));
 
   if (there_exists_another_round
@@ -293,9 +293,9 @@ find_traces (int *n_traces, struct trace *traces)
     {
       bbd[e->dest->index].heap = heap;
       bbd[e->dest->index].node = heap->insert (bb_to_key (e->dest), e->dest);
-      if (e->dest->frequency > max_entry_frequency)
-	max_entry_frequency = e->dest->frequency;
-      if (e->dest->count.initialized_p () && e->dest->count > max_entry_count)
+      if (e->dest->count.to_frequency (cfun) > max_entry_frequency)
+	max_entry_frequency = e->dest->count.to_frequency (cfun);
+      if (e->dest->count.ipa_p () && e->dest->count > max_entry_count)
 	max_entry_count = e->dest->count;
     }
 
@@ -329,8 +329,10 @@ find_traces (int *n_traces, struct trace *traces)
 	  for (bb = traces[i].first;
 	       bb != traces[i].last;
 	       bb = (basic_block) bb->aux)
-	    fprintf (dump_file, "%d [%d] ", bb->index, bb->frequency);
-	  fprintf (dump_file, "%d [%d]\n", bb->index, bb->frequency);
+	    fprintf (dump_file, "%d [%d] ", bb->index,
+		     bb->count.to_frequency (cfun));
+	  fprintf (dump_file, "%d [%d]\n", bb->index,
+		   bb->count.to_frequency (cfun));
 	}
       fflush (dump_file);
     }
@@ -374,11 +376,11 @@ rotate_loop (edge back_edge, struct trace *trace, int trace_n)
 		{
 		  /* The current edge E is also preferred.  */
 		  int freq = EDGE_FREQUENCY (e);
-		  if (freq > best_freq || e->count > best_count)
+		  if (freq > best_freq || e->count () > best_count)
 		    {
 		      best_freq = freq;
-		      if (e->count.initialized_p ())
-		        best_count = e->count;
+		      if (e->count ().initialized_p ())
+		        best_count = e->count ();
 		      best_edge = e;
 		      best_bb = bb;
 		    }
@@ -392,17 +394,17 @@ rotate_loop (edge back_edge, struct trace *trace, int trace_n)
 		  /* The current edge E is preferred.  */
 		  is_preferred = true;
 		  best_freq = EDGE_FREQUENCY (e);
-		  best_count = e->count;
+		  best_count = e->count ();
 		  best_edge = e;
 		  best_bb = bb;
 		}
 	      else
 		{
 		  int freq = EDGE_FREQUENCY (e);
-		  if (!best_edge || freq > best_freq || e->count > best_count)
+		  if (!best_edge || freq > best_freq || e->count () > best_count)
 		    {
 		      best_freq = freq;
-		      best_count = e->count;
+		      best_count = e->count ();
 		      best_edge = e;
 		      best_bb = bb;
 		    }
@@ -529,7 +531,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 
 	  if (dump_file)
 	    fprintf (dump_file, "Basic block %d was visited in trace %d\n",
-		     bb->index, *n_traces - 1);
+		     bb->index, *n_traces);
 
 	  ends_in_call = block_ends_with_call_p (bb);
 
@@ -545,11 +547,13 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 		  && bb_visited_trace (e->dest) != *n_traces)
 		continue;
 
+	      /* If partitioning hot/cold basic blocks, don't consider edges
+		 that cross section boundaries.  */
 	      if (BB_PARTITION (e->dest) != BB_PARTITION (bb))
 		continue;
 
 	      prob = e->probability;
-	      freq = e->dest->frequency;
+	      freq = e->dest->count.to_frequency (cfun);
 
 	      /* The only sensible preference for a call instruction is the
 		 fallthru edge.  Don't bother selecting anything else.  */
@@ -571,12 +575,9 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 		  || !prob.initialized_p ()
 		  || ((prob.to_reg_br_prob_base () < branch_th
 		       || EDGE_FREQUENCY (e) < exec_th
-		      || e->count < count_th) && (!for_size)))
+		      || e->count ().ipa () < count_th) && (!for_size)))
 		continue;
 
-	      /* If partitioning hot/cold basic blocks, don't consider edges
-		 that cross section boundaries.  */
-
 	      if (better_edge_p (bb, e, prob, freq, best_prob, best_freq,
 				 best_edge))
 		{
@@ -586,12 +587,28 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 		}
 	    }
 
-	  /* If the best destination has multiple predecessors, and can be
-	     duplicated cheaper than a jump, don't allow it to be added
-	     to a trace.  We'll duplicate it when connecting traces.  */
-	  if (best_edge && EDGE_COUNT (best_edge->dest->preds) >= 2
+	  /* If the best destination has multiple predecessors and can be
+	     duplicated cheaper than a jump, don't allow it to be added to
+	     a trace; we'll duplicate it when connecting the traces later.
+	     However, we need to check that this duplication wouldn't leave
+	     the best destination with only crossing predecessors, because
+	     this would change its effective partition from hot to cold.  */
+	  if (best_edge
+	      && EDGE_COUNT (best_edge->dest->preds) >= 2
 	      && copy_bb_p (best_edge->dest, 0))
-	    best_edge = NULL;
+	    {
+	      bool only_crossing_preds = true;
+	      edge e;
+	      edge_iterator ei;
+	      FOR_EACH_EDGE (e, ei, best_edge->dest->preds)
+		if (e != best_edge && !(e->flags & EDGE_CROSSING))
+		  {
+		    only_crossing_preds = false;
+		    break;
+		  }
+	      if (!only_crossing_preds)
+		best_edge = NULL;
+	    }
 
 	  /* If the best destination has multiple successors or predecessors,
 	     don't allow it to be added when optimizing for size.  This makes
@@ -656,7 +673,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 		      || !prob.initialized_p ()
 		      || prob.to_reg_br_prob_base () < branch_th
 		      || freq < exec_th
-		      || e->count < count_th)
+		      || e->count ().ipa () < count_th)
 		    {
 		      /* When partitioning hot/cold basic blocks, make sure
 			 the cold blocks (and only the cold blocks) all get
@@ -691,7 +708,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 		  if (best_edge->dest != bb)
 		    {
 		      if (EDGE_FREQUENCY (best_edge)
-			  > 4 * best_edge->dest->frequency / 5)
+			  > 4 * best_edge->dest->count.to_frequency (cfun) / 5)
 			{
 			  /* The loop has at least 4 iterations.  If the loop
 			     header is not the first block of the function
@@ -768,8 +785,8 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 			    & EDGE_CAN_FALLTHRU)
 			&& !(single_succ_edge (e->dest)->flags & EDGE_COMPLEX)
 			&& single_succ (e->dest) == best_edge->dest
-			&& (2 * e->dest->frequency >= EDGE_FREQUENCY (best_edge)
-			    || for_size))
+			&& (2 * e->dest->count.to_frequency (cfun)
+			    >= EDGE_FREQUENCY (best_edge) || for_size))
 		      {
 			best_edge = e;
 			if (dump_file)
@@ -930,9 +947,9 @@ bb_to_key (basic_block bb)
 
   if (priority)
     /* The block with priority should have significantly lower key.  */
-    return -(100 * BB_FREQ_MAX + 100 * priority + bb->frequency);
+    return -(100 * BB_FREQ_MAX + 100 * priority + bb->count.to_frequency (cfun));
 
-  return -bb->frequency;
+  return -bb->count.to_frequency (cfun);
 }
 
 /* Return true when the edge E from basic block BB is better than the temporary
@@ -988,16 +1005,6 @@ better_edge_p (const_basic_block bb, const_edge e, profile_probability prob,
   else
     is_better_edge = false;
 
-  /* If we are doing hot/cold partitioning, make sure that we always favor
-     non-crossing edges over crossing edges.  */
-
-  if (!is_better_edge
-      && flag_reorder_blocks_and_partition
-      && cur_best_edge
-      && (cur_best_edge->flags & EDGE_CROSSING)
-      && !(e->flags & EDGE_CROSSING))
-    is_better_edge = true;
-
   return is_better_edge;
 }
 
@@ -1285,7 +1292,7 @@ connect_traces (int n_traces, struct trace *traces)
 				&& !connected[bbd[di].start_of_trace]
 				&& BB_PARTITION (e2->dest) == current_partition
 				&& EDGE_FREQUENCY (e2) >= freq_threshold
-				&& e2->count >= count_threshold
+				&& e2->count ().ipa () >= count_threshold
 				&& (!best2
 				    || e2->probability > best2->probability
 				    || (e2->probability == best2->probability
@@ -1311,8 +1318,8 @@ connect_traces (int n_traces, struct trace *traces)
 		  && copy_bb_p (best->dest,
 				optimize_edge_for_speed_p (best)
 				&& EDGE_FREQUENCY (best) >= freq_threshold
-				&& (!best->count.initialized_p ()
-				    || best->count >= count_threshold)))
+				&& (!best->count ().initialized_p ()
+				    || best->count ().ipa () >= count_threshold)))
 		{
 		  basic_block new_bb;
 
@@ -1370,7 +1377,7 @@ copy_bb_p (const_basic_block bb, int code_may_grow)
   int max_size = uncond_jump_length;
   rtx_insn *insn;
 
-  if (!bb->frequency)
+  if (!bb->count.to_frequency (cfun))
     return false;
   if (EDGE_COUNT (bb->preds) < 2)
     return false;
@@ -1454,7 +1461,6 @@ fix_up_crossing_landing_pad (eh_landing_pad old_lp, basic_block old_bb)
   last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
   new_bb = create_basic_block (new_label, jump, last_bb);
   new_bb->aux = last_bb->aux;
-  new_bb->frequency = post_bb->frequency;
   new_bb->count = post_bb->count;
   last_bb->aux = new_bb;
 
@@ -1512,7 +1518,6 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
       edge_iterator ei;
       profile_probability highest_probability
 				 = profile_probability::uninitialized ();
-      int highest_freq = 0;
       profile_count highest_count = profile_count::uninitialized ();
       bool found = false;
 
@@ -1528,7 +1533,7 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
 
 	  /* Do not expect profile insanities when profile was not adjusted.  */
 	  if (e->probability == profile_probability::never ()
-	      || e->count == profile_count::zero ())
+	      || e->count () == profile_count::zero ())
 	    continue;
 
           if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
@@ -1539,11 +1544,8 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
           /* The following loop will look for the hottest edge via
              the edge count, if it is non-zero, then fallback to the edge
              frequency and finally the edge probability.  */
-          if (!highest_count.initialized_p () || e->count > highest_count)
-            highest_count = e->count;
-          int edge_freq = EDGE_FREQUENCY (e);
-          if (edge_freq > highest_freq)
-            highest_freq = edge_freq;
+          if (!(e->count () > highest_count))
+            highest_count = e->count ();
           if (!highest_probability.initialized_p ()
 	      || e->probability > highest_probability)
             highest_probability = e->probability;
@@ -1563,22 +1565,17 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
             continue;
 	  /* Do not expect profile insanities when profile was not adjusted.  */
 	  if (e->probability == profile_probability::never ()
-	      || e->count == profile_count::zero ())
+	      || e->count () == profile_count::zero ())
 	    continue;
           /* Select the hottest edge using the edge count, if it is non-zero,
              then fallback to the edge frequency and finally the edge
              probability.  */
-          if (highest_count > 0)
-            {
-              if (e->count < highest_count)
-                continue;
-            }
-          else if (highest_freq)
+          if (highest_count.initialized_p ())
             {
-              if (EDGE_FREQUENCY (e) < highest_freq)
+              if (!(e->count () >= highest_count))
                 continue;
             }
-          else if (e->probability < highest_probability)
+          else if (!(e->probability >= highest_probability))
             continue;
 
           basic_block reach_bb = walk_up ? e->src : e->dest;
diff --git a/gcc/brig/ChangeLog b/gcc/brig/ChangeLog
index fa7668486b2..dede3f405f5 100644
--- a/gcc/brig/ChangeLog
+++ b/gcc/brig/ChangeLog
@@ -1,3 +1,12 @@
+2017-10-31  Henry Linjamäki  <henry.linjamaki@parmance.com>
+
+	* brig-lang.c (brig_langhook_type_for_mode): Fix PR 82771.
+
+2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* brig-lang.c (brig_langhook_type_for_mode): Use scalar_int_mode
+	and scalar_float_mode.
+
 2017-10-09  Pekka Jääskeläinen  <pekka.jaaskelainen@parmance.com>
 
 	* brigfrontend/brig-to-generic.cc: Support BRIG_KIND_NONE
diff --git a/gcc/brig/brig-lang.c b/gcc/brig/brig-lang.c
index cff605541d0..f34d9587632 100644
--- a/gcc/brig/brig-lang.c
+++ b/gcc/brig/brig-lang.c
@@ -280,9 +280,9 @@ brig_langhook_type_for_mode (machine_mode mode, int unsignedp)
 
   scalar_int_mode imode;
   scalar_float_mode fmode;
-  if (is_int_mode (mode, &imode))
+  if (is_float_mode (mode, &fmode))
     {
-      switch (GET_MODE_BITSIZE (imode))
+      switch (GET_MODE_BITSIZE (fmode))
 	{
 	case 32:
 	  return float_type_node;
@@ -291,15 +291,15 @@ brig_langhook_type_for_mode (machine_mode mode, int unsignedp)
 	default:
 	  /* We have to check for long double in order to support
 	     i386 excess precision.  */
-	  if (imode == TYPE_MODE (long_double_type_node))
+	  if (fmode == TYPE_MODE (long_double_type_node))
 	    return long_double_type_node;
 
 	  gcc_unreachable ();
 	  return NULL_TREE;
 	}
     }
-  else if (is_float_mode (mode, &fmode))
-    return brig_langhook_type_for_size (GET_MODE_BITSIZE (fmode), unsignedp);
+  else if (is_int_mode (mode, &imode))
+    return brig_langhook_type_for_size (GET_MODE_BITSIZE (imode), unsignedp);
   else
     {
       /* E.g., build_common_builtin_nodes () asks for modes/builtins
diff --git a/gcc/bt-load.c b/gcc/bt-load.c
index 1da0ad62f1e..90922082e7b 100644
--- a/gcc/bt-load.c
+++ b/gcc/bt-load.c
@@ -185,7 +185,7 @@ static int first_btr, last_btr;
 static int
 basic_block_freq (const_basic_block bb)
 {
-  return bb->frequency;
+  return bb->count.to_frequency (cfun);
 }
 
 /* If the rtx at *XP references (sets or reads) any branch target
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 18946033502..877f2aef424 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -544,6 +544,20 @@ DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE,
 		     BT_DOUBLE, BT_DOUBLE, BT_DOUBLE, BT_DOUBLE)
 DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE,
 		     BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16,
+		     BT_FLOAT16, BT_FLOAT16, BT_FLOAT16, BT_FLOAT16)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32,
+		     BT_FLOAT32, BT_FLOAT32, BT_FLOAT32, BT_FLOAT32)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64,
+		     BT_FLOAT64, BT_FLOAT64, BT_FLOAT64, BT_FLOAT64)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128,
+		     BT_FLOAT128, BT_FLOAT128, BT_FLOAT128, BT_FLOAT128)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X,
+		     BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X,
+		     BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X,
+		     BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X)
 DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_FLOAT_FLOAT_INTPTR,
 		     BT_FLOAT, BT_FLOAT, BT_FLOAT, BT_INT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_INTPTR,
diff --git a/gcc/builtins.c b/gcc/builtins.c
index c50d7f43f76..b0fe2a42980 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1208,6 +1208,7 @@ void
 expand_builtin_update_setjmp_buf (rtx buf_addr)
 {
   machine_mode sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL);
+  buf_addr = convert_memory_address (Pmode, buf_addr);
   rtx stack_save
     = gen_rtx_MEM (sa_mode,
 		   memory_address
@@ -1615,7 +1616,7 @@ expand_builtin_apply (rtx function, rtx arguments, rtx argsize)
      arguments to the outgoing arguments address.  We can pass TRUE
      as the 4th argument because we just saved the stack pointer
      and will restore it right after the call.  */
-  allocate_dynamic_stack_space (argsize, 0, BIGGEST_ALIGNMENT, true);
+  allocate_dynamic_stack_space (argsize, 0, BIGGEST_ALIGNMENT, -1, true);
 
   /* Set DRAP flag to true, even though allocate_dynamic_stack_space
      may have already set current_function_calls_alloca to true.
@@ -1822,14 +1823,26 @@ expand_builtin_classify_type (tree exp)
   return GEN_INT (no_type_class);
 }
 
-/* This helper macro, meant to be used in mathfn_built_in below,
-   determines which among a set of three builtin math functions is
-   appropriate for a given type mode.  The `F' and `L' cases are
-   automatically generated from the `double' case.  */
+/* This helper macro, meant to be used in mathfn_built_in below, determines
+   which among a set of builtin math functions is appropriate for a given type
+   mode.  The `F' (float) and `L' (long double) are automatically generated
+   from the 'double' case.  If a function supports the _Float<N> and _Float<N>X
+   types, there are additional types that are considered with 'F32', 'F64',
+   'F128', etc. suffixes.  */
 #define CASE_MATHFN(MATHFN) \
   CASE_CFN_##MATHFN: \
   fcode = BUILT_IN_##MATHFN; fcodef = BUILT_IN_##MATHFN##F ; \
   fcodel = BUILT_IN_##MATHFN##L ; break;
+/* Similar to the above, but also add support for the _Float<N> and _Float<N>X
+   types.  */
+#define CASE_MATHFN_FLOATN(MATHFN) \
+  CASE_CFN_##MATHFN: \
+  fcode = BUILT_IN_##MATHFN; fcodef = BUILT_IN_##MATHFN##F ; \
+  fcodel = BUILT_IN_##MATHFN##L ; fcodef16 = BUILT_IN_##MATHFN##F16 ; \
+  fcodef32 = BUILT_IN_##MATHFN##F32; fcodef64 = BUILT_IN_##MATHFN##F64 ; \
+  fcodef128 = BUILT_IN_##MATHFN##F128 ; fcodef32x = BUILT_IN_##MATHFN##F32X ; \
+  fcodef64x = BUILT_IN_##MATHFN##F64X ; fcodef128x = BUILT_IN_##MATHFN##F128X ;\
+  break;
 /* Similar to above, but appends _R after any F/L suffix.  */
 #define CASE_MATHFN_REENT(MATHFN) \
   case CFN_BUILT_IN_##MATHFN##_R: \
@@ -1846,7 +1859,15 @@ expand_builtin_classify_type (tree exp)
 static built_in_function
 mathfn_built_in_2 (tree type, combined_fn fn)
 {
+  tree mtype;
   built_in_function fcode, fcodef, fcodel;
+  built_in_function fcodef16 = END_BUILTINS;
+  built_in_function fcodef32 = END_BUILTINS;
+  built_in_function fcodef64 = END_BUILTINS;
+  built_in_function fcodef128 = END_BUILTINS;
+  built_in_function fcodef32x = END_BUILTINS;
+  built_in_function fcodef64x = END_BUILTINS;
+  built_in_function fcodef128x = END_BUILTINS;
 
   switch (fn)
     {
@@ -1860,7 +1881,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
     CASE_MATHFN (CBRT)
     CASE_MATHFN (CEIL)
     CASE_MATHFN (CEXPI)
-    CASE_MATHFN (COPYSIGN)
+    CASE_MATHFN_FLOATN (COPYSIGN)
     CASE_MATHFN (COS)
     CASE_MATHFN (COSH)
     CASE_MATHFN (DREM)
@@ -1873,9 +1894,9 @@ mathfn_built_in_2 (tree type, combined_fn fn)
     CASE_MATHFN (FABS)
     CASE_MATHFN (FDIM)
     CASE_MATHFN (FLOOR)
-    CASE_MATHFN (FMA)
-    CASE_MATHFN (FMAX)
-    CASE_MATHFN (FMIN)
+    CASE_MATHFN_FLOATN (FMA)
+    CASE_MATHFN_FLOATN (FMAX)
+    CASE_MATHFN_FLOATN (FMIN)
     CASE_MATHFN (FMOD)
     CASE_MATHFN (FREXP)
     CASE_MATHFN (GAMMA)
@@ -1929,7 +1950,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
     CASE_MATHFN (SIN)
     CASE_MATHFN (SINCOS)
     CASE_MATHFN (SINH)
-    CASE_MATHFN (SQRT)
+    CASE_MATHFN_FLOATN (SQRT)
     CASE_MATHFN (TAN)
     CASE_MATHFN (TANH)
     CASE_MATHFN (TGAMMA)
@@ -1942,12 +1963,27 @@ mathfn_built_in_2 (tree type, combined_fn fn)
       return END_BUILTINS;
     }
 
-  if (TYPE_MAIN_VARIANT (type) == double_type_node)
+  mtype = TYPE_MAIN_VARIANT (type);
+  if (mtype == double_type_node)
     return fcode;
-  else if (TYPE_MAIN_VARIANT (type) == float_type_node)
+  else if (mtype == float_type_node)
     return fcodef;
-  else if (TYPE_MAIN_VARIANT (type) == long_double_type_node)
+  else if (mtype == long_double_type_node)
     return fcodel;
+  else if (mtype == float16_type_node)
+    return fcodef16;
+  else if (mtype == float32_type_node)
+    return fcodef32;
+  else if (mtype == float64_type_node)
+    return fcodef64;
+  else if (mtype == float128_type_node)
+    return fcodef128;
+  else if (mtype == float32x_type_node)
+    return fcodef32x;
+  else if (mtype == float64x_type_node)
+    return fcodef64x;
+  else if (mtype == float128x_type_node)
+    return fcodef128x;
   else
     return END_BUILTINS;
 }
@@ -2001,6 +2037,9 @@ associated_internal_fn (tree fndecl)
     {
 #define DEF_INTERNAL_FLT_FN(NAME, FLAGS, OPTAB, TYPE) \
     CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME;
+#define DEF_INTERNAL_FLT_FLOATN_FN(NAME, FLAGS, OPTAB, TYPE) \
+    CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME; \
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_##NAME): return IFN_##NAME;
 #define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
     CASE_INT_FN (BUILT_IN_##NAME): return IFN_##NAME;
 #include "internal-fn.def"
@@ -2074,6 +2113,7 @@ expand_builtin_mathfn_ternary (tree exp, rtx target, rtx subtarget)
   switch (DECL_FUNCTION_CODE (fndecl))
     {
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       builtin_optab = fma_optab; break;
     default:
       gcc_unreachable ();
@@ -4864,19 +4904,22 @@ expand_builtin_alloca (tree exp)
   rtx result;
   unsigned int align;
   tree fndecl = get_callee_fndecl (exp);
-  bool alloca_with_align = (DECL_FUNCTION_CODE (fndecl)
-			    == BUILT_IN_ALLOCA_WITH_ALIGN);
+  HOST_WIDE_INT max_size;
+  enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl);
   bool alloca_for_var = CALL_ALLOCA_FOR_VAR_P (exp);
   bool valid_arglist
-    = (alloca_with_align
-       ? validate_arglist (exp, INTEGER_TYPE, INTEGER_TYPE, VOID_TYPE)
-       : validate_arglist (exp, INTEGER_TYPE, VOID_TYPE));
+    = (fcode == BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX
+       ? validate_arglist (exp, INTEGER_TYPE, INTEGER_TYPE, INTEGER_TYPE,
+			   VOID_TYPE)
+       : fcode == BUILT_IN_ALLOCA_WITH_ALIGN
+	 ? validate_arglist (exp, INTEGER_TYPE, INTEGER_TYPE, VOID_TYPE)
+	 : validate_arglist (exp, INTEGER_TYPE, VOID_TYPE));
 
   if (!valid_arglist)
     return NULL_RTX;
 
-  if ((alloca_with_align && !warn_vla_limit)
-      || (!alloca_with_align && !warn_alloca_limit))
+  if ((alloca_for_var && !warn_vla_limit)
+      || (!alloca_for_var && !warn_alloca_limit))
     {
       /* -Walloca-larger-than and -Wvla-larger-than settings override
 	 the more general -Walloc-size-larger-than so unless either of
@@ -4891,13 +4934,19 @@ expand_builtin_alloca (tree exp)
   op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
 
   /* Compute the alignment.  */
-  align = (alloca_with_align
-	   ? TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 1))
-	   : BIGGEST_ALIGNMENT);
+  align = (fcode == BUILT_IN_ALLOCA
+	   ? BIGGEST_ALIGNMENT
+	   : TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 1)));
+
+  /* Compute the maximum size.  */
+  max_size = (fcode == BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX
+              ? TREE_INT_CST_LOW (CALL_EXPR_ARG (exp, 2))
+              : -1);
 
   /* Allocate the desired space.  If the allocation stems from the declaration
      of a variable-sized object, it cannot accumulate.  */
-  result = allocate_dynamic_stack_space (op0, 0, align, alloca_for_var);
+  result
+    = allocate_dynamic_stack_space (op0, 0, align, max_size, alloca_for_var);
   result = convert_memory_address (ptr_mode, result);
 
   return result;
@@ -6491,8 +6540,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
       && fcode != BUILT_IN_EXECLE
       && fcode != BUILT_IN_EXECVP
       && fcode != BUILT_IN_EXECVE
-      && fcode != BUILT_IN_ALLOCA
-      && fcode != BUILT_IN_ALLOCA_WITH_ALIGN
+      && !ALLOCA_FUNCTION_CODE_P (fcode)
       && fcode != BUILT_IN_FREE
       && fcode != BUILT_IN_CHKP_SET_PTR_BOUNDS
       && fcode != BUILT_IN_CHKP_INIT_PTR_BOUNDS
@@ -6568,6 +6616,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
       break;
 
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       target = expand_builtin_mathfn_ternary (exp, target, subtarget);
       if (target)
 	return target;
@@ -6721,8 +6770,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
       else
 	return XEXP (DECL_RTL (DECL_RESULT (current_function_decl)), 0);
 
-    case BUILT_IN_ALLOCA:
-    case BUILT_IN_ALLOCA_WITH_ALIGN:
+    CASE_BUILT_IN_ALLOCA:
       target = expand_builtin_alloca (exp);
       if (target)
 	return target;
@@ -10416,8 +10464,7 @@ is_inexpensive_builtin (tree decl)
     switch (DECL_FUNCTION_CODE (decl))
       {
       case BUILT_IN_ABS:
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
+      CASE_BUILT_IN_ALLOCA:
       case BUILT_IN_BSWAP16:
       case BUILT_IN_BSWAP32:
       case BUILT_IN_BSWAP64:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 1c1efceea21..26118f1766b 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -128,6 +128,29 @@ along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN_CHKP (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE,	\
 		    TYPE, true, true, true, ATTRS, false, true)
 
+/* A set of GCC builtins for _FloatN and _FloatNx types.  TYPE_MACRO is called
+   with an argument such as FLOAT32 to produce the enum value for the type.  If
+   we are compiling for the C language with GNU extensions, we enable the name
+   without the __builtin_ prefix as well as the name with the __builtin_
+   prefix.  C++ does not enable these names by default because they don't have
+   the _Float<N> and _Float<N>X keywords, and a class based library should use
+   the __builtin_ names.  */
+#undef DEF_FLOATN_BUILTIN
+#define DEF_FLOATN_BUILTIN(ENUM, NAME, TYPE, ATTRS)	\
+  DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,	\
+	       targetm.floatn_builtin_p ((int) ENUM), true, true, ATTRS, \
+	       false, true)
+#undef DEF_EXT_LIB_FLOATN_NX_BUILTINS
+#define DEF_EXT_LIB_FLOATN_NX_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS)	\
+  DEF_FLOATN_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), ATTRS) \
+  DEF_FLOATN_BUILTIN (ENUM ## F128X, NAME "f128x", TYPE_MACRO (FLOAT128X), \
+		      ATTRS)
+
 /* Like DEF_LIB_BUILTIN, except that the function is only a part of
    the standard in C94 or above.  */
 #undef DEF_C94_BUILTIN
@@ -324,7 +347,7 @@ DEF_C99_BUILTIN        (BUILT_IN_COPYSIGN, "copysign", BT_FN_DOUBLE_DOUBLE_DOUBL
 DEF_C99_BUILTIN        (BUILT_IN_COPYSIGNF, "copysignf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_COPYSIGNL, "copysignl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #define COPYSIGN_TYPE(F) BT_FN_##F##_##F##_##F
-DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef COPYSIGN_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
@@ -357,7 +380,7 @@ DEF_LIB_BUILTIN        (BUILT_IN_FABS, "fabs", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_N
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSF, "fabsf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSL, "fabsl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #define FABS_TYPE(F) BT_FN_##F##_##F
-DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FABS, "fabs", FABS_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FABS, "fabs", FABS_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef FABS_TYPE
 DEF_GCC_BUILTIN        (BUILT_IN_FABSD32, "fabsd32", BT_FN_DFLOAT32_DFLOAT32, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FABSD64, "fabsd64", BT_FN_DFLOAT64_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST)
@@ -382,12 +405,21 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL, "floorl", BT_FN_LONGDOUBLE_LONGDOUBLE,
 DEF_C99_BUILTIN        (BUILT_IN_FMA, "fma", BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAL, "fmal", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
+#define FMA_TYPE(F) BT_FN_##F##_##F##_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FMA, "fma", FMA_TYPE, ATTR_MATHFN_FPROUNDING)
+#undef FMA_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMAX, "fmax", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXF, "fmaxf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXL, "fmaxl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define FMAX_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FMAX, "fmax", FMAX_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef FMAX_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMIN, "fmin", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMINF, "fminf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMINL, "fminl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define FMIN_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FMIN, "fmin", FMIN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef FMIN_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_FMOD, "fmod", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FMODF, "fmodf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FMODL, "fmodl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -495,7 +527,7 @@ DEF_C99_BUILTIN        (BUILT_IN_NAN, "nan", BT_FN_DOUBLE_CONST_STRING, ATTR_CON
 DEF_C99_BUILTIN        (BUILT_IN_NANF, "nanf", BT_FN_FLOAT_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
 DEF_C99_BUILTIN        (BUILT_IN_NANL, "nanl", BT_FN_LONGDOUBLE_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
 #define NAN_TYPE(F) BT_FN_##F##_CONST_STRING
-DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_NAN, "nan", NAN_TYPE, ATTR_CONST_NOTHROW_NONNULL)
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_NAN, "nan", NAN_TYPE, ATTR_CONST_NOTHROW_NONNULL)
 DEF_GCC_BUILTIN        (BUILT_IN_NAND32, "nand32", BT_FN_DFLOAT32_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
 DEF_GCC_BUILTIN        (BUILT_IN_NAND64, "nand64", BT_FN_DFLOAT64_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
 DEF_GCC_BUILTIN        (BUILT_IN_NAND128, "nand128", BT_FN_DFLOAT128_CONST_STRING, ATTR_CONST_NOTHROW_NONNULL)
@@ -564,6 +596,9 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_SINL, "sinl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR
 DEF_LIB_BUILTIN        (BUILT_IN_SQRT, "sqrt", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SQRTF, "sqrtf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_SQRTL, "sqrtl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
+#define SQRT_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_SQRT, "sqrt", SQRT_TYPE, ATTR_MATHFN_FPROUNDING_ERRNO)
+#undef SQRT_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_TAN, "tan", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_TANF, "tanf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
 DEF_LIB_BUILTIN        (BUILT_IN_TANH, "tanh", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
@@ -921,6 +956,7 @@ DEF_BUILTIN_STUB (BUILT_IN_SETJMP_RECEIVER, "__builtin_setjmp_receiver")
 DEF_BUILTIN_STUB (BUILT_IN_STACK_SAVE, "__builtin_stack_save")
 DEF_BUILTIN_STUB (BUILT_IN_STACK_RESTORE, "__builtin_stack_restore")
 DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN, "__builtin_alloca_with_align")
+DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_align_and_max")
 
 /* An internal version of memcmp, used when the result is only tested for
    equality with zero.  */
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index ee6fc87dd6f..e8476426fcb 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,52 @@
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* c-common.c (binary_op_error): Update for renaming of
+	error_at_rich_loc.
+	(c_parse_error): Likewise.
+	* c-warn.c (warn_logical_not_parentheses): Likewise for
+	renaming of inform_at_rich_loc.
+	(warn_for_restrict): Likewise for renaming of
+	warning_at_rich_loc_n.
+
+2017-10-30  Joseph Myers  <joseph@codesourcery.com>
+
+	* c.opt (std=c17, std=gnu17, std=iso9899:2017): New options.
+	* c-opts.c (set_std_c17): New function.
+	(c_common_init_options): Use gnu17 as default C version.
+	(c_common_handle_option): Handle -std=c17 and -std=gnu17.
+
+2017-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* c-cppbuiltin.c (mode_has_fma): Add support for PowerPC KFmode.
+	(c_cpp_builtins): If a machine has a fast fma _Float<N> and
+	_Float<N>X variant, define __FP_FAST_FMA<N> and/or
+	__FP_FAST_FMA<N>X.
+
+2017-10-23  Marek Polacek  <polacek@redhat.com>
+
+	PR c/82681
+	* c-warn.c (warnings_for_convert_and_check): Fix typos.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* c-common.c (check_builtin_function_arguments): Also check arguments
+	of __builtin_alloca_with_align_and_max.
+
+2017-10-17  David Malcolm  <dmalcolm@redhat.com>
+
+	* c-format.c (format_warning_at_char): Pass UNKNOWN_LOCATION
+	rather than NULL to format_warning_va.
+	(check_format_types): Likewise when calling format_type_warning.
+	Remove code to extract source_ranges and source_range * in favor
+	of just a location_t.
+	(format_type_warning): Convert source_range * param to a
+	location_t.
+
+2017-10-13  Jakub Jelinek  <jakub@redhat.com>
+
+	* c-gimplify.c (c_gimplify_expr): Handle [LR]ROTATE_EXPR like
+	[LR]SHIFT_EXPR.
+
 2017-10-12  David Malcolm  <dmalcolm@redhat.com>
 
 	* c-common.c (enum missing_token_insertion_kind): New enum.
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index bd8ca306c2d..bb75cba4c39 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -65,6 +65,7 @@ static tree handle_asan_odr_indicator_attribute (tree *, tree, tree, int,
 static tree handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
+static tree handle_nocf_check_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noicf_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noipa_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
@@ -367,6 +368,8 @@ const struct attribute_spec c_common_attribute_table[] =
   { "patchable_function_entry",	1, 2, true, false, false,
 			      handle_patchable_function_entry_attribute,
 			      false },
+  { "nocf_check",		      0, 0, false, true, true,
+			      handle_nocf_check_attribute, true },
   { NULL,                     0, 0, false, false, false, NULL, false }
 };
 
@@ -772,6 +775,30 @@ handle_noclone_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "nocf_check" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_nocf_check_attribute (tree *node, tree name,
+			  tree ARG_UNUSED (args),
+			  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_TYPE
+      && TREE_CODE (*node) != METHOD_TYPE)
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored", name);
+      *no_add_attrs = true;
+    }
+  else if (!(flag_cf_protection & CF_BRANCH))
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored. Use "
+				"-fcf-protection option to enable it", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "no_icf" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 1f2bf646e76..83c6aadda27 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -2704,9 +2704,9 @@ binary_op_error (rich_location *richloc, enum tree_code code,
     default:
       gcc_unreachable ();
     }
-  error_at_rich_loc (richloc,
-		     "invalid operands to binary %s (have %qT and %qT)",
-		     opname, type0, type1);
+  error_at (richloc,
+	    "invalid operands to binary %s (have %qT and %qT)",
+	    opname, type0, type1);
 }
 
 /* Given an expression as a tree, return its original type.  Do this
@@ -5705,6 +5705,16 @@ check_builtin_function_arguments (location_t loc, vec<location_t> arg_loc,
 
   switch (DECL_FUNCTION_CODE (fndecl))
     {
+    case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX:
+      if (!tree_fits_uhwi_p (args[2]))
+	{
+	  error_at (ARG_LOCATION (2),
+		    "third argument to function %qE must be a constant integer",
+		    fndecl);
+	  return false;
+	}
+      /* fall through */
+
     case BUILT_IN_ALLOCA_WITH_ALIGN:
       {
 	/* Get the requested alignment (in bits) if it's a constant
@@ -5944,7 +5954,7 @@ c_parse_error (const char *gmsgid, enum cpp_ttype token_type,
       else
 	message = catenate_messages (gmsgid, " before %s'\\x%x'");
 
-      error_at_rich_loc (richloc, message, prefix, val);
+      error_at (richloc, message, prefix, val);
       free (message);
       message = NULL;
     }
@@ -5972,7 +5982,7 @@ c_parse_error (const char *gmsgid, enum cpp_ttype token_type,
   else if (token_type == CPP_NAME)
     {
       message = catenate_messages (gmsgid, " before %qE");
-      error_at_rich_loc (richloc, message, value);
+      error_at (richloc, message, value);
       free (message);
       message = NULL;
     }
@@ -5985,16 +5995,16 @@ c_parse_error (const char *gmsgid, enum cpp_ttype token_type,
   else if (token_type < N_TTYPES)
     {
       message = catenate_messages (gmsgid, " before %qs token");
-      error_at_rich_loc (richloc, message, cpp_type2name (token_type, token_flags));
+      error_at (richloc, message, cpp_type2name (token_type, token_flags));
       free (message);
       message = NULL;
     }
   else
-    error_at_rich_loc (richloc, gmsgid);
+    error_at (richloc, gmsgid);
 
   if (message)
     {
-      error_at_rich_loc (richloc, message);
+      error_at (richloc, message);
       free (message);
     }
 #undef catenate_messages
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 4330c9102d9..2ac9616b72f 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -82,6 +82,11 @@ mode_has_fma (machine_mode mode)
       return !!HAVE_fmadf4;
 #endif
 
+#ifdef HAVE_fmakf4	/* PowerPC if long double != __float128.  */
+    case E_KFmode:
+      return !!HAVE_fmakf4;
+#endif
+
 #ifdef HAVE_fmaxf4
     case E_XFmode:
       return !!HAVE_fmaxf4;
@@ -1119,7 +1124,7 @@ c_cpp_builtins (cpp_reader *pfile)
 	       floatn_nx_types[i].extended ? "X" : "");
       sprintf (csuffix, "F%d%s", floatn_nx_types[i].n,
 	       floatn_nx_types[i].extended ? "x" : "");
-      builtin_define_float_constants (prefix, csuffix, "%s", NULL,
+      builtin_define_float_constants (prefix, csuffix, "%s", csuffix,
 				      FLOATN_NX_TYPE_NODE (i));
     }
 
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 0dba9793311..164d0353967 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -97,7 +97,8 @@ format_warning_at_char (location_t fmt_string_loc, tree format_string_cst,
 
   substring_loc fmt_loc (fmt_string_loc, string_type, char_idx, char_idx,
 			 char_idx);
-  bool warned = format_warning_va (fmt_loc, NULL, NULL, opt, gmsgid, &ap);
+  bool warned = format_warning_va (fmt_loc, UNKNOWN_LOCATION, NULL, opt,
+				   gmsgid, &ap);
   va_end (ap);
 
   return warned;
@@ -1039,7 +1040,7 @@ static void check_format_types (const substring_loc &fmt_loc,
 				char conversion_char,
 				vec<location_t> *arglocs);
 static void format_type_warning (const substring_loc &fmt_loc,
-				 source_range *param_range,
+				 location_t param_loc,
 				 format_wanted_type *, tree,
 				 tree,
 				 const format_kind_info *fki,
@@ -3073,8 +3074,9 @@ check_format_types (const substring_loc &fmt_loc,
       cur_param = types->param;
       if (!cur_param)
         {
-	  format_type_warning (fmt_loc, NULL, types, wanted_type, NULL, fki,
-			       offset_to_type_start, conversion_char);
+	  format_type_warning (fmt_loc, UNKNOWN_LOCATION, types, wanted_type,
+			       NULL, fki, offset_to_type_start,
+			       conversion_char);
           continue;
         }
 
@@ -3084,23 +3086,15 @@ check_format_types (const substring_loc &fmt_loc,
       orig_cur_type = cur_type;
       char_type_flag = 0;
 
-      source_range param_range;
-      source_range *param_range_ptr;
+      location_t param_loc = UNKNOWN_LOCATION;
       if (EXPR_HAS_LOCATION (cur_param))
-	{
-	  param_range = EXPR_LOCATION_RANGE (cur_param);
-	  param_range_ptr = &param_range;
-	}
+	param_loc = EXPR_LOCATION (cur_param);
       else if (arglocs)
 	{
 	  /* arg_num is 1-based.  */
 	  gcc_assert (types->arg_num > 0);
-	  location_t param_loc = (*arglocs)[types->arg_num - 1];
-	  param_range = get_range_from_loc (line_table, param_loc);
-	  param_range_ptr = &param_range;
+	  param_loc = (*arglocs)[types->arg_num - 1];
 	}
-      else
-	param_range_ptr = NULL;
 
       STRIP_NOPS (cur_param);
 
@@ -3166,7 +3160,7 @@ check_format_types (const substring_loc &fmt_loc,
 	    }
 	  else
 	    {
-	      format_type_warning (fmt_loc, param_range_ptr,
+	      format_type_warning (fmt_loc, param_loc,
 				   types, wanted_type, orig_cur_type, fki,
 				   offset_to_type_start, conversion_char);
 	      break;
@@ -3236,7 +3230,7 @@ check_format_types (const substring_loc &fmt_loc,
 	  && TYPE_PRECISION (cur_type) == TYPE_PRECISION (wanted_type))
 	continue;
       /* Now we have a type mismatch.  */
-      format_type_warning (fmt_loc, param_range_ptr, types,
+      format_type_warning (fmt_loc, param_loc, types,
 			   wanted_type, orig_cur_type, fki,
 			   offset_to_type_start, conversion_char);
     }
@@ -3544,8 +3538,9 @@ get_corrected_substring (const substring_loc &fmt_loc,
 /* Give a warning about a format argument of different type from that expected.
    The range of the diagnostic is taken from WHOLE_FMT_LOC; the caret location
    is based on the location of the char at TYPE->offset_loc.
-   If non-NULL, PARAM_RANGE is the source range of the
-   relevant argument.  WANTED_TYPE is the type the argument should have,
+   PARAM_LOC is the location of the relevant argument, or UNKNOWN_LOCATION
+   if this is unavailable.
+   WANTED_TYPE is the type the argument should have,
    possibly stripped of pointer dereferences.  The description (such as "field
    precision"), the placement in the format string, a possibly more
    friendly name of WANTED_TYPE, and the number of pointer dereferences
@@ -3566,7 +3561,7 @@ get_corrected_substring (const substring_loc &fmt_loc,
                           V~~~~~~~~ : range of WHOLE_FMT_LOC, from cols 23-31
       sprintf (d, "before %-+*.*lld after", int_expr, int_expr, long_expr);
                                 ^ ^                             ^~~~~~~~~
-                                | ` CONVERSION_CHAR: 'd'        *PARAM_RANGE
+                                | ` CONVERSION_CHAR: 'd'        PARAM_LOC
                                 type starts here
 
    OFFSET_TO_TYPE_START is 13, the offset to the "lld" within the
@@ -3574,7 +3569,7 @@ get_corrected_substring (const substring_loc &fmt_loc,
 
 static void
 format_type_warning (const substring_loc &whole_fmt_loc,
-		     source_range *param_range,
+		     location_t param_loc,
 		     format_wanted_type *type,
 		     tree wanted_type, tree arg_type,
 		     const format_kind_info *fki,
@@ -3636,7 +3631,7 @@ format_type_warning (const substring_loc &whole_fmt_loc,
     {
       if (arg_type)
 	format_warning_at_substring
-	  (fmt_loc, param_range,
+	  (fmt_loc, param_loc,
 	   corrected_substring, OPT_Wformat_,
 	   "%s %<%s%.*s%> expects argument of type %<%s%s%>, "
 	   "but argument %d has type %qT",
@@ -3646,7 +3641,7 @@ format_type_warning (const substring_loc &whole_fmt_loc,
 	   wanted_type_name, p, arg_num, arg_type);
       else
 	format_warning_at_substring
-	  (fmt_loc, param_range,
+	  (fmt_loc, param_loc,
 	   corrected_substring, OPT_Wformat_,
 	   "%s %<%s%.*s%> expects a matching %<%s%s%> argument",
 	   gettext (kind_descriptions[kind]),
@@ -3657,7 +3652,7 @@ format_type_warning (const substring_loc &whole_fmt_loc,
     {
       if (arg_type)
 	format_warning_at_substring
-	  (fmt_loc, param_range,
+	  (fmt_loc, param_loc,
 	   corrected_substring, OPT_Wformat_,
 	   "%s %<%s%.*s%> expects argument of type %<%T%s%>, "
 	   "but argument %d has type %qT",
@@ -3667,7 +3662,7 @@ format_type_warning (const substring_loc &whole_fmt_loc,
 	   wanted_type, p, arg_num, arg_type);
       else
 	format_warning_at_substring
-	  (fmt_loc, param_range,
+	  (fmt_loc, param_loc,
 	   corrected_substring, OPT_Wformat_,
 	   "%s %<%s%.*s%> expects a matching %<%T%s%> argument",
 	   gettext (kind_descriptions[kind]),
diff --git a/gcc/c-family/c-gimplify.c b/gcc/c-family/c-gimplify.c
index 6a4b7c77a34..91f9bf9c7a3 100644
--- a/gcc/c-family/c-gimplify.c
+++ b/gcc/c-family/c-gimplify.c
@@ -229,6 +229,8 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p ATTRIBUTE_UNUSED,
     {
     case LSHIFT_EXPR:
     case RSHIFT_EXPR:
+    case LROTATE_EXPR:
+    case RROTATE_EXPR:
       {
 	/* We used to convert the right operand of a shift-expression
 	   to an integer_type_node in the FEs.  But it is unnecessary
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 6bd535532d3..32120e636c2 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -115,6 +115,7 @@ static void set_std_cxx2a (int);
 static void set_std_c89 (int, int);
 static void set_std_c99 (int);
 static void set_std_c11 (int);
+static void set_std_c17 (int);
 static void check_deps_environment_vars (void);
 static void handle_deferred_opts (void);
 static void sanitize_cpp_opts (void);
@@ -236,8 +237,8 @@ c_common_init_options (unsigned int decoded_options_count,
 
   if (c_language == clk_c)
     {
-      /* The default for C is gnu11.  */
-      set_std_c11 (false /* ISO */);
+      /* The default for C is gnu17.  */
+      set_std_c17 (false /* ISO */);
 
       /* If preprocessing assembly language, accept any of the C-family
 	 front end options since the driver may pass them through.  */
@@ -675,6 +676,16 @@ c_common_handle_option (size_t scode, const char *arg, int value,
 	set_std_c11 (false /* ISO */);
       break;
 
+    case OPT_std_c17:
+      if (!preprocessing_asm_p)
+	set_std_c17 (true /* ISO */);
+      break;
+
+    case OPT_std_gnu17:
+      if (!preprocessing_asm_p)
+	set_std_c17 (false /* ISO */);
+      break;
+
     case OPT_trigraphs:
       cpp_opts->trigraphs = 1;
       break;
@@ -1559,6 +1570,21 @@ set_std_c11 (int iso)
   lang_hooks.name = "GNU C11";
 }
 
+/* Set the C 17 standard (without GNU extensions if ISO).  */
+static void
+set_std_c17 (int iso)
+{
+  cpp_set_lang (parse_in, iso ? CLK_STDC17: CLK_GNUC17);
+  flag_no_asm = iso;
+  flag_no_nonansi_builtin = iso;
+  flag_iso = iso;
+  flag_isoc11 = 1;
+  flag_isoc99 = 1;
+  flag_isoc94 = 1;
+  lang_hooks.name = "GNU C17";
+}
+
+
 /* Set the C++ 98 standard (without GNU extensions if ISO).  */
 static void
 set_std_cxx98 (int iso)
diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index cb1db0327c3..09ef6856cf9 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -496,8 +496,8 @@ warn_logical_not_parentheses (location_t location, enum tree_code code,
       rich_location richloc (line_table, lhs_loc);
       richloc.add_fixit_insert_before (lhs_loc, "(");
       richloc.add_fixit_insert_after (lhs_loc, ")");
-      inform_at_rich_loc (&richloc, "add parentheses around left hand side "
-			  "expression to silence this warning");
+      inform (&richloc, "add parentheses around left hand side "
+	      "expression to silence this warning");
     }
 }
 
@@ -1215,12 +1215,12 @@ warnings_for_convert_and_check (location_t loc, tree type, tree expr,
       if (cst)
 	warning_at (loc, OPT_Woverflow,
 		    "overflow in conversion from %qT to %qT "
-		    "chages value from %qE to %qE",
+		    "changes value from %qE to %qE",
 		    exprtype, type, expr, result);
       else
 	warning_at (loc, OPT_Woverflow,
 		    "overflow in conversion from %qT to %qT "
-		    "chages the value of %qE",
+		    "changes the value of %qE",
 		    exprtype, type, expr);
     }
   else
@@ -2391,13 +2391,13 @@ warn_for_restrict (unsigned param_pos, tree *argarray, unsigned nargs)
 	richloc.add_range (EXPR_LOCATION (arg), false);
     }
 
-  warning_at_rich_loc_n (&richloc, OPT_Wrestrict, arg_positions.length (),
-			 "passing argument %i to restrict-qualified parameter"
-			 " aliases with argument %Z",
-			 "passing argument %i to restrict-qualified parameter"
-			 " aliases with arguments %Z",
-			 param_pos + 1, arg_positions.address (),
-			 arg_positions.length ());
+  warning_n (&richloc, OPT_Wrestrict, arg_positions.length (),
+	     "passing argument %i to restrict-qualified parameter"
+	     " aliases with argument %Z",
+	     "passing argument %i to restrict-qualified parameter"
+	     " aliases with arguments %Z",
+	     param_pos + 1, arg_positions.address (),
+	     arg_positions.length ());
 }
 
 /* Callback function to determine whether an expression TP or one of its
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 13d2a59b8a5..dae124ac1c2 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1944,6 +1944,10 @@ std=c1x
 C ObjC Alias(std=c11)
 Deprecated in favor of -std=c11.
 
+std=c17
+C ObjC
+Conform to the ISO 2017 C standard.
+
 std=c89
 C ObjC Alias(std=c90)
 Conform to the ISO 1990 C standard.
@@ -2006,6 +2010,10 @@ std=gnu1x
 C ObjC Alias(std=gnu11)
 Deprecated in favor of -std=gnu11.
 
+std=gnu17
+C ObjC
+Conform to the ISO 2017 C standard with GNU extensions.
+
 std=gnu89
 C ObjC Alias(std=gnu90)
 Conform to the ISO 1990 C standard with GNU extensions.
@@ -2042,6 +2050,10 @@ std=iso9899:2011
 C ObjC Alias(std=c11)
 Conform to the ISO 2011 C standard.
 
+std=iso9899:2017
+C ObjC Alias(std=c17)
+Conform to the ISO 2017 C standard.
+
 traditional
 Driver
 
diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index 1f697f17f99..60feeea9022 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,47 @@
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* c-decl.c (implicit_decl_warning): Update for renaming of
+	pedwarn_at_rich_loc and warning_at_rich_loc.
+	(implicitly_declare): Likewise for renaming of inform_at_rich_loc.
+	(undeclared_variable): Likewise for renaming of error_at_rich_loc.
+	* c-parser.c (c_parser_declaration_or_fndef): Likewise.
+	(c_parser_struct_or_union_specifier): Likewise for renaming of
+	pedwarn_at_rich_loc.
+	(c_parser_parameter_declaration): Likewise for renaming of
+	error_at_rich_loc.
+	* c-typeck.c (build_component_ref): Likewise.
+	(build_unary_op): Likewise for renaming of inform_at_rich_loc.
+	(pop_init_level): Likewise for renaming of warning_at_rich_loc.
+	(set_init_label): Likewise for renaming of error_at_rich_loc.
+
+2017-10-30  Richard Biener  <rguenther@suse.de>
+
+	* gimple-parser.c (c_parser_gimple_statement): Parse conditional
+	stmts.
+
+2017-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* c-decl.c (header_for_builtin_fn): Add support for copysign, fma,
+	fmax, fmin, and sqrt _Float<N> and _Float<N>X variants.
+
+2017-10-25  David Malcolm  <dmalcolm@redhat.com>
+
+	PR c/7356
+	* c-parser.c (c_parser_declaration_or_fndef): Detect missing
+	semicolons.
+
+2017-10-25  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libstdc++/81706
+	* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
+	newdecl to corresponding __builtin_ if any.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82466
+	* c-decl.c (diagnose_mismatched_decls): Use
+	OPT_Wbuiltin_declaration_mismatch.
+
 2017-10-12  David Malcolm  <dmalcolm@redhat.com>
 
 	* c-parser.c (c_parser_require): Add "type_is_unique" param and
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 26b34ab3e50..d95a2b6ea4f 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -1837,7 +1837,8 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
 	  locate_old_decl (olddecl);
 	}
       else if (TREE_PUBLIC (newdecl))
-	warning (0, "built-in function %q+D declared as non-function",
+	warning (OPT_Wbuiltin_declaration_mismatch,
+		 "built-in function %q+D declared as non-function",
 		 newdecl);
       else
 	warning (OPT_Wshadow, "declaration of %q+D shadows "
@@ -2569,6 +2570,8 @@ merge_decls (tree newdecl, tree olddecl, tree newtype, tree oldtype)
 			set_builtin_decl_declared_p (fncode, true);
 		      break;
 		    }
+
+		  copy_attributes_to_builtin (newdecl);
 		}
 	    }
 	  else
@@ -3116,10 +3119,10 @@ implicit_decl_warning (location_t loc, tree id, tree olddecl)
 	{
 	  gcc_rich_location richloc (loc);
 	  richloc.add_fixit_replace (hint);
-	  warned = pedwarn_at_rich_loc
-	    (&richloc, OPT_Wimplicit_function_declaration,
-	     "implicit declaration of function %qE; did you mean %qs?",
-	     id, hint);
+	  warned = pedwarn (&richloc, OPT_Wimplicit_function_declaration,
+			    "implicit declaration of function %qE;"
+			    " did you mean %qs?",
+			    id, hint);
 	}
       else
 	warned = pedwarn (loc, OPT_Wimplicit_function_declaration,
@@ -3129,7 +3132,7 @@ implicit_decl_warning (location_t loc, tree id, tree olddecl)
     {
       gcc_rich_location richloc (loc);
       richloc.add_fixit_replace (hint);
-      warned = warning_at_rich_loc
+      warned = warning_at
 	(&richloc, OPT_Wimplicit_function_declaration,
 	 G_("implicit declaration of function %qE; did you mean %qs?"),
 	 id, hint);
@@ -3160,6 +3163,7 @@ header_for_builtin_fn (enum built_in_function fcode)
     CASE_FLT_FN (BUILT_IN_CBRT):
     CASE_FLT_FN (BUILT_IN_CEIL):
     CASE_FLT_FN (BUILT_IN_COPYSIGN):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
     CASE_FLT_FN (BUILT_IN_COS):
     CASE_FLT_FN (BUILT_IN_COSH):
     CASE_FLT_FN (BUILT_IN_ERF):
@@ -3168,11 +3172,15 @@ header_for_builtin_fn (enum built_in_function fcode)
     CASE_FLT_FN (BUILT_IN_EXP2):
     CASE_FLT_FN (BUILT_IN_EXPM1):
     CASE_FLT_FN (BUILT_IN_FABS):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS):
     CASE_FLT_FN (BUILT_IN_FDIM):
     CASE_FLT_FN (BUILT_IN_FLOOR):
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
     CASE_FLT_FN (BUILT_IN_FMAX):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMAX):
     CASE_FLT_FN (BUILT_IN_FMIN):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMIN):
     CASE_FLT_FN (BUILT_IN_FMOD):
     CASE_FLT_FN (BUILT_IN_FREXP):
     CASE_FLT_FN (BUILT_IN_HYPOT):
@@ -3204,6 +3212,7 @@ header_for_builtin_fn (enum built_in_function fcode)
     CASE_FLT_FN (BUILT_IN_SINH):
     CASE_FLT_FN (BUILT_IN_SINCOS):
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
     CASE_FLT_FN (BUILT_IN_TAN):
     CASE_FLT_FN (BUILT_IN_TANH):
     CASE_FLT_FN (BUILT_IN_TGAMMA):
@@ -3392,10 +3401,9 @@ implicitly_declare (location_t loc, tree functionid)
 		    {
 		      rich_location richloc (line_table, loc);
 		      maybe_add_include_fixit (&richloc, header);
-		      inform_at_rich_loc
-			(&richloc,
-			 "include %qs or provide a declaration of %qD",
-			 header, decl);
+		      inform (&richloc,
+			      "include %qs or provide a declaration of %qD",
+			      header, decl);
 		    }
 		  newtype = TREE_TYPE (decl);
 		}
@@ -3463,10 +3471,10 @@ undeclared_variable (location_t loc, tree id)
 	{
 	  gcc_rich_location richloc (loc);
 	  richloc.add_fixit_replace (guessed_id);
-	  error_at_rich_loc (&richloc,
-			     "%qE undeclared here (not in a function);"
-			     " did you mean %qs?",
-			     id, guessed_id);
+	  error_at (&richloc,
+		    "%qE undeclared here (not in a function);"
+		    " did you mean %qs?",
+		    id, guessed_id);
 	}
       else
 	error_at (loc, "%qE undeclared here (not in a function)", id);
@@ -3481,11 +3489,10 @@ undeclared_variable (location_t loc, tree id)
 	    {
 	      gcc_rich_location richloc (loc);
 	      richloc.add_fixit_replace (guessed_id);
-	      error_at_rich_loc
-		(&richloc,
-		 "%qE undeclared (first use in this function);"
-		 " did you mean %qs?",
-		 id, guessed_id);
+	      error_at (&richloc,
+			"%qE undeclared (first use in this function);"
+			" did you mean %qs?",
+			id, guessed_id);
 	    }
 	  else
 	    error_at (loc, "%qE undeclared (first use in this function)", id);
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 6b843247911..7bca5f1a2a7 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1785,26 +1785,26 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 	{
 	  /* This is not C++ with its implicit typedef.  */
 	  richloc.add_fixit_insert_before ("struct ");
-	  error_at_rich_loc (&richloc,
-			     "unknown type name %qE;"
-			     " use %<struct%> keyword to refer to the type",
-			     name);
+	  error_at (&richloc,
+		    "unknown type name %qE;"
+		    " use %<struct%> keyword to refer to the type",
+		    name);
 	}
       else if (tag_exists_p (UNION_TYPE, name))
 	{
 	  richloc.add_fixit_insert_before ("union ");
-	  error_at_rich_loc (&richloc,
-			     "unknown type name %qE;"
-			     " use %<union%> keyword to refer to the type",
-			     name);
+	  error_at (&richloc,
+		    "unknown type name %qE;"
+		    " use %<union%> keyword to refer to the type",
+		    name);
 	}
       else if (tag_exists_p (ENUMERAL_TYPE, name))
 	{
 	  richloc.add_fixit_insert_before ("enum ");
-	  error_at_rich_loc (&richloc,
-			     "unknown type name %qE;"
-			     " use %<enum%> keyword to refer to the type",
-			     name);
+	  error_at (&richloc,
+		    "unknown type name %qE;"
+		    " use %<enum%> keyword to refer to the type",
+		    name);
 	}
       else
 	{
@@ -1812,9 +1812,9 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 	  if (hint)
 	    {
 	      richloc.add_fixit_replace (hint);
-	      error_at_rich_loc (&richloc,
-				 "unknown type name %qE; did you mean %qs?",
-				 name, hint);
+	      error_at (&richloc,
+			"unknown type name %qE; did you mean %qs?",
+			name, hint);
 	    }
 	  else
 	    error_at (here, "unknown type name %qE", name);
@@ -2241,11 +2241,37 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 	}
       if (!start_function (specs, declarator, all_prefix_attrs))
 	{
-	  /* This can appear in many cases looking nothing like a
-	     function definition, so we don't give a more specific
-	     error suggesting there was one.  */
-	  c_parser_error (parser, "expected %<=%>, %<,%>, %<;%>, %<asm%> "
-			  "or %<__attribute__%>");
+	  /* At this point we've consumed:
+	       declaration-specifiers declarator
+	     and the next token isn't CPP_EQ, CPP_COMMA, CPP_SEMICOLON,
+	     RID_ASM, RID_ATTRIBUTE, or RID_IN,
+	     but the
+	       declaration-specifiers declarator
+	     aren't grokkable as a function definition, so we have
+	     an error.  */
+	  gcc_assert (!c_parser_next_token_is (parser, CPP_SEMICOLON));
+	  if (c_parser_next_token_starts_declspecs (parser))
+	    {
+	      /* If we have
+		   declaration-specifiers declarator decl-specs
+		 then assume we have a missing semicolon, which would
+		 give us:
+		   declaration-specifiers declarator  decl-specs
+						    ^
+						    ;
+		   <~~~~~~~~~ declaration ~~~~~~~~~~>
+		 Use c_parser_require to get an error with a fix-it hint.  */
+	      c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>");
+	      parser->error = false;
+	    }
+	  else
+	    {
+	      /* This can appear in many cases looking nothing like a
+		 function definition, so we don't give a more specific
+		 error suggesting there was one.  */
+	      c_parser_error (parser, "expected %<=%>, %<,%>, %<;%>, %<asm%> "
+			      "or %<__attribute__%>");
+	    }
 	  if (nested)
 	    c_pop_function_context ();
 	  break;
@@ -3142,9 +3168,8 @@ c_parser_struct_or_union_specifier (c_parser *parser)
 		= c_parser_peek_token (parser)->location;
 	      gcc_rich_location richloc (semicolon_loc);
 	      richloc.add_fixit_remove ();
-	      pedwarn_at_rich_loc
-		(&richloc, OPT_Wpedantic,
-		 "extra semicolon in struct or union specified");
+	      pedwarn (&richloc, OPT_Wpedantic,
+		       "extra semicolon in struct or union specified");
 	      c_parser_consume_token (parser);
 	      continue;
 	    }
@@ -4047,9 +4072,9 @@ c_parser_parameter_declaration (c_parser *parser, tree attrs)
 	    {
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_replace (hint);
-	      error_at_rich_loc (&richloc,
-				 "unknown type name %qE; did you mean %qs?",
-				 token->value, hint);
+	      error_at (&richloc,
+			"unknown type name %qE; did you mean %qs?",
+			token->value, hint);
 	    }
 	  else
 	    error_at (token->location, "unknown type name %qE", token->value);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index ad980548a74..e28dfc2884e 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -2406,10 +2406,9 @@ build_component_ref (location_t loc, tree datum, tree component,
 	      gcc_rich_location rich_loc (reported_loc);
 	      if (component_loc != UNKNOWN_LOCATION)
 		rich_loc.add_fixit_misspelled_id (component_loc, guessed_id);
-	      error_at_rich_loc
-		(&rich_loc,
-		 "%qT has no member named %qE; did you mean %qE?",
-		 type, component, guessed_id);
+	      error_at (&rich_loc,
+			"%qT has no member named %qE; did you mean %qE?",
+			type, component, guessed_id);
 	    }
 	  else
 	    error_at (loc, "%qT has no member named %qE", type, component);
@@ -2483,9 +2482,9 @@ build_component_ref (location_t loc, tree datum, tree component,
       rich_location richloc (line_table, loc);
       /* "loc" should be the "." token.  */
       richloc.add_fixit_replace ("->");
-      error_at_rich_loc (&richloc,
-			 "%qE is a pointer; did you mean to use %<->%>?",
-			 datum);
+      error_at (&richloc,
+		"%qE is a pointer; did you mean to use %<->%>?",
+		datum);
       return error_mark_node;
     }
   else if (code != ERROR_MARK)
@@ -4276,8 +4275,7 @@ build_unary_op (location_t location, enum tree_code code, tree xarg,
 	    {
 	      gcc_rich_location richloc (location);
 	      richloc.add_fixit_insert_before (location, "!");
-	      inform_at_rich_loc (&richloc, "did you mean to use logical "
-				  "not?");
+	      inform (&richloc, "did you mean to use logical not?");
 	    }
 	  if (!noconvert)
 	    arg = default_conversion (arg);
@@ -8256,9 +8254,9 @@ pop_init_level (location_t loc, int implicit,
       && !constructor_zeroinit)
     {
       gcc_assert (initializer_stack->missing_brace_richloc);
-      warning_at_rich_loc (initializer_stack->missing_brace_richloc,
-			   OPT_Wmissing_braces,
-			   "missing braces around initializer");
+      warning_at (initializer_stack->missing_brace_richloc,
+		  OPT_Wmissing_braces,
+		  "missing braces around initializer");
     }
 
   /* Warn when some struct elements are implicitly initialized to zero.  */
@@ -8580,10 +8578,9 @@ set_init_label (location_t loc, tree fieldname, location_t fieldname_loc,
 	{
 	  gcc_rich_location rich_loc (fieldname_loc);
 	  rich_loc.add_fixit_misspelled_id (fieldname_loc, guessed_id);
-	  error_at_rich_loc
-	    (&rich_loc,
-	     "%qT has no member named %qE; did you mean %qE?",
-	     constructor_type, fieldname, guessed_id);
+	  error_at (&rich_loc,
+		    "%qT has no member named %qE; did you mean %qE?",
+		    constructor_type, fieldname, guessed_id);
 	}
       else
 	error_at (fieldname_loc, "%qT has no member named %qE",
diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index ab335b6e78c..aea675ffabb 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -276,7 +276,7 @@ c_parser_gimple_statement (c_parser *parser, gimple_seq *seq)
       && TREE_CODE (lhs.value) == CALL_EXPR)
     {
       gimple *call;
-      call = gimple_build_call_from_tree (lhs.value);
+      call = gimple_build_call_from_tree (lhs.value, NULL);
       gimple_seq_add_stmt (seq, call);
       gimple_set_location (call, loc);
       return;
@@ -407,7 +407,7 @@ c_parser_gimple_statement (c_parser *parser, gimple_seq *seq)
       rhs = c_parser_gimple_unary_expression (parser);
       if (rhs.value != error_mark_node)
 	{
-	  gimple *call = gimple_build_call_from_tree (rhs.value);
+	  gimple *call = gimple_build_call_from_tree (rhs.value, NULL);
 	  gimple_call_set_lhs (call, lhs.value);
 	  gimple_seq_add_stmt (seq, call);
 	  gimple_set_location (call, loc);
@@ -419,6 +419,23 @@ c_parser_gimple_statement (c_parser *parser, gimple_seq *seq)
   if (lhs.value != error_mark_node
       && rhs.value != error_mark_node)
     {
+      /* If we parsed a comparison and the next token is a '?' then
+         parse a conditional expression.  */
+      if (COMPARISON_CLASS_P (rhs.value)
+	  && c_parser_next_token_is (parser, CPP_QUERY))
+	{
+	  struct c_expr trueval, falseval;
+	  c_parser_consume_token (parser);
+	  trueval = c_parser_gimple_postfix_expression (parser);
+	  falseval.set_error ();
+	  if (c_parser_require (parser, CPP_COLON, "expected %<:%>"))
+	    falseval = c_parser_gimple_postfix_expression (parser);
+	  if (trueval.value == error_mark_node
+	      || falseval.value == error_mark_node)
+	    return;
+	  rhs.value = build3_loc (loc, COND_EXPR, TREE_TYPE (trueval.value),
+				  rhs.value, trueval.value, falseval.value);
+	}
       assign = gimple_build_assign (lhs.value, rhs.value);
       gimple_seq_add_stmt (seq, assign);
       gimple_set_location (assign, loc);
diff --git a/gcc/calls.c b/gcc/calls.c
index ff9724358c5..477bc369036 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -444,7 +444,7 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU
      if no arguments are actually popped.  If the target does not have
      "call" or "call_value" insns, then we must use the popping versions
      even if the call has no arguments to pop.  */
-  else if (maybe_nonzero (n_popped)
+  else if (may_ne (n_popped, 0)
 	   || !(valreg
 		? targetm.have_call_value ()
 		: targetm.have_call ()))
@@ -523,7 +523,7 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU
      if the context of the call as a whole permits.  */
   inhibit_defer_pop = old_inhibit_defer_pop;
 
-  if (maybe_nonzero (n_popped))
+  if (may_ne (n_popped, 0))
     {
       if (!already_popped)
 	CALL_INSN_FUNCTION_USAGE (call_insn)
@@ -555,7 +555,7 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU
 	 If returning from the subroutine does pop the args, indicate that the
 	 stack pointer will be changed.  */
 
-      if (maybe_nonzero (rounded_stack_size))
+      if (may_ne (rounded_stack_size, 0))
 	{
 	  if (ecf_flags & ECF_NORETURN)
 	    /* Just pretend we did the pop.  */
@@ -578,7 +578,7 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU
 
      ??? It will be worthwhile to enable combine_stack_adjustments even for
      such machines.  */
-  else if (maybe_nonzero (n_popped))
+  else if (may_ne (n_popped, 0))
     anti_adjust_stack (gen_int_mode (n_popped, Pmode));
 }
 
@@ -644,16 +644,9 @@ special_function_p (const_tree fndecl, int flags)
 	flags |= ECF_RETURNS_TWICE;
     }
 
-  if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
-    switch (DECL_FUNCTION_CODE (fndecl))
-      {
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
-	flags |= ECF_MAY_BE_ALLOCA;
-	break;
-      default:
-	break;
-      }
+  if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
+      && ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (fndecl)))
+    flags |= ECF_MAY_BE_ALLOCA;
 
   return flags;
 }
@@ -735,8 +728,7 @@ gimple_alloca_call_p (const gimple *stmt)
   if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
     switch (DECL_FUNCTION_CODE (fndecl))
       {
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
+      CASE_BUILT_IN_ALLOCA:
         return true;
       default:
 	break;
@@ -756,8 +748,7 @@ alloca_call_p (const_tree exp)
       && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
     switch (DECL_FUNCTION_CODE (fndecl))
       {
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
+      CASE_BUILT_IN_ALLOCA:
         return true;
       default:
 	break;
@@ -1857,6 +1848,8 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
 		  copy = allocate_dynamic_stack_space (size_rtx,
 						       TYPE_ALIGN (type),
 						       TYPE_ALIGN (type),
+						       max_int_size_in_bytes
+						       (type),
 						       true);
 		  copy = gen_rtx_MEM (BLKmode, copy);
 		  set_mem_attributes (copy, type, 1);
@@ -2186,7 +2179,7 @@ finalize_must_preallocate (int must_preallocate, int num_actuals,
 	      += int_size_in_bytes (TREE_TYPE (args[i].tree_value));
 	}
 
-      if (maybe_nonzero (args_size->constant)
+      if (may_ne (args_size->constant, 0)
 	  && may_ge (copy_to_evaluate_size * 2, args_size->constant))
 	must_preallocate = 1;
     }
@@ -2465,7 +2458,7 @@ mem_might_overlap_already_clobbered_arg_p (rtx addr, poly_uint64 size)
   else if (!poly_int_rtx_p (val, &i))
     return true;
 
-  if (known_zero (size))
+  if (must_eq (size, 0U))
     return false;
 
   if (STACK_GROWS_DOWNWARD)
@@ -2839,7 +2832,7 @@ shift_return_value (machine_mode mode, bool left_p, rtx value)
   machine_mode value_mode = GET_MODE (value);
   poly_int64 shift = GET_MODE_BITSIZE (value_mode) - GET_MODE_BITSIZE (mode);
 
-  if (known_zero (shift))
+  if (must_eq (shift, 0))
     return false;
 
   /* Use ashr rather than lshr for right shifts.  This is for the benefit
@@ -3386,7 +3379,7 @@ expand_call (tree exp, rtx target, int ignore)
 			      structure_value_addr))
       && (args_size.var
 	  || (!ACCUMULATE_OUTGOING_ARGS
-	      && maybe_nonzero (args_size.constant))))
+	      && may_ne (args_size.constant, 0))))
     structure_value_addr = copy_to_reg (structure_value_addr);
 
   /* Tail calls can make things harder to debug, and we've traditionally
@@ -3503,9 +3496,9 @@ expand_call (tree exp, rtx target, int ignore)
 	 Also do the adjustments before a throwing call, otherwise
 	 exception handling can fail; PR 19225. */
       if (may_ge (pending_stack_adjust, 32)
-	  || (maybe_nonzero (pending_stack_adjust)
+	  || (may_ne (pending_stack_adjust, 0)
 	      && (flags & ECF_MAY_BE_ALLOCA))
-	  || (maybe_nonzero (pending_stack_adjust)
+	  || (may_ne (pending_stack_adjust, 0)
 	      && flag_exceptions && !(flags & ECF_NOTHROW))
 	  || pass == 0)
 	do_pending_stack_adjust ();
@@ -3686,7 +3679,7 @@ expand_call (tree exp, rtx target, int ignore)
 
 		  /* Special case this because overhead of `push_block' in
 		     this case is non-trivial.  */
-		  if (known_zero (needed))
+		  if (must_eq (needed, 0))
 		    argblock = virtual_outgoing_args_rtx;
 		  else
 		    {
@@ -3745,8 +3738,8 @@ expand_call (tree exp, rtx target, int ignore)
 	      /* We can pass TRUE as the 4th argument because we just
 		 saved the stack pointer and will restore it right after
 		 the call.  */
-	      allocate_dynamic_stack_space (push_size, 0,
-					    BIGGEST_ALIGNMENT, true);
+	      allocate_dynamic_stack_space (push_size, 0, BIGGEST_ALIGNMENT,
+					    -1, true);
 	    }
 
 	  /* If argument evaluation might modify the stack pointer,
@@ -3781,7 +3774,7 @@ expand_call (tree exp, rtx target, int ignore)
 	{
 	  /* When the stack adjustment is pending, we get better code
 	     by combining the adjustments.  */
-	  if (maybe_nonzero (pending_stack_adjust)
+	  if (may_ne (pending_stack_adjust, 0)
 	      && ! inhibit_defer_pop
 	      && (combine_pending_stack_adjustment_and_call
 		  (&pending_stack_adjust,
diff --git a/gcc/cfg.c b/gcc/cfg.c
index 01e68aeda51..062788afdc0 100644
--- a/gcc/cfg.c
+++ b/gcc/cfg.c
@@ -68,6 +68,7 @@ init_flow (struct function *the_fun)
   if (!the_fun->cfg)
     the_fun->cfg = ggc_cleared_alloc<control_flow_graph> ();
   n_edges_for_fn (the_fun) = 0;
+  the_fun->cfg->count_max = profile_count::uninitialized ();
   ENTRY_BLOCK_PTR_FOR_FN (the_fun)
     = alloc_block ();
   ENTRY_BLOCK_PTR_FOR_FN (the_fun)->index = ENTRY_BLOCK;
@@ -263,7 +264,6 @@ unchecked_make_edge (basic_block src, basic_block dst, int flags)
   e = ggc_cleared_alloc<edge_def> ();
   n_edges_for_fn (cfun)++;
 
-  e->count = profile_count::uninitialized ();
   e->probability = profile_probability::uninitialized ();
   e->src = src;
   e->dest = dst;
@@ -334,7 +334,6 @@ make_single_succ_edge (basic_block src, basic_block dest, int flags)
   edge e = make_edge (src, dest, flags);
 
   e->probability = profile_probability::always ();
-  e->count = src->count;
   return e;
 }
 
@@ -445,37 +444,18 @@ check_bb_profile (basic_block bb, FILE * file, int indent)
 		       ";; %sInvalid sum of outgoing probabilities %.1f%%\n",
 		       s_indent, isum * 100.0 / REG_BR_PROB_BASE);
 	    }
-	  profile_count lsum = profile_count::zero ();
-	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    lsum += e->count;
-	  if (EDGE_COUNT (bb->succs) && lsum.differs_from_p (bb->count))
-	    {
-	      fprintf (file, ";; %sInvalid sum of outgoing counts ",
-		       s_indent);
-	      lsum.dump (file);
-	      fprintf (file, ", should be ");
-	      bb->count.dump (file);
-	      fprintf (file, "\n");
-	    }
 	}
     }
   if (bb != ENTRY_BLOCK_PTR_FOR_FN (fun))
     {
-      int sum = 0;
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	sum += EDGE_FREQUENCY (e);
-      if (abs (sum - bb->frequency) > 100)
-	fprintf (file,
-		 ";; %sInvalid sum of incoming frequencies %i, should be %i\n",
-		 s_indent, sum, bb->frequency);
-      profile_count lsum = profile_count::zero ();
+      profile_count sum = profile_count::zero ();
       FOR_EACH_EDGE (e, ei, bb->preds)
-	lsum += e->count;
-      if (lsum.differs_from_p (bb->count))
+	sum += e->count ();
+      if (sum.differs_from_p (bb->count))
 	{
 	  fprintf (file, ";; %sInvalid sum of incoming counts ",
 		   s_indent);
-	  lsum.dump (file);
+	  sum.dump (file);
 	  fprintf (file, ", should be ");
 	  bb->count.dump (file);
 	  fprintf (file, "\n");
@@ -522,10 +502,10 @@ dump_edge_info (FILE *file, edge e, dump_flags_t flags, int do_succ)
       fprintf (file, "] ");
     }
 
-  if (e->count.initialized_p () && do_details)
+  if (e->count ().initialized_p () && do_details)
     {
       fputs (" count:", file);
-      e->count.dump (file);
+      e->count ().dump (file);
     }
 
   if (e->flags && do_details)
@@ -777,7 +757,6 @@ dump_bb_info (FILE *outf, basic_block bb, int indent, dump_flags_t flags,
 	      fputs (", count ", outf);
 	      bb->count.dump (outf);
 	    }
-	  fprintf (outf, ", freq %i", bb->frequency);
 	  if (maybe_hot_bb_p (fun, bb))
 	    fputs (", maybe hot", outf);
 	  if (probably_never_executed_bb_p (fun, bb))
@@ -869,15 +848,15 @@ brief_dump_cfg (FILE *file, dump_flags_t flags)
     }
 }
 
-/* An edge originally destinating BB of FREQUENCY and COUNT has been proved to
+/* An edge originally destinating BB of COUNT has been proved to
    leave the block by TAKEN_EDGE.  Update profile of BB such that edge E can be
    redirected to destination of TAKEN_EDGE.
 
    This function may leave the profile inconsistent in the case TAKEN_EDGE
-   frequency or count is believed to be lower than FREQUENCY or COUNT
+   frequency or count is believed to be lower than COUNT
    respectively.  */
 void
-update_bb_profile_for_threading (basic_block bb, int edge_frequency,
+update_bb_profile_for_threading (basic_block bb, 
 				 profile_count count, edge taken_edge)
 {
   edge c;
@@ -892,16 +871,10 @@ update_bb_profile_for_threading (basic_block bb, int edge_frequency,
     }
   bb->count -= count;
 
-  bb->frequency -= edge_frequency;
-  if (bb->frequency < 0)
-    bb->frequency = 0;
-
   /* Compute the probability of TAKEN_EDGE being reached via threaded edge.
      Watch for overflows.  */
-  if (bb->frequency)
-    /* FIXME: We should get edge frequency as count.  */
-    prob = profile_probability::probability_in_gcov_type
-		 (edge_frequency, bb->frequency);
+  if (bb->count.nonzero_p ())
+    prob = count.probability_in (bb->count);
   else
     prob = profile_probability::never ();
   if (prob > taken_edge->probability)
@@ -925,9 +898,9 @@ update_bb_profile_for_threading (basic_block bb, int edge_frequency,
   if (prob == profile_probability::never ())
     {
       if (dump_file)
-	fprintf (dump_file, "Edge frequencies of bb %i has been reset, "
-		 "frequency of block should end up being 0, it is %i\n",
-		 bb->index, bb->frequency);
+	fprintf (dump_file, "Edge probabilities of bb %i has been reset, "
+		 "count of block should end up being 0, it is non-zero\n",
+		 bb->index);
       EDGE_SUCC (bb, 0)->probability = profile_probability::guessed_always ();
       ei = ei_start (bb->succs);
       ei_next (&ei);
@@ -941,10 +914,6 @@ update_bb_profile_for_threading (basic_block bb, int edge_frequency,
     }
 
   gcc_assert (bb == taken_edge->src);
-  if (dump_file && taken_edge->count < count)
-    fprintf (dump_file, "edge %i->%i count became negative after threading",
-	     taken_edge->src->index, taken_edge->dest->index);
-  taken_edge->count -= count;
 }
 
 /* Multiply all frequencies of basic blocks in array BBS of length NBBS
@@ -953,7 +922,6 @@ void
 scale_bbs_frequencies_int (basic_block *bbs, int nbbs, int num, int den)
 {
   int i;
-  edge e;
   if (num < 0)
     num = 0;
 
@@ -973,21 +941,10 @@ scale_bbs_frequencies_int (basic_block *bbs, int nbbs, int num, int den)
 
   for (i = 0; i < nbbs; i++)
     {
-      edge_iterator ei;
-      bbs[i]->frequency = RDIV (bbs[i]->frequency * num, den);
-      /* Make sure the frequencies do not grow over BB_FREQ_MAX.  */
-      if (bbs[i]->frequency > BB_FREQ_MAX)
-	bbs[i]->frequency = BB_FREQ_MAX;
       bbs[i]->count = bbs[i]->count.apply_scale (num, den);
-      FOR_EACH_EDGE (e, ei, bbs[i]->succs)
-	e->count = e->count.apply_scale (num, den);
     }
 }
 
-/* numbers smaller than this value are safe to multiply without getting
-   64bit overflow.  */
-#define MAX_SAFE_MULTIPLIER (1 << (sizeof (int64_t) * 4 - 1))
-
 /* Multiply all frequencies of basic blocks in array BBS of length NBBS
    by NUM/DEN, in gcov_type arithmetic.  More accurate than previous
    function but considerably slower.  */
@@ -996,38 +953,9 @@ scale_bbs_frequencies_gcov_type (basic_block *bbs, int nbbs, gcov_type num,
 				 gcov_type den)
 {
   int i;
-  edge e;
-  gcov_type fraction = RDIV (num * 65536, den);
-
-  gcc_assert (fraction >= 0);
 
-  if (num < MAX_SAFE_MULTIPLIER)
-    for (i = 0; i < nbbs; i++)
-      {
-	edge_iterator ei;
-	bbs[i]->frequency = RDIV (bbs[i]->frequency * num, den);
-	if (bbs[i]->count <= MAX_SAFE_MULTIPLIER)
-	  bbs[i]->count = bbs[i]->count.apply_scale (num, den);
-	else
-	  bbs[i]->count = bbs[i]->count.apply_scale (fraction, 65536);
-	FOR_EACH_EDGE (e, ei, bbs[i]->succs)
-	  if (bbs[i]->count <= MAX_SAFE_MULTIPLIER)
-	    e->count =  e->count.apply_scale (num, den);
-	  else
-	    e->count = e->count.apply_scale (fraction, 65536);
-      }
-   else
-    for (i = 0; i < nbbs; i++)
-      {
-	edge_iterator ei;
-	if (sizeof (gcov_type) > sizeof (int))
-	  bbs[i]->frequency = RDIV (bbs[i]->frequency * num, den);
-	else
-	  bbs[i]->frequency = RDIV (bbs[i]->frequency * fraction, 65536);
-	bbs[i]->count = bbs[i]->count.apply_scale (fraction, 65536);
-	FOR_EACH_EDGE (e, ei, bbs[i]->succs)
-	  e->count = e->count.apply_scale (fraction, 65536);
-      }
+  for (i = 0; i < nbbs; i++)
+    bbs[i]->count = bbs[i]->count.apply_scale (num, den);
 }
 
 /* Multiply all frequencies of basic blocks in array BBS of length NBBS
@@ -1038,17 +966,9 @@ scale_bbs_frequencies_profile_count (basic_block *bbs, int nbbs,
 				     profile_count num, profile_count den)
 {
   int i;
-  edge e;
-
-  for (i = 0; i < nbbs; i++)
-    {
-      edge_iterator ei;
-      bbs[i]->frequency = RDIV (bbs[i]->frequency * num.to_gcov_type (),
-				den.to_gcov_type ());
+  if (num == profile_count::zero () || den.nonzero_p ())
+    for (i = 0; i < nbbs; i++)
       bbs[i]->count = bbs[i]->count.apply_scale (num, den);
-      FOR_EACH_EDGE (e, ei, bbs[i]->succs)
-	e->count =  e->count.apply_scale (num, den);
-    }
 }
 
 /* Multiply all frequencies of basic blocks in array BBS of length NBBS
@@ -1059,16 +979,9 @@ scale_bbs_frequencies (basic_block *bbs, int nbbs,
 		       profile_probability p)
 {
   int i;
-  edge e;
 
   for (i = 0; i < nbbs; i++)
-    {
-      edge_iterator ei;
-      bbs[i]->frequency = p.apply (bbs[i]->frequency);
-      bbs[i]->count = bbs[i]->count.apply_probability (p);
-      FOR_EACH_EDGE (e, ei, bbs[i]->succs)
-	e->count =  e->count.apply_probability (p);
-    }
+    bbs[i]->count = bbs[i]->count.apply_probability (p);
 }
 
 /* Helper types for hash tables.  */
diff --git a/gcc/cfg.h b/gcc/cfg.h
index 81b243a1a9e..e8129ddb190 100644
--- a/gcc/cfg.h
+++ b/gcc/cfg.h
@@ -71,6 +71,9 @@ struct GTY(()) control_flow_graph {
   /* Maximal number of entities in the single jumptable.  Used to estimate
      final flowgraph size.  */
   int max_jumptable_ents;
+
+  /* Maximal count of BB in function.  */
+  profile_count count_max;
 };
 
 
@@ -103,7 +106,7 @@ extern void debug_bb (basic_block);
 extern basic_block debug_bb_n (int);
 extern void dump_bb_info (FILE *, basic_block, int, dump_flags_t, bool, bool);
 extern void brief_dump_cfg (FILE *, dump_flags_t);
-extern void update_bb_profile_for_threading (basic_block, int, profile_count, edge);
+extern void update_bb_profile_for_threading (basic_block, profile_count, edge);
 extern void scale_bbs_frequencies_int (basic_block *, int, int, int);
 extern void scale_bbs_frequencies_gcov_type (basic_block *, int, gcov_type,
 					     gcov_type);
diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index 394d986c945..8bf8a53fa58 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -612,7 +612,6 @@ connect_infinite_loops_to_exit (void)
       basic_block deadend_block = dfs_find_deadend (unvisited_block);
       edge e = make_edge (deadend_block, EXIT_BLOCK_PTR_FOR_FN (cfun),
 			  EDGE_FAKE);
-      e->count = profile_count::zero ();
       e->probability = profile_probability::never ();
       dfs.add_bb (deadend_block);
     }
@@ -1555,3 +1554,42 @@ single_pred_before_succ_order (void)
 #undef MARK_VISITED
 #undef VISITED_P
 }
+
+/* Ignoring loop backedges, if BB has precisely one incoming edge then
+   return that edge.  Otherwise return NULL.
+
+   When IGNORE_NOT_EXECUTABLE is true, also ignore edges that are not marked
+   as executable.  */
+
+edge
+single_pred_edge_ignoring_loop_edges (basic_block bb,
+				      bool ignore_not_executable)
+{
+  edge retval = NULL;
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      /* A loop back edge can be identified by the destination of
+	 the edge dominating the source of the edge.  */
+      if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
+	continue;
+
+      /* We can safely ignore edges that are not executable.  */
+      if (ignore_not_executable
+	  && (e->flags & EDGE_EXECUTABLE) == 0)
+	continue;
+
+      /* If we have already seen a non-loop edge, then we must have
+	 multiple incoming non-loop edges and thus we return NULL.  */
+      if (retval)
+	return NULL;
+
+      /* This is the first non-loop incoming edge we have found.  Record
+	 it.  */
+      retval = e;
+    }
+
+  return retval;
+}
diff --git a/gcc/cfganal.h b/gcc/cfganal.h
index 39bb5e547a5..c5cb51d9cf8 100644
--- a/gcc/cfganal.h
+++ b/gcc/cfganal.h
@@ -77,5 +77,8 @@ extern void bitmap_intersection_of_preds (sbitmap, sbitmap *, basic_block);
 extern void bitmap_union_of_succs (sbitmap, sbitmap *, basic_block);
 extern void bitmap_union_of_preds (sbitmap, sbitmap *, basic_block);
 extern basic_block * single_pred_before_succ_order (void);
+extern edge single_incoming_edge_ignoring_loop_edges (basic_block, bool);
+extern edge single_pred_edge_ignoring_loop_edges (basic_block, bool);
+
 
 #endif /* GCC_CFGANAL_H */
diff --git a/gcc/cfgbuild.c b/gcc/cfgbuild.c
index 62956b2a6a2..a0926752143 100644
--- a/gcc/cfgbuild.c
+++ b/gcc/cfgbuild.c
@@ -499,7 +499,6 @@ find_bb_boundaries (basic_block bb)
 	  remove_edge (fallthru);
 	  /* BB is unreachable at this point - we need to determine its profile
 	     once edges are built.  */
-	  bb->frequency = 0;
 	  bb->count = profile_count::uninitialized ();
 	  flow_transfer_insn = NULL;
 	  debug_insn = NULL;
@@ -576,10 +575,8 @@ compute_outgoing_frequencies (basic_block b)
 	  e = BRANCH_EDGE (b);
 	  e->probability
 		 = profile_probability::from_reg_br_prob_note (probability);
-	  e->count = b->count.apply_probability (e->probability);
 	  f = FALLTHRU_EDGE (b);
 	  f->probability = e->probability.invert ();
-	  f->count = b->count - e->count;
 	  return;
 	}
       else
@@ -591,7 +588,6 @@ compute_outgoing_frequencies (basic_block b)
     {
       e = single_succ_edge (b);
       e->probability = profile_probability::always ();
-      e->count = b->count;
       return;
     }
   else
@@ -610,10 +606,6 @@ compute_outgoing_frequencies (basic_block b)
       if (complex_edge)
         guess_outgoing_edge_probabilities (b);
     }
-
-  if (b->count.initialized_p ())
-    FOR_EACH_EDGE (e, ei, b->succs)
-      e->count = b->count.apply_probability (e->probability);
 }
 
 /* Assume that some pass has inserted labels or control flow
@@ -676,18 +668,15 @@ find_many_sub_basic_blocks (sbitmap blocks)
 	  {
 	    bool initialized_src = false, uninitialized_src = false;
 	    bb->count = profile_count::zero ();
-	    bb->frequency = 0;
 	    FOR_EACH_EDGE (e, ei, bb->preds)
 	      {
-		if (e->count.initialized_p ())
+		if (e->count ().initialized_p ())
 		  {
-		    bb->count += e->count;
+		    bb->count += e->count ();
 		    initialized_src = true;
 		  }
 		else
 		  uninitialized_src = true;
-		if (e->probability.initialized_p ())
-		  bb->frequency += EDGE_FREQUENCY (e);
 	      }
 	    /* When some edges are missing with read profile, this is
 	       most likely because RTL expansion introduced loop.
@@ -699,7 +688,7 @@ find_many_sub_basic_blocks (sbitmap blocks)
 	       precisely once.  */
 	    if (!initialized_src
 		|| (uninitialized_src
-		     && profile_status_for_fn (cfun) != PROFILE_READ))
+		     && profile_status_for_fn (cfun) < PROFILE_GUESSED))
 	      bb->count = profile_count::uninitialized ();
 	  }
  	/* If nothing changed, there is no need to create new BBs.  */
diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
index 25327d545ba..3a32938b05d 100644
--- a/gcc/cfgcleanup.c
+++ b/gcc/cfgcleanup.c
@@ -558,9 +558,7 @@ try_forward_edges (int mode, basic_block b)
       else
 	{
 	  /* Save the values now, as the edge may get removed.  */
-	  profile_count edge_count = e->count;
-	  profile_probability edge_probability = e->probability;
-	  int edge_frequency;
+	  profile_count edge_count = e->count ();
 	  int n = 0;
 
 	  e->goto_locus = goto_locus;
@@ -585,8 +583,6 @@ try_forward_edges (int mode, basic_block b)
 	  /* We successfully forwarded the edge.  Now update profile
 	     data: for each edge we traversed in the chain, remove
 	     the original edge's execution count.  */
-	  edge_frequency = edge_probability.apply (b->frequency);
-
 	  do
 	    {
 	      edge t;
@@ -596,16 +592,12 @@ try_forward_edges (int mode, basic_block b)
 		  gcc_assert (n < nthreaded_edges);
 		  t = threaded_edges [n++];
 		  gcc_assert (t->src == first);
-		  update_bb_profile_for_threading (first, edge_frequency,
-						   edge_count, t);
+		  update_bb_profile_for_threading (first, edge_count, t);
 		  update_br_prob_note (first);
 		}
 	      else
 		{
 		  first->count -= edge_count;
-		  first->frequency -= edge_frequency;
-		  if (first->frequency < 0)
-		    first->frequency = 0;
 		  /* It is possible that as the result of
 		     threading we've removed edge as it is
 		     threaded to the fallthru edge.  Avoid
@@ -616,7 +608,6 @@ try_forward_edges (int mode, basic_block b)
 		  t = single_succ_edge (first);
 		}
 
-	      t->count -= edge_count;
 	      first = t->dest;
 	    }
 	  while (first != target);
@@ -2111,7 +2102,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2,
   else
     redirect_edges_to = osrc2;
 
-  /* Recompute the frequencies and counts of outgoing edges.  */
+  /* Recompute the counts of destinations of outgoing edges.  */
   FOR_EACH_EDGE (s, ei, redirect_edges_to->succs)
     {
       edge s2;
@@ -2130,34 +2121,23 @@ try_crossjump_to_edge (int mode, edge e1, edge e2,
 	    break;
 	}
 
-      s->count += s2->count;
-
       /* Take care to update possible forwarder blocks.  We verified
 	 that there is no more than one in the chain, so we can't run
 	 into infinite loop.  */
       if (FORWARDER_BLOCK_P (s->dest))
-	{
-	  single_succ_edge (s->dest)->count += s2->count;
-	  s->dest->count += s2->count;
-	  s->dest->frequency += EDGE_FREQUENCY (s);
-	}
+	s->dest->count += s->count ();
 
       if (FORWARDER_BLOCK_P (s2->dest))
-	{
-	  single_succ_edge (s2->dest)->count -= s2->count;
-	  s2->dest->count -= s2->count;
-	  s2->dest->frequency -= EDGE_FREQUENCY (s);
-	  if (s2->dest->frequency < 0)
-	    s2->dest->frequency = 0;
-	}
+	s2->dest->count -= s->count ();
 
-      if (!redirect_edges_to->frequency && !src1->frequency)
+      /* FIXME: Is this correct? Should be rewritten to count API.  */
+      if (redirect_edges_to->count.nonzero_p () && src1->count.nonzero_p ())
 	s->probability = s->probability.combine_with_freq
-			   (redirect_edges_to->frequency,
-			    s2->probability, src1->frequency);
+			   (redirect_edges_to->count.to_frequency (cfun),
+			    s2->probability, src1->count.to_frequency (cfun));
     }
 
-  /* Adjust count and frequency for the block.  An earlier jump
+  /* Adjust count for the block.  An earlier jump
      threading pass may have left the profile in an inconsistent
      state (see update_bb_profile_for_threading) so we must be
      prepared for overflows.  */
@@ -2165,9 +2145,6 @@ try_crossjump_to_edge (int mode, edge e1, edge e2,
   do
     {
       tmp->count += src1->count;
-      tmp->frequency += src1->frequency;
-      if (tmp->frequency > BB_FREQ_MAX)
-        tmp->frequency = BB_FREQ_MAX;
       if (tmp == redirect_edges_to)
         break;
       tmp = find_fallthru_edge (tmp->succs)->dest;
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 07b5c9df4c6..06a8af8a166 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -446,7 +446,7 @@ add_stack_var (tree decl)
   v->size = tree_to_poly_uint64 (size);
   /* Ensure that all variables have size, so that &a != &b for any two
      variables that are simultaneously live.  */
-  if (known_zero (v->size))
+  if (must_eq (v->size, 0U))
     v->size = 1;
   v->alignb = align_local_variable (decl);
   /* An alignment of zero can mightily confuse us later.  */
@@ -1183,7 +1183,7 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 
 	  /* If there were any variables requiring "large" alignment, allocate
 	     space.  */
-	  if (maybe_nonzero (large_size) && ! large_allocation_done)
+	  if (may_ne (large_size, 0U) && ! large_allocation_done)
 	    {
 	      poly_int64 loffset;
 	      rtx large_allocsize;
@@ -1397,10 +1397,18 @@ expand_one_ssa_partition (tree var)
     }
 
   machine_mode reg_mode = promote_ssa_mode (var, NULL);
-
   rtx x = gen_reg_rtx (reg_mode);
 
   set_rtl (var, x);
+
+  /* For a promoted variable, X will not be used directly but wrapped in a
+     SUBREG with SUBREG_PROMOTED_VAR_P set, which means that the RTL land
+     will assume that its upper bits can be inferred from its lower bits.
+     Therefore, if X isn't initialized on every path from the entry, then
+     we must do it manually in order to fulfill the above assumption.  */
+  if (reg_mode != TYPE_MODE (TREE_TYPE (var))
+      && bitmap_bit_p (SA.partitions_for_undefined_values, part))
+    emit_move_insn (x, CONST0_RTX (reg_mode));
 }
 
 /* Record the association between the RTL generated for partition PART
@@ -2521,8 +2529,7 @@ expand_gimple_cond (basic_block bb, gcond *stmt)
   dest = false_edge->dest;
   redirect_edge_succ (false_edge, new_bb);
   false_edge->flags |= EDGE_FALLTHRU;
-  new_bb->count = false_edge->count;
-  new_bb->frequency = EDGE_FREQUENCY (false_edge);
+  new_bb->count = false_edge->count ();
   loop_p loop = find_common_loop (bb->loop_father, dest->loop_father);
   add_bb_to_loop (new_bb, loop);
   if (loop->latch == bb
@@ -2648,8 +2655,7 @@ expand_call_stmt (gcall *stmt)
   CALL_EXPR_RETURN_SLOT_OPT (exp) = gimple_call_return_slot_opt_p (stmt);
   if (decl
       && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL
-      && (DECL_FUNCTION_CODE (decl) == BUILT_IN_ALLOCA
-	  || DECL_FUNCTION_CODE (decl) == BUILT_IN_ALLOCA_WITH_ALIGN))
+      && ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (decl)))
     CALL_ALLOCA_FOR_VAR_P (exp) = gimple_call_alloca_for_var_p (stmt);
   else
     CALL_FROM_THUNK_P (exp) = gimple_call_from_thunk_p (stmt);
@@ -2673,12 +2679,28 @@ expand_call_stmt (gcall *stmt)
 	  }
     }
 
+  rtx_insn *before_call = get_last_insn ();
   lhs = gimple_call_lhs (stmt);
   if (lhs)
     expand_assignment (lhs, exp, false);
   else
     expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL);
 
+  /* If the gimple call is an indirect call and has 'nocf_check'
+     attribute find a generated CALL insn to mark it as no
+     control-flow verification is needed.  */
+  if (gimple_call_nocf_check_p (stmt)
+      && !gimple_call_fndecl (stmt))
+    {
+      rtx_insn *last = get_last_insn ();
+      while (!CALL_P (last)
+	     && last != before_call)
+	last = PREV_INSN (last);
+
+      if (last != before_call)
+	add_reg_note (last, REG_CALL_NOCF_CHECK, const0_rtx);
+    }
+
   mark_transaction_restart_calls (stmt);
 }
 
@@ -3832,20 +3854,13 @@ expand_gimple_tailcall (basic_block bb, gcall *stmt, bool *can_fallthru)
      the exit block.  */
 
   probability = profile_probability::never ();
-  profile_count count = profile_count::zero ();
 
   for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
     {
       if (!(e->flags & (EDGE_ABNORMAL | EDGE_EH)))
 	{
 	  if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
-	    {
-	      e->dest->count -= e->count;
-	      e->dest->frequency -= EDGE_FREQUENCY (e);
-	      if (e->dest->frequency < 0)
-		e->dest->frequency = 0;
-	    }
-	  count += e->count;
+	    e->dest->count -= e->count ();
 	  probability += e->probability;
 	  remove_edge (e);
 	}
@@ -3875,7 +3890,6 @@ expand_gimple_tailcall (basic_block bb, gcall *stmt, bool *can_fallthru)
   e = make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_ABNORMAL
 		 | EDGE_SIBCALL);
   e->probability = probability;
-  e->count = count;
   BB_END (bb) = last;
   update_bb_for_insn (bb);
 
@@ -4472,7 +4486,7 @@ expand_debug_expr (tree exp)
 				 &unsignedp, &reversep, &volatilep);
 	rtx orig_op0;
 
-	if (known_zero (bitsize))
+	if (must_eq (bitsize, 0))
 	  return NULL;
 
 	orig_op0 = op0 = expand_debug_expr (tem);
@@ -4516,12 +4530,12 @@ expand_debug_expr (tree exp)
 	      /* Bitfield.  */
 	      mode1 = smallest_int_mode_for_size (bitsize);
 	    poly_int64 bytepos = bits_to_bytes_round_down (bitpos);
-	    if (maybe_nonzero (bytepos))
+	    if (may_ne (bytepos, 0))
 	      {
 		op0 = adjust_address_nv (op0, mode1, bytepos);
 		bitpos = num_trailing_bits (bitpos);
 	      }
-	    else if (known_zero (bitpos)
+	    else if (must_eq (bitpos, 0)
 		     && must_eq (bitsize, GET_MODE_BITSIZE (mode)))
 	      op0 = adjust_address_nv (op0, mode, 0);
 	    else if (GET_MODE (op0) != mode1)
@@ -4533,7 +4547,7 @@ expand_debug_expr (tree exp)
 	    set_mem_attributes (op0, exp, 0);
 	  }
 
-	if (known_zero (bitpos) && mode == GET_MODE (op0))
+	if (must_eq (bitpos, 0) && mode == GET_MODE (op0))
 	  return op0;
 
 	if (may_lt (bitpos, 0))
@@ -5863,7 +5877,6 @@ construct_init_block (void)
   init_block = create_basic_block (NEXT_INSN (get_insns ()),
 				   get_last_insn (),
 				   ENTRY_BLOCK_PTR_FOR_FN (cfun));
-  init_block->frequency = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
   init_block->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
   add_bb_to_loop (init_block, ENTRY_BLOCK_PTR_FOR_FN (cfun)->loop_father);
   if (e)
@@ -5927,7 +5940,7 @@ construct_exit_block (void)
   while (NEXT_INSN (head) && NOTE_P (NEXT_INSN (head)))
     head = NEXT_INSN (head);
   /* But make sure exit_block starts with RETURN_LABEL, otherwise the
-     bb frequency counting will be confused.  Any instructions before that
+     bb count counting will be confused.  Any instructions before that
      label are emitted for the case where PREV_BB falls through into the
      exit block, so append those instructions to prev_bb in that case.  */
   if (NEXT_INSN (head) != return_label)
@@ -5940,7 +5953,6 @@ construct_exit_block (void)
 	}
     }
   exit_block = create_basic_block (NEXT_INSN (head), end, prev_bb);
-  exit_block->frequency = EXIT_BLOCK_PTR_FOR_FN (cfun)->frequency;
   exit_block->count = EXIT_BLOCK_PTR_FOR_FN (cfun)->count;
   add_bb_to_loop (exit_block, EXIT_BLOCK_PTR_FOR_FN (cfun)->loop_father);
 
@@ -5959,12 +5971,8 @@ construct_exit_block (void)
   FOR_EACH_EDGE (e2, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
     if (e2 != e)
       {
-	e->count -= e2->count;
-	exit_block->count -= e2->count;
-	exit_block->frequency -= EDGE_FREQUENCY (e2);
+	exit_block->count -= e2->count ();
       }
-  if (exit_block->frequency < 0)
-    exit_block->frequency = 0;
   update_bb_for_insn (exit_block);
 }
 
diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index 18dc49a035e..4a224243e32 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -146,12 +146,15 @@ verify_flow_info (void)
 	  error ("verify_flow_info: Wrong count of block %i", bb->index);
 	  err = 1;
 	}
-      if (bb->frequency < 0)
+      /* FIXME: Graphite and SLJL and target code still tends to produce
+	 edges with no probablity.  */
+      if (profile_status_for_fn (cfun) >= PROFILE_GUESSED
+          && !bb->count.initialized_p () && !flag_graphite && 0)
 	{
-	  error ("verify_flow_info: Wrong frequency of block %i %i",
-		 bb->index, bb->frequency);
+	  error ("verify_flow_info: Missing count of block %i", bb->index);
 	  err = 1;
 	}
+
       FOR_EACH_EDGE (e, ei, bb->succs)
 	{
 	  if (last_visited [e->dest->index] == bb)
@@ -160,15 +163,18 @@ verify_flow_info (void)
 		     e->src->index, e->dest->index);
 	      err = 1;
 	    }
-	  if (!e->probability.verify ())
+	  /* FIXME: Graphite and SLJL and target code still tends to produce
+	     edges with no probablity.  */
+	  if (profile_status_for_fn (cfun) >= PROFILE_GUESSED
+	      && !e->probability.initialized_p () && !flag_graphite && 0)
 	    {
-	      error ("verify_flow_info: Wrong probability of edge %i->%i",
-		     e->src->index, e->dest->index);
+	      error ("Uninitialized probability of edge %i->%i", e->src->index,
+		     e->dest->index);
 	      err = 1;
 	    }
-	  if (!e->count.verify ())
+	  if (!e->probability.verify ())
 	    {
-	      error ("verify_flow_info: Wrong count of edge %i->%i",
+	      error ("verify_flow_info: Wrong probability of edge %i->%i",
 		     e->src->index, e->dest->index);
 	      err = 1;
 	    }
@@ -311,7 +317,6 @@ dump_bb_for_graph (pretty_printer *pp, basic_block bb)
   /* TODO: Add pretty printer for counter.  */
   if (bb->count.initialized_p ())
     pp_printf (pp, "COUNT:" "%" PRId64, bb->count.to_gcov_type ());
-  pp_printf (pp, " FREQ:%i |", bb->frequency);
   pp_write_text_to_stream (pp);
   if (!(dump_flags & TDF_SLIM))
     cfg_hooks->dump_bb_for_graph (pp, bb);
@@ -443,7 +448,6 @@ redirect_edge_succ_nodup (edge e, basic_block new_succ)
     {
       s->flags |= e->flags;
       s->probability += e->probability;
-      s->count += e->count;
       /* FIXME: This should be called via a hook and only for IR_GIMPLE.  */
       redirect_edge_var_map_dup (s, e);
       remove_edge (e);
@@ -510,7 +514,6 @@ split_block_1 (basic_block bb, void *i)
     return NULL;
 
   new_bb->count = bb->count;
-  new_bb->frequency = bb->frequency;
   new_bb->discriminator = bb->discriminator;
 
   if (dom_info_available_p (CDI_DOMINATORS))
@@ -622,8 +625,7 @@ basic_block
 split_edge (edge e)
 {
   basic_block ret;
-  profile_count count = e->count;
-  int freq = EDGE_FREQUENCY (e);
+  profile_count count = e->count ();
   edge f;
   bool irr = (e->flags & EDGE_IRREDUCIBLE_LOOP) != 0;
   struct loop *loop;
@@ -637,9 +639,7 @@ split_edge (edge e)
 
   ret = cfg_hooks->split_edge (e);
   ret->count = count;
-  ret->frequency = freq;
   single_succ_edge (ret)->probability = profile_probability::always ();
-  single_succ_edge (ret)->count = count;
 
   if (irr)
     {
@@ -867,8 +867,6 @@ make_forwarder_block (basic_block bb, bool (*redirect_edge_p) (edge),
   fallthru = split_block_after_labels (bb);
   dummy = fallthru->src;
   dummy->count = profile_count::zero ();
-  dummy->frequency = 0;
-  fallthru->count = profile_count::zero ();
   bb = fallthru->dest;
 
   /* Redirect back edges we want to keep.  */
@@ -878,12 +876,7 @@ make_forwarder_block (basic_block bb, bool (*redirect_edge_p) (edge),
 
       if (redirect_edge_p (e))
 	{
-	  dummy->frequency += EDGE_FREQUENCY (e);
-	  if (dummy->frequency > BB_FREQ_MAX)
-	    dummy->frequency = BB_FREQ_MAX;
-
-	  dummy->count += e->count;
-	  fallthru->count += e->count;
+	  dummy->count += e->count ();
 	  ei_next (&ei);
 	  continue;
 	}
@@ -1069,7 +1062,7 @@ duplicate_block (basic_block bb, edge e, basic_block after)
 {
   edge s, n;
   basic_block new_bb;
-  profile_count new_count = e ? e->count : profile_count::uninitialized ();
+  profile_count new_count = e ? e->count (): profile_count::uninitialized ();
   edge_iterator ei;
 
   if (!cfg_hooks->duplicate_block)
@@ -1093,13 +1086,6 @@ duplicate_block (basic_block bb, edge e, basic_block after)
 	 is no need to actually check for duplicated edges.  */
       n = unchecked_make_edge (new_bb, s->dest, s->flags);
       n->probability = s->probability;
-      if (e && bb->count > profile_count::zero ())
-	{
-	  n->count = s->count.apply_scale (new_count, bb->count);
-	  s->count -= n->count;
-	}
-      else
-	n->count = s->count;
       n->aux = s->aux;
     }
 
@@ -1108,19 +1094,10 @@ duplicate_block (basic_block bb, edge e, basic_block after)
       new_bb->count = new_count;
       bb->count -= new_count;
 
-      new_bb->frequency = EDGE_FREQUENCY (e);
-      bb->frequency -= EDGE_FREQUENCY (e);
-
       redirect_edge_and_branch_force (e, new_bb);
-
-      if (bb->frequency < 0)
-	bb->frequency = 0;
     }
   else
-    {
-      new_bb->count = bb->count;
-      new_bb->frequency = bb->frequency;
-    }
+    new_bb->count = bb->count;
 
   set_bb_original (new_bb, bb);
   set_bb_copy (bb, new_bb);
@@ -1463,23 +1440,16 @@ account_profile_record (struct profile_record *record, int after_pass)
 	    record->num_mismatched_freq_out[after_pass]++;
 	  profile_count lsum = profile_count::zero ();
 	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    lsum += e->count;
+	    lsum += e->count ();
 	  if (EDGE_COUNT (bb->succs) && (lsum.differs_from_p (bb->count)))
 	    record->num_mismatched_count_out[after_pass]++;
 	}
       if (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)
 	  && profile_status_for_fn (cfun) != PROFILE_ABSENT)
 	{
-	  int sum = 0;
-	  FOR_EACH_EDGE (e, ei, bb->preds)
-	    sum += EDGE_FREQUENCY (e);
-	  if (abs (sum - bb->frequency) > 100
-	      || (MAX (sum, bb->frequency) > 10
-		  && abs ((sum - bb->frequency) * 100 / (MAX (sum, bb->frequency) + 1)) > 10))
-	    record->num_mismatched_freq_in[after_pass]++;
 	  profile_count lsum = profile_count::zero ();
 	  FOR_EACH_EDGE (e, ei, bb->preds)
-	    lsum += e->count;
+	    lsum += e->count ();
 	  if (lsum.differs_from_p (bb->count))
 	    record->num_mismatched_count_in[after_pass]++;
 	}
diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index c3bd9c05013..d82da97d7af 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -599,15 +599,15 @@ find_subloop_latch_edge_by_profile (vec<edge> latches)
 
   FOR_EACH_VEC_ELT (latches, i, e)
     {
-      if (e->count > mcount)
+      if (e->count ()> mcount)
 	{
 	  me = e;
-	  mcount = e->count;
+	  mcount = e->count();
 	}
-      tcount += e->count;
+      tcount += e->count();
     }
 
-  if (!tcount.initialized_p () || tcount < HEAVY_EDGE_MIN_SAMPLES
+  if (!tcount.initialized_p () || !(tcount.ipa () > HEAVY_EDGE_MIN_SAMPLES)
       || (tcount - mcount).apply_scale (HEAVY_EDGE_RATIO, 1) > tcount)
     return NULL;
 
diff --git a/gcc/cfgloopanal.c b/gcc/cfgloopanal.c
index 73710abac6e..78a3c9387aa 100644
--- a/gcc/cfgloopanal.c
+++ b/gcc/cfgloopanal.c
@@ -213,9 +213,10 @@ average_num_loop_insns (const struct loop *loop)
 	if (NONDEBUG_INSN_P (insn))
 	  binsns++;
 
-      ratio = loop->header->frequency == 0
+      ratio = loop->header->count.to_frequency (cfun) == 0
 	      ? BB_FREQ_MAX
-	      : (bb->frequency * BB_FREQ_MAX) / loop->header->frequency;
+	      : (bb->count.to_frequency (cfun) * BB_FREQ_MAX)
+		 / loop->header->count.to_frequency (cfun);
       ninsns += binsns * ratio;
     }
   free (bbs);
@@ -245,58 +246,38 @@ expected_loop_iterations_unbounded (const struct loop *loop,
   /* If we have no profile at all, use AVG_LOOP_NITER.  */
   if (profile_status_for_fn (cfun) == PROFILE_ABSENT)
     expected = PARAM_VALUE (PARAM_AVG_LOOP_NITER);
-  else if (loop->latch && (loop->latch->count.reliable_p ()
-			   || loop->header->count.reliable_p ()))
+  else if (loop->latch && (loop->latch->count.initialized_p ()
+			   || loop->header->count.initialized_p ()))
     {
       profile_count count_in = profile_count::zero (),
 		    count_latch = profile_count::zero ();
 
       FOR_EACH_EDGE (e, ei, loop->header->preds)
 	if (e->src == loop->latch)
-	  count_latch = e->count;
+	  count_latch = e->count ();
 	else
-	  count_in += e->count;
+	  count_in += e->count ();
 
       if (!count_latch.initialized_p ())
-	;
-      else if (!(count_in > profile_count::zero ()))
+	expected = PARAM_VALUE (PARAM_AVG_LOOP_NITER);
+      else if (!count_in.nonzero_p ())
 	expected = count_latch.to_gcov_type () * 2;
       else
 	{
 	  expected = (count_latch.to_gcov_type () + count_in.to_gcov_type ()
 		      - 1) / count_in.to_gcov_type ();
-	  if (read_profile_p)
+	  if (read_profile_p
+	      && count_latch.reliable_p () && count_in.reliable_p ())
 	    *read_profile_p = true;
 	}
     }
-  if (expected == -1)
-    {
-      int freq_in, freq_latch;
-
-      freq_in = 0;
-      freq_latch = 0;
-
-      FOR_EACH_EDGE (e, ei, loop->header->preds)
-	if (flow_bb_inside_loop_p (loop, e->src))
-	  freq_latch += EDGE_FREQUENCY (e);
-	else
-	  freq_in += EDGE_FREQUENCY (e);
-
-      if (freq_in == 0)
-	{
-	  /* If we have no profile at all, use AVG_LOOP_NITER iterations.  */
-	  if (!freq_latch)
-	    expected = PARAM_VALUE (PARAM_AVG_LOOP_NITER);
-	  else
-	    expected = freq_latch * 2;
-	}
-      else
-        expected = (freq_latch + freq_in - 1) / freq_in;
-    }
+  else
+    expected = PARAM_VALUE (PARAM_AVG_LOOP_NITER);
 
   HOST_WIDE_INT max = get_max_loop_iterations_int (loop);
   if (max != -1 && max < expected)
     return max;
+ 
   return expected;
 }
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index fd335c3fe1d..1f55137ed97 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -536,7 +536,6 @@ scale_loop_profile (struct loop *loop, profile_probability p,
       if (e)
 	{
 	  edge other_e;
-	  int freq_delta;
 	  profile_count count_delta;
 
           FOR_EACH_EDGE (other_e, ei, e->src->succs)
@@ -545,27 +544,18 @@ scale_loop_profile (struct loop *loop, profile_probability p,
 	      break;
 
 	  /* Probability of exit must be 1/iterations.  */
-	  freq_delta = EDGE_FREQUENCY (e);
+	  count_delta = e->count ();
 	  e->probability = profile_probability::always ()
 				.apply_scale (1, iteration_bound);
 	  other_e->probability = e->probability.invert ();
-	  freq_delta -= EDGE_FREQUENCY (e);
+	  count_delta -= e->count ();
 
-	  /* Adjust counts accordingly.  */
-	  count_delta = e->count;
-	  e->count = e->src->count.apply_probability (e->probability);
-	  other_e->count = e->src->count.apply_probability (other_e->probability);
-	  count_delta -= e->count;
-
-	  /* If latch exists, change its frequency and count, since we changed
+	  /* If latch exists, change its count, since we changed
 	     probability of exit.  Theoretically we should update everything from
 	     source of exit edge to latch, but for vectorizer this is enough.  */
 	  if (loop->latch
 	      && loop->latch != e->src)
 	    {
-	      loop->latch->frequency += freq_delta;
-	      if (loop->latch->frequency < 0)
-		loop->latch->frequency = 0;
 	      loop->latch->count += count_delta;
 	    }
 	}
@@ -575,34 +565,20 @@ scale_loop_profile (struct loop *loop, profile_probability p,
 	 we look at the actual profile, if it is available.  */
       p = p.apply_scale (iteration_bound, iterations);
 
-      bool determined = false;
       if (loop->header->count.initialized_p ())
 	{
 	  profile_count count_in = profile_count::zero ();
 
 	  FOR_EACH_EDGE (e, ei, loop->header->preds)
 	    if (e->src != loop->latch)
-	      count_in += e->count;
+	      count_in += e->count ();
 
 	  if (count_in > profile_count::zero () )
 	    {
 	      p = count_in.probability_in (loop->header->count.apply_scale
 						 (iteration_bound, 1));
-	      determined = true;
 	    }
 	}
-      if (!determined && loop->header->frequency)
-	{
-	  int freq_in = 0;
-
-	  FOR_EACH_EDGE (e, ei, loop->header->preds)
-	    if (e->src != loop->latch)
-	      freq_in += EDGE_FREQUENCY (e);
-
-	  if (freq_in != 0)
-	    p = profile_probability::probability_in_gcov_type
-			 (freq_in * iteration_bound, loop->header->frequency);
-	}
       if (!(p > profile_probability::never ()))
 	p = profile_probability::very_unlikely ();
     }
@@ -804,7 +780,7 @@ create_empty_loop_on_edge (edge entry_edge,
   loop->latch = loop_latch;
   add_loop (loop, outer);
 
-  /* TODO: Fix frequencies and counts.  */
+  /* TODO: Fix counts.  */
   scale_loop_frequencies (loop, profile_probability::even ());
 
   /* Update dominators.  */
@@ -870,16 +846,12 @@ loopify (edge latch_edge, edge header_edge,
   basic_block pred_bb = header_edge->src;
   struct loop *loop = alloc_loop ();
   struct loop *outer = loop_outer (succ_bb->loop_father);
-  int freq;
   profile_count cnt;
-  edge e;
-  edge_iterator ei;
 
   loop->header = header_edge->dest;
   loop->latch = latch_edge->src;
 
-  freq = EDGE_FREQUENCY (header_edge);
-  cnt = header_edge->count;
+  cnt = header_edge->count ();
 
   /* Redirect edges.  */
   loop_redirect_edge (latch_edge, loop->header);
@@ -907,15 +879,10 @@ loopify (edge latch_edge, edge header_edge,
     remove_bb_from_loops (switch_bb);
   add_bb_to_loop (switch_bb, outer);
 
-  /* Fix frequencies.  */
+  /* Fix counts.  */
   if (redirect_all_edges)
     {
-      switch_bb->frequency = freq;
       switch_bb->count = cnt;
-      FOR_EACH_EDGE (e, ei, switch_bb->succs)
-	{
-	  e->count = switch_bb->count.apply_probability (e->probability);
-	}
     }
   scale_loop_frequencies (loop, false_scale);
   scale_loop_frequencies (succ_bb->loop_father, true_scale);
@@ -1177,7 +1144,7 @@ duplicate_loop_to_header_edge (struct loop *loop, edge e,
     {
       /* Calculate coefficients by that we have to scale frequencies
 	 of duplicated loop bodies.  */
-      freq_in = header->frequency;
+      freq_in = header->count.to_frequency (cfun);
       freq_le = EDGE_FREQUENCY (latch_edge);
       if (freq_in == 0)
 	freq_in = 1;
@@ -1650,8 +1617,6 @@ lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
 		  current_ir_type () == IR_GIMPLE ? EDGE_TRUE_VALUE : 0);
   e1->probability = then_prob;
   e->probability = else_prob;
-  e1->count = e->count.apply_probability (e1->probability);
-  e->count = e->count.apply_probability (e->probability);
 
   set_immediate_dominator (CDI_DOMINATORS, first_head, new_head);
   set_immediate_dominator (CDI_DOMINATORS, second_head, new_head);
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 739d1bb9490..ae469088eec 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -1156,7 +1156,6 @@ try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout)
     e->flags = 0;
 
   e->probability = profile_probability::always ();
-  e->count = src->count;
 
   if (e->dest != target)
     redirect_edge_succ (e, target);
@@ -1505,9 +1504,7 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
 	  int prob = XINT (note, 0);
 
 	  b->probability = profile_probability::from_reg_br_prob_note (prob);
-	  b->count = e->count.apply_probability (b->probability);
 	  e->probability -= e->probability;
-	  e->count -= b->count;
 	}
     }
 
@@ -1536,6 +1533,7 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
 
 	  basic_block bb = create_basic_block (BB_HEAD (e->dest), NULL,
 					       ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	  bb->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
 
 	  /* Change the existing edge's source to be the new block, and add
 	     a new edge from the entry block to the new block.  */
@@ -1615,7 +1613,7 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
   if (EDGE_COUNT (e->src->succs) >= 2 || abnormal_edge_flags || asm_goto_edge)
     {
       rtx_insn *new_head;
-      profile_count count = e->count;
+      profile_count count = e->count ();
       profile_probability probability = e->probability;
       /* Create the new structures.  */
 
@@ -1631,7 +1629,6 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
 
       jump_block = create_basic_block (new_head, NULL, e->src);
       jump_block->count = count;
-      jump_block->frequency = EDGE_FREQUENCY (e);
 
       /* Make sure new block ends up in correct hot/cold section.  */
 
@@ -1640,7 +1637,6 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
       /* Wire edge in.  */
       new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
       new_edge->probability = probability;
-      new_edge->count = count;
 
       /* Redirect old edge.  */
       redirect_edge_pred (e, jump_block);
@@ -1655,13 +1651,10 @@ force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
       if (asm_goto_edge)
 	{
 	  new_edge->probability = new_edge->probability.apply_scale (1, 2);
-	  new_edge->count = new_edge->count.apply_scale (1, 2);
 	  jump_block->count = jump_block->count.apply_scale (1, 2);
-	  jump_block->frequency /= 2;
 	  edge new_edge2 = make_edge (new_edge->src, target,
 				      e->flags & ~EDGE_FALLTHRU);
 	  new_edge2->probability = probability - new_edge->probability;
-	  new_edge2->count = count - new_edge->count;
 	}
 
       new_bb = jump_block;
@@ -2251,9 +2244,23 @@ void
 update_br_prob_note (basic_block bb)
 {
   rtx note;
-  if (!JUMP_P (BB_END (bb)) || !BRANCH_EDGE (bb)->probability.initialized_p ())
-    return;
   note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX);
+  if (!JUMP_P (BB_END (bb)) || !BRANCH_EDGE (bb)->probability.initialized_p ())
+    {
+      if (note)
+	{
+	  rtx *note_link, this_rtx;
+
+	  note_link = &REG_NOTES (BB_END (bb));
+	  for (this_rtx = *note_link; this_rtx; this_rtx = XEXP (this_rtx, 1))
+	    if (this_rtx == note)
+	      {
+		*note_link = XEXP (this_rtx, 1);
+		break;
+	      }
+	}
+      return;
+    }
   if (!note
       || XINT (note, 0) == BRANCH_EDGE (bb)->probability.to_reg_br_prob_note ())
     return;
@@ -3155,7 +3162,6 @@ purge_dead_edges (basic_block bb)
       if (single_succ_p (bb))
 	{
 	  single_succ_edge (bb)->probability = profile_probability::always ();
-	  single_succ_edge (bb)->count = bb->count;
 	}
       else
 	{
@@ -3168,8 +3174,6 @@ purge_dead_edges (basic_block bb)
 	  b->probability = profile_probability::from_reg_br_prob_note
 					 (XINT (note, 0));
 	  f->probability = b->probability.invert ();
-	  b->count = bb->count.apply_probability (b->probability);
-	  f->count = bb->count.apply_probability (f->probability);
 	}
 
       return purged;
@@ -3221,7 +3225,6 @@ purge_dead_edges (basic_block bb)
   gcc_assert (single_succ_p (bb));
 
   single_succ_edge (bb)->probability = profile_probability::always ();
-  single_succ_edge (bb)->count = bb->count;
 
   if (dump_file)
     fprintf (dump_file, "Purged non-fallthru edges from bb %i\n",
@@ -3633,7 +3636,6 @@ relink_block_chain (bool stay_in_cfglayout_mode)
 	    fprintf (dump_file, "compensation ");
 	  else
 	    fprintf (dump_file, "bb %i ", bb->index);
-	  fprintf (dump_file, " [%i]\n", bb->frequency);
 	}
     }
 
@@ -4906,7 +4908,6 @@ rtl_flow_call_edges_add (sbitmap blocks)
 
 	      edge ne = make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE);
 	      ne->probability = profile_probability::guessed_never ();
-	      ne->count = profile_count::guessed_zero ();
 	    }
 
 	  if (insn == BB_HEAD (bb))
@@ -5045,7 +5046,7 @@ rtl_account_profile_record (basic_block bb, int after_pass,
 	    += insn_cost (insn, true) * bb->count.to_gcov_type ();
 	else if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
 	  record->time[after_pass]
-	    += insn_cost (insn, true) * bb->frequency;
+	    += insn_cost (insn, true) * bb->count.to_frequency (cfun);
       }
 }
 
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index d8da3dd76cd..7c3507c6ece 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -862,7 +862,7 @@ symbol_table::create_edge (cgraph_node *caller, cgraph_node *callee,
   edge->next_callee = NULL;
   edge->lto_stmt_uid = 0;
 
-  edge->count = count;
+  edge->count = count.ipa ();
   edge->frequency = freq;
   gcc_checking_assert (freq >= 0);
   gcc_checking_assert (freq <= CGRAPH_FREQ_MAX);
@@ -1308,7 +1308,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 	  /* We are producing the final function body and will throw away the
 	     callgraph edges really soon.  Reset the counts/frequencies to
 	     keep verifier happy in the case of roundoff errors.  */
-	  e->count = gimple_bb (e->call_stmt)->count;
+	  e->count = gimple_bb (e->call_stmt)->count.ipa ();
 	  e->frequency = compute_call_stmt_bb_frequency
 			  (e->caller->decl, gimple_bb (e->call_stmt));
 	}
@@ -1338,7 +1338,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 	    prob = profile_probability::even ();
 	  new_stmt = gimple_ic (e->call_stmt,
 				dyn_cast<cgraph_node *> (ref->referred),
-				prob, e->count, e->count + e2->count);
+				prob);
 	  e->speculative = false;
 	  e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
 						     false);
@@ -1644,7 +1644,7 @@ cgraph_update_edges_for_call_stmt_node (cgraph_node *node,
 	  /* Otherwise remove edge and create new one; we can't simply redirect
 	     since function has changed, so inline plan and other information
 	     attached to edge is invalid.  */
-	  count = e->count;
+	  count = e->count.ipa ();
 	  frequency = e->frequency;
  	  if (e->indirect_unknown_callee || e->inline_failed)
 	    e->remove ();
@@ -1655,7 +1655,7 @@ cgraph_update_edges_for_call_stmt_node (cgraph_node *node,
 	{
 	  /* We are seeing new direct call; compute profile info based on BB.  */
 	  basic_block bb = gimple_bb (new_stmt);
-	  count = bb->count;
+	  count = bb->count.ipa ();
 	  frequency = compute_call_stmt_bb_frequency (current_function_decl,
 						      bb);
 	}
@@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
   return changed;
 }
 
+/* Worker to set malloc flag.  */
+static void
+set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
+{
+  if (malloc_p && !DECL_IS_MALLOC (node->decl))
+    {
+      DECL_IS_MALLOC (node->decl) = true;
+      *changed = true;
+    }
+
+  ipa_ref *ref;
+  FOR_EACH_ALIAS (node, ref)
+    {
+      cgraph_node *alias = dyn_cast<cgraph_node *> (ref->referring);
+      if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
+	set_malloc_flag_1 (alias, malloc_p, changed);
+    }
+
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+    if (e->caller->thunk.thunk_p
+	&& (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE))
+      set_malloc_flag_1 (e->caller, malloc_p, changed);
+}
+
+/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
+
+bool
+cgraph_node::set_malloc_flag (bool malloc_p)
+{
+  bool changed = false;
+
+  if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE)
+    set_malloc_flag_1 (this, malloc_p, &changed);
+  else
+    {
+      ipa_ref *ref;
+
+      FOR_EACH_ALIAS (this, ref)
+	{
+	  cgraph_node *alias = dyn_cast<cgraph_node *> (ref->referring);
+	  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
+	    set_malloc_flag_1 (alias, malloc_p, &changed);
+	}
+    }
+  return changed;
+}
+
 /* Worker to set_const_flag.  */
 
 static void
@@ -3035,9 +3082,14 @@ bool
 cgraph_edge::verify_count_and_frequency ()
 {
   bool error_found = false;
-  if (count < 0)
+  if (!count.verify ())
     {
-      error ("caller edge count is negative");
+      error ("caller edge count invalid");
+      error_found = true;
+    }
+  if (count.initialized_p () && !(count.ipa () == count))
+    {
+      error ("caller edge count is local");
       error_found = true;
     }
   if (frequency < 0)
@@ -3136,9 +3188,14 @@ cgraph_node::verify_node (void)
 	       identifier_to_locale (e->callee->name ()));
 	error_found = true;
       }
-  if (count < 0)
+  if (!count.verify ())
+    {
+      error ("cgraph count invalid");
+      error_found = true;
+    }
+  if (count.initialized_p () && !(count.ipa () == count))
     {
-      error ("execution count is negative");
+      error ("cgraph count is local");
       error_found = true;
     }
   if (global.inlined_to && same_comdat_group)
@@ -3222,7 +3279,9 @@ cgraph_node::verify_node (void)
     {
       if (e->verify_count_and_frequency ())
 	error_found = true;
+      /* FIXME: re-enable once cgraph is converted to counts.  */
       if (gimple_has_body_p (e->caller->decl)
+	  && 0
 	  && !e->caller->global.inlined_to
 	  && !e->speculative
 	  /* Optimized out calls are redirected to __builtin_unreachable.  */
@@ -3245,9 +3304,11 @@ cgraph_node::verify_node (void)
     {
       if (e->verify_count_and_frequency ())
 	error_found = true;
+      /* FIXME: re-enable once cgraph is converted to counts.  */
       if (gimple_has_body_p (e->caller->decl)
 	  && !e->caller->global.inlined_to
 	  && !e->speculative
+	  && 0
 	  && (e->frequency
 	      != compute_call_stmt_bb_frequency (e->caller->decl,
 						 gimple_bb (e->call_stmt))))
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 1758e8b08c1..84824e9f814 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1151,6 +1151,10 @@ public:
      if any to NOTHROW.  */
   bool set_nothrow_flag (bool nothrow);
 
+  /* SET DECL_IS_MALLOC on cgraph_node's decl and on aliases of the node
+     if any.  */
+  bool set_malloc_flag (bool malloc_p);
+
   /* If SET_CONST is true, mark function, aliases and thunks to be ECF_CONST.
     If SET_CONST if false, clear the flag.
 
diff --git a/gcc/cgraphbuild.c b/gcc/cgraphbuild.c
index d853acd883d..dd4bf9a7fa3 100644
--- a/gcc/cgraphbuild.c
+++ b/gcc/cgraphbuild.c
@@ -190,21 +190,8 @@ record_eh_tables (cgraph_node *node, function *fun)
 int
 compute_call_stmt_bb_frequency (tree decl, basic_block bb)
 {
-  int entry_freq = ENTRY_BLOCK_PTR_FOR_FN
-  		     (DECL_STRUCT_FUNCTION (decl))->frequency;
-  int freq = bb->frequency;
-
-  if (profile_status_for_fn (DECL_STRUCT_FUNCTION (decl)) == PROFILE_ABSENT)
-    return CGRAPH_FREQ_BASE;
-
-  if (!entry_freq)
-    entry_freq = 1, freq++;
-
-  freq = freq * CGRAPH_FREQ_BASE / entry_freq;
-  if (freq > CGRAPH_FREQ_MAX)
-    freq = CGRAPH_FREQ_MAX;
-
-  return freq;
+  return bb->count.to_cgraph_frequency
+      (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION (decl))->count);
 }
 
 /* Mark address taken in STMT.  */
@@ -415,7 +402,7 @@ cgraph_edge::rebuild_edges (void)
   node->remove_callees ();
   node->remove_all_references ();
 
-  node->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
+  node->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa ();
 
   FOR_EACH_BB_FN (bb, cfun)
     {
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 9385dc825ab..c5183a02058 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1601,17 +1601,12 @@ init_lowered_empty_function (tree decl, bool in_ssa, profile_count count)
 
   /* Create BB for body of the function and connect it properly.  */
   ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = count;
-  ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency = BB_FREQ_MAX;
   EXIT_BLOCK_PTR_FOR_FN (cfun)->count = count;
-  EXIT_BLOCK_PTR_FOR_FN (cfun)->frequency = BB_FREQ_MAX;
   bb = create_basic_block (NULL, ENTRY_BLOCK_PTR_FOR_FN (cfun));
   bb->count = count;
-  bb->frequency = BB_FREQ_MAX;
   e = make_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun), bb, EDGE_FALLTHRU);
-  e->count = count;
   e->probability = profile_probability::always ();
   e = make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
-  e->count = count;
   e->probability = profile_probability::always ();
   add_bb_to_loop (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun)->loop_father);
 
@@ -1854,8 +1849,12 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk)
       else
 	resdecl = DECL_RESULT (thunk_fndecl);
 
+      profile_count cfg_count = count;
+      if (!cfg_count.initialized_p ())
+	cfg_count = profile_count::from_gcov_type (BB_FREQ_MAX).guessed_local ();
+
       bb = then_bb = else_bb = return_bb
-	= init_lowered_empty_function (thunk_fndecl, true, count);
+	= init_lowered_empty_function (thunk_fndecl, true, cfg_count);
 
       bsi = gsi_start_bb (bb);
 
@@ -1968,14 +1967,11 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk)
 		     adjustment, because that's why we're emitting a
 		     thunk.  */
 		  then_bb = create_basic_block (NULL, bb);
-		  then_bb->count = count - count.apply_scale (1, 16);
-		  then_bb->frequency = BB_FREQ_MAX - BB_FREQ_MAX / 16;
+		  then_bb->count = cfg_count - cfg_count.apply_scale (1, 16);
 		  return_bb = create_basic_block (NULL, then_bb);
-		  return_bb->count = count;
-		  return_bb->frequency = BB_FREQ_MAX;
+		  return_bb->count = cfg_count;
 		  else_bb = create_basic_block (NULL, else_bb);
-		  then_bb->count = count.apply_scale (1, 16);
-		  then_bb->frequency = BB_FREQ_MAX / 16;
+		  else_bb->count = cfg_count.apply_scale (1, 16);
 		  add_bb_to_loop (then_bb, bb->loop_father);
 		  add_bb_to_loop (return_bb, bb->loop_father);
 		  add_bb_to_loop (else_bb, bb->loop_father);
@@ -1988,17 +1984,14 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk)
 		  e = make_edge (bb, then_bb, EDGE_TRUE_VALUE);
 		  e->probability = profile_probability::guessed_always ()
 					.apply_scale (1, 16);
-		  e->count = count - count.apply_scale (1, 16);
 		  e = make_edge (bb, else_bb, EDGE_FALSE_VALUE);
 		  e->probability = profile_probability::guessed_always ()
 					.apply_scale (1, 16);
-		  e->count = count.apply_scale (1, 16);
 		  make_single_succ_edge (return_bb,
 					 EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
 		  make_single_succ_edge (then_bb, return_bb, EDGE_FALLTHRU);
 		  e = make_edge (else_bb, return_bb, EDGE_FALLTHRU);
 		  e->probability = profile_probability::always ();
-		  e->count = count.apply_scale (1, 16);
 		  bsi = gsi_last_bb (then_bb);
 		}
 
@@ -2033,8 +2026,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk)
 	}
 
       cfun->gimple_df->in_ssa_p = true;
+      counts_to_freqs ();
       profile_status_for_fn (cfun)
-        = count.initialized_p () ? PROFILE_READ : PROFILE_GUESSED;
+        = cfg_count.initialized_p () && cfg_count.ipa_p ()
+	  ? PROFILE_READ : PROFILE_GUESSED;
       /* FIXME: C++ FE should stop setting TREE_ASM_WRITTEN on thunks.  */
       TREE_ASM_WRITTEN (thunk_fndecl) = false;
       delete_unreachable_blocks ();
diff --git a/gcc/color-macros.h b/gcc/color-macros.h
new file mode 100644
index 00000000000..37ed4d197cf
--- /dev/null
+++ b/gcc/color-macros.h
@@ -0,0 +1,108 @@
+/* Terminal color manipulation macros.
+   Copyright (C) 2005-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_COLOR_MACROS_H
+#define GCC_COLOR_MACROS_H
+
+/* Select Graphic Rendition (SGR, "\33[...m") strings.  */
+/* Also Erase in Line (EL) to Right ("\33[K") by default.  */
+/*    Why have EL to Right after SGR?
+	 -- The behavior of line-wrapping when at the bottom of the
+	    terminal screen and at the end of the current line is often
+	    such that a new line is introduced, entirely cleared with
+	    the current background color which may be different from the
+	    default one (see the boolean back_color_erase terminfo(5)
+	    capability), thus scrolling the display by one line.
+	    The end of this new line will stay in this background color
+	    even after reverting to the default background color with
+	    "\33[m', unless it is explicitly cleared again with "\33[K"
+	    (which is the behavior the user would instinctively expect
+	    from the whole thing).  There may be some unavoidable
+	    background-color flicker at the end of this new line because
+	    of this (when timing with the monitor's redraw is just right).
+	 -- The behavior of HT (tab, "\t") is usually the same as that of
+	    Cursor Forward Tabulation (CHT) with a default parameter
+	    of 1 ("\33[I"), i.e., it performs pure movement to the next
+	    tab stop, without any clearing of either content or screen
+	    attributes (including background color); try
+	       printf 'asdfqwerzxcv\rASDF\tZXCV\n'
+	    in a bash(1) shell to demonstrate this.  This is not what the
+	    user would instinctively expect of HT (but is ok for CHT).
+	    The instinctive behavior would include clearing the terminal
+	    cells that are skipped over by HT with blank cells in the
+	    current screen attributes, including background color;
+	    the boolean dest_tabs_magic_smso terminfo(5) capability
+	    indicates this saner behavior for HT, but only some rare
+	    terminals have it (although it also indicates a special
+	    glitch with standout mode in the Teleray terminal for which
+	    it was initially introduced).  The remedy is to add "\33K"
+	    after each SGR sequence, be it START (to fix the behavior
+	    of any HT after that before another SGR) or END (to fix the
+	    behavior of an HT in default background color that would
+	    follow a line-wrapping at the bottom of the screen in another
+	    background color, and to complement doing it after START).
+	    Piping GCC's output through a pager such as less(1) avoids
+	    any HT problems since the pager performs tab expansion.
+
+      Generic disadvantages of this remedy are:
+	 -- Some very rare terminals might support SGR but not EL (nobody
+	    will use "gcc -fdiagnostics-color" on a terminal that does not
+	    support SGR in the first place).
+	 -- Having these extra control sequences might somewhat complicate
+	    the task of any program trying to parse "gcc -fdiagnostics-color"
+	    output in order to extract structuring information from it.
+      A specific disadvantage to doing it after SGR START is:
+	 -- Even more possible background color flicker (when timing
+	    with the monitor's redraw is just right), even when not at the
+	    bottom of the screen.
+      There are no additional disadvantages specific to doing it after
+      SGR END.
+
+      It would be impractical for GCC to become a full-fledged
+      terminal program linked against ncurses or the like, so it will
+      not detect terminfo(5) capabilities.  */
+
+#define COLOR_SEPARATOR		";"
+#define COLOR_NONE		"00"
+#define COLOR_BOLD		"01"
+#define COLOR_UNDERSCORE	"04"
+#define COLOR_BLINK		"05"
+#define COLOR_REVERSE		"07"
+#define COLOR_FG_BLACK		"30"
+#define COLOR_FG_RED		"31"
+#define COLOR_FG_GREEN		"32"
+#define COLOR_FG_YELLOW		"33"
+#define COLOR_FG_BLUE		"34"
+#define COLOR_FG_MAGENTA	"35"
+#define COLOR_FG_CYAN		"36"
+#define COLOR_FG_WHITE		"37"
+#define COLOR_BG_BLACK		"40"
+#define COLOR_BG_RED		"41"
+#define COLOR_BG_GREEN		"42"
+#define COLOR_BG_YELLOW		"43"
+#define COLOR_BG_BLUE		"44"
+#define COLOR_BG_MAGENTA	"45"
+#define COLOR_BG_CYAN		"46"
+#define COLOR_BG_WHITE		"47"
+#define SGR_START		"\33["
+#define SGR_END			"m\33[K"
+#define SGR_SEQ(str)		SGR_START str SGR_END
+#define SGR_RESET		SGR_SEQ("")
+
+#endif  /* GCC_COLOR_MACROS_H */
diff --git a/gcc/combine.c b/gcc/combine.c
index ff0cb2a7c62..99cc343192e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3028,6 +3028,13 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
       else
 	fprintf (dump_file, "\nTrying %d -> %d:\n",
 		 INSN_UID (i2), INSN_UID (i3));
+
+      if (i0)
+	dump_insn_slim (dump_file, i0);
+      if (i1)
+	dump_insn_slim (dump_file, i1);
+      dump_insn_slim (dump_file, i2);
+      dump_insn_slim (dump_file, i3);
     }
 
   /* If multiple insns feed into one of I2 or I3, they can be in any
@@ -12112,6 +12119,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
 	  const_op -= 1;
 	  code = LEU;
 	  /* ... fall through ...  */
+	  gcc_fallthrough ();
 	}
       /* (unsigned) < 0x80000000 is equivalent to >= 0.  */
       else if (is_a <scalar_int_mode> (mode, &int_mode)
@@ -12149,6 +12157,7 @@ simplify_compare_const (enum rtx_code code, machine_mode mode,
 	  const_op -= 1;
 	  code = GTU;
 	  /* ... fall through ...  */
+	  gcc_fallthrough ();
 	}
 
       /* (unsigned) >= 0x80000000 is equivalent to < 0.  */
@@ -14504,6 +14513,7 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2,
 	case REG_SETJMP:
 	case REG_TM:
 	case REG_CALL_DECL:
+	case REG_CALL_NOCF_CHECK:
 	  /* These notes must remain with the call.  It should not be
 	     possible for both I2 and I3 to be a call.  */
 	  if (CALL_P (i3))
@@ -14686,6 +14696,17 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2,
 		  && CALL_P (from_insn)
 		  && find_reg_fusage (from_insn, USE, XEXP (note, 0)))
 		place = from_insn;
+	      else if (i2 && reg_set_p (XEXP (note, 0), PATTERN (i2)))
+		{
+		  /* If the new I2 sets the same register that is marked
+		     dead in the note, we do not in general know where to
+		     put the note.  One important case we _can_ handle is
+		     when the note comes from I3.  */
+		  if (from_insn == i3)
+		    place = i3;
+		  else
+		    break;
+		}
 	      else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
 		place = i3;
 	      else if (i2 != 0 && next_nonnote_nondebug_insn (i2) == i3
@@ -14699,11 +14720,6 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2,
 		       || rtx_equal_p (XEXP (note, 0), elim_i0))
 		break;
 	      tem_insn = i3;
-	      /* If the new I2 sets the same register that is marked dead
-		 in the note, we do not know where to put the note.
-		 Give up.  */
-	      if (i2 != 0 && reg_set_p (XEXP (note, 0), PATTERN (i2)))
-		break;
 	    }
 
 	  if (place == 0)
diff --git a/gcc/common.opt b/gcc/common.opt
index c95da640174..f8f2ed3db8a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -741,6 +741,10 @@ Wsuggest-attribute=noreturn
 Common Var(warn_suggest_attribute_noreturn) Warning
 Warn about functions which might be candidates for __attribute__((noreturn)).
 
+Wsuggest-attribute=malloc
+Common Var(warn_suggest_attribute_malloc) Warning
+Warn about functions which might be candidates for __attribute__((malloc)).
+
 Wsuggest-final-types
 Common Var(warn_suggest_final_types) Warning
 Warn about C++ polymorphic types where adding final keyword would improve code quality.
@@ -1620,6 +1624,29 @@ finline-atomics
 Common Report Var(flag_inline_atomics) Init(1) Optimization
 Inline __atomic operations when a lock free instruction sequence is available.
 
+fcf-protection
+Common RejectNegative Alias(fcf-protection=,full)
+
+fcf-protection=
+Common Report Joined RejectNegative Enum(cf_protection_level) Var(flag_cf_protection) Init(CF_NONE)
+-fcf-protection=[full|branch|return|none]	Instrument functions with checks to verify jump/call/return control-flow transfer
+instructions have valid targets.
+
+Enum
+Name(cf_protection_level) Type(enum cf_protection_level) UnknownError(unknown Cotrol-Flow Protection Level %qs)
+
+EnumValue
+Enum(cf_protection_level) String(full) Value(CF_FULL)
+
+EnumValue
+Enum(cf_protection_level) String(branch) Value(CF_BRANCH)
+
+EnumValue
+Enum(cf_protection_level) String(return) Value(CF_RETURN)
+
+EnumValue
+Enum(cf_protection_level) String(none) Value(CF_NONE)
+
 finstrument-functions
 Common Report Var(flag_instrument_function_entry_exit)
 Instrument function entry and exit with profiling calls.
@@ -2846,11 +2873,23 @@ Common Driver RejectNegative JoinedOrMissing
 Generate debug information in default format.
 
 gcoff
-Common Driver JoinedOrMissing Negative(gdwarf)
-Generate debug information in COFF format.
+Common Driver Ignore Warn(switch %qs no longer supported)
+Does nothing.  Preserved for backward compatibility.
+
+gcoff1
+Common Driver Ignore Warn(switch %qs no longer supported)
+Does nothing.  Preserved for backward compatibility.
+
+gcoff2
+Common Driver Ignore Warn(switch %qs no longer supported)
+Does nothing.  Preserved for backward compatibility.
+
+gcoff3
+Common Driver Ignore Warn(switch %qs no longer supported)
+Does nothing.  Preserved for backward compatibility.
 
 gcolumn-info
-Common Driver Var(debug_column_info,1) Init(0)
+Common Driver Var(debug_column_info,1) Init(1)
 Record DW_AT_decl_column and DW_AT_call_column in DWARF.
 
 gdwarf
@@ -2914,7 +2953,7 @@ Common Driver JoinedOrMissing Negative(gxcoff+)
 Generate debug information in XCOFF format.
 
 gxcoff+
-Common Driver JoinedOrMissing Negative(gcoff)
+Common Driver JoinedOrMissing Negative(gdwarf)
 Generate debug information in extended XCOFF format.
 
 Enum
diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 4185176495a..ada918e6f2a 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -137,6 +137,9 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLZERO_SET OPTION_MASK_ISA_CLZERO
 #define OPTION_MASK_ISA_PKU_SET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA_RDPID_SET OPTION_MASK_ISA_RDPID
+#define OPTION_MASK_ISA_GFNI_SET OPTION_MASK_ISA_GFNI
+#define OPTION_MASK_ISA_IBT_SET OPTION_MASK_ISA_IBT
+#define OPTION_MASK_ISA_SHSTK_SET OPTION_MASK_ISA_SHSTK
 
 /* Define a set of ISAs which aren't available when a given ISA is
    disabled.  MMX and SSE ISAs are handled separately.  */
@@ -202,6 +205,9 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLZERO_UNSET OPTION_MASK_ISA_CLZERO
 #define OPTION_MASK_ISA_PKU_UNSET OPTION_MASK_ISA_PKU
 #define OPTION_MASK_ISA_RDPID_UNSET OPTION_MASK_ISA_RDPID
+#define OPTION_MASK_ISA_GFNI_UNSET OPTION_MASK_ISA_GFNI
+#define OPTION_MASK_ISA_IBT_UNSET OPTION_MASK_ISA_IBT
+#define OPTION_MASK_ISA_SHSTK_UNSET OPTION_MASK_ISA_SHSTK
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
    as -mno-sse4.1. */
@@ -484,6 +490,48 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mgfni:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_GFNI_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_GFNI_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_GFNI_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_GFNI_UNSET;
+	}
+      return true;
+
+    case OPT_mcet:
+    case OPT_mibt:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_IBT_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_IBT_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_IBT_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_IBT_UNSET;
+	}
+      if (code != OPT_mcet)
+	return true;
+      /* fall through.  */
+
+    case OPT_mshstk:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_SHSTK_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_SHSTK_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_SHSTK_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_SHSTK_UNSET;
+	}
+      return true;
+
     case OPT_mavx5124fmaps:
       if (value)
 	{
diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
index 7e557a245b5..17d08842d15 100644
--- a/gcc/compare-elim.c
+++ b/gcc/compare-elim.c
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm_p.h"
 #include "insn-config.h"
 #include "recog.h"
+#include "emit-rtl.h"
 #include "cfgrtl.h"
 #include "tree-pass.h"
 #include "domwalk.h"
@@ -96,6 +97,9 @@ struct comparison
   /* The insn prior to the comparison insn that clobbers the flags.  */
   rtx_insn *prev_clobber;
 
+  /* The insn prior to the comparison insn that sets in_a REG.  */
+  rtx_insn *in_a_setter;
+
   /* The two values being compared.  These will be either REGs or
      constants.  */
   rtx in_a, in_b;
@@ -308,26 +312,22 @@ can_eliminate_compare (rtx compare, rtx eh_note, struct comparison *cmp)
 edge
 find_comparison_dom_walker::before_dom_children (basic_block bb)
 {
-  struct comparison *last_cmp;
-  rtx_insn *insn, *next, *last_clobber;
-  bool last_cmp_valid;
+  rtx_insn *insn, *next;
   bool need_purge = false;
-  bitmap killed;
-
-  killed = BITMAP_ALLOC (NULL);
+  rtx_insn *last_setter[FIRST_PSEUDO_REGISTER];
 
   /* The last comparison that was made.  Will be reset to NULL
      once the flags are clobbered.  */
-  last_cmp = NULL;
+  struct comparison *last_cmp = NULL;
 
   /* True iff the last comparison has not been clobbered, nor
      have its inputs.  Used to eliminate duplicate compares.  */
-  last_cmp_valid = false;
+  bool last_cmp_valid = false;
 
   /* The last insn that clobbered the flags, if that insn is of
      a form that may be valid for eliminating a following compare.
      To be reset to NULL once the flags are set otherwise.  */
-  last_clobber = NULL;
+  rtx_insn *last_clobber = NULL;
 
   /* Propagate the last live comparison throughout the extended basic block. */
   if (single_pred_p (bb))
@@ -337,6 +337,7 @@ find_comparison_dom_walker::before_dom_children (basic_block bb)
 	last_cmp_valid = last_cmp->inputs_valid;
     }
 
+  memset (last_setter, 0, sizeof (last_setter));
   for (insn = BB_HEAD (bb); insn; insn = next)
     {
       rtx src;
@@ -345,10 +346,6 @@ find_comparison_dom_walker::before_dom_children (basic_block bb)
       if (!NONDEBUG_INSN_P (insn))
 	continue;
 
-      /* Compute the set of registers modified by this instruction.  */
-      bitmap_clear (killed);
-      df_simulate_find_defs (insn, killed);
-
       src = conforming_compare (insn);
       if (src)
 	{
@@ -372,6 +369,13 @@ find_comparison_dom_walker::before_dom_children (basic_block bb)
 	  last_cmp->in_b = XEXP (src, 1);
 	  last_cmp->eh_note = eh_note;
 	  last_cmp->orig_mode = GET_MODE (src);
+	  if (last_cmp->in_b == const0_rtx
+	      && last_setter[REGNO (last_cmp->in_a)])
+	    {
+	      rtx set = single_set (last_setter[REGNO (last_cmp->in_a)]);
+	      if (set && rtx_equal_p (SET_DEST (set), last_cmp->in_a))
+		last_cmp->in_a_setter = last_setter[REGNO (last_cmp->in_a)];
+	    }
 	  all_compares.safe_push (last_cmp);
 
 	  /* It's unusual, but be prepared for comparison patterns that
@@ -387,28 +391,36 @@ find_comparison_dom_walker::before_dom_children (basic_block bb)
 	    find_flags_uses_in_insn (last_cmp, insn);
 
 	  /* Notice if this instruction kills the flags register.  */
-	  if (bitmap_bit_p (killed, targetm.flags_regnum))
-	    {
-	      /* See if this insn could be the "clobber" that eliminates
-		 a future comparison.   */
-	      last_clobber = (arithmetic_flags_clobber_p (insn) ? insn : NULL);
-
-	      /* In either case, the previous compare is no longer valid.  */
-	      last_cmp = NULL;
-	      last_cmp_valid = false;
-	    }
+	  df_ref def;
+	  FOR_EACH_INSN_DEF (def, insn)
+	    if (DF_REF_REGNO (def) == targetm.flags_regnum)
+	      {
+		/* See if this insn could be the "clobber" that eliminates
+		   a future comparison.   */
+		last_clobber = (arithmetic_flags_clobber_p (insn)
+				? insn : NULL);
+
+		/* In either case, the previous compare is no longer valid.  */
+		last_cmp = NULL;
+		last_cmp_valid = false;
+		break;
+	      }
 	}
 
-      /* Notice if any of the inputs to the comparison have changed.  */
-      if (last_cmp_valid
-	  && (bitmap_bit_p (killed, REGNO (last_cmp->in_a))
-	      || (REG_P (last_cmp->in_b)
-		  && bitmap_bit_p (killed, REGNO (last_cmp->in_b)))))
-	last_cmp_valid = false;
+      /* Notice if any of the inputs to the comparison have changed
+	 and remember last insn that sets each register.  */
+      df_ref def;
+      FOR_EACH_INSN_DEF (def, insn)
+	{
+	  if (last_cmp_valid
+	      && (DF_REF_REGNO (def) == REGNO (last_cmp->in_a)
+		  || (REG_P (last_cmp->in_b)
+		      && DF_REF_REGNO (def) == REGNO (last_cmp->in_b))))
+	    last_cmp_valid = false;
+	  last_setter[DF_REF_REGNO (def)] = insn;
+	}
     }
 
-  BITMAP_FREE (killed);
-
   /* Remember the live comparison for subsequent members of
      the extended basic block.  */
   if (last_cmp)
@@ -579,6 +591,133 @@ equivalent_reg_at_start (rtx reg, rtx_insn *end, rtx_insn *start)
   return reg;
 }
 
+/* Return true if it is okay to merge the comparison CMP_INSN with
+   the instruction ARITH_INSN.  Both instructions are assumed to be in the
+   same basic block with ARITH_INSN appearing before CMP_INSN.  This checks
+   that there are no uses or defs of the condition flags or control flow
+   changes between the two instructions.  */
+
+static bool
+can_merge_compare_into_arith (rtx_insn *cmp_insn, rtx_insn *arith_insn)
+{
+  for (rtx_insn *insn = PREV_INSN (cmp_insn);
+       insn && insn != arith_insn;
+       insn = PREV_INSN (insn))
+    {
+      if (!NONDEBUG_INSN_P (insn))
+	continue;
+      /* Bail if there are jumps or calls in between.  */
+      if (!NONJUMP_INSN_P (insn))
+	return false;
+
+      /* Bail on old-style asm statements because they lack
+	 data flow information.  */
+      if (GET_CODE (PATTERN (insn)) == ASM_INPUT)
+	return false;
+
+      df_ref ref;
+      /* Find a USE of the flags register.  */
+      FOR_EACH_INSN_USE (ref, insn)
+	if (DF_REF_REGNO (ref) == targetm.flags_regnum)
+	  return false;
+
+      /* Find a DEF of the flags register.  */
+      FOR_EACH_INSN_DEF (ref, insn)
+	if (DF_REF_REGNO (ref) == targetm.flags_regnum)
+	  return false;
+    }
+  return true;
+}
+
+/* Given two SET expressions, SET_A and SET_B determine whether they form
+   a recognizable pattern when emitted in parallel.  Return that parallel
+   if so.  Otherwise return NULL.  */
+
+static rtx
+try_validate_parallel (rtx set_a, rtx set_b)
+{
+  rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set_a, set_b));
+  rtx_insn *insn = make_insn_raw (par);
+
+  if (insn_invalid_p (insn, false))
+    {
+      crtl->emit.x_cur_insn_uid--;
+      return NULL_RTX;
+    }
+
+  SET_PREV_INSN (insn) = NULL_RTX;
+  SET_NEXT_INSN (insn) = NULL_RTX;
+  INSN_LOCATION (insn) = 0;
+  return insn;
+}
+
+/* For a comparison instruction described by CMP check if it compares a
+   register with zero i.e. it is of the form CC := CMP R1, 0.
+   If it is, find the instruction defining R1 (say I1) and try to create a
+   PARALLEL consisting of I1 and the comparison, representing a flag-setting
+   arithmetic instruction.  Example:
+   I1: R1 := R2 + R3
+   <instructions that don't read the condition register>
+   I2: CC := CMP R1 0
+   I2 can be merged with I1 into:
+   I1: { CC := CMP (R2 + R3) 0 ; R1 := R2 + R3 }
+   This catches cases where R1 is used between I1 and I2 and therefore
+   combine and other RTL optimisations will not try to propagate it into
+   I2.  Return true if we succeeded in merging CMP.  */
+
+static bool
+try_merge_compare (struct comparison *cmp)
+{
+  rtx_insn *cmp_insn = cmp->insn;
+
+  if (cmp->in_b != const0_rtx || cmp->in_a_setter == NULL)
+    return false;
+  rtx in_a = cmp->in_a;
+  df_ref use;
+
+  FOR_EACH_INSN_USE (use, cmp_insn)
+    if (DF_REF_REGNO (use) == REGNO (in_a))
+      break;
+  if (!use)
+    return false;
+
+  rtx_insn *def_insn = cmp->in_a_setter;
+  rtx set = single_set (def_insn);
+
+  if (!can_merge_compare_into_arith (cmp_insn, def_insn))
+    return false;
+
+  rtx src = SET_SRC (set);
+  rtx flags = maybe_select_cc_mode (cmp, src, CONST0_RTX (GET_MODE (src)));
+  if (!flags)
+    {
+    /* We may already have a change group going through maybe_select_cc_mode.
+       Discard it properly.  */
+      cancel_changes (0);
+      return false;
+    }
+
+  rtx flag_set
+    = gen_rtx_SET (flags, gen_rtx_COMPARE (GET_MODE (flags),
+					   copy_rtx (src),
+					   CONST0_RTX (GET_MODE (src))));
+  rtx arith_set = copy_rtx (PATTERN (def_insn));
+  rtx par = try_validate_parallel (flag_set, arith_set);
+  if (!par)
+    {
+      /* We may already have a change group going through maybe_select_cc_mode.
+	 Discard it properly.  */
+      cancel_changes (0);
+      return false;
+    }
+  if (!apply_change_group ())
+    return false;
+  emit_insn_after (par, def_insn);
+  delete_insn (def_insn);
+  delete_insn (cmp->insn);
+  return true;
+}
+
 /* Attempt to replace a comparison with a prior arithmetic insn that can
    compute the same flags value as the comparison itself.  Return true if
    successful, having made all rtl modifications necessary.  */
@@ -588,6 +727,9 @@ try_eliminate_compare (struct comparison *cmp)
 {
   rtx flags, in_a, in_b, cmp_src;
 
+  if (try_merge_compare (cmp))
+    return true;
+
   /* We must have found an interesting "clobber" preceding the compare.  */
   if (cmp->prev_clobber == NULL)
     return false;
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 22702396a9f..3dace854c95 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -378,7 +378,8 @@ i[34567]86-*-*)
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
 		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
-		       clzerointrin.h pkuintrin.h sgxintrin.h"
+		       clzerointrin.h pkuintrin.h sgxintrin.h cetintrin.h
+		       gfniintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -402,7 +403,8 @@ x86_64-*-*)
 		       avx512ifmaintrin.h avx512ifmavlintrin.h avx512vbmiintrin.h
 		       avx512vbmivlintrin.h avx5124fmapsintrin.h avx5124vnniwintrin.h
 		       avx512vpopcntdqintrin.h clwbintrin.h mwaitxintrin.h
-		       clzerointrin.h pkuintrin.h sgxintrin.h"
+		       clzerointrin.h pkuintrin.h sgxintrin.h cetintrin.h
+		       gfniintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
@@ -459,7 +461,7 @@ powerpc*-*-*)
 	extra_objs="rs6000-string.o rs6000-p8swap.o"
 	extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
 	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
-	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h"
+	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
 	extra_headers="${extra_headers} mmintrin.h x86intrin.h"
 	extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h si2vmx.h"
 	extra_headers="${extra_headers} paired.h"
@@ -874,7 +876,7 @@ case ${target} in
   tmake_file="${tmake_file} t-sol2 t-slibgcc"
   c_target_objs="${c_target_objs} sol2-c.o"
   cxx_target_objs="${cxx_target_objs} sol2-c.o sol2-cxx.o"
-  extra_objs="sol2.o sol2-stubs.o"
+  extra_objs="${extra_objs} sol2.o sol2-stubs.o"
   extra_options="${extra_options} sol2.opt"
   case ${enable_threads}:${have_pthread_h}:${have_thread_h} in
     "":yes:* | yes:yes:* )
@@ -1692,7 +1694,7 @@ i[34567]86-*-cygwin*)
 	tmake_file="${tmake_file} i386/t-cygming t-slibgcc"
 	target_gtfiles="\$(srcdir)/config/i386/winnt.c"
 	extra_options="${extra_options} i386/cygming.opt i386/cygwin.opt"
-	extra_objs="winnt.o winnt-stubs.o"
+	extra_objs="${extra_objs} winnt.o winnt-stubs.o"
 	c_target_objs="${c_target_objs} msformat-c.o"
 	cxx_target_objs="${cxx_target_objs} winnt-cxx.o msformat-c.o"
 	if test x$enable_threads = xyes; then
@@ -1708,7 +1710,7 @@ x86_64-*-cygwin*)
 	tmake_file="${tmake_file} i386/t-cygming t-slibgcc i386/t-cygwin-w64"
 	target_gtfiles="\$(srcdir)/config/i386/winnt.c"
 	extra_options="${extra_options} i386/cygming.opt i386/cygwin.opt"
-	extra_objs="winnt.o winnt-stubs.o"
+	extra_objs="${extra_objs} winnt.o winnt-stubs.o"
 	c_target_objs="${c_target_objs} msformat-c.o"
 	cxx_target_objs="${cxx_target_objs} winnt-cxx.o msformat-c.o"
 	if test x$enable_threads = xyes; then
@@ -1783,7 +1785,7 @@ i[34567]86-*-mingw* | x86_64-*-mingw*)
 		*)
 			;;
 	esac
-	extra_objs="winnt.o winnt-stubs.o"
+	extra_objs="${extra_objs} winnt.o winnt-stubs.o"
 	c_target_objs="${c_target_objs} msformat-c.o"
 	cxx_target_objs="${cxx_target_objs} winnt-cxx.o msformat-c.o"
 	gas=yes
@@ -3437,11 +3439,18 @@ if test x$with_cpu = x ; then
       esac
       ;;
     powerpc*-*-*spe*)
+      # For SPE, start with 8540, then upgrade to 8548 if
+      # --enable-e500-double was requested explicitly or if we were
+      # configured for e500v2.
+      with_cpu=8540
       if test x$enable_e500_double = xyes; then
-         with_cpu=8548
-      else
-         with_cpu=8540
-      fi       
+        with_cpu=8548
+      fi
+      case ${target_noncanonical} in
+        e500v2*)
+          with_cpu=8548
+          ;;
+      esac
       ;;
     sparc*-*-*)
       case ${target} in
@@ -4544,7 +4553,8 @@ case ${target} in
 	i[34567]86-*-darwin* | x86_64-*-darwin*)
 		;;
 	i[34567]86-*-linux* | x86_64-*-linux*)
-		tmake_file="$tmake_file i386/t-linux"
+		extra_objs="${extra_objs} cet.o"
+		tmake_file="$tmake_file i386/t-linux i386/t-cet"
 		;;
 	i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu)
 		tmake_file="$tmake_file i386/t-kfreebsd"
diff --git a/gcc/config.in b/gcc/config.in
index 89d7108e8db..5651bcba431 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -717,6 +717,12 @@
 #endif
 
 
+/* Define if your assembler supports -xbrace_comment option. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_XBRACE_COMMENT_OPTION
+#endif
+
+
 /* Define to 1 if you have the `atoq' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_ATOQ
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 8ca4cfc299f..8479b6a1f2c 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -170,6 +170,11 @@ aarch64_types_quadop_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none,
       qualifier_none, qualifier_lane_index };
 #define TYPES_QUADOP_LANE (aarch64_types_quadop_lane_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_quadopu_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+      qualifier_unsigned, qualifier_lane_index };
+#define TYPES_QUADOPU_LANE (aarch64_types_quadopu_lane_qualifiers)
 
 static enum aarch64_type_qualifiers
 aarch64_types_binop_imm_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -1064,7 +1069,8 @@ aarch64_simd_expand_args (rtx target, int icode, int have_retval,
 		    = GET_MODE_NUNITS (builtin_mode).to_constant ();
 		  aarch64_simd_lane_bounds (op[opc], 0, nunits, exp);
 		  /* Keep to GCC-vector-extension lane indices in the RTL.  */
-		  op[opc] = endian_lane_rtx (builtin_mode, INTVAL (op[opc]));
+		  op[opc] = aarch64_endian_lane_rtx (builtin_mode,
+						     INTVAL (op[opc]));
 		}
 	      goto constant_arg;
 
@@ -1078,7 +1084,7 @@ aarch64_simd_expand_args (rtx target, int icode, int have_retval,
 		    = GET_MODE_NUNITS (vmode).to_constant ();
 		  aarch64_simd_lane_bounds (op[opc], 0, nunits, exp);
 		  /* Keep to GCC-vector-extension lane indices in the RTL.  */
-		  op[opc] = endian_lane_rtx (vmode, INTVAL (op[opc]));
+		  op[opc] = aarch64_endian_lane_rtx (vmode, INTVAL (op[opc]));
 		}
 	      /* Fall through - if the lane index isn't a constant then
 		 the next case will error.  */
diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index d7f42b3d5ab..80fe1838eb6 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -108,6 +108,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
 
 
   aarch64_def_or_undef (TARGET_CRC32, "__ARM_FEATURE_CRC32", pfile);
+  aarch64_def_or_undef (TARGET_DOTPROD, "__ARM_FEATURE_DOTPROD", pfile);
 
   cpp_undef (pfile, "__AARCH64_CMODEL_TINY__");
   cpp_undef (pfile, "__AARCH64_CMODEL_SMALL__");
@@ -176,7 +177,7 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
      information that it specifies.  */
   if (args)
     {
-      if (!aarch64_process_target_attr (args, "pragma"))
+      if (!aarch64_process_target_attr (args))
 	return false;
 
       aarch64_override_options_internal (&global_options);
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 10893324d3f..cdf047c0fa2 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -83,8 +83,13 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR
 /* ARMv8.2-A Architecture Processors.  */
 
 /* ARM ('A') cores. */
-AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1)
-AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1)
+AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
+AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
+
+/* ARMv8.3-A Architecture Processors.  */
+
+/* Qualcomm ('Q') cores. */
+AARCH64_CORE("saphira",     saphira,    falkor,    8_3A,  AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */
 
@@ -95,6 +100,6 @@ AARCH64_CORE("cortex-a73.cortex-a53",  cortexa73cortexa53, cortexa53, 8A,  AARCH
 
 /* ARM DynamIQ big.LITTLE configurations.  */
 
-AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
+AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
 
 #undef AARCH64_CORE
diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def
index 6d519e08fa2..11bbdfcb55e 100644
--- a/gcc/config/aarch64/aarch64-modes.def
+++ b/gcc/config/aarch64/aarch64-modes.def
@@ -31,15 +31,10 @@ ADJUST_FLOAT_FORMAT (HF, &ieee_half_format);
 
 /* Vector modes.  */
 
-VECTOR_BOOL_MODE (32);
-VECTOR_BOOL_MODE (16);
-VECTOR_BOOL_MODE (8);
-VECTOR_BOOL_MODE (4);
-
-ADJUST_BYTESIZE (V32BI, aarch64_sve_vg);
-ADJUST_BYTESIZE (V16BI, aarch64_sve_vg);
-ADJUST_BYTESIZE (V8BI, aarch64_sve_vg);
-ADJUST_BYTESIZE (V4BI, aarch64_sve_vg);
+VECTOR_BOOL_MODE (32, 4);
+VECTOR_BOOL_MODE (16, 4);
+VECTOR_BOOL_MODE (8, 4);
+VECTOR_BOOL_MODE (4, 4);
 
 ADJUST_NUNITS (V32BI, aarch64_sve_vg * 8);
 ADJUST_NUNITS (V16BI, aarch64_sve_vg * 4);
@@ -65,6 +60,10 @@ INT_MODE (OI, 32);
 INT_MODE (CI, 48);
 INT_MODE (XI, 64);
 
+/* Define SVE modes for NVECS vectors.  VB, VH, VS and VD are the prefixes
+   for 8-bit, 16-bit, 32-bit and 64-bit elements respectively.  It isn't
+   strictly necessary to set the alignment here, since the default would
+   be clamped to BIGGEST_ALIGNMENT anyhow, but it seems clearer.  */
 #define SVE_MODES(NVECS, VB, VH, VS, VD) \
   VECTOR_MODES (INT, 32 * NVECS); \
   VECTOR_MODES (FLOAT, 32 * NVECS); \
@@ -73,9 +72,20 @@ INT_MODE (XI, 64);
   ADJUST_NUNITS (VH##HI, aarch64_sve_vg * NVECS * 4); \
   ADJUST_NUNITS (VS##SI, aarch64_sve_vg * NVECS * 2); \
   ADJUST_NUNITS (VD##DI, aarch64_sve_vg * NVECS); \
+  ADJUST_NUNITS (VH##HF, aarch64_sve_vg * NVECS * 4); \
   ADJUST_NUNITS (VS##SF, aarch64_sve_vg * NVECS * 2); \
-  ADJUST_NUNITS (VD##DF, aarch64_sve_vg * NVECS);
-
+  ADJUST_NUNITS (VD##DF, aarch64_sve_vg * NVECS); \
+  \
+  ADJUST_ALIGNMENT (VB##QI, 16); \
+  ADJUST_ALIGNMENT (VH##HI, 16); \
+  ADJUST_ALIGNMENT (VS##SI, 16); \
+  ADJUST_ALIGNMENT (VD##DI, 16); \
+  ADJUST_ALIGNMENT (VH##HF, 16); \
+  ADJUST_ALIGNMENT (VS##SF, 16); \
+  ADJUST_ALIGNMENT (VD##DF, 16);
+
+/* Give SVE vectors the names normally used for 256-bit vectors.
+   The actual number depends on command-line flags.  */
 SVE_MODES (1, V32, V16, V8, V4)
 SVE_MODES (2, V64, V32, V16, V8)
 SVE_MODES (3, V96, V48, V24, V12)
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index e348dcd3752..a0bc50ca576 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -43,8 +43,8 @@
 AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_SVE, "fp")
 
 /* Enabling "simd" also enables "fp".
-   Disabling "simd" also disables "crypto" and "sve".  */
-AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, AARCH64_FL_CRYPTO | AARCH64_FL_SVE, "asimd")
+   Disabling "simd" also disables "crypto", "dotprod" and "sve".  */
+AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD | AARCH64_FL_SVE, "asimd")
 
 /* Enabling "crypto" also enables "fp", "simd".
    Disabling "crypto" just disables "crypto".  */
@@ -57,8 +57,8 @@ AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, 0, 0, "crc32")
 AARCH64_OPT_EXTENSION("lse", AARCH64_FL_LSE, 0, 0, "atomics")
 
 /* Enabling "fp16" also enables "fp".
-   Disabling "fp16" just disables "fp16".  */
-AARCH64_OPT_EXTENSION("fp16", AARCH64_FL_F16, AARCH64_FL_FP, 0, "fphp asimdhp")
+   Disabling "fp16" disables "sve" and "fp16".  */
+AARCH64_OPT_EXTENSION("fp16", AARCH64_FL_F16, AARCH64_FL_FP, AARCH64_FL_SVE, "fphp asimdhp")
 
 /* Enabling or disabling "rcpc" only changes "rcpc".  */
 AARCH64_OPT_EXTENSION("rcpc", AARCH64_FL_RCPC, 0, 0, "lrcpc")
@@ -67,7 +67,12 @@ AARCH64_OPT_EXTENSION("rcpc", AARCH64_FL_RCPC, 0, 0, "lrcpc")
    Disabling "rdma" just disables "rdma".  */
 AARCH64_OPT_EXTENSION("rdma", AARCH64_FL_RDMA, AARCH64_FL_FP | AARCH64_FL_SIMD, 0, "asimdrdm")
 
-/* Enabling "sve" also enables "fp" and "simd".
+/* Enabling "dotprod" also enables "simd".
+   Disabling "dotprod" only disables "dotprod".  */
+AARCH64_OPT_EXTENSION("dotprod", AARCH64_FL_DOTPROD, AARCH64_FL_SIMD, 0, "asimddp")
+
+/* Enabling "sve" also enables "fp16", "fp" and "simd".
    Disabling "sve" just disables "sve".  */
-AARCH64_OPT_EXTENSION("sve", AARCH64_FL_SVE, AARCH64_FL_FP | AARCH64_FL_SIMD, 0, "sve")
+AARCH64_OPT_EXTENSION("sve", AARCH64_FL_SVE, AARCH64_FL_FP | AARCH64_FL_SIMD | AARCH64_FL_F16, 0, "sve")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 705675f6d91..39c5b0a7965 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -373,6 +373,7 @@ bool aarch64_legitimate_pic_operand_p (rtx);
 bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx);
 bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
 bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
+opt_machine_mode aarch64_sve_pred_mode (unsigned int);
 bool aarch64_sve_cnt_immediate_p (rtx);
 bool aarch64_sve_addvl_addpl_immediate_p (rtx);
 bool aarch64_sve_inc_dec_immediate_p (rtx);
@@ -445,8 +446,8 @@ const char * aarch64_output_probe_stack_range (rtx, rtx);
 void aarch64_err_no_fpadvsimd (machine_mode, const char *);
 void aarch64_expand_epilogue (bool);
 void aarch64_expand_mov_immediate (rtx, rtx, rtx (*) (rtx, rtx) = 0);
-void aarch64_expand_sve_mem_move (rtx, rtx, machine_mode,
-				  rtx (*) (rtx, rtx, rtx));
+void aarch64_emit_sve_pred_move (rtx, rtx, rtx);
+void aarch64_expand_sve_mem_move (rtx, rtx, machine_mode);
 void aarch64_expand_prologue (void);
 void aarch64_expand_vector_init (rtx, rtx);
 void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx,
@@ -469,7 +470,7 @@ void aarch64_simd_emit_reg_reg_move (rtx *, machine_mode, unsigned int);
 rtx aarch64_simd_expand_builtin (int, tree, rtx);
 
 void aarch64_simd_lane_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT, const_tree);
-rtx endian_lane_rtx (machine_mode, unsigned int);
+rtx aarch64_endian_lane_rtx (machine_mode, unsigned int);
 
 void aarch64_split_128bit_move (rtx, rtx);
 
@@ -508,7 +509,7 @@ void aarch64_expand_sve_vcond (machine_mode, machine_mode, rtx *);
 
 void aarch64_init_builtins (void);
 
-bool aarch64_process_target_attr (tree, const char*);
+bool aarch64_process_target_attr (tree);
 void aarch64_override_options_internal (struct gcc_options *);
 
 rtx aarch64_expand_builtin (tree exp,
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index d713d5d8b88..52d01342372 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -205,6 +205,14 @@
   BUILTIN_VSDQ_I_DI (BINOP, srshl, 0)
   BUILTIN_VSDQ_I_DI (BINOP_UUS, urshl, 0)
 
+  /* Implemented by aarch64_<sur><dotprod>{_lane}{q}<dot_mode>.  */
+  BUILTIN_VB (TERNOP, sdot, 0)
+  BUILTIN_VB (TERNOPU, udot, 0)
+  BUILTIN_VB (QUADOP_LANE, sdot_lane, 0)
+  BUILTIN_VB (QUADOPU_LANE, udot_lane, 0)
+  BUILTIN_VB (QUADOP_LANE, sdot_laneq, 0)
+  BUILTIN_VB (QUADOPU_LANE, udot_laneq, 0)
+
   BUILTIN_VDQ_I (SHIFTIMM, ashr, 3)
   VAR1 (SHIFTIMM, ashr_simd, 0, di)
   BUILTIN_VDQ_I (SHIFTIMM, lshr, 3)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 60635aa418c..fcc49e3a2f8 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -80,7 +80,7 @@
           )))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "dup\\t%0.<Vtype>, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_dup<q>")]
@@ -95,13 +95,13 @@
           )))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "dup\\t%0.<Vtype>, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_dup<q>")]
 )
 
-(define_insn "*aarch64_simd_mov<mode>"
+(define_insn "*aarch64_simd_mov<VD:mode>"
   [(set (match_operand:VD 0 "nonimmediate_operand"
 		"=w, m,  m,  w, ?r, ?w, ?r, w")
 	(match_operand:VD 1 "general_operand"
@@ -124,12 +124,12 @@
      default: gcc_unreachable ();
      }
 }
-  [(set_attr "type" "neon_load1_1reg<q>, neon_stp, neon_store1_1reg<q>,\
+  [(set_attr "type" "neon_load1_1reg<q>, store_8, neon_store1_1reg<q>,\
 		     neon_logic<q>, neon_to_gp<q>, f_mcr,\
 		     mov_reg, neon_move<q>")]
 )
 
-(define_insn "*aarch64_simd_mov<mode>"
+(define_insn "*aarch64_simd_mov<VQ:mode>"
   [(set (match_operand:VQ 0 "nonimmediate_operand"
 		"=w, Umq,  m,  w, ?r, ?w, ?r, w")
 	(match_operand:VQ 1 "general_operand"
@@ -158,8 +158,8 @@
 	gcc_unreachable ();
     }
 }
-  [(set_attr "type" "neon_load1_1reg<q>, neon_store1_1reg<q>,\
-		     neon_stp, neon_logic<q>, multiple, multiple,\
+  [(set_attr "type" "neon_load1_1reg<q>, store_16, neon_store1_1reg<q>,\
+		     neon_logic<q>, multiple, multiple,\
 		     multiple, neon_move<q>")
    (set_attr "length" "4,4,4,4,8,8,8,4")]
 )
@@ -391,6 +391,87 @@
 }
 )
 
+;; These instructions map to the __builtins for the Dot Product operations.
+(define_insn "aarch64_<sur>dot<vsi2qi>"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(plus:VS (match_operand:VS 1 "register_operand" "0")
+		(unspec:VS [(match_operand:<VSI2QI> 2 "register_operand" "w")
+			    (match_operand:<VSI2QI> 3 "register_operand" "w")]
+		DOTPROD)))]
+  "TARGET_DOTPROD"
+  "<sur>dot\\t%0.<Vtype>, %2.<Vdottype>, %3.<Vdottype>"
+  [(set_attr "type" "neon_dot")]
+)
+
+;; These expands map to the Dot Product optab the vectorizer checks for.
+;; The auto-vectorizer expects a dot product builtin that also does an
+;; accumulation into the provided register.
+;; Given the following pattern
+;;
+;; for (i=0; i<len; i++) {
+;;     c = a[i] * b[i];
+;;     r += c;
+;; }
+;; return result;
+;;
+;; This can be auto-vectorized to
+;; r  = a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
+;;
+;; given enough iterations.  However the vectorizer can keep unrolling the loop
+;; r += a[4]*b[4] + a[5]*b[5] + a[6]*b[6] + a[7]*b[7];
+;; r += a[8]*b[8] + a[9]*b[9] + a[10]*b[10] + a[11]*b[11];
+;; ...
+;;
+;; and so the vectorizer provides r, in which the result has to be accumulated.
+(define_expand "<sur>dot_prod<vsi2qi>"
+  [(set (match_operand:VS 0 "register_operand")
+	(plus:VS (unspec:VS [(match_operand:<VSI2QI> 1 "register_operand")
+			    (match_operand:<VSI2QI> 2 "register_operand")]
+		 DOTPROD)
+		(match_operand:VS 3 "register_operand")))]
+  "TARGET_DOTPROD"
+{
+  emit_insn (
+    gen_aarch64_<sur>dot<vsi2qi> (operands[3], operands[3], operands[1],
+				    operands[2]));
+  emit_insn (gen_rtx_SET (operands[0], operands[3]));
+  DONE;
+})
+
+;; These instructions map to the __builtins for the Dot Product
+;; indexed operations.
+(define_insn "aarch64_<sur>dot_lane<vsi2qi>"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(plus:VS (match_operand:VS 1 "register_operand" "0")
+		(unspec:VS [(match_operand:<VSI2QI> 2 "register_operand" "w")
+			    (match_operand:V8QI 3 "register_operand" "<h_con>")
+			    (match_operand:SI 4 "immediate_operand" "i")]
+		DOTPROD)))]
+  "TARGET_DOTPROD"
+  {
+    operands[4]
+      = GEN_INT (ENDIAN_LANE_N (V8QImode, INTVAL (operands[4])));
+    return "<sur>dot\\t%0.<Vtype>, %2.<Vdottype>, %3.4b[%4]";
+  }
+  [(set_attr "type" "neon_dot")]
+)
+
+(define_insn "aarch64_<sur>dot_laneq<vsi2qi>"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(plus:VS (match_operand:VS 1 "register_operand" "0")
+		(unspec:VS [(match_operand:<VSI2QI> 2 "register_operand" "w")
+			    (match_operand:V16QI 3 "register_operand" "<h_con>")
+			    (match_operand:SI 4 "immediate_operand" "i")]
+		DOTPROD)))]
+  "TARGET_DOTPROD"
+  {
+    operands[4]
+      = GEN_INT (ENDIAN_LANE_N (V16QImode, INTVAL (operands[4])));
+    return "<sur>dot\\t%0.<Vtype>, %2.<Vdottype>, %3.4b[%4]";
+  }
+  [(set_attr "type" "neon_dot")]
+)
+
 (define_expand "copysign<mode>3"
   [(match_operand:VHSDF 0 "register_operand")
    (match_operand:VHSDF 1 "register_operand")
@@ -419,7 +500,7 @@
       (match_operand:VMUL 3 "register_operand" "w")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "<f>mul\\t%0.<Vtype>, %3.<Vtype>, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon<fp>_mul_<stype>_scalar<q>")]
@@ -435,7 +516,7 @@
       (match_operand:VMUL_CHANGE_NLANES 3 "register_operand" "w")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "<f>mul\\t%0.<Vtype>, %3.<Vtype>, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon<fp>_mul_<Vetype>_scalar<q>")]
@@ -488,7 +569,7 @@
        (match_operand:DF 3 "register_operand" "w")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
     return "fmul\\t%0.2d, %3.2d, %1.d[%2]";
   }
   [(set_attr "type" "neon_fp_mul_d_scalar_q")]
@@ -567,9 +648,8 @@
       case 0:
 	return "and\t%0.<Vbtype>, %1.<Vbtype>, %2.<Vbtype>";
       case 1:
-	return (aarch64_output_simd_mov_immediate
-		(operands[2], GET_MODE_BITSIZE (<MODE>mode).to_constant (),
-		 AARCH64_CHECK_BIC));
+	return aarch64_output_simd_mov_immediate (operands[2], <bitsize>,
+						  AARCH64_CHECK_BIC);
       default:
 	gcc_unreachable ();
       }
@@ -589,9 +669,8 @@
       case 0:
 	return "orr\t%0.<Vbtype>, %1.<Vbtype>, %2.<Vbtype>";
       case 1:
-	return (aarch64_output_simd_mov_immediate
-		(operands[2], GET_MODE_BITSIZE (<MODE>mode).to_constant (),
-		 AARCH64_CHECK_ORR));
+	return aarch64_output_simd_mov_immediate (operands[2], <bitsize>,
+						  AARCH64_CHECK_ORR);
       default:
 	gcc_unreachable ();
       }
@@ -1073,7 +1152,7 @@
 	 (match_operand:VDQHS 4 "register_operand" "0")))]
  "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "mla\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_mla_<Vetype>_scalar<q>")]
@@ -1091,7 +1170,7 @@
 	 (match_operand:VDQHS 4 "register_operand" "0")))]
  "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "mla\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_mla_<Vetype>_scalar<q>")]
@@ -1131,7 +1210,7 @@
 	   (match_operand:VDQHS 3 "register_operand" "w"))))]
  "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "mls\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_mla_<Vetype>_scalar<q>")]
@@ -1149,7 +1228,7 @@
 	   (match_operand:VDQHS 3 "register_operand" "w"))))]
  "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "mls\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_mla_<Vetype>_scalar<q>")]
@@ -1719,7 +1798,7 @@
       (match_operand:VDQF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "fmla\\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_fp_mla_<Vetype>_scalar<q>")]
@@ -1736,7 +1815,7 @@
       (match_operand:VDQSF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "fmla\\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_fp_mla_<Vetype>_scalar<q>")]
@@ -1764,7 +1843,7 @@
       (match_operand:DF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
     return "fmla\\t%0.2d, %3.2d, %1.2d[%2]";
   }
   [(set_attr "type" "neon_fp_mla_d_scalar_q")]
@@ -1794,7 +1873,7 @@
       (match_operand:VDQF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "fmls\\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_fp_mla_<Vetype>_scalar<q>")]
@@ -1812,7 +1891,7 @@
       (match_operand:VDQSF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
     return "fmls\\t%0.<Vtype>, %3.<Vtype>, %1.<Vtype>[%2]";
   }
   [(set_attr "type" "neon_fp_mla_<Vetype>_scalar<q>")]
@@ -1842,7 +1921,7 @@
       (match_operand:DF 4 "register_operand" "0")))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (V2DFmode, INTVAL (operands[2]));
     return "fmls\\t%0.2d, %3.2d, %1.2d[%2]";
   }
   [(set_attr "type" "neon_fp_mla_d_scalar_q")]
@@ -2175,7 +2254,7 @@
 	       UNSPEC_ADDV)]
   "TARGET_SIMD"
   {
-    rtx elt = endian_lane_rtx (<MODE>mode, 0);
+    rtx elt = aarch64_endian_lane_rtx (<MODE>mode, 0);
     rtx scratch = gen_reg_rtx (<MODE>mode);
     emit_insn (gen_aarch64_reduc_plus_internal<mode> (scratch, operands[1]));
     emit_insn (gen_aarch64_get_lane<mode> (operands[0], scratch, elt));
@@ -2226,7 +2305,7 @@
 		    UNSPEC_FADDV))]
  "TARGET_SIMD"
 {
-  rtx elt = endian_lane_rtx (V4SFmode, 0);
+  rtx elt = aarch64_endian_lane_rtx (V4SFmode, 0);
   rtx scratch = gen_reg_rtx (V4SFmode);
   emit_insn (gen_aarch64_faddpv4sf (scratch, operands[1], operands[1]));
   emit_insn (gen_aarch64_faddpv4sf (scratch, scratch, scratch));
@@ -2268,7 +2347,7 @@
 		  FMAXMINV)]
   "TARGET_SIMD"
   {
-    rtx elt = endian_lane_rtx (<MODE>mode, 0);
+    rtx elt = aarch64_endian_lane_rtx (<MODE>mode, 0);
     rtx scratch = gen_reg_rtx (<MODE>mode);
     emit_insn (gen_aarch64_reduc_<maxmin_uns>_internal<mode> (scratch,
 							      operands[1]));
@@ -2284,7 +2363,7 @@
 		    MAXMINV)]
   "TARGET_SIMD"
   {
-    rtx elt = endian_lane_rtx (<MODE>mode, 0);
+    rtx elt = aarch64_endian_lane_rtx (<MODE>mode, 0);
     rtx scratch = gen_reg_rtx (<MODE>mode);
     emit_insn (gen_aarch64_reduc_<maxmin_uns>_internal<mode> (scratch,
 							      operands[1]));
@@ -2809,7 +2888,7 @@
 	    (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "smov\\t%<GPI:w>0, %1.<VDQQH:Vetype>[%2]";
   }
   [(set_attr "type" "neon_to_gp<q>")]
@@ -2823,7 +2902,7 @@
 	    (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "umov\\t%w0, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_to_gp<q>")]
@@ -2839,7 +2918,7 @@
 	  (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     switch (which_alternative)
       {
 	case 0:
@@ -3215,7 +3294,7 @@
 	 UNSPEC_FMULX))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[3]));
     return "fmulx\t%<v>0<Vmtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_fp_mul_<Vetype>_scalar<q>")]
@@ -3234,7 +3313,7 @@
 	 UNSPEC_FMULX))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
     return "fmulx\t%<v>0<Vmtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_fp_mul_<Vetype><q>")]
@@ -3268,7 +3347,7 @@
 	 UNSPEC_FMULX))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
     return "fmulx\t%<Vetype>0, %<Vetype>1, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "fmul<Vetype>")]
@@ -3354,7 +3433,7 @@
 	 VQDMULH))]
   "TARGET_SIMD"
   "*
-   operands[3] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
+   operands[3] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
    return \"sq<r>dmulh\\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[%3]\";"
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar<q>")]
 )
@@ -3369,7 +3448,7 @@
 	 VQDMULH))]
   "TARGET_SIMD"
   "*
-   operands[3] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
+   operands[3] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
    return \"sq<r>dmulh\\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[%3]\";"
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar<q>")]
 )
@@ -3384,7 +3463,7 @@
 	 VQDMULH))]
   "TARGET_SIMD"
   "*
-   operands[3] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
+   operands[3] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
    return \"sq<r>dmulh\\t%<v>0, %<v>1, %2.<v>[%3]\";"
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar<q>")]
 )
@@ -3399,7 +3478,7 @@
 	 VQDMULH))]
   "TARGET_SIMD"
   "*
-   operands[3] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
+   operands[3] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
    return \"sq<r>dmulh\\t%<v>0, %<v>1, %2.<v>[%3]\";"
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar<q>")]
 )
@@ -3431,7 +3510,7 @@
 	  SQRDMLH_AS))]
    "TARGET_SIMD_RDMA"
    {
-     operands[4] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
+     operands[4] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
      return
       "sqrdml<SQRDMLH_AS:rdma_as>h\\t%0.<Vtype>, %2.<Vtype>, %3.<Vetype>[%4]";
    }
@@ -3449,7 +3528,7 @@
 	  SQRDMLH_AS))]
    "TARGET_SIMD_RDMA"
    {
-     operands[4] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
+     operands[4] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
      return
       "sqrdml<SQRDMLH_AS:rdma_as>h\\t%<v>0, %<v>2, %3.<Vetype>[%4]";
    }
@@ -3469,7 +3548,7 @@
 	  SQRDMLH_AS))]
    "TARGET_SIMD_RDMA"
    {
-     operands[4] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
+     operands[4] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
      return
       "sqrdml<SQRDMLH_AS:rdma_as>h\\t%0.<Vtype>, %2.<Vtype>, %3.<Vetype>[%4]";
    }
@@ -3487,7 +3566,7 @@
 	  SQRDMLH_AS))]
    "TARGET_SIMD_RDMA"
    {
-     operands[4] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
+     operands[4] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
      return
       "sqrdml<SQRDMLH_AS:rdma_as>h\\t%<v>0, %<v>2, %3.<v>[%4]";
    }
@@ -3531,7 +3610,7 @@
 	    (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
     return
       "sqdml<SBINQOPS:as>l\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3555,7 +3634,7 @@
 	    (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
     return
       "sqdml<SBINQOPS:as>l\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3578,7 +3657,7 @@
 	    (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
     return
       "sqdml<SBINQOPS:as>l\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3601,7 +3680,7 @@
 	    (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
     return
       "sqdml<SBINQOPS:as>l\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3696,7 +3775,7 @@
 	      (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[4]));
     return
      "sqdml<SBINQOPS:as>l2\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3722,7 +3801,7 @@
 	      (const_int 1))))]
   "TARGET_SIMD"
   {
-    operands[4] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
+    operands[4] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[4]));
     return
      "sqdml<SBINQOPS:as>l2\\t%<vw2>0<Vmwtype>, %<v>2<Vmtype>, %3.<Vetype>[%4]";
   }
@@ -3869,7 +3948,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
     return "sqdmull\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -3890,7 +3969,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
     return "sqdmull\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -3910,7 +3989,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
     return "sqdmull\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -3930,7 +4009,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
     return "sqdmull\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -4008,7 +4087,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCOND>mode, INTVAL (operands[3]));
     return "sqdmull2\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -4031,7 +4110,7 @@
 	     (const_int 1)))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<VCONQ>mode, INTVAL (operands[3]));
     return "sqdmull2\\t%<vw2>0<Vmwtype>, %<v>1<Vmtype>, %2.<Vetype>[%3]";
   }
   [(set_attr "type" "neon_sat_mul_<Vetype>_scalar_long")]
@@ -4537,7 +4616,7 @@
 		   UNSPEC_LD2_LANE))]
   "TARGET_SIMD"
   {
-    operands[3] = endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
     return "ld2\\t{%S0.<Vetype> - %T0.<Vetype>}[%3], %1";
   }
   [(set_attr "type" "neon_load2_one_lane")]
@@ -4581,7 +4660,7 @@
 		   UNSPEC_ST2_LANE))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "st2\\t{%S1.<Vetype> - %T1.<Vetype>}[%2], %0";
   }
   [(set_attr "type" "neon_store2_one_lane<q>")]
@@ -4635,7 +4714,7 @@
 		   UNSPEC_LD3_LANE))]
   "TARGET_SIMD"
 {
-    operands[3] = endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
     return "ld3\\t{%S0.<Vetype> - %U0.<Vetype>}[%3], %1";
 }
   [(set_attr "type" "neon_load3_one_lane")]
@@ -4679,7 +4758,7 @@
 		    UNSPEC_ST3_LANE))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "st3\\t{%S1.<Vetype> - %U1.<Vetype>}[%2], %0";
   }
   [(set_attr "type" "neon_store3_one_lane<q>")]
@@ -4733,7 +4812,7 @@
 		   UNSPEC_LD4_LANE))]
   "TARGET_SIMD"
 {
-    operands[3] = endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
+    operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
     return "ld4\\t{%S0.<Vetype> - %V0.<Vetype>}[%3], %1";
 }
   [(set_attr "type" "neon_load4_one_lane")]
@@ -4777,7 +4856,7 @@
 		    UNSPEC_ST4_LANE))]
   "TARGET_SIMD"
   {
-    operands[2] = endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
     return "st4\\t{%S1.<Vetype> - %V1.<Vetype>}[%2], %0";
   }
   [(set_attr "type" "neon_store4_one_lane<q>")]
@@ -5280,6 +5359,9 @@
 [(set_attr "type" "multiple")]
 )
 
+;; This instruction's pattern is generated directly by
+;; aarch64_expand_vec_perm_const, so any changes to the pattern would
+;; need corresponding changes there.
 (define_insn "aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
 	(unspec:VALL_F16 [(match_operand:VALL_F16 1 "register_operand" "w")
@@ -5290,7 +5372,10 @@
   [(set_attr "type" "neon_permute<q>")]
 )
 
-;; Note immediate (third) operand is lane index not byte index.
+;; This instruction's pattern is generated directly by
+;; aarch64_expand_vec_perm_const, so any changes to the pattern would
+;; need corresponding changes there.  Note that the immediate (third)
+;; operand is a lane index not a byte index.
 (define_insn "aarch64_ext<mode>"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
         (unspec:VALL_F16 [(match_operand:VALL_F16 1 "register_operand" "w")
@@ -5306,6 +5391,9 @@
   [(set_attr "type" "neon_ext<q>")]
 )
 
+;; This instruction's pattern is generated directly by
+;; aarch64_expand_vec_perm_const, so any changes to the pattern would
+;; need corresponding changes there.
 (define_insn "aarch64_rev<REVERSE:rev_op><mode>"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
 	(unspec:VALL_F16 [(match_operand:VALL_F16 1 "register_operand" "w")]
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index b6f839f5643..7052063bb23 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -1,4 +1,4 @@
-;; Machine description for AArch64 AdvSIMD architecture.
+;; Machine description for AArch64 SVE.
 ;; Copyright (C) 2009-2016 Free Software Foundation, Inc.
 ;; Contributed by ARM Ltd.
 ;;
@@ -18,17 +18,66 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
+;; Note on the handling of big-endian SVE
+;; --------------------------------------
+;;
+;; On big-endian systems, Advanced SIMD mov<mode> patterns act in the
+;; same way as movdi or movti would: the first byte of memory goes
+;; into the most significant byte of the register and the last byte
+;; of memory goes into the least significant byte of the register.
+;; This is the most natural ordering for Advanced SIMD and matches
+;; the ABI layout for 64-bit and 128-bit vector types.
+;;
+;; As a result, the order of bytes within the register is what GCC
+;; expects for a big-endian target, and subreg offsets therefore work
+;; as expected, with the first element in memory having subreg offset 0
+;; and the last element in memory having the subreg offset associated
+;; with a big-endian lowpart.  However, this ordering also means that
+;; GCC's lane numbering does not match the architecture's numbering:
+;; GCC always treats the element at the lowest address in memory
+;; (subreg offset 0) as element 0, while the architecture treats
+;; the least significant end of the register as element 0.
+;;
+;; The situation for SVE is different.  We want the layout of the
+;; SVE register to be same for mov<mode> as it is for maskload<mode>:
+;; logically, a mov<mode> load should be indistinguishable from a
+;; maskload<mode> whose mask is all true.  We therefore need the
+;; register layout to match LD1 rather than LDR.  The ABI layout of
+;; SVE types also matches LD1 byte ordering rather than LDR byte ordering.
+;;
+;; As a result, the architecture lane numbering matches GCC's lane
+;; numbering, with element 0 always being the first in memory.
+;; However:
+;;
+;; - Applying a subreg offset to a register does not give the element
+;;   that GCC expects: the first element in memory has the subreg offset
+;;   associated with a big-endian lowpart while the last element in memory
+;;   has subreg offset 0.  We handle this via TARGET_CAN_CHANGE_MODE_CLASS.
+;;
+;; - We cannot use LDR and STR for spill slots that might be accessed
+;;   via subregs, since although the elements have the order GCC expects,
+;;   the order of the bytes within the elements is different.  We instead
+;;   access spill slots via LD1 and ST1, using secondary reloads to
+;;   reserve a predicate register.
+
+
+;; SVE data moves.
 (define_expand "mov<mode>"
   [(set (match_operand:SVE_ALL 0 "nonimmediate_operand")
 	(match_operand:SVE_ALL 1 "general_operand"))]
   "TARGET_SVE"
   {
-    if (MEM_P (operands[0]) || MEM_P (operands[1]))
+    /* Use the predicated load and store patterns where possible.
+       This is required for big-endian targets (see the comment at the
+       head of the file) and increases the addressing choices for
+       little-endian.  */
+    if ((MEM_P (operands[0]) || MEM_P (operands[1]))
+        && can_create_pseudo_p ())
       {
-	aarch64_expand_sve_mem_move (operands[0], operands[1], <VPRED>mode,
-				     gen_pred_mov<mode>);
+	aarch64_expand_sve_mem_move (operands[0], operands[1], <VPRED>mode);
 	DONE;
       }
+
     if (CONSTANT_P (operands[1]))
       {
 	aarch64_expand_mov_immediate (operands[0], operands[1],
@@ -38,26 +87,14 @@
   }
 )
 
-;; This will always load and store the elements in little-endian order.
-;: We therefore try to restrict its use to spill slots and make sure that
-;; all loads and stores to spill slots go through this pattern, so that
-;; everything agrees on the local endianness.  In particular:
-;;
-;; 1) The pattern doesn't accept moves involving memory operands before
-;;    register allocation.  The moves should use the richer pred_mov<mode>
-;;    pattern instead.
-;;
-;; 2) Big-endian targets use TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P to disallow
-;;    memory equivalences during register allocation.  This should ensure
-;;    that any memory replacements of registers are fresh spill slots.
-;;
-;; ??? There is still the hole that this is effectively a REVB operation
-;; on big-endian targets, rather than the simple move that the RTL pattern
-;; claims.
-(define_insn "aarch64_sve_mov<mode>"
+;; Unpredicated moves (little-endian).  Only allow memory operations
+;; during and after RA; before RA we want the predicated load and
+;; store patterns to be used instead.
+(define_insn "*aarch64_sve_mov<mode>_le"
   [(set (match_operand:SVE_ALL 0 "aarch64_sve_nonimmediate_operand" "=w, Utr, w, w")
 	(match_operand:SVE_ALL 1 "aarch64_sve_general_operand" "Utr, w, w, Dn"))]
   "TARGET_SVE
+   && !BYTES_BIG_ENDIAN
    && ((lra_in_progress || reload_completed)
        || (register_operand (operands[0], <MODE>mode)
 	   && nonmemory_operand (operands[1], <MODE>mode)))"
@@ -68,12 +105,51 @@
    * return aarch64_output_sve_mov_immediate (operands[1]);"
 )
 
-(define_insn "pred_mov<mode>"
+;; Unpredicated moves (big-endian).  Memory accesses require secondary
+;; reloads.
+(define_insn "*aarch64_sve_mov<mode>_be"
+  [(set (match_operand:SVE_ALL 0 "register_operand" "=w, w")
+	(match_operand:SVE_ALL 1 "aarch64_nonmemory_operand" "w, Dn"))]
+  "TARGET_SVE && BYTES_BIG_ENDIAN"
+  "@
+   mov\t%0.d, %1.d
+   * return aarch64_output_sve_mov_immediate (operands[1]);"
+)
+
+;; Handle big-endian memory reloads.  We use byte PTRUE for all modes
+;; to try to encourage reuse.
+(define_expand "aarch64_sve_reload_be"
+  [(parallel
+     [(set (match_operand 0)
+           (match_operand 1))
+      (clobber (match_operand:V32BI 2 "register_operand" "=Upl"))])]
+  "TARGET_SVE && BYTES_BIG_ENDIAN"
+  {
+    /* Create a PTRUE.  */
+    emit_move_insn (operands[2], CONSTM1_RTX (V32BImode));
+
+    /* Refer to the PTRUE in the appropriate mode for this move.  */
+    machine_mode mode = GET_MODE (operands[0]);
+    machine_mode pred_mode
+      = aarch64_sve_pred_mode (GET_MODE_UNIT_SIZE (mode)).require ();
+    rtx pred = gen_lowpart (pred_mode, operands[2]);
+
+    /* Emit a predicated load or store.  */
+    aarch64_emit_sve_pred_move (operands[0], pred, operands[1]);
+    DONE;
+  }
+)
+
+;; A predicated load or store for which the predicate is known to be
+;; all-true.  Note that this pattern is generated directly by
+;; aarch64_emit_sve_pred_move, so changes to this pattern will
+;; need changes there as well.
+(define_insn "*pred_mov<mode>"
   [(set (match_operand:SVE_ALL 0 "nonimmediate_operand" "=w, m")
 	(unspec:SVE_ALL
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
 	   (match_operand:SVE_ALL 2 "nonimmediate_operand" "m, w")]
-	  UNSPEC_PRED_MOVE))]
+	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE
    && (register_operand (operands[0], <MODE>mode)
        || register_operand (operands[2], <MODE>mode))"
@@ -82,6 +158,62 @@
    st1<Vesize>\t%2.<Vetype>, %1, %0"
 )
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:SVE_ALL 0 "nonimmediate_operand")
+	(match_operand:SVE_ALL 1 "general_operand"))]
+  "TARGET_SVE"
+  {
+    /* Equivalent to a normal move for our purpooses.  */
+    emit_move_insn (operands[0], operands[1]);
+    DONE;
+  }
+)
+
+(define_insn "maskload<mode><vpred>"
+  [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
+	(unspec:SVE_ALL
+	  [(match_operand:<VPRED> 2 "register_operand" "Upl")
+	   (match_operand:SVE_ALL 1 "memory_operand" "m")]
+	  UNSPEC_LD1_SVE))]
+  "TARGET_SVE"
+  "ld1<Vesize>\t%0.<Vetype>, %2/z, %1"
+)
+
+(define_insn "maskstore<mode><vpred>"
+  [(set (match_operand:SVE_ALL 0 "memory_operand" "+m")
+	(unspec:SVE_ALL [(match_operand:<VPRED> 2 "register_operand" "Upl")
+			 (match_operand:SVE_ALL 1 "register_operand" "w")
+			 (match_dup 0)]
+			UNSPEC_ST1_SVE))]
+  "TARGET_SVE"
+  "st1<Vesize>\t%1.<Vetype>, %2, %0"
+)
+
+(define_expand "firstfault_load<mode>"
+  [(set (match_operand:SVE_ALL 0 "register_operand")
+	(unspec:SVE_ALL
+	  [(match_operand:SVE_ALL 1 "aarch64_sve_ldff1_operand")
+	   (match_dup 2)
+	   (reg:SI FFRT_REGNUM)]
+	  UNSPEC_LDFF1))]
+  "TARGET_SVE"
+  {
+    operands[2] = force_reg (<VPRED>mode, CONSTM1_RTX (<VPRED>mode));
+  }
+)
+
+(define_insn "*firstfault_load<mode>"
+  [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
+	(unspec:SVE_ALL
+	  [(match_operand:SVE_ALL 1 "aarch64_sve_ldff1_operand" "Utf")
+	   (match_operand:<VPRED> 2 "register_operand" "Upl")
+	   (reg:SI FFRT_REGNUM)]
+	  UNSPEC_LDFF1))]
+  "TARGET_SVE"
+  "ldff1<Vesize>\t%0.<Vetype>, %2/z, %j1";
+)
+
+;; SVE structure moves.
 (define_expand "mov<mode>"
   [(set (match_operand:SVE_STRUCT 0 "nonimmediate_operand")
 	(match_operand:SVE_STRUCT 1 "general_operand"))]
@@ -89,8 +221,7 @@
   {
     if (MEM_P (operands[0]) || MEM_P (operands[1]))
       {
-	aarch64_expand_sve_mem_move (operands[0], operands[1], <VPRED>mode,
-				     gen_pred_mov<mode>);
+	aarch64_expand_sve_mem_move (operands[0], operands[1], <VPRED>mode);
 	DONE;
       }
     if (CONSTANT_P (operands[1]))
@@ -101,17 +232,38 @@
   }
 )
 
-;; See the comments above the SVE_ALL aarch64_sve_mov<mode> for details
-;; of the memory handling.
-(define_insn_and_split "aarch64_sve_mov<mode>"
+;; Unpredicated structure moves (little-endian).  Only allow memory operations
+;; during and after RA; before RA we want the predicated load and store
+;; patterns to be used instead.
+(define_insn "*aarch64_sve_mov<mode>_le"
   [(set (match_operand:SVE_STRUCT 0 "aarch64_sve_nonimmediate_operand" "=w, Utr, w, w")
 	(match_operand:SVE_STRUCT 1 "aarch64_sve_general_operand" "Utr, w, w, Dn"))]
   "TARGET_SVE
+   && !BYTES_BIG_ENDIAN
    && ((lra_in_progress || reload_completed)
        || (register_operand (operands[0], <MODE>mode)
 	   && nonmemory_operand (operands[1], <MODE>mode)))"
   "#"
-  "&& reload_completed"
+  [(set_attr "length" "<insn_length>")]
+)
+
+;; Unpredicated structure moves (big-endian).  Memory accesses require
+;; secondary reloads.
+(define_insn "*aarch64_sve_mov<mode>_le"
+  [(set (match_operand:SVE_STRUCT 0 "register_operand" "=w, w")
+	(match_operand:SVE_STRUCT 1 "aarch64_nonmemory_operand" "w, Dn"))]
+  "TARGET_SVE && BYTES_BIG_ENDIAN"
+  "#"
+  [(set_attr "length" "<insn_length>")]
+)
+
+;; Split unpredicated structure moves into pieces.  This is the same
+;; for both big-endian and little-endian code, although it only needs
+;; to handle memory operands for little-endian code.
+(define_split
+  [(set (match_operand:SVE_STRUCT 0 "aarch64_sve_nonimmediate_operand")
+	(match_operand:SVE_STRUCT 1 "aarch64_sve_general_operand"))]
+  "TARGET_SVE && reload_completed"
   [(const_int 0)]
   {
     rtx dest = operands[0];
@@ -125,11 +277,10 @@
 					     i * BYTES_PER_SVE_VECTOR);
 	  rtx subsrc = simplify_gen_subreg (<VSINGLE>mode, src, <MODE>mode,
 					    i * BYTES_PER_SVE_VECTOR);
-	  emit_insn (gen_aarch64_sve_mov<mode> (subdest, subsrc));
+	  emit_insn (gen_rtx_SET (subdest, subsrc));
 	}
     DONE;
   }
-  [(set_attr "length" "<insn_length>")]
 )
 
 (define_insn_and_split "pred_mov<mode>"
@@ -137,7 +288,7 @@
 	(unspec:SVE_STRUCT
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
 	   (match_operand:SVE_STRUCT 2 "aarch64_sve_struct_nonimmediate_operand" "Utx, w")]
-	  UNSPEC_PRED_MOVE))]
+	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE
    && (register_operand (operands[0], <MODE>mode)
        || register_operand (operands[2], <MODE>mode))"
@@ -153,7 +304,7 @@
 	rtx subsrc = simplify_gen_subreg (<VSINGLE>mode, operands[2],
 					  <MODE>mode,
 					  i * BYTES_PER_SVE_VECTOR);
-	emit_insn (gen_pred_mov<vsingle> (subdest, operands[1], subsrc));
+	aarch64_emit_sve_pred_move (subdest, operands[1], subsrc);
       }
     DONE;
   }
@@ -170,73 +321,147 @@
   }
 )
 
-(define_expand "movmisalign<mode>"
-  [(set (match_operand:SVE_ALL 0 "nonimmediate_operand")
-	(match_operand:SVE_ALL 1 "general_operand"))]
+(define_insn "*aarch64_sve_mov<mode>"
+  [(set (match_operand:PRED_ALL 0 "nonimmediate_operand" "=Upa, m, Upa, Upa, Upa")
+	(match_operand:PRED_ALL 1 "general_operand" "Upa, Upa, m, Dz, Dm"))]
+  "TARGET_SVE
+   && (register_operand (operands[0], <MODE>mode)
+       || register_operand (operands[1], <MODE>mode))"
+  "@
+   mov\t%0.b, %1.b
+   str\t%1, %0
+   ldr\t%0, %1
+   pfalse\t%0.b
+   * return aarch64_output_ptrue (<MODE>mode, '<Vetype>');"
+)
+
+;; Handle extractions from a predicate by converting to an integer vector
+;; and extracting from there.
+(define_expand "vec_extract<vpred><Vel>"
+  [(match_operand:<VEL> 0 "register_operand")
+   (match_operand:<VPRED> 1 "register_operand")
+   (match_operand:SI 2 "nonmemory_operand")
+   ;; Dummy operand to which we can attach the iterator.
+   (reg:SVE_I V0_REGNUM)]
   "TARGET_SVE"
   {
-    /* Equivalent to a normal move for our purpooses.  */
-    emit_move_insn (operands[0], operands[1]);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    emit_insn (gen_aarch64_sve_dup<mode>_const (tmp, operands[1],
+						CONST1_RTX (<MODE>mode),
+						CONST0_RTX (<MODE>mode)));
+    emit_insn (gen_vec_extract<mode><Vel> (operands[0], tmp, operands[2]));
     DONE;
   }
 )
 
-(define_insn "maskload<mode><vpred>"
-  [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
-	(unspec:SVE_ALL
-	  [(match_operand:SVE_ALL 1 "memory_operand" "m")
-	   (match_operand:<VPRED> 2 "register_operand" "Upl")]
-	  UNSPEC_LD1_SVE))]
+(define_expand "vec_extract<mode><Vel>"
+  [(set (match_operand:<VEL> 0 "register_operand")
+	(vec_select:<VEL>
+	  (match_operand:SVE_ALL 1 "register_operand")
+	  (parallel [(match_operand:SI 2 "nonmemory_operand")])))]
   "TARGET_SVE"
-  "ld1<Vesize>\t%0.<Vetype>, %2/z, %1"
+  {
+    poly_int64 val;
+    if (poly_int_rtx_p (operands[2], &val)
+	&& must_eq (val, GET_MODE_NUNITS (<MODE>mode) - 1))
+      {
+	/* The last element can be extracted with a LASTB and a false
+	   predicate.  */
+	rtx sel = force_reg (<VPRED>mode, CONST0_RTX (<VPRED>mode));
+	emit_insn (gen_extract_last_<mode> (operands[0], operands[1], sel));
+	DONE;
+      }
+    if (!CONST_INT_P (operands[2]))
+      {
+	/* Create an index with operand[2] as the base and -1 as the step.
+	   It will then be zero for the element we care about.  */
+	rtx index = gen_lowpart (<VEL_INT>mode, operands[2]);
+	index = force_reg (<VEL_INT>mode, index);
+	rtx series = gen_reg_rtx (<V_INT_EQUIV>mode);
+	emit_insn (gen_vec_series<v_int_equiv> (series, index, constm1_rtx));
+
+	/* Get a predicate that is true for only that element.  */
+	rtx zero = CONST0_RTX (<V_INT_EQUIV>mode);
+	rtx cmp = gen_rtx_EQ (<V_INT_EQUIV>mode, series, zero);
+	rtx sel = gen_reg_rtx (<VPRED>mode);
+	emit_insn (gen_vec_cmp<v_int_equiv><vpred> (sel, cmp, series, zero));
+
+	/* Select the element using LASTB.  */
+	emit_insn (gen_extract_last_<mode> (operands[0], operands[1], sel));
+	DONE;
+      }
+  }
 )
 
-(define_insn "maskstore<mode><vpred>"
-  [(set (match_operand:SVE_ALL 0 "memory_operand" "+m")
-	(unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "w")
-			 (match_operand:<VPRED> 2 "register_operand" "Upl")
-			 (match_dup 0)]
-			UNSPEC_ST1_SVE))]
-  "TARGET_SVE"
-  "st1<Vesize>\t%1.<Vetype>, %2, %0"
+;; Extract an element from the Advanced SIMD portion of the register.
+;; We don't just reuse the aarch64-simd.md pattern because we don't
+;; want any chnage in lane number on big-endian targets.
+(define_insn "*vec_extract<mode><Vel>_v128"
+  [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
+	(vec_select:<VEL>
+	  (match_operand:SVE_ALL 1 "register_operand" "w, w, w")
+	  (parallel [(match_operand:SI 2 "const_int_operand")])))]
+  "TARGET_SVE
+   && IN_RANGE (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode), 0, 15)"
+  {
+    operands[1] = gen_lowpart (<V128>mode, operands[1]);
+    switch (which_alternative)
+      {
+	case 0:
+	  return "umov\\t%<vwcore>0, %1.<Vetype>[%2]";
+	case 1:
+	  return "dup\\t%<Vetype>0, %1.<Vetype>[%2]";
+	case 2:
+	  return "st1\\t{%1.<Vetype>}[%2], %0";
+	default:
+	  gcc_unreachable ();
+      }
+  }
+  [(set_attr "type" "neon_to_gp_q, neon_dup_q, neon_store1_one_lane_q")]
 )
 
-(define_expand "firstfault_load<mode>"
-  [(set (match_operand:SVE_ALL 0 "register_operand")
-	(unspec:SVE_ALL
-	  [(match_operand:SVE_ALL 1 "aarch64_sve_ldff1_operand")
-	   (match_dup 2)
-	   (reg:SI FFRT_REGNUM)]
-	  UNSPEC_LDFF1))]
-  "TARGET_SVE"
+;; Extract an element in the range of DUP.  This pattern allows the
+;; source and destination to be different.
+(define_insn "*vec_extract<mode><Vel>_dup"
+  [(set (match_operand:<VEL> 0 "register_operand" "=w")
+	(vec_select:<VEL>
+	  (match_operand:SVE_ALL 1 "register_operand" "w")
+	  (parallel [(match_operand:SI 2 "const_int_operand")])))]
+  "TARGET_SVE
+   && IN_RANGE (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode), 16, 63)"
   {
-    operands[2] = force_reg (<VPRED>mode, CONSTM1_RTX (<VPRED>mode));
+    operands[0] = gen_rtx_REG (<MODE>mode, REGNO (operands[0]));
+    return "dup\t%0.<Vetype>, %1.<Vetype>[%2]";
   }
 )
 
-(define_insn "*firstfault_load<mode>"
-  [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
-	(unspec:SVE_ALL
-	  [(match_operand:SVE_ALL 1 "aarch64_sve_ldff1_operand" "Utf")
-	   (match_operand:<VPRED> 2 "register_operand" "Upl")
-	   (reg:SI FFRT_REGNUM)]
-	  UNSPEC_LDFF1))]
-  "TARGET_SVE"
-  "ldff1<Vesize>\t%0.<Vetype>, %2/z, %j1";
+;; Extract an element outside the range of DUP.  This pattern requires the
+;; source and destination to be the same.
+(define_insn "*vec_extract<mode><Vel>_ext"
+  [(set (match_operand:<VEL> 0 "register_operand" "=w")
+	(vec_select:<VEL>
+	  (match_operand:SVE_ALL 1 "register_operand" "0")
+	  (parallel [(match_operand:SI 2 "const_int_operand")])))]
+  "TARGET_SVE && INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode) >= 64"
+  {
+    operands[0] = gen_rtx_REG (<MODE>mode, REGNO (operands[0]));
+    operands[2] = GEN_INT (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode));
+    return "ext\t%0.b, %0.b, %0.b, #%2";
+  }
 )
 
-(define_insn "*aarch64_sve_mov<mode>"
-  [(set (match_operand:PRED_ALL 0 "nonimmediate_operand" "=Upa, m, Upa, Upa, Upa")
-	(match_operand:PRED_ALL 1 "general_operand" "Upa, Upa, m, Dz, Dm"))]
-  "TARGET_SVE
-   && (register_operand (operands[0], <MODE>mode)
-       || register_operand (operands[1], <MODE>mode))"
+;; Extract the last active element of operand 1 into operand 0.
+;; If no elements are active, extract the last inactive element instead.
+(define_insn "extract_last_<mode>"
+  [(set (match_operand:<VEL> 0 "register_operand" "=r, w")
+	(unspec:<VEL>
+	  [(match_operand:SVE_ALL 1 "register_operand" "w, w")
+	   (match_operand:<VPRED> 2 "register_operand" "Upl, Upl")]
+	  UNSPEC_LASTB))]
+  "TARGET_SVE"
   "@
-   mov\t%0.b, %1.b
-   str\t%1, %0
-   ldr\t%0, %1
-   pfalse\t%0.b
-   * return aarch64_output_ptrue (<MODE>mode, '<Vetype>');"
+   lastb\t%<vwcore>0, %2, %1.<Vetype>
+   lastb\t%<Vetype>0, %2, %1.<Vetype>"
 )
 
 (define_expand "vec_duplicate<mode>"
@@ -318,13 +543,13 @@
 (define_insn "vec_series<mode>"
   [(set (match_operand:SVE_I 0 "register_operand" "=w, w, w")
 	(vec_series:SVE_I
-	  (match_operand:<VEL> 1 "aarch64_sve_index_operand" "r, Di, r")
-	  (match_operand:<VEL> 2 "aarch64_sve_index_operand" "r, r, Di")))]
+	  (match_operand:<VEL> 1 "aarch64_sve_index_operand" "Di, r, r")
+	  (match_operand:<VEL> 2 "aarch64_sve_index_operand" "r, Di, r")))]
   "TARGET_SVE"
   "@
-   index\t%0.<Vetype>, %<vw>1, %<vw>2
    index\t%0.<Vetype>, #%1, %<vw>2
-   index\t%0.<Vetype>, %<vw>1, #%2"
+   index\t%0.<Vetype>, %<vw>1, #%2
+   index\t%0.<Vetype>, %<vw>1, %<vw>2"
 )
 
 (define_expand "vec_gather_loads<mode>"
@@ -451,6 +676,8 @@
    st1<Vesize>\t%3.<Vetype>, %4, [%0, %1.<Vetype><gather_scaled_modu>]"
 )
 
+;; Optimize {x, x, x, x, ...} + {0, n, 2*n, 3*n, ...} if n is in range
+;; of an INDEX instruction.
 (define_insn "*vec_series<mode>_plus"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
 	(plus:SVE_I
@@ -626,15 +853,16 @@
    (match_operand:SVE_ALL 2 "register_operand")
    (match_operand:<V_INT_EQUIV> 3)]
   "TARGET_SIMD"
-{
-  unsigned int nunits;
-  if (GET_MODE_NUNITS (<MODE>mode).is_constant (&nunits)
-      && aarch64_expand_vec_perm_const (operands[0], operands[1],
-					operands[2], operands[3], nunits))
-    DONE;
-  else
-    FAIL;
-})
+  {
+    unsigned int nunits;
+    if (GET_MODE_NUNITS (<MODE>mode).is_constant (&nunits)
+	&& aarch64_expand_vec_perm_const (operands[0], operands[1],
+					  operands[2], operands[3], nunits))
+      DONE;
+    else
+      FAIL;
+  }
+)
 
 (define_expand "vec_perm<mode>"
   [(match_operand:SVE_ALL 0 "register_operand")
@@ -659,18 +887,18 @@
 (define_expand "<perm_optab>_<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand")
 	(unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand")
-			  (match_operand:SVE_ALL 2 "register_operand")]
-			 OPTAB_PERMUTE))]
+			 (match_operand:SVE_ALL 2 "register_operand")]
+			OPTAB_PERMUTE))]
   "TARGET_SVE")
 
 (define_insn "vec_reverse_<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
 	(unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "w")]
-			 UNSPEC_REV))]
+			UNSPEC_REV))]
   "TARGET_SVE"
   "rev\t%0.<Vetype>, %1.<Vetype>")
 
-(define_insn "sve_tbl1<mode>"
+(define_insn "*aarch64_sve_tbl<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
 	(unspec:SVE_ALL
 	  [(match_operand:SVE_ALL 1 "register_operand" "w")
@@ -680,7 +908,7 @@
   "tbl\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "sve_<perm_insn><perm_hilo><mode>"
+(define_insn "*aarch64_sve_<perm_insn><perm_hilo><mode>"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(unspec:PRED_ALL [(match_operand:PRED_ALL 1 "register_operand" "Upa")
 			  (match_operand:PRED_ALL 2 "register_operand" "Upa")]
@@ -689,7 +917,7 @@
   "<perm_insn><perm_hilo>\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "sve_<perm_insn><perm_hilo><mode>"
+(define_insn "*aarch64_sve_<perm_insn><perm_hilo><mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
 	(unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "w")
 			 (match_operand:SVE_ALL 2 "register_operand" "w")]
@@ -698,7 +926,7 @@
   "<perm_insn><perm_hilo>\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "sve_rev64<mode>"
+(define_insn "*aarch64_sve_rev64<mode>"
   [(set (match_operand:SVE_BHS 0 "register_operand" "=w")
 	(unspec:SVE_BHS
 	  [(match_operand:V4BI 1 "register_operand" "Upl")
@@ -709,7 +937,7 @@
   "rev<Vesize>\t%0.d, %1/m, %2.d"
 )
 
-(define_insn "sve_rev32<mode>"
+(define_insn "*aarch64_sve_rev32<mode>"
   [(set (match_operand:SVE_BH 0 "register_operand" "=w")
 	(unspec:SVE_BH
 	  [(match_operand:V8BI 1 "register_operand" "Upl")
@@ -720,7 +948,7 @@
   "rev<Vesize>\t%0.s, %1/m, %2.s"
 )
 
-(define_insn "sve_rev16v32qi"
+(define_insn "*aarch64_sve_rev16v32qi"
   [(set (match_operand:V32QI 0 "register_operand" "=w")
 	(unspec:V32QI
 	  [(match_operand:V16BI 1 "register_operand" "Upl")
@@ -731,12 +959,12 @@
   "revb\t%0.h, %1/m, %2.h"
 )
 
-(define_insn "sve_dup_lane<mode>"
+(define_insn "*aarch64_sve_dup_lane<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
 	(vec_duplicate:SVE_ALL
 	  (vec_select:<VEL>
 	    (match_operand:SVE_ALL 1 "register_operand" "w")
-	    (parallel [(match_operand:SI 2 "const_int_operand" "i")]))))]
+	    (parallel [(match_operand:SI 2 "const_int_operand")]))))]
   "TARGET_SVE
    && IN_RANGE (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode), 0, 63)"
   "dup\t%0.<Vetype>, %1.<Vetype>[%2]"
@@ -744,17 +972,17 @@
 
 ;; Note that the immediate (third) operand is the lane index not
 ;; the byte index.
-(define_insn "aarch64_ext<mode>"
+(define_insn "*aarch64_sve_ext<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
 	(unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "0")
 			 (match_operand:SVE_ALL 2 "register_operand" "w")
-			 (match_operand:SI 3 "const_int_operand" "i")]
+			 (match_operand:SI 3 "const_int_operand")]
 			UNSPEC_EXT))]
   "TARGET_SVE
    && IN_RANGE (INTVAL (operands[3]) * GET_MODE_SIZE (<VEL>mode), 0, 255)"
   {
     operands[3] = GEN_INT (INTVAL (operands[3]) * GET_MODE_SIZE (<VEL>mode));
-    return "ext\\t%0.b, %1.b, %2.b, #%3";
+    return "ext\\t%0.b, %0.b, %2.b, #%3";
   }
 )
 
@@ -765,7 +993,7 @@
 	  (match_operand:SVE_I 2 "aarch64_sve_add_operand" "vsa, vsn, vsi, w")))]
   "TARGET_SVE"
   "@
-   add\t%0.<Vetype>, %0.<Vetype>, #%2
+   add\t%0.<Vetype>, %0.<Vetype>, #%D2
    sub\t%0.<Vetype>, %0.<Vetype>, #%N2
    * return aarch64_output_sve_inc_dec_immediate (\"%0.<Vetype>\", operands[2]);
    add\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
@@ -779,9 +1007,10 @@
   "TARGET_SVE"
   "@
    sub\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>
-   subr\t%0.<Vetype>, %0.<Vetype>, #%1"
+   subr\t%0.<Vetype>, %0.<Vetype>, #%D1"
 )
 
+;; Unpredicated multiplication.
 (define_expand "mul<mode>3"
   [(set (match_operand:SVE_I 0 "register_operand")
 	(unspec:SVE_I
@@ -796,9 +1025,10 @@
   }
 )
 
-;; We don't actually need the predicate for the first operand, but using Upa
-;; or X isn't likely to gain much and would make the instruction seem less
-;; uniform to the register allocator.
+;; Multiplication predicated with a PTRUE.  We don't actually need the
+;; predicate for the first alternative, but using Upa or X isn't likely
+;; to gain much and would make the instruction seem less uniform to the
+;; register allocator.
 (define_insn "*mul<mode>3"
   [(set (match_operand:SVE_I 0 "register_operand" "=w, w")
 	(unspec:SVE_I
@@ -843,6 +1073,7 @@
    mls\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated NEG, NOT and POPCOUNT.
 (define_expand "<optab><mode>2"
   [(set (match_operand:SVE_I 0 "register_operand")
 	(unspec:SVE_I
@@ -855,6 +1086,7 @@
   }
 )
 
+;; NEG, NOT and POPCOUNT predicated with a PTRUE.
 (define_insn "*<optab><mode>2"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
 	(unspec:SVE_I
@@ -866,6 +1098,7 @@
   "<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
 )
 
+;; Vector AND, ORR and XOR.
 (define_insn "<optab><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand" "=w, w")
 	(LOGICAL:SVE_I
@@ -877,6 +1110,9 @@
    <logical>\t%0.d, %1.d, %2.d"
 )
 
+;; Vector AND, ORR and XOR on floating-point modes.  We avoid subregs
+;; by providing this, but we need to use UNSPECs since rtx logical ops
+;; aren't defined for floating-point modes.
 (define_insn "*<optab><mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w")
 	(unspec:SVE_F [(match_operand:SVE_F 1 "register_operand" "w")
@@ -897,6 +1133,7 @@
   "bic\t%0.d, %2.d, %1.d"
 )
 
+;; Predicate AND.  We can reuse one of the inputs as the GP.
 (define_insn "and<mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(and:PRED_ALL (match_operand:PRED_ALL 1 "register_operand" "Upa")
@@ -905,6 +1142,7 @@
   "and\t%0.b, %1/z, %1.b, %2.b"
 )
 
+;; Unpredicated predicate ORR and XOR.
 (define_expand "<optab><mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand")
 	(and:PRED_ALL
@@ -918,6 +1156,7 @@
   }
 )
 
+;; Predicated predicate ORR and XOR.
 (define_insn "pred_<optab><mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(and:PRED_ALL
@@ -929,12 +1168,12 @@
   "<logical>\t%0.b, %1/z, %2.b, %3.b"
 )
 
-;; Perform a logical operation on the active bits of operands 2 and 3,
-;; using operand 1 as the GP.  Store the result in operand 0 and set the flags
-;; in the same way as for PTEST.  The (and ...) in the UNSPEC_PTEST_PTRUE is
-;; logically redundant, but means that the tested value is structurally
-;; equivalent to rhs of the second set.
-(define_insn "*pred_<optab><mode>3_cc"
+;; Perform a logical operation on operands 2 and 3, using operand 1 as
+;; the GP (which is known to be a PTRUE).  Store the result in operand 0
+;; and set the flags in the same way as for PTEST.  The (and ...) in the
+;; UNSPEC_PTEST_PTRUE is logically redundant, but means that the tested
+;; value is structurally equivalent to rhs of the second set.
+(define_insn "*<optab><mode>3_cc"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
 	  (unspec:SI [(match_operand:PRED_ALL 1 "register_operand" "Upa")
@@ -952,6 +1191,7 @@
   "<logical>s\t%0.b, %1/z, %2.b, %3.b"
 )
 
+;; Unpredicated predicate inverse.
 (define_expand "one_cmpl<mode>2"
   [(set (match_operand:PRED_ALL 0 "register_operand")
 	(and:PRED_ALL
@@ -963,6 +1203,7 @@
   }
 )
 
+;; Predicated predicate inverse.
 (define_insn "*one_cmpl<mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(and:PRED_ALL
@@ -972,6 +1213,7 @@
   "not\t%0.b, %1/z, %2.b"
 )
 
+;; Predicated predicate BIC and ORN.
 (define_insn "*<nlogical><mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(and:PRED_ALL
@@ -983,6 +1225,7 @@
   "<nlogical>\t%0.b, %1/z, %3.b, %2.b"
 )
 
+;; Predicated predicate NAND and NOR.
 (define_insn "*<logical_nn><mode>3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(and:PRED_ALL
@@ -1004,6 +1247,7 @@
   "brka\t%0.b, %1/z, %2.b"
 )
 
+;; Unpredicated LSL, LSR and ASR by a vector.
 (define_expand "v<optab><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand")
 	(unspec:SVE_I
@@ -1018,6 +1262,10 @@
   }
 )
 
+;; LSL, LSR and ASR by a vector, predicated with a PTRUE.  We don't
+;; actually need the predicate for the first alternative, but using Upa
+;; or X isn't likely to gain much and would make the instruction seem
+;; less uniform to the register allocator.
 (define_insn "*v<optab><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand" "=w, w")
 	(unspec:SVE_I
@@ -1032,6 +1280,8 @@
    <shift>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
 )
 
+;; LSL, LSR and ASR by a scalar, which expands into one of the vector
+;; shifts above.
 (define_expand "<ASHIFT:optab><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand")
 	(ASHIFT:SVE_I (match_operand:SVE_I 1 "register_operand")
@@ -1057,7 +1307,7 @@
   }
 )
 
-;; Test all bits of operand 1.  Operand 0 is a PTRUE GP.
+;; Test all bits of operand 1.  Operand 0 is a GP that is known to hold PTRUE.
 ;;
 ;; Using UNSPEC_PTEST_PTRUE allows combine patterns to assume that the GP
 ;; is a PTRUE even if the optimizers haven't yet been able to propagate
@@ -1074,6 +1324,8 @@
   "ptest\t%0, %1.b"
 )
 
+;; Set element I of the result if operand1 + J < operand2 for all J in [0, I].
+;; with the comparison being unsigned.
 (define_insn "while_ult<GPI:mode><PRED_ALL:mode>"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(unspec:PRED_ALL [(match_operand:GPI 1 "aarch64_reg_or_zero" "rZ")
@@ -1084,6 +1336,9 @@
   "whilelo\t%0.<PRED_ALL:Vetype>, %<w>1, %<w>2"
 )
 
+;; WHILELO sets the flags in the same way as a PTEST with a PTRUE GP.
+;; Handle the case in which both results are useful.  The GP operand
+;; to the PTEST isn't needed, so we allow it to be anything.
 (define_insn_and_split "while_ult<GPI:mode><PRED_ALL:mode>_cc"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
@@ -1112,39 +1367,39 @@
   }
 )
 
-;; Predicated comparison.
+;; Predicated integer comparison.
 (define_insn "*vec_cmp<cmp_op>_<mode>"
   [(set (match_operand:<VPRED> 0 "register_operand" "=Upa, Upa")
 	(unspec:<VPRED>
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
 	   (match_operand:SVE_I 2 "register_operand" "w, w")
-	   (match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "w, <imm_con>")]
+	   (match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "<imm_con>, w")]
 	  SVE_COND_INT_CMP))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_SVE"
   "@
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3"
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>"
 )
 
-;; Predicated comparison in which only the flags result is interesting.
+;; Predicated integer comparison in which only the flags result is interesting.
 (define_insn "*vec_cmp<cmp_op>_<mode>_ptest"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
 	  (unspec:SI
 	    [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
 	     (unspec:<VPRED>
-	       [(match_operand:SVE_I 2 "register_operand" "w, w")
-		(match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "w, <imm_con>")
-		(match_dup 3)]
+	       [(match_dup 1)
+	        (match_operand:SVE_I 2 "register_operand" "w, w")
+		(match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "<imm_con>, w")]
 	       SVE_COND_INT_CMP)]
 	    UNSPEC_PTEST_PTRUE)
 	  (const_int 0)))
    (clobber (match_scratch:<VPRED> 0 "=Upa, Upa"))]
   "TARGET_SVE"
   "@
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3"
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>"
 )
 
 ;; Predicated comparison in which both the flag and predicate results
@@ -1157,7 +1412,7 @@
 	     (unspec:<VPRED>
 	       [(match_dup 1)
 		(match_operand:SVE_I 2 "register_operand" "w, w")
-		(match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "w, <imm_con>")]
+		(match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "<imm_con>, w")]
 	       SVE_COND_INT_CMP)]
 	    UNSPEC_PTEST_PTRUE)
 	  (const_int 0)))
@@ -1169,8 +1424,8 @@
 	  SVE_COND_INT_CMP))]
   "TARGET_SVE"
   "@
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>
-   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3"
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #%3
+   cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>"
 )
 
 (define_insn "*fold_cond_<cmp_op><mode>"
@@ -1179,29 +1434,32 @@
 	  (unspec:<VPRED>
 	    [(match_operand:<VPRED> 1 "aarch64_simd_imm_minus_one")
 	     (match_operand:SVE_I 2 "register_operand" "w, w")
-	     (match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "w, <imm_con>")]
+	     (match_operand:SVE_I 3 "aarch64_sve_cmp_<imm_con>_operand" "<imm_con>, w")]
 	    SVE_COND_INT_CMP)
 	  (match_operand:<VPRED> 4 "register_operand" "Upl, Upl")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_SVE"
   "@
-   cmp<cmp_op>\t%0.<Vetype>, %4/z, %2.<Vetype>, %3.<Vetype>
-   cmp<cmp_op>\t%0.<Vetype>, %4/z, %2.<Vetype>, #%3"
+   cmp<cmp_op>\t%0.<Vetype>, %4/z, %2.<Vetype>, #%3
+   cmp<cmp_op>\t%0.<Vetype>, %4/z, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Predicated floating-point comparison (excluding FCMUO, which doesn't
+;; allow #0.0 as an operand).
 (define_insn "*vec_fcm<cmp_op><mode>"
   [(set (match_operand:<VPRED> 0 "register_operand" "=Upa, Upa")
 	(unspec:<VPRED>
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
 	   (match_operand:SVE_F 2 "register_operand" "w, w")
-	   (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "w, Dz")]
+	   (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "Dz, w")]
 	  SVE_COND_FP_CMP))]
   "TARGET_SVE"
   "@
-   fcm<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>
-   fcm<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #0.0"
+   fcm<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, #0.0
+   fcm<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Predicated FCMUO.
 (define_insn "*vec_fcmuo<mode>"
   [(set (match_operand:<VPRED> 0 "register_operand" "=Upa")
 	(unspec:<VPRED>
@@ -1256,7 +1514,8 @@
   "sel\t%0.<Vetype>, %3, %1.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "*sel_dup<mode>_const"
+;; Selects between a duplicated immediate and zero.
+(define_insn "aarch64_sve_dup<mode>_const"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
 	(unspec:SVE_I
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl")
@@ -1267,6 +1526,8 @@
   "mov\t%0.<Vetype>, %1/z, #%2"
 )
 
+;; Integer (signed) vcond.  Don't enforce an immediate range here, since it
+;; depends on the comparison; leave it to aarch64_expand_sve_vcond instead.
 (define_expand "vcond<mode><v_int_equiv>"
   [(set (match_operand:SVE_ALL 0 "register_operand")
 	(if_then_else:SVE_ALL
@@ -1282,21 +1543,8 @@
   }
 )
 
-(define_expand "vcond<mode><v_fp_equiv>"
-  [(set (match_operand:SVE_SD 0 "register_operand")
-	(if_then_else:SVE_SD
-	  (match_operator 3 "comparison_operator"
-	    [(match_operand:<V_FP_EQUIV> 4 "register_operand")
-	     (match_operand:<V_FP_EQUIV> 5 "aarch64_simd_reg_or_zero")])
-	  (match_operand:SVE_SD 1 "register_operand")
-	  (match_operand:SVE_SD 2 "register_operand")))]
-  "TARGET_SVE"
-  {
-    aarch64_expand_sve_vcond (<MODE>mode, <V_FP_EQUIV>mode, operands);
-    DONE;
-  }
-)
-
+;; Integer vcondu.  Don't enforce an immediate range here, since it
+;; depends on the comparison; leave it to aarch64_expand_sve_vcond instead.
 (define_expand "vcondu<mode><v_int_equiv>"
   [(set (match_operand:SVE_ALL 0 "register_operand")
 	(if_then_else:SVE_ALL
@@ -1312,6 +1560,27 @@
   }
 )
 
+;; Floating-point vcond.  All comparisons except FCMUO allow a zero
+;; operand; aarch64_expand_sve_vcond handles the case of an FCMUO
+;; with zero.
+(define_expand "vcond<mode><v_fp_equiv>"
+  [(set (match_operand:SVE_SD 0 "register_operand")
+	(if_then_else:SVE_SD
+	  (match_operator 3 "comparison_operator"
+	    [(match_operand:<V_FP_EQUIV> 4 "register_operand")
+	     (match_operand:<V_FP_EQUIV> 5 "aarch64_simd_reg_or_zero")])
+	  (match_operand:SVE_SD 1 "register_operand")
+	  (match_operand:SVE_SD 2 "register_operand")))]
+  "TARGET_SVE"
+  {
+    aarch64_expand_sve_vcond (<MODE>mode, <V_FP_EQUIV>mode, operands);
+    DONE;
+  }
+)
+
+;; Signed integer comparisons.  Don't enforce an immediate range here, since
+;; it depends on the comparison; leave it to aarch64_expand_sve_vec_cmp_int
+;; instead.
 (define_expand "vec_cmp<mode><vpred>"
   [(parallel
     [(set (match_operand:<VPRED> 0 "register_operand")
@@ -1327,6 +1596,9 @@
   }
 )
 
+;; Unsigned integer comparisons.  Don't enforce an immediate range here, since
+;; it depends on the comparison; leave it to aarch64_expand_sve_vec_cmp_int
+;; instead.
 (define_expand "vec_cmpu<mode><vpred>"
   [(parallel
     [(set (match_operand:<VPRED> 0 "register_operand")
@@ -1342,6 +1614,9 @@
   }
 )
 
+;; Floating-point comparisons.  All comparisons except FCMUO allow a zero
+;; operand; aarch64_expand_sve_vec_cmp_float handles the case of an FCMUO
+;; with zero.
 (define_expand "vec_cmp<mode><vpred>"
   [(set (match_operand:<VPRED> 0 "register_operand")
 	(match_operator:<VPRED> 1 "comparison_operator"
@@ -1355,6 +1630,7 @@
   }
 )
 
+;; Branch based on predicate equality or inequality.
 (define_expand "cbranch<mode>4"
   [(set (pc)
 	(if_then_else
@@ -1381,7 +1657,7 @@
   }
 )
 
-;; Max/Min operations.
+;; Unpredicated integer MIN/MAX.
 (define_expand "<su><maxmin><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand")
 	(unspec:SVE_I
@@ -1395,6 +1671,7 @@
   }
 )
 
+;; Integer MIN/MAX predicated with a PTRUE.
 (define_insn "*<su><maxmin><mode>3"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
 	(unspec:SVE_I
@@ -1406,6 +1683,7 @@
   "<su><maxmin>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated floating-point MIN/MAX.
 (define_expand "<su><maxmin><mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1419,6 +1697,7 @@
   }
 )
 
+;; Floating-point MIN/MAX predicated with a PTRUE.
 (define_insn "*<su><maxmin><mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w")
 	(unspec:SVE_F
@@ -1430,6 +1709,7 @@
   "f<maxmin>nm\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated fmin/fmax.
 (define_expand "<maxmin_uns><mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1444,6 +1724,7 @@
   }
 )
 
+;; fmin/fmax predicated with a PTRUE.
 (define_insn "*<maxmin_uns><mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w")
 	(unspec:SVE_F
@@ -1505,6 +1786,7 @@
    clastb\t%<vw>0, %1, %<vw>0, %3.<Vetype>"
 )
 
+;; Unpredicated integer add reduction.
 (define_expand "reduc_plus_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand")
 	(unspec:<VEL> [(match_dup 2)
@@ -1516,6 +1798,7 @@
   }
 )
 
+;; Predicated integer add reduction.  The result is always 64-bits.
 (define_insn "*reduc_plus_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand" "=w")
 	(unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
@@ -1525,6 +1808,7 @@
   "uaddv\t%d0, %1, %2.<Vetype>"
 )
 
+;; Unpredicated floating-point add reduction.
 (define_expand "reduc_plus_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand")
 	(unspec:<VEL> [(match_dup 2)
@@ -1536,6 +1820,7 @@
   }
 )
 
+;; Predicated floating-point add reduction.
 (define_insn "*reduc_plus_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand" "=w")
 	(unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
@@ -1545,6 +1830,7 @@
   "faddv\t%<Vetype>0, %1, %2.<Vetype>"
 )
 
+;; Unpredicated integer MIN/MAX reduction.
 (define_expand "reduc_<maxmin_uns>_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand")
 	(unspec:<VEL> [(match_dup 2)
@@ -1556,6 +1842,7 @@
   }
 )
 
+;; Predicated integer MIN/MAX reduction.
 (define_insn "*reduc_<maxmin_uns>_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand" "=w")
 	(unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
@@ -1565,6 +1852,7 @@
   "<maxmin_uns_op>v\t%<Vetype>0, %1, %2.<Vetype>"
 )
 
+;; Unpredicated floating-point MIN/MAX reduction.
 (define_expand "reduc_<maxmin_uns>_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand")
 	(unspec:<VEL> [(match_dup 2)
@@ -1576,6 +1864,7 @@
   }
 )
 
+;; Predicated floating-point MIN/MAX reduction.
 (define_insn "*reduc_<maxmin_uns>_scal_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand" "=w")
 	(unspec:<VEL> [(match_operand:<VPRED> 1 "register_operand" "Upl")
@@ -1644,6 +1933,7 @@
   "fadda\t%<Vetype>0, %2, %<Vetype>0, %3.<Vetype>"
 )
 
+;; Unpredicated floating-point addition.
 (define_expand "add<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1658,6 +1948,7 @@
   }
 )
 
+;; Floating-point addition predicated with a PTRUE.
 (define_insn "*add<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w")
 	(unspec:SVE_F
@@ -1673,6 +1964,7 @@
    fadd\t%0.<Vetype>, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated floating-point subtraction.
 (define_expand "sub<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1687,24 +1979,26 @@
   }
 )
 
+;; Floating-point subtraction predicated with a PTRUE.
 (define_insn "*sub<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w, w")
 	(unspec:SVE_F
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl")
 	   (minus:SVE_F
-	     (match_operand:SVE_F 2 "aarch64_sve_float_arith_operand" "w, 0, 0, vfa")
-	     (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_operand" "w, vfa, vfn, 0"))]
+	     (match_operand:SVE_F 2 "aarch64_sve_float_arith_operand" "0, 0, vfa, w")
+	     (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_operand" "vfa, vfn, 0, w"))]
 	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE
    && (register_operand (operands[2], <MODE>mode)
        || register_operand (operands[3], <MODE>mode))"
   "@
-   fsub\t%0.<Vetype>, %2.<Vetype>, %3.<Vetype>
    fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
    fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
-   fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2"
+   fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
+   fsub\t%0.<Vetype>, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated floating-point multiplication.
 (define_expand "mul<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1719,6 +2013,7 @@
   }
 )
 
+;; Floating-point multiplication predicated with a PTRUE.
 (define_insn "*mul<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
@@ -1733,8 +2028,7 @@
    fmul\t%0.<Vetype>, %2.<Vetype>, %3.<Vetype>"
 )
 
-;; Note: fma is %0 = (%1 * %2) + %3
-
+;; Unpredicated fma (%0 = (%1 * %2) + %3).
 (define_expand "fma<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1749,6 +2043,7 @@
   }
 )
 
+;; fma predicated with a PTRUE.
 (define_insn "*fma<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
@@ -1763,6 +2058,7 @@
    fmla\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated fnma (%0 = (-%1 * %2) + %3).
 (define_expand "fnma<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1778,6 +2074,7 @@
   }
 )
 
+;; fnma predicated with a PTRUE.
 (define_insn "*fnma<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
@@ -1793,12 +2090,12 @@
    fmls\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
 )
 
-(define_expand "fnms<mode>4"
+;; Unpredicated fms (%0 = (%1 * %2) - %3).
+(define_expand "fms<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
 	  [(match_dup 4)
-	   (fma:SVE_F (neg:SVE_F
-			(match_operand:SVE_F 1 "register_operand"))
+	   (fma:SVE_F (match_operand:SVE_F 1 "register_operand")
 		      (match_operand:SVE_F 2 "register_operand")
 		      (neg:SVE_F
 			(match_operand:SVE_F 3 "register_operand")))]
@@ -1809,27 +2106,29 @@
   }
 )
 
-(define_insn "*fnms<mode>4"
+;; fms predicated with a PTRUE.
+(define_insn "*fms<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
-	   (fma:SVE_F (neg:SVE_F
-			(match_operand:SVE_F 2 "register_operand" "%0, w"))
+	   (fma:SVE_F (match_operand:SVE_F 2 "register_operand" "%0, w")
 		      (match_operand:SVE_F 3 "register_operand" "w, w")
 		      (neg:SVE_F
 			(match_operand:SVE_F 4 "register_operand" "w, 0")))]
 	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE"
   "@
-   fnmad\t%0.<Vetype>, %1/m, %3.<Vetype>, %4.<Vetype>
-   fnmla\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
+   fnmsb\t%0.<Vetype>, %1/m, %3.<Vetype>, %4.<Vetype>
+   fnmls\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
 )
 
-(define_expand "fms<mode>4"
+;; Unpredicated fnms (%0 = (-%1 * %2) - %3).
+(define_expand "fnms<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
 	  [(match_dup 4)
-	   (fma:SVE_F (match_operand:SVE_F 1 "register_operand")
+	   (fma:SVE_F (neg:SVE_F
+			(match_operand:SVE_F 1 "register_operand"))
 		      (match_operand:SVE_F 2 "register_operand")
 		      (neg:SVE_F
 			(match_operand:SVE_F 3 "register_operand")))]
@@ -1840,21 +2139,24 @@
   }
 )
 
-(define_insn "*fms<mode>4"
+;; fnms predicated with a PTRUE.
+(define_insn "*fnms<mode>4"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
 	  [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
-	   (fma:SVE_F (match_operand:SVE_F 2 "register_operand" "%0, w")
+	   (fma:SVE_F (neg:SVE_F
+			(match_operand:SVE_F 2 "register_operand" "%0, w"))
 		      (match_operand:SVE_F 3 "register_operand" "w, w")
 		      (neg:SVE_F
 			(match_operand:SVE_F 4 "register_operand" "w, 0")))]
 	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE"
   "@
-   fnmsb\t%0.<Vetype>, %1/m, %3.<Vetype>, %4.<Vetype>
-   fnmls\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
+   fnmad\t%0.<Vetype>, %1/m, %3.<Vetype>, %4.<Vetype>
+   fnmla\t%0.<Vetype>, %1/m, %2.<Vetype>, %3.<Vetype>"
 )
 
+;; Unpredicated floating-point division.
 (define_expand "div<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1868,6 +2170,7 @@
   }
 )
 
+;; Floating-point division predicated with a PTRUE.
 (define_insn "*div<mode>3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w")
 	(unspec:SVE_F
@@ -1881,6 +2184,7 @@
    fdivr\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
 )
 
+;; Unpredicated FNEG, FABS and FSQRT.
 (define_expand "<optab><mode>2"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1893,6 +2197,7 @@
   }
 )
 
+;; FNEG, FABS and FSQRT predicated with a PTRUE.
 (define_insn "*<optab><mode>2"
   [(set (match_operand:SVE_F 0 "register_operand" "=w")
 	(unspec:SVE_F
@@ -1903,6 +2208,7 @@
   "<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
 )
 
+;; Unpredicated FRINTy.
 (define_expand "<frint_pattern><mode>2"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1916,6 +2222,7 @@
   }
 )
 
+;; FRINTy predicated with a PTRUE.
 (define_insn "*<frint_pattern><mode>2"
   [(set (match_operand:SVE_F 0 "register_operand" "=w")
 	(unspec:SVE_F
@@ -1927,7 +2234,8 @@
   "frint<frint_suffix>\t%0.<Vetype>, %1/m, %2.<Vetype>"
 )
 
-;; Convert float to integer of the same size (sf to si or df to di).
+;; Unpredicated conversion of floats to integers of the same size (HF to HI,
+;; SF to SI or DF to DI).
 (define_expand "<fix_trunc_optab><mode><v_int_equiv>2"
   [(set (match_operand:<V_INT_EQUIV> 0 "register_operand")
 	(unspec:<V_INT_EQUIV>
@@ -1941,6 +2249,19 @@
   }
 )
 
+;; Conversion of SF to DI, SI or HI, predicated with a PTRUE.
+(define_insn "*<fix_trunc_optab>v16hsf<mode>2"
+  [(set (match_operand:SVE_HSDI 0 "register_operand" "=w")
+	(unspec:SVE_HSDI
+	  [(match_operand:<VPRED> 1 "register_operand" "Upl")
+	   (FIXUORS:SVE_HSDI
+	     (match_operand:V16HF 2 "register_operand" "w"))]
+	  UNSPEC_MERGE_PTRUE))]
+  "TARGET_SVE"
+  "fcvtz<su>\t%0.<Vetype>, %1/m, %2.h"
+)
+
+;; Conversion of SF to DI or SI, predicated with a PTRUE.
 (define_insn "*<fix_trunc_optab>v8sf<mode>2"
   [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
 	(unspec:SVE_SDI
@@ -1952,6 +2273,7 @@
   "fcvtz<su>\t%0.<Vetype>, %1/m, %2.s"
 )
 
+;; Conversion of DF to DI or SI, predicated with a PTRUE.
 (define_insn "*<fix_trunc_optab>v4df<mode>2"
   [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
 	(unspec:SVE_SDI
@@ -1963,7 +2285,8 @@
   "fcvtz<su>\t%0.<Vetype>, %1/m, %2.d"
 )
 
-;; Convert integer to float of the same size (si to sf or di to df).
+;; Unpredicated conversion of integers to floats of the same size
+;; (HI to HF, SI to SF or DI to DF).
 (define_expand "<optab><v_int_equiv><mode>2"
   [(set (match_operand:SVE_F 0 "register_operand")
 	(unspec:SVE_F
@@ -1977,6 +2300,20 @@
   }
 )
 
+;; Conversion of DI, SI or HI to the same number of HFs, predicated
+;; with a PTRUE.
+(define_insn "*<optab><mode>v16hf2"
+  [(set (match_operand:V16HF 0 "register_operand" "=w")
+	(unspec:V16HF
+	  [(match_operand:<VPRED> 1 "register_operand" "Upl")
+	   (FLOATUORS:V16HF
+	     (match_operand:SVE_HSDI 2 "register_operand" "w"))]
+	  UNSPEC_MERGE_PTRUE))]
+  "TARGET_SVE"
+  "<su_optab>cvtf\t%0.h, %1/m, %2.<Vetype>"
+)
+
+;; Conversion of DI or SI to the same number of SFs, predicated with a PTRUE.
 (define_insn "*<optab><mode>v8sf2"
   [(set (match_operand:V8SF 0 "register_operand" "=w")
 	(unspec:V8SF
@@ -1988,6 +2325,7 @@
   "<su_optab>cvtf\t%0.s, %1/m, %2.<Vetype>"
 )
 
+;; Conversion of DI or SI to DF, predicated with a PTRUE.
 (define_insn "*<optab><mode>v4df2"
   [(set (match_operand:V4DF 0 "register_operand" "=w")
 	(unspec:V4DF
@@ -1999,30 +2337,35 @@
   "<su_optab>cvtf\t%0.d, %1/m, %2.<Vetype>"
 )
 
-(define_insn "*truncv4dfv8sf2"
-  [(set (match_operand:V8SF 0 "register_operand" "=w")
-	(unspec:V8SF
-	  [(match_operand:V4BI 1 "register_operand" "Upl")
-	   (unspec:V8SF
-	     [(match_operand:V4DF 2 "register_operand" "w")]
+;; Conversion of DFs to the same number of SFs, or SFs to the same number
+;; of HFs.
+(define_insn "*trunc<Vwide><mode>2"
+  [(set (match_operand:SVE_HSF 0 "register_operand" "=w")
+	(unspec:SVE_HSF
+	  [(match_operand:<VWIDE_PRED> 1 "register_operand" "Upl")
+	   (unspec:SVE_HSF
+	     [(match_operand:<VWIDE> 2 "register_operand" "w")]
 	     UNSPEC_FLOAT_CONVERT)]
 	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE"
-  "fcvt\t%0.s, %1/m, %2.d"
+  "fcvt\t%0.<Vetype>, %1/m, %2.<Vewtype>"
 )
 
-(define_insn "*extendv8sfv4df2"
-  [(set (match_operand:V4DF 0 "register_operand" "=w")
-	(unspec:V4DF
-	  [(match_operand:V4BI 1 "register_operand" "Upl")
-	   (unspec:V4DF
-	     [(match_operand:V8SF 2 "register_operand" "w")]
+;; Conversion of SFs to the same number of DFs, or HFs to the same number
+;; of SFs.
+(define_insn "*extend<mode><Vwide>2"
+  [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+	(unspec:<VWIDE>
+	  [(match_operand:<VWIDE_PRED> 1 "register_operand" "Upl")
+	   (unspec:<VWIDE>
+	     [(match_operand:SVE_HSF 2 "register_operand" "w")]
 	     UNSPEC_FLOAT_CONVERT)]
 	  UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE"
-  "fcvt\t%0.d, %1/m, %2.s"
+  "fcvt\t%0.<Vewtype>, %1/m, %2.<Vetype>"
 )
 
+;; PUNPKHI and PUNPKLO.
 (define_insn "vec_unpack<su>_<perm_hilo>_<mode>"
   [(set (match_operand:<VWIDE> 0 "register_operand" "=Upa")
 	(unspec:<VWIDE> [(match_operand:PRED_BHS 1 "register_operand" "Upa")]
@@ -2031,6 +2374,7 @@
   "punpk<perm_hilo>\t%0.h, %1.b"
 )
 
+;; SUNPKHI, UUNPKHI, SUNPKLO and UUNPKLO.
 (define_insn "vec_unpack<su>_<perm_hilo>_<SVE_BHSI:mode>"
   [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
 	(unspec:<VWIDE> [(match_operand:SVE_BHSI 1 "register_operand" "w")]
@@ -2039,38 +2383,37 @@
   "<su>unpk<perm_hilo>\t%0.<Vewtype>, %1.<Vetype>"
 )
 
-;; Used by the vec_unpacks_<perm_hilo>_v8sf expander to unpack the bit
-;; representation of a V8SF without conversion.  The choice between signed
-;; and unsigned isn't significant.
-(define_insn "*vec_unpacku_<perm_hilo>_v8sf_no_convert"
-  [(set (match_operand:V8SF 0 "register_operand" "=w")
-	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "w")]
-		     UNPACK_UNSIGNED))]
+;; Used by the vec_unpacks_<perm_hilo>_<mode> expander to unpack the bit
+;; representation of a V8SF or V16HF without conversion.  The choice between
+;; signed and unsigned isn't significant.
+(define_insn "*vec_unpacku_<perm_hilo>_<mode>_no_convert"
+  [(set (match_operand:SVE_HSF 0 "register_operand" "=w")
+	(unspec:SVE_HSF [(match_operand:SVE_HSF 1 "register_operand" "w")]
+			UNPACK_UNSIGNED))]
   "TARGET_SVE"
-  "uunpk<perm_hilo>\t%0.d, %1.s"
+  "uunpk<perm_hilo>\t%0.<Vewtype>, %1.<Vetype>"
 )
 
-;; Float to double
-;; unpack from v8sf with no conversion
-;; then float convert the unpacked v8sf to v4df
-(define_expand "vec_unpacks_<perm_hilo>_v8sf"
+;; Unpack one half of a V8SF to V4DF, or one half of a V16HF to V8SF.
+;; First unpack the source without conversion, then float-convert the
+;; unpacked source.
+(define_expand "vec_unpacks_<perm_hilo>_<mode>"
   [(set (match_dup 2)
-	(unspec:V8SF [(match_operand:V8SF 1 "register_operand")]
-		     UNPACK_UNSIGNED))
-   (set (match_operand:V4DF 0 "register_operand")
-	(unspec:V4DF [(match_dup 3)
-		      (unspec:V4DF [(match_dup 2)] UNSPEC_FLOAT_CONVERT)]
-		     UNSPEC_MERGE_PTRUE))]
+	(unspec:SVE_HSF [(match_operand:SVE_HSF 1 "register_operand")]
+			UNPACK_UNSIGNED))
+   (set (match_operand:<VWIDE> 0 "register_operand")
+	(unspec:<VWIDE> [(match_dup 3)
+			 (unspec:<VWIDE> [(match_dup 2)] UNSPEC_FLOAT_CONVERT)]
+			UNSPEC_MERGE_PTRUE))]
   "TARGET_SVE"
   {
-    operands[2] = gen_reg_rtx (V8SFmode);
-    operands[3] = force_reg (V4BImode, CONSTM1_RTX (V4BImode));
+    operands[2] = gen_reg_rtx (<MODE>mode);
+    operands[3] = force_reg (<VWIDE_PRED>mode, CONSTM1_RTX (<VWIDE_PRED>mode));
   }
 )
 
-;; Int to double
-;; unpack from v8si to v4di
-;; then convert v8si to v4df
+;; Unpack one half of a V8SI to V4DF.  First unpack from V8SI to V4DI,
+;; reinterpret the V4DI as a V8SI, then convert the unpacked V8SI to V4DF.
 (define_expand "vec_unpack<su_optab>_float_<perm_hilo>_v8si"
   [(set (match_dup 2)
 	(unspec:V4DI [(match_operand:V8SI 1 "register_operand")]
@@ -2087,6 +2430,8 @@
   }
 )
 
+;; Predicate pack.  Use UZP1 on the narrower type, which discards
+;; the high part of each wide element.
 (define_insn "vec_pack_trunc_<Vwide>"
   [(set (match_operand:PRED_BHS 0 "register_operand" "=Upa")
 	(unspec:PRED_BHS
@@ -2097,6 +2442,8 @@
   "uzp1\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
 )
 
+;; Integer pack.  Use UZP1 on the narrower type, which discards
+;; the high part of each wide element.
 (define_insn "vec_pack_trunc_<Vwide>"
   [(set (match_operand:SVE_BHSI 0 "register_operand" "=w")
 	(unspec:SVE_BHSI
@@ -2107,44 +2454,32 @@
   "uzp1\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "*vec_pack_trunc_v4df_no_convert"
-  [(set (match_operand:V8SF 0 "register_operand" "=w")
-	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "w")
-		      (match_operand:V8SF 2 "register_operand" "w")]
-		     UNSPEC_PACK))]
-  "TARGET_SVE"
-  "uzp1\t%0.s, %1.s, %2.s"
-)
-
-;; Double to float
-;; float convert both inputs from v4df to v8sf
-;; then pack them together with no conversion
-(define_expand "vec_pack_trunc_v4df"
+;; Convert two vectors of DF to SF, or two vectors of SF to HF, and pack
+;; the results into a single vector.
+(define_expand "vec_pack_trunc_<Vwide>"
   [(set (match_dup 4)
-	(unspec:V8SF
+	(unspec:SVE_HSF
 	  [(match_dup 3)
-	   (unspec:V8SF [(match_operand:V4DF 1 "register_operand")]
-			UNSPEC_FLOAT_CONVERT)]
+	   (unspec:SVE_HSF [(match_operand:<VWIDE> 1 "register_operand")]
+			   UNSPEC_FLOAT_CONVERT)]
 	  UNSPEC_MERGE_PTRUE))
    (set (match_dup 5)
-	(unspec:V8SF
+	(unspec:SVE_HSF
 	  [(match_dup 3)
-	   (unspec:V8SF [(match_operand:V4DF 2 "register_operand")]
-			UNSPEC_FLOAT_CONVERT)]
+	   (unspec:SVE_HSF [(match_operand:<VWIDE> 2 "register_operand")]
+			   UNSPEC_FLOAT_CONVERT)]
 	  UNSPEC_MERGE_PTRUE))
-   (set (match_operand:V8SF 0 "register_operand")
-	(unspec:V8SF [(match_dup 4) (match_dup 5)] UNSPEC_PACK))]
+   (set (match_operand:SVE_HSF 0 "register_operand")
+	(unspec:SVE_HSF [(match_dup 4) (match_dup 5)] UNSPEC_UZP1))]
   "TARGET_SVE"
   {
-    operands[3] = force_reg (V4BImode, CONSTM1_RTX (V4BImode));
-    operands[4] = gen_reg_rtx (V8SFmode);
-    operands[5] = gen_reg_rtx (V8SFmode);
+    operands[3] = force_reg (<VWIDE_PRED>mode, CONSTM1_RTX (<VWIDE_PRED>mode));
+    operands[4] = gen_reg_rtx (<MODE>mode);
+    operands[5] = gen_reg_rtx (<MODE>mode);
   }
 )
 
-;; Double to int
-;; float convert both inputs from v4df to v8si
-;; then pack v4di to v8si
+;; Convert two vectors of DF to SI and pack the results into a single vector.
 (define_expand "vec_pack_<su>fix_trunc_v4df"
   [(set (match_dup 4)
 	(unspec:V8SI
@@ -2157,14 +2492,12 @@
 	   (FIXUORS:V8SI (match_operand:V4DF 2 "register_operand"))]
 	  UNSPEC_MERGE_PTRUE))
    (set (match_operand:V8SI 0 "register_operand")
-	(unspec:V8SI [(match_dup 6) (match_dup 7)] UNSPEC_PACK))]
+	(unspec:V8SI [(match_dup 4) (match_dup 5)] UNSPEC_UZP1))]
   "TARGET_SVE"
   {
     operands[3] = force_reg (V4BImode, CONSTM1_RTX (V4BImode));
     operands[4] = gen_reg_rtx (V8SImode);
     operands[5] = gen_reg_rtx (V8SImode);
-    operands[6] = gen_rtx_SUBREG (V4DImode, operands[4], 0);
-    operands[7] = gen_rtx_SUBREG (V4DImode, operands[5], 0);
   }
 )
 
@@ -2205,18 +2538,6 @@
   "<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
 )
 
-(define_insn "extract_last_<mode>"
-  [(set (match_operand:<VEL> 0 "register_operand" "=r, w")
-	(unspec:<VEL>
-	  [(match_operand:SVE_ALL 1 "register_operand" "w, w")
-	   (match_operand:<VPRED> 2 "register_operand" "Upl, Upl")]
-	  UNSPEC_LASTB))]
-  "TARGET_SVE"
-  "@
-   lastb\t%<vwcore>0, %2, %1.<Vetype>
-   lastb\t%<Vetype>0, %2, %1.<Vetype>"
-)
-
 (define_insn "read_nf<mode>"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
 	(unspec:PRED_ALL [(reg:SI FFRT_REGNUM)] UNSPEC_READ_NF))
diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md
index 7fcd6cb2c2e..7b3a7460561 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
+	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
 	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a1894a79dd7..b5a179784a6 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -880,6 +880,34 @@ static const struct tune_params qdf24xx_tunings =
   &qdf24xx_prefetch_tune
 };
 
+/* Tuning structure for the Qualcomm Saphira core.  Default to falkor values
+   for now.  */
+static const struct tune_params saphira_tunings =
+{
+  &generic_extra_costs,
+  &generic_addrcost_table,
+  &generic_regmove_cost,
+  &generic_vector_cost,
+  &generic_branch_cost,
+  &generic_approx_modes,
+  4, /* memmov_cost  */
+  4, /* issue_rate  */
+  (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
+   | AARCH64_FUSE_MOVK_MOVK), /* fuseable_ops  */
+  16,	/* function_align.  */
+  8,	/* jump_align.  */
+  16,	/* loop_align.  */
+  2,	/* int_reassoc_width.  */
+  4,	/* fp_reassoc_width.  */
+  1,	/* vec_reassoc_width.  */
+  2,	/* min_div_recip_mul_sf.  */
+  2,	/* min_div_recip_mul_df.  */
+  0,	/* max_case_values.  */
+  tune_params::AUTOPREFETCHER_WEAK,	/* autoprefetcher_model.  */
+  (AARCH64_EXTRA_TUNE_NONE),		/* tune_flags.  */
+  &generic_prefetch_tune
+};
+
 static const struct tune_params thunderx2t99_tunings =
 {
   &thunderx2t99_extra_costs,
@@ -1092,13 +1120,15 @@ aarch64_dbx_register_number (unsigned regno)
      return AARCH64_DWARF_V0 + regno - V0_REGNUM;
    else if (PR_REGNUM_P (regno))
      return AARCH64_DWARF_P0 + regno - P0_REGNUM;
+   else if (regno == VG_REGNUM)
+     return AARCH64_DWARF_VG;
 
    /* Return values >= DWARF_FRAME_REGISTERS indicate that there is no
       equivalent DWARF register.  */
    return DWARF_FRAME_REGISTERS;
 }
 
-/* Return true if MODE is any of the AdvSIMD structure modes.  */
+/* Return true if MODE is any of the Advanced SIMD structure modes.  */
 static bool
 aarch64_advsimd_struct_mode_p (machine_mode mode)
 {
@@ -1117,15 +1147,19 @@ aarch64_sve_pred_mode_p (machine_mode mode)
 	      || mode == V4BImode));
 }
 
-/* Return a set of flags describing the vector properties of mode MODE.
-   Ignore modes that are not supported by the current target.  */
+/* Three mutually-exclusive flags describing a vector or predicate type.  */
 const unsigned int VEC_ADVSIMD  = 1;
 const unsigned int VEC_SVE_DATA = 2;
 const unsigned int VEC_SVE_PRED = 4;
+/* Can be used in combination with VEC_ADVSIMD or VEC_SVE_DATA to indicate
+   a structure of 2, 3 or 4 vectors.  */
+const unsigned int VEC_STRUCT   = 8;
+/* Useful combinations of the above.  */
 const unsigned int VEC_ANY_SVE  = VEC_SVE_DATA | VEC_SVE_PRED;
 const unsigned int VEC_ANY_DATA = VEC_ADVSIMD | VEC_SVE_DATA;
-const unsigned int VEC_STRUCT   = 8;
 
+/* Return a set of flags describing the vector properties of mode MODE.
+   Ignore modes that are not supported by the current target.  */
 static unsigned int
 aarch64_classify_vector_mode (machine_mode mode)
 {
@@ -1145,7 +1179,7 @@ aarch64_classify_vector_mode (machine_mode mode)
 	  || inner == DImode
 	  || inner == DFmode))
     {
-      if (TARGET_SVE && inner != HFmode)
+      if (TARGET_SVE)
 	{
 	  if (must_eq (GET_MODE_BITSIZE (mode), BITS_PER_SVE_VECTOR))
 	    return VEC_SVE_DATA;
@@ -1207,14 +1241,14 @@ aarch64_array_mode_supported_p (machine_mode mode,
   return false;
 }
 
-/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */
+/* Return the SVE predicate mode to use for elements that have
+   ELEM_NBYTES bytes, if such a mode exists.  */
 
-static opt_machine_mode
-aarch64_get_mask_mode (poly_uint64 nunits, poly_uint64 nbytes)
+opt_machine_mode
+aarch64_sve_pred_mode (unsigned int elem_nbytes)
 {
-  if (TARGET_SVE && must_eq (nbytes, BYTES_PER_SVE_VECTOR))
+  if (TARGET_SVE)
     {
-      unsigned int elem_nbytes = vector_element_size (nbytes, nunits);
       if (elem_nbytes == 1)
 	return V32BImode;
       if (elem_nbytes == 2)
@@ -1224,6 +1258,21 @@ aarch64_get_mask_mode (poly_uint64 nunits, poly_uint64 nbytes)
       if (elem_nbytes == 8)
 	return V4BImode;
     }
+  return opt_machine_mode ();
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */
+
+static opt_machine_mode
+aarch64_get_mask_mode (poly_uint64 nunits, poly_uint64 nbytes)
+{
+  if (TARGET_SVE && must_eq (nbytes, BYTES_PER_SVE_VECTOR))
+    {
+      unsigned int elem_nbytes = vector_element_size (nbytes, nunits);
+      machine_mode pred_mode;
+      if (aarch64_sve_pred_mode (elem_nbytes).exists (&pred_mode))
+	return pred_mode;
+    }
 
   return default_get_mask_mode (nunits, nbytes);
 }
@@ -1233,7 +1282,11 @@ aarch64_get_mask_mode (poly_uint64 nunits, poly_uint64 nbytes)
 static unsigned int
 aarch64_hard_regno_nregs (unsigned regno, machine_mode mode)
 {
-  HOST_WIDE_INT size;
+  /* ??? Logically we should only need to provide a value when
+     HARD_REGNO_MODE_OK says that the combination is valid,
+     but at the moment we need to handle all modes.  Just ignore
+     any runtime parts for registers that can't store them.  */
+  HOST_WIDE_INT lowest_size = constant_lower_bound (GET_MODE_SIZE (mode));
   switch (aarch64_regno_regclass (regno))
     {
     case FP_REGS:
@@ -1241,19 +1294,13 @@ aarch64_hard_regno_nregs (unsigned regno, machine_mode mode)
       if (aarch64_sve_data_mode_p (mode))
 	return exact_div (GET_MODE_SIZE (mode),
 			  BYTES_PER_SVE_VECTOR).to_constant ();
-      /* ??? In the rest of this function it would probably make
-	 sense to assert for invalid modes, but it's likely that there
-	 are still callers to this function that don't check
-	 HARD_REGNO_MODE_OK first.  */
-      size = constant_lower_bound (GET_MODE_SIZE (mode));
-      return CEIL (size, UNITS_PER_VREG);
+      return CEIL (lowest_size, UNITS_PER_VREG);
     case PR_REGS:
     case PR_LO_REGS:
     case PR_HI_REGS:
       return 1;
     default:
-      size = constant_lower_bound (GET_MODE_SIZE (mode));
-      return CEIL (size, UNITS_PER_WORD);
+      return CEIL (lowest_size, UNITS_PER_WORD);
     }
   gcc_unreachable ();
 }
@@ -1266,6 +1313,10 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode)
   if (GET_MODE_CLASS (mode) == MODE_CC)
     return regno == CC_REGNUM;
 
+  if (regno == VG_REGNUM)
+    /* This is expected to have the same size as _Unwind_Word.  */
+    return mode == DImode;
+
   unsigned int vec_flags = aarch64_classify_vector_mode (mode);
   if (vec_flags & VEC_SVE_PRED)
     return PR_REGNUM_P (regno);
@@ -1325,26 +1376,19 @@ aarch64_regmode_natural_size (machine_mode mode)
 
 /* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
 machine_mode
-aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned nregs,
+aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned,
 				     machine_mode mode)
 {
-  /* Handle modes that fit within single registers.  */
-  if (GP_REGNUM_P (regno) || FP_REGNUM_P (regno))
-    {
-      if (must_ge (GET_MODE_SIZE (mode), 4))
-	return mode;
-      else
-	return SImode;
-    }
   /* The predicate mode determines which bits are significant and
      which are "don't care".  Decreasing the number of lanes would
      lose data while increasing the number of lanes would make bits
      unnecessarily significant.  */
-  else if (PR_REGNUM_P (regno))
+  if (PR_REGNUM_P (regno))
+    return mode;
+  if (must_ge (GET_MODE_SIZE (mode), 4))
     return mode;
-  /* Fall back to generic for multi-reg and very large modes.  */
   else
-    return choose_hard_reg_mode (regno, nregs, false);
+    return SImode;
 }
 
 /* Implement TARGET_CONSTANT_ALIGNMENT.  Make strings word-aligned so
@@ -2053,9 +2097,9 @@ aarch64_sve_cnt_immediate_p (rtx x)
    operand (a vector pattern followed by a multiplier in the range [1, 16]).
    PREFIX is the mnemonic without the size suffix and OPERANDS is the
    first part of the operands template (the part that comes before the
-   vector size itself).  FACTOR is the multiple of VQ that is needed.
-   NELTS_PER_VQ, if nonzero, is the number of elements in each vector
-   quadword.  If it is zero, we can use any element size.  */
+   vector size itself).  FACTOR is the number of quadwords.
+   NELTS_PER_VQ, if nonzero, is the number of elements in each quadword.
+   If it is zero, we can use any element size.  */
 
 static char *
 aarch64_output_sve_cnt_immediate (const char *prefix, const char *operands,
@@ -2070,7 +2114,7 @@ aarch64_output_sve_cnt_immediate (const char *prefix, const char *operands,
        multiplier is 1 whereever possible.  */
     nelts_per_vq = factor & -factor;
   int shift = std::min (exact_log2 (nelts_per_vq), 4);
-  gcc_assert (shift > 0);
+  gcc_assert (IN_RANGE (shift, 1, 4));
   char suffix = "dwhb"[shift - 1];
 
   factor >>= shift;
@@ -2133,15 +2177,20 @@ aarch64_sve_addvl_addpl_immediate_p (rtx x)
 char *
 aarch64_output_sve_addvl_addpl (rtx dest, rtx base, rtx offset)
 {
-  static char buffer[sizeof ("addpl\t%x0, %x1, #-32")];
+  static char buffer[sizeof ("addpl\t%x0, %x1, #-") + 3 * sizeof (int)];
   poly_int64 offset_value = rtx_to_poly_int64 (offset);
   gcc_assert (aarch64_sve_addvl_addpl_immediate_p (offset_value));
 
-  /* Use INC if possible.  */
-  if (rtx_equal_p (dest, base)
-      && GP_REGNUM_P (REGNO (dest))
-      && aarch64_sve_cnt_immediate_p (offset_value))
-    return aarch64_output_sve_cnt_immediate ("inc", "%x0", offset);
+  /* Use INC or DEC if possible.  */
+  if (rtx_equal_p (dest, base) && GP_REGNUM_P (REGNO (dest)))
+    {
+      if (aarch64_sve_cnt_immediate_p (offset_value))
+	return aarch64_output_sve_cnt_immediate ("inc", "%x0",
+						 offset_value.coeffs[1], 0);
+      if (aarch64_sve_cnt_immediate_p (-offset_value))
+	return aarch64_output_sve_cnt_immediate ("dec", "%x0",
+						 -offset_value.coeffs[1], 0);
+    }
 
   int factor = offset_value.coeffs[1];
   if ((factor & 15) == 0)
@@ -2340,39 +2389,30 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate,
   return num_insns;
 }
 
-/* Return the number of temporary registers that aarch64_add_constant
+/* Return the number of temporary registers that aarch64_add_offset_1
    would need to add OFFSET to a register.  */
 
 static unsigned int
-aarch64_add_constant_temporaries (HOST_WIDE_INT offset)
+aarch64_add_offset_1_temporaries (HOST_WIDE_INT offset)
 {
   return abs_hwi (offset) < 0x1000000 ? 0 : 1;
 }
 
-/* Set DEST to REG + DELTA.  MODE is the mode of the addition.
-   FRAME_RELATED_P should be true if the RTX_FRAME_RELATED flag should
-   be set and CFA adjustments added to the generated instructions.
-
-   TEMP1, if nonnull, is a register of mode MODE that can be used as a
-   temporary if register allocation is already complete.  This temporary
-   register may overlap DEST but must not overlap REG.  If TEMP1 is known
-   to hold abs (DELTA), EMIT_MOVE_IMM can be set to false to avoid emitting
-   the immediate again.
-
-   Since this function may be used to adjust the stack pointer, we must
-   ensure that it cannot cause transient stack deallocation (for example
-   by first incrementing SP and then decrementing when adjusting by a
-   large immediate).  */
+/* A subroutine of aarch64_add_offset that handles the case in which
+   OFFSET is known at compile time.  The arguments are otherwise the same.  */
 
 static void
-aarch64_add_constant_internal (scalar_int_mode mode, rtx dest, rtx temp1,
-			       rtx src, HOST_WIDE_INT delta,
-			       bool frame_related_p, bool emit_move_imm)
+aarch64_add_offset_1 (scalar_int_mode mode, rtx dest,
+		      rtx src, HOST_WIDE_INT offset, rtx temp1,
+		      bool frame_related_p, bool emit_move_imm)
 {
-  HOST_WIDE_INT mdelta = abs_hwi (delta);
+  gcc_assert (emit_move_imm || temp1 != NULL_RTX);
+  gcc_assert (temp1 == NULL_RTX || !reg_overlap_mentioned_p (temp1, src));
+
+  HOST_WIDE_INT moffset = abs_hwi (offset);
   rtx_insn *insn;
 
-  if (!mdelta)
+  if (!moffset)
     {
       if (!rtx_equal_p (dest, src))
 	{
@@ -2383,48 +2423,49 @@ aarch64_add_constant_internal (scalar_int_mode mode, rtx dest, rtx temp1,
     }
 
   /* Single instruction adjustment.  */
-  if (aarch64_uimm12_shift (mdelta))
+  if (aarch64_uimm12_shift (moffset))
     {
-      insn = emit_insn (gen_add3_insn (dest, src, GEN_INT (delta)));
+      insn = emit_insn (gen_add3_insn (dest, src, GEN_INT (offset)));
       RTX_FRAME_RELATED_P (insn) = frame_related_p;
       return;
     }
 
-  /* Emit 2 additions/subtractions if the adjustment is less than 24 bits.
-     Only do this if mdelta is not a 16-bit move as adjusting using a move
-     is better.  */
-  if (mdelta < 0x1000000 && (!temp1 || !aarch64_move_imm (mdelta, mode)))
+  /* Emit 2 additions/subtractions if the adjustment is less than 24 bits
+     and either:
+
+     a) the offset cannot be loaded by a 16-bit move or
+     b) there is no spare register into which we can move it.  */
+  if (moffset < 0x1000000
+      && ((!temp1 && !can_create_pseudo_p ())
+	  || !aarch64_move_imm (moffset, mode)))
     {
-      HOST_WIDE_INT low_off = mdelta & 0xfff;
+      HOST_WIDE_INT low_off = moffset & 0xfff;
 
-      low_off = delta < 0 ? -low_off : low_off;
+      low_off = offset < 0 ? -low_off : low_off;
       insn = emit_insn (gen_add3_insn (dest, src, GEN_INT (low_off)));
       RTX_FRAME_RELATED_P (insn) = frame_related_p;
-      insn = emit_insn (gen_add2_insn (dest, GEN_INT (delta - low_off)));
+      insn = emit_insn (gen_add2_insn (dest, GEN_INT (offset - low_off)));
       RTX_FRAME_RELATED_P (insn) = frame_related_p;
       return;
     }
 
   /* Emit a move immediate if required and an addition/subtraction.  */
-  if (!temp1 || emit_move_imm)
-    temp1 = aarch64_force_temporary (mode, temp1, GEN_INT (mdelta));
-  insn = emit_insn (delta < 0 ? gen_sub3_insn (dest, src, temp1)
-			      : gen_add3_insn (dest, src, temp1));
+  if (emit_move_imm)
+    {
+      gcc_assert (temp1 != NULL_RTX || can_create_pseudo_p ());
+      temp1 = aarch64_force_temporary (mode, temp1, GEN_INT (moffset));
+    }
+  insn = emit_insn (offset < 0
+		    ? gen_sub3_insn (dest, src, temp1)
+		    : gen_add3_insn (dest, src, temp1));
   if (frame_related_p)
     {
       RTX_FRAME_RELATED_P (insn) = frame_related_p;
-      rtx adj = plus_constant (mode, src, delta);
+      rtx adj = plus_constant (mode, src, offset);
       add_reg_note (insn, REG_CFA_ADJUST_CFA, gen_rtx_SET (dest, adj));
     }
 }
 
-static inline void
-aarch64_add_constant (scalar_int_mode mode, rtx reg, rtx temp1,
-		      HOST_WIDE_INT delta)
-{
-  aarch64_add_constant_internal (mode, reg, temp1, reg, delta, false, true);
-}
-
 /* Return the number of temporary registers that aarch64_add_offset
    would need to move OFFSET into a register or add OFFSET to a register;
    ADD_P is true if we want the latter rather than the former.  */
@@ -2455,16 +2496,29 @@ aarch64_offset_temporaries (bool add_p, poly_int64 offset)
 	 be shifted).  */
       count += 1;
     }
-  return count + aarch64_add_constant_temporaries (constant);
+  return count + aarch64_add_offset_1_temporaries (constant);
 }
 
-/* Set DEST to REG + OFFSET.  MODE is the mode of the addition.
-   FRAME_RELATED_P should be true if the RTX_FRAME_RELATED flag should
+/* If X can be represented as a poly_int64, return the number
+   of temporaries that are required to add it to a register.
+   Return -1 otherwise.  */
+
+int
+aarch64_add_offset_temporaries (rtx x)
+{
+  poly_int64 offset;
+  if (!poly_int_rtx_p (x, &offset))
+    return -1;
+  return aarch64_offset_temporaries (true, offset);
+}
+
+/* Set DEST to SRC + OFFSET.  MODE is the mode of the addition.
+   FRAME_RELATED_P is true if the RTX_FRAME_RELATED flag should
    be set and CFA adjustments added to the generated instructions.
 
    TEMP1, if nonnull, is a register of mode MODE that can be used as a
    temporary if register allocation is already complete.  This temporary
-   register may overlap DEST if !FRAME_RELATED_P but must not overlap REG.
+   register may overlap DEST if !FRAME_RELATED_P but must not overlap SRC.
    If TEMP1 is known to hold abs (OFFSET), EMIT_MOVE_IMM can be set to
    false to avoid emitting the immediate again.
 
@@ -2477,20 +2531,22 @@ aarch64_offset_temporaries (bool add_p, poly_int64 offset)
    large immediate).  */
 
 static void
-aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
-		    rtx reg, poly_int64 offset, bool frame_related_p,
-		    bool emit_move_imm = true)
+aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src,
+		    poly_int64 offset, rtx temp1, rtx temp2,
+		    bool frame_related_p, bool emit_move_imm = true)
 {
-  if (temp1 && frame_related_p)
-    gcc_checking_assert (!reg_overlap_mentioned_p (dest, temp1));
-  if (temp2)
-    gcc_checking_assert (!reg_overlap_mentioned_p (dest, temp2));
+  gcc_assert (emit_move_imm || temp1 != NULL_RTX);
+  gcc_assert (temp1 == NULL_RTX || !reg_overlap_mentioned_p (temp1, src));
+  gcc_assert (temp1 == NULL_RTX
+	      || !frame_related_p
+	      || !reg_overlap_mentioned_p (temp1, dest));
+  gcc_assert (temp2 == NULL_RTX || !reg_overlap_mentioned_p (dest, temp2));
 
   /* Try using ADDVL or ADDPL to add the whole value.  */
-  if (reg != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (offset))
+  if (src != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (offset))
     {
       rtx offset_rtx = gen_int_mode (offset, mode);
-      rtx_insn *insn = emit_insn (gen_add3_insn (dest, reg, offset_rtx));
+      rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx));
       RTX_FRAME_RELATED_P (insn) = frame_related_p;
       return;
     }
@@ -2504,22 +2560,22 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 
   /* Try using ADDVL or ADDPL to add the VG-based part.  */
   poly_int64 poly_offset (factor, factor);
-  if (reg != const0_rtx
+  if (src != const0_rtx
       && aarch64_sve_addvl_addpl_immediate_p (poly_offset))
     {
       rtx offset_rtx = gen_int_mode (poly_offset, mode);
-      rtx addr = gen_rtx_PLUS (mode, reg, offset_rtx);
       if (frame_related_p)
 	{
-	  rtx_insn *insn = emit_insn (gen_rtx_SET (dest, addr));
+	  rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx));
 	  RTX_FRAME_RELATED_P (insn) = true;
-	  reg = dest;
+	  src = dest;
 	}
       else
 	{
-	  reg = aarch64_force_temporary (mode, temp1, addr);
+	  rtx addr = gen_rtx_PLUS (mode, src, offset_rtx);
+	  src = aarch64_force_temporary (mode, temp1, addr);
 	  temp1 = temp2;
-	  temp2 = 0;
+	  temp2 = NULL_RTX;
 	}
     }
   /* Otherwise use a CNT-based sequence.  */
@@ -2533,10 +2589,9 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 	  code = MINUS;
 	}
 
-      /* Calculate CNTD * FACTOR / 2.  */
+      /* Calculate CNTD * FACTOR / 2.  First try to fold the division
+	 into the multiplication.  */
       rtx val;
-
-      /* First try to fold the division into the multiplication.  */
       int shift = 0;
       if (factor & 1)
 	/* Use a right shift by 1.  */
@@ -2548,9 +2603,10 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 	{
 	  if (factor > 16 * 8)
 	    {
-	      /* Use "CNTB Xn, ALL, MUL #NEW_FACTOR", then shift the
-		 result into position.  */
-	      int extra_shift = exact_log2 (low_bit) - 3;
+	      /* "CNTB Xn, ALL, MUL #FACTOR" is out of range, so calculate
+		 the value with the minimum multiplier and shift it into
+		 position.  */
+	      int extra_shift = exact_log2 (low_bit);
 	      shift += extra_shift;
 	      factor >>= extra_shift;
 	    }
@@ -2564,7 +2620,7 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 
 	  /* Go back to using a negative multiplication factor if we have
 	     no register from which to subtract.  */
-	  if (code == MINUS && reg == const0_rtx)
+	  if (code == MINUS && src == const0_rtx)
 	    {
 	      factor = -factor;
 	      code = PLUS;
@@ -2587,11 +2643,11 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 	  val = gen_rtx_ASHIFTRT (mode, val, const1_rtx);
 	}
 
-      /* Calculate REG +/- VL*FACTOR.  */
-      if (reg != const0_rtx)
+      /* Calculate SRC +/- CNTD * FACTOR / 2.  */
+      if (src != const0_rtx)
 	{
 	  val = aarch64_force_temporary (mode, temp1, val);
-	  val = gen_rtx_fmt_ee (code, mode, reg, val);
+	  val = gen_rtx_fmt_ee (code, mode, src, val);
 	}
       else if (code == MINUS)
 	{
@@ -2606,62 +2662,58 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx temp1, rtx temp2,
 	    {
 	      RTX_FRAME_RELATED_P (insn) = true;
 	      add_reg_note (insn, REG_CFA_ADJUST_CFA,
-			    gen_rtx_SET (dest, plus_constant (Pmode, reg,
+			    gen_rtx_SET (dest, plus_constant (Pmode, src,
 							      poly_offset)));
 	    }
-	  reg = dest;
+	  src = dest;
 	  if (constant == 0)
 	    return;
 	}
       else
 	{
-	  reg = aarch64_force_temporary (mode, temp1, val);
+	  src = aarch64_force_temporary (mode, temp1, val);
 	  temp1 = temp2;
-	  temp2 = 0;
+	  temp2 = NULL_RTX;
 	}
 
       emit_move_imm = true;
     }
 
-  aarch64_add_constant_internal (mode, dest, temp1, reg, constant,
-				 frame_related_p, emit_move_imm);
+  aarch64_add_offset_1 (mode, dest, src, constant, temp1,
+			frame_related_p, emit_move_imm);
 }
 
-static inline void
-aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, bool emit_move_imm)
-{
-  aarch64_add_offset (Pmode, stack_pointer_rtx, temp1, temp2,
-		      stack_pointer_rtx, delta, true, emit_move_imm);
-}
+/* Like aarch64_add_offset, but the offset is given as an rtx rather
+   than a poly_int64.  */
 
-static inline void
-aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p)
+void
+aarch64_split_add_offset (scalar_int_mode mode, rtx dest, rtx src,
+			  rtx offset_rtx, rtx temp1, rtx temp2)
 {
-  aarch64_add_offset (Pmode, stack_pointer_rtx, temp1, temp2,
-		      stack_pointer_rtx, -delta, frame_related_p);
+  aarch64_add_offset (mode, dest, src, rtx_to_poly_int64 (offset_rtx),
+		      temp1, temp2, false);
 }
 
-/* If X is a polynomial constant, return the number of temporaries that
-   are required to add it to a register.  Return -1 otherwise.  */
+/* Add DELTA to the stack pointer, marking the instructions frame-related.
+   TEMP1 is available as a temporary if nonnull.  EMIT_MOVE_IMM is false
+   if TEMP1 already contains abs (DELTA).  */
 
-int
-aarch64_add_offset_temporaries (rtx x)
+static inline void
+aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, bool emit_move_imm)
 {
-  poly_int64 offset;
-  if (!poly_int_rtx_p (x, &offset))
-    return -1;
-  return aarch64_offset_temporaries (true, offset);
+  aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, delta,
+		      temp1, temp2, true, emit_move_imm);
 }
 
-/* Like aarch64_add_offset, but the offset is given as an rtx rather
-   than a poly_int64.  */
+/* Subtract DELTA from the stack pointer, marking the instructions
+   frame-related if FRAME_RELATED_P.  TEMP1 is available as a temporary
+   if nonnull.  */
 
-void
-aarch64_split_add_offset (scalar_int_mode mode, rtx dest, rtx temp1,
-			  rtx temp2, rtx reg, rtx offset_rtx)
+static inline void
+aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p)
 {
-  aarch64_add_offset (mode, dest, temp1, temp2, reg,
-		      rtx_to_poly_int64 (offset_rtx), false);
+  aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, -delta,
+		      temp1, temp2, frame_related_p);
 }
 
 /* Set DEST to (vec_series BASE STEP).  */
@@ -2672,11 +2724,7 @@ aarch64_expand_vec_series (rtx dest, rtx base, rtx step)
   machine_mode mode = GET_MODE (dest);
   scalar_mode inner = GET_MODE_INNER (mode);
 
-  /* At this point we have to decide which variant of the index insn to use:
-       1. index imm, reg
-       2. index reg, imm
-       3. index reg, reg.  */
-
+  /* Each operand can be a register or an immediate in the range [-16, 15].  */
   if (!aarch64_sve_index_immediate_p (base))
     base = force_reg (inner, base);
   if (!aarch64_sve_index_immediate_p (step))
@@ -2724,32 +2772,25 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm,
 		 than that.  */
 	      if (partial_subreg_p (int_mode, SImode))
 		{
+		  /* We shouldn't be doing symbol calculations in modes
+		     narrower than SImode.  */
+		  gcc_assert (base == const0_rtx);
 		  dest = gen_lowpart (SImode, dest);
 		  int_mode = SImode;
 		}
 	      if (base != const0_rtx)
 		{
 		  base = aarch64_force_temporary (int_mode, dest, base);
-		  aarch64_add_offset (int_mode, dest, NULL_RTX, NULL_RTX,
-				      base, offset, false);
+		  aarch64_add_offset (int_mode, dest, base, offset,
+				      NULL_RTX, NULL_RTX, false);
 		}
 	      else
-		aarch64_add_offset (int_mode, dest, dest, NULL_RTX,
-				    base, offset, false);
+		aarch64_add_offset (int_mode, dest, base, offset,
+				    dest, NULL_RTX, false);
 	    }
 	  return;
 	}
 
-      /* Cope with complex (const ...) expressions involving VL.  */
-      if (GET_CODE (base) != SYMBOL_REF
-	  && GET_CODE (base) != LABEL_REF)
-	{
-	  base = force_reg (int_mode, base);
-	  aarch64_add_offset (int_mode, dest, dest, NULL_RTX,
-			      base, offset, false);
-	  return;
-	}
-
       sty = aarch64_classify_symbol (base, const_offset);
       switch (sty)
 	{
@@ -2759,8 +2800,8 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm,
 	    {
 	      gcc_assert (can_create_pseudo_p ());
 	      base = aarch64_force_temporary (int_mode, dest, base);
-	      aarch64_add_offset (int_mode, dest, NULL_RTX, NULL_RTX,
-				  base, const_offset, false);
+	      aarch64_add_offset (int_mode, dest, base, const_offset,
+				  NULL_RTX, NULL_RTX, false);
 	      return;
 	    }
 
@@ -2799,8 +2840,8 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm,
 	    {
 	      gcc_assert(can_create_pseudo_p ());
 	      base = aarch64_force_temporary (int_mode, dest, base);
-	      aarch64_add_offset (int_mode, dest, NULL_RTX, NULL_RTX,
-				  base, const_offset, false);
+	      aarch64_add_offset (int_mode, dest, base, const_offset,
+				  NULL_RTX, NULL_RTX, false);
 	      return;
 	    }
 	  /* FALLTHRU */
@@ -2847,7 +2888,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm,
 	{
 	  rtx mem = force_const_mem (mode, imm);
 	  gcc_assert (mem);
-	  emit_insn (gen_rtx_SET (dest, mem));
+	  emit_move_insn (dest, mem);
 	}
 
       return;
@@ -2857,31 +2898,39 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm,
 				  as_a <scalar_int_mode> (mode));
 }
 
+/* Emit an SVE predicated move from SRC to DEST.  PRED is a predicate
+   that is known to contain PTRUE.  */
+
+void
+aarch64_emit_sve_pred_move (rtx dest, rtx pred, rtx src)
+{
+  emit_insn (gen_rtx_SET (dest, gen_rtx_UNSPEC (GET_MODE (dest),
+						gen_rtvec (2, pred, src),
+						UNSPEC_MERGE_PTRUE)));
+}
+
 /* Expand a pre-RA SVE data move from SRC to DEST in which at least one
    operand is in memory.  In this case we need to use the predicated LD1
    and ST1 instead of LDR and STR, both for correctness on big-endian
    targets and because LD1 and ST1 support a wider range of addressing modes.
-   PRED_MODE is the mode of the predicate and GEN_PRED_MOVE is the
-   generator for the predicated move pattern.  */
+   PRED_MODE is the mode of the predicate that should be used.  */
 
 void
-aarch64_expand_sve_mem_move (rtx dest, rtx src, machine_mode pred_mode,
-			     rtx (*gen_pred_move) (rtx, rtx, rtx))
+aarch64_expand_sve_mem_move (rtx dest, rtx src, machine_mode pred_mode)
 {
   machine_mode mode = GET_MODE (dest);
   rtx ptrue = force_reg (pred_mode, CONSTM1_RTX (pred_mode));
-  if (register_operand (src, mode)
-      || register_operand (dest, mode))
-    emit_insn (gen_pred_move (dest, ptrue, src));
-  else
+  if (!register_operand (src, mode)
+      && !register_operand (dest, mode))
     {
       rtx tmp = gen_reg_rtx (mode);
       if (MEM_P (src))
-	emit_insn (gen_pred_move (tmp, ptrue, src));
+	aarch64_emit_sve_pred_move (tmp, ptrue, src);
       else
 	emit_move_insn (tmp, src);
-      emit_insn (gen_pred_move (dest, ptrue, tmp));
+      src = tmp;
     }
+  aarch64_emit_sve_pred_move (dest, ptrue, src);
 }
 
 static bool
@@ -2908,7 +2957,8 @@ aarch64_pass_by_reference (cumulative_args_t pcum ATTRIBUTE_UNUSED,
   if (mode == BLKmode && type)
     size = int_size_in_bytes (type);
   else
-    /* We don't support passing and returning SVE types.  */
+    /* No frontends can create types with variable-sized modes, so we
+       shouldn't be asked to pass or return them.  */
     size = GET_MODE_SIZE (mode).to_constant ();
 
   /* Aggregates are passed by reference based on their size.  */
@@ -3137,7 +3187,8 @@ aarch64_layout_arg (cumulative_args_t pcum_v, machine_mode mode,
   if (type)
     size = int_size_in_bytes (type);
   else
-    /* We don't support passing and returning SVE types.  */
+    /* No frontends can create types with variable-sized modes, so we
+       shouldn't be asked to pass or return them.  */
     size = GET_MODE_SIZE (mode).to_constant ();
   size = ROUND_UP (size, UNITS_PER_WORD);
 
@@ -3422,7 +3473,8 @@ aarch64_pad_reg_upward (machine_mode mode, const_tree type,
       if (type)
 	size = int_size_in_bytes (type);
       else
-	/* We don't support passing and returning SVE types.  */
+	/* No frontends can create types with variable-sized modes, so we
+	   shouldn't be asked to pass or return them.  */
 	size = GET_MODE_SIZE (mode).to_constant ();
       if (size < 2 * UNITS_PER_WORD)
 	return true;
@@ -3452,12 +3504,19 @@ aarch64_libgcc_cmp_return_mode (void)
 #define PROBE_STACK_FIRST_REG  9
 #define PROBE_STACK_SECOND_REG 10
 
-/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
+/* Emit code to probe a range of stack addresses from FIRST to FIRST+POLY_SIZE,
    inclusive.  These are offsets from the current stack pointer.  */
 
 static void
-aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
+aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 poly_size)
 {
+  HOST_WIDE_INT size;
+  if (!poly_size.is_constant (&size))
+    {
+      sorry ("stack probes for SVE frames");
+      return;
+    }
+
   rtx reg1 = gen_rtx_REG (Pmode, PROBE_STACK_FIRST_REG);
 
   /* See the same assertion on PROBE_INTERVAL above.  */
@@ -3465,20 +3524,19 @@ aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
 
   /* See if we have a constant small number of probes to generate.  If so,
      that's the easy case.  */
-  HOST_WIDE_INT const_size;
-  if (size.is_constant (&const_size) && const_size <= PROBE_INTERVAL)
+  if (size <= PROBE_INTERVAL)
     {
-      const HOST_WIDE_INT base = ROUND_UP (const_size, ARITH_FACTOR);
+      const HOST_WIDE_INT base = ROUND_UP (size, ARITH_FACTOR);
 
       emit_set_insn (reg1,
 		     plus_constant (Pmode,
 				    stack_pointer_rtx, -(first + base)));
-      emit_stack_probe (plus_constant (Pmode, reg1, base - const_size));
+      emit_stack_probe (plus_constant (Pmode, reg1, base - size));
     }
 
   /* The run-time loop is made up of 8 insns in the generic case while the
      compile-time loop is made up of 4+2*(n-2) insns for n # of intervals.  */
-  else if (size.is_constant (&const_size) && const_size <= 4 * PROBE_INTERVAL)
+  else if (size <= 4 * PROBE_INTERVAL)
     {
       HOST_WIDE_INT i, rem;
 
@@ -3491,14 +3549,14 @@ aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
       /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 2 until
 	 it exceeds SIZE.  If only two probes are needed, this will not
 	 generate any code.  Then probe at FIRST + SIZE.  */
-      for (i = 2 * PROBE_INTERVAL; i < const_size; i += PROBE_INTERVAL)
+      for (i = 2 * PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
 	{
 	  emit_set_insn (reg1,
 			 plus_constant (Pmode, reg1, -PROBE_INTERVAL));
 	  emit_stack_probe (reg1);
 	}
 
-      rem = const_size - (i - PROBE_INTERVAL);
+      rem = size - (i - PROBE_INTERVAL);
       if (rem > 256)
 	{
 	  const HOST_WIDE_INT base = ROUND_UP (rem, ARITH_FACTOR);
@@ -3521,12 +3579,8 @@ aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
 
       /* Step 1: round SIZE to the previous multiple of the interval.  */
 
-      rtx rounded_size;
-      if (size.is_constant (&const_size))
-	rounded_size = GEN_INT (const_size & -PROBE_INTERVAL);
-      else
-	/* Pending SVE support.  */
-	gcc_unreachable ();
+      HOST_WIDE_INT rounded_size = size & -PROBE_INTERVAL;
+
 
       /* Step 2: compute initial and final value of the loop counter.  */
 
@@ -3535,24 +3589,18 @@ aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
 		     plus_constant (Pmode, stack_pointer_rtx, -first));
 
       /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
-      HOST_WIDE_INT adjustment = -first;
-      if (CONST_INT_P (rounded_size))
-	adjustment -= INTVAL (rounded_size);
-      if (!aarch64_uimm12_shift (adjustment))
+      HOST_WIDE_INT adjustment = - (first + rounded_size);
+      if (! aarch64_uimm12_shift (adjustment))
 	{
-	  aarch64_internal_mov_immediate (reg2,
-					  gen_int_mode (adjustment, Pmode),
+	  aarch64_internal_mov_immediate (reg2, GEN_INT (adjustment),
 					  true, Pmode);
 	  emit_set_insn (reg2, gen_rtx_PLUS (Pmode, stack_pointer_rtx, reg2));
 	}
       else
 	{
 	  emit_set_insn (reg2,
-			 plus_constant (Pmode, stack_pointer_rtx,
-					adjustment));
+			 plus_constant (Pmode, stack_pointer_rtx, adjustment));
 	}
-      if (!CONST_INT_P (rounded_size))
-	emit_set_insn (reg2, gen_rtx_MINUS (Pmode, reg2, rounded_size));
 	  	
       /* Step 3: the loop
 
@@ -3572,11 +3620,9 @@ aarch64_emit_probe_stack_range (HOST_WIDE_INT first, poly_int64 size)
       /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
 	 that SIZE is equal to ROUNDED_SIZE.  */
 
-      if (!size.is_constant (&const_size))
-	gcc_unreachable ();
-      else if (const_size != INTVAL (rounded_size))
+      if (size != rounded_size)
 	{
-	  HOST_WIDE_INT rem = const_size - INTVAL (rounded_size);
+	  HOST_WIDE_INT rem = size - rounded_size;
 
 	  if (rem > 256)
 	    {
@@ -3615,7 +3661,14 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
   output_asm_insn ("sub\t%0, %0, %1", xops);
 
   /* Probe at TEST_ADDR.  */
-  output_asm_insn ("str\txzr, [%0]", xops);
+  if (flag_stack_clash_protection)
+    {
+      gcc_assert (xops[0] == stack_pointer_rtx);
+      xops[1] = GEN_INT (PROBE_INTERVAL - 8);
+      output_asm_insn ("str\txzr, [%0, %1]", xops);
+    }
+  else
+    output_asm_insn ("str\txzr, [%0]", xops);
 
   /* Test if TEST_ADDR == LAST_ADDR.  */
   xops[1] = reg2;
@@ -3632,16 +3685,13 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
 static bool
 aarch64_frame_pointer_required (void)
 {
-  /* In aarch64_override_options_after_change
-     flag_omit_leaf_frame_pointer turns off the frame pointer by
-     default.  Turn it back on now if we've not got a leaf
-     function.  */
-  if (flag_omit_leaf_frame_pointer
-      && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))
-    return true;
-
-  /* Force a frame pointer for EH returns so the return address is at FP+8.  */
-  if (crtl->calls_eh_return)
+  /* Use the frame pointer if enabled and it is not a leaf function, unless
+     leaf frame pointer omission is disabled.  If the frame pointer is enabled,
+     force the frame pointer in leaf functions which use LR.  */
+  if (flag_omit_frame_pointer == 2
+      && !(flag_omit_leaf_frame_pointer
+	   && crtl->is_leaf
+	   && !df_regs_ever_live_p (LR_REGNUM)))
     return true;
 
   return false;
@@ -3659,6 +3709,10 @@ aarch64_layout_frame (void)
   if (reload_completed && cfun->machine->frame.laid_out)
     return;
 
+  /* Force a frame chain for EH returns so the return address is at FP+8.  */
+  cfun->machine->frame.emit_frame_chain
+    = frame_pointer_needed || crtl->calls_eh_return;
+
 #define SLOT_NOT_REQUIRED (-2)
 #define SLOT_REQUIRED     (-1)
 
@@ -3693,14 +3747,14 @@ aarch64_layout_frame (void)
 	last_fp_reg = regno;
       }
 
-  if (frame_pointer_needed)
+  if (cfun->machine->frame.emit_frame_chain)
     {
       /* FP and LR are placed in the linkage record.  */
       cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
-      offset += 2 * UNITS_PER_WORD;
+      offset = 2 * UNITS_PER_WORD;
     }
 
   /* Now assign stack slots for them.  */
@@ -3752,6 +3806,8 @@ aarch64_layout_frame (void)
 			   STACK_BOUNDARY / BITS_PER_UNIT);
 
   /* Both these values are already aligned.  */
+  gcc_assert (multiple_p (crtl->outgoing_args_size,
+			  STACK_BOUNDARY / BITS_PER_UNIT));
   cfun->machine->frame.frame_size
     = (cfun->machine->frame.hard_fp_offset
        + crtl->outgoing_args_size);
@@ -3804,20 +3860,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.final_adjust
 	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
     }
-  else if (!frame_pointer_needed
-	   && varargs_and_saved_regs_size < max_push_offset)
-    {
-      /* Frame with large local area and outgoing arguments (this pushes the
-	 callee-saves first, followed by the locals and outgoing area):
-	 stp reg1, reg2, [sp, -varargs_and_saved_regs_size]!
-	 stp reg3, reg4, [sp, 16]
-	 sub sp, sp, frame_size - varargs_and_saved_regs_size  */
-      cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size;
-      cfun->machine->frame.final_adjust
-	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset;
-    }
   else
     {
       /* Frame with large local area and outgoing arguments using frame pointer:
@@ -4141,6 +4183,9 @@ aarch64_restore_callee_saves (machine_mode mode,
     }
 }
 
+/* Return true if OFFSET is a signed 4-bit value multiplied by the size
+   of MODE.  */
+
 static inline bool
 offset_4bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 {
@@ -4149,6 +4194,9 @@ offset_4bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 	  && IN_RANGE (multiple, -8, 7));
 }
 
+/* Return true if OFFSET is a unsigned 6-bit value multiplied by the size
+   of MODE.  */
+
 static inline bool
 offset_6bit_unsigned_scaled_p (machine_mode mode, poly_int64 offset)
 {
@@ -4157,6 +4205,9 @@ offset_6bit_unsigned_scaled_p (machine_mode mode, poly_int64 offset)
 	  && IN_RANGE (multiple, 0, 63));
 }
 
+/* Return true if OFFSET is a signed 7-bit value multiplied by the size
+   of MODE.  */
+
 bool
 aarch64_offset_7bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 {
@@ -4165,6 +4216,8 @@ aarch64_offset_7bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 	  && IN_RANGE (multiple, -64, 63));
 }
 
+/* Return true if OFFSET is a signed 9-bit value.  */
+
 static inline bool
 offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
 			       poly_int64 offset)
@@ -4174,6 +4227,9 @@ offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
 	  && IN_RANGE (const_offset, -256, 255));
 }
 
+/* Return true if OFFSET is a signed 9-bit value multiplied by the size
+   of MODE.  */
+
 static inline bool
 offset_9bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 {
@@ -4182,6 +4238,9 @@ offset_9bit_signed_scaled_p (machine_mode mode, poly_int64 offset)
 	  && IN_RANGE (multiple, -256, 255));
 }
 
+/* Return true if OFFSET is an unsigned 12-bit value multiplied by the size
+   of MODE.  */
+
 static inline bool
 offset_12bit_unsigned_scaled_p (machine_mode mode, poly_int64 offset)
 {
@@ -4406,6 +4465,144 @@ aarch64_set_handled_components (sbitmap components)
       cfun->machine->reg_is_wrapped_separately[regno] = true;
 }
 
+/* Allocate POLY_SIZE bytes of stack space using TEMP1 and TEMP2 as scratch
+   registers.  */
+
+static void
+aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2,
+					poly_int64 poly_size)
+{
+  HOST_WIDE_INT size;
+  if (!poly_size.is_constant (&size))
+    {
+      sorry ("stack probes for SVE frames");
+      return;
+    }
+
+  HOST_WIDE_INT probe_interval
+    = 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL);
+  HOST_WIDE_INT guard_size
+    = 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE);
+  HOST_WIDE_INT guard_used_by_caller = 1024;
+
+  /* SIZE should be large enough to require probing here.  ie, it
+     must be larger than GUARD_SIZE - GUARD_USED_BY_CALLER.
+
+     We can allocate GUARD_SIZE - GUARD_USED_BY_CALLER as a single chunk
+     without any probing.  */
+  gcc_assert (size >= guard_size - guard_used_by_caller);
+  aarch64_sub_sp (temp1, temp2, guard_size - guard_used_by_caller, true);
+  HOST_WIDE_INT orig_size = size;
+  size -= (guard_size - guard_used_by_caller);
+
+  HOST_WIDE_INT rounded_size = size & -probe_interval;
+  HOST_WIDE_INT residual = size - rounded_size;
+
+  /* We can handle a small number of allocations/probes inline.  Otherwise
+     punt to a loop.  */
+  if (rounded_size && rounded_size <= 4 * probe_interval)
+    {
+      /* We don't use aarch64_sub_sp here because we don't want to
+	 repeatedly load TEMP1.  */
+      if (probe_interval > ARITH_FACTOR)
+	emit_move_insn (temp1, GEN_INT (-probe_interval));
+      else
+	temp1 = GEN_INT (-probe_interval);
+
+      for (HOST_WIDE_INT i = 0; i < rounded_size; i += probe_interval)
+	{
+	  rtx_insn *insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
+						     temp1));
+          add_reg_note (insn, REG_STACK_CHECK, const0_rtx);
+
+	  if (probe_interval > ARITH_FACTOR)
+	    {
+	      RTX_FRAME_RELATED_P (insn) = 1;
+	      rtx adj = plus_constant (Pmode, stack_pointer_rtx, rounded_size);
+	      add_reg_note (insn, REG_CFA_ADJUST_CFA,
+			    gen_rtx_SET (stack_pointer_rtx, adj));
+	    }
+
+	  emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+					   (probe_interval
+					    - GET_MODE_SIZE (word_mode))));
+	  emit_insn (gen_blockage ());
+	}
+      dump_stack_clash_frame_info (PROBE_INLINE, size != rounded_size);
+    }
+  else if (rounded_size)
+    {
+      /* Compute the ending address.  */
+      unsigned int scratchreg = REGNO (temp1);
+      emit_move_insn (temp1, GEN_INT (-rounded_size));
+      rtx_insn *insn
+	 = emit_insn (gen_add3_insn (temp1, stack_pointer_rtx, temp1));
+
+      /* For the initial allocation, we don't have a frame pointer
+	 set up, so we always need CFI notes.  If we're doing the
+	 final allocation, then we may have a frame pointer, in which
+	 case it is the CFA, otherwise we need CFI notes.
+
+	 We can determine which allocation we are doing by looking at
+	 the temporary register.  IP0 is the initial allocation, IP1
+	 is the final allocation.  */
+      if (scratchreg == IP0_REGNUM || !frame_pointer_needed)
+	{
+	  /* We want the CFA independent of the stack pointer for the
+	     duration of the loop.  */
+	  add_reg_note (insn, REG_CFA_DEF_CFA,
+			plus_constant (Pmode, temp1,
+				       (rounded_size + (orig_size - size))));
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
+
+      /* This allocates and probes the stack.
+
+	 It also probes at a 4k interval regardless of the value of
+	 PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL.  */
+      insn = emit_insn (gen_probe_stack_range (stack_pointer_rtx,
+					       stack_pointer_rtx, temp1));
+
+      /* Now reset the CFA register if needed.  */
+      if (scratchreg == IP0_REGNUM || !frame_pointer_needed)
+	{
+	  add_reg_note (insn, REG_CFA_DEF_CFA,
+			plus_constant (Pmode, stack_pointer_rtx,
+				       (rounded_size + (orig_size - size))));
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
+
+      emit_insn (gen_blockage ());
+      dump_stack_clash_frame_info (PROBE_LOOP, size != rounded_size);
+    }
+  else
+    dump_stack_clash_frame_info (PROBE_INLINE, size != rounded_size);
+
+  /* Handle any residuals.
+     Note that any residual must be probed.  */
+  if (residual)
+    {
+      aarch64_sub_sp (temp1, temp2, residual, true);
+      add_reg_note (get_last_insn (), REG_STACK_CHECK, const0_rtx);
+      emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+				       (residual - GET_MODE_SIZE (word_mode))));
+      emit_insn (gen_blockage ());
+    }
+  return;
+}
+
+/* Add a REG_CFA_EXPRESSION note to INSN to say that register REG
+   is saved at BASE + OFFSET.  */
+
+static void
+aarch64_add_cfa_expression (rtx_insn *insn, unsigned int reg,
+			    rtx base, poly_int64 offset)
+{
+  rtx mem = gen_frame_mem (DImode, plus_constant (Pmode, base, offset));
+  add_reg_note (insn, REG_CFA_EXPRESSION,
+		gen_rtx_SET (mem, regno_reg_rtx[reg]));
+}
+
 /* AArch64 stack frames generated by this compiler look like:
 
 	+-------------------------------+
@@ -4460,6 +4657,7 @@ aarch64_expand_prologue (void)
   poly_int64 callee_offset = cfun->machine->frame.callee_offset;
   unsigned reg1 = cfun->machine->frame.wb_candidate1;
   unsigned reg2 = cfun->machine->frame.wb_candidate2;
+  bool emit_frame_chain = cfun->machine->frame.emit_frame_chain;
   rtx_insn *insn;
 
   /* Sign return address for functions.  */
@@ -4489,26 +4687,133 @@ aarch64_expand_prologue (void)
 
   rtx ip0_rtx = gen_rtx_REG (Pmode, IP0_REGNUM);
   rtx ip1_rtx = gen_rtx_REG (Pmode, IP1_REGNUM);
-  aarch64_sub_sp (ip0_rtx, ip1_rtx, initial_adjust, true);
+
+  /* We do not fully protect aarch64 against stack clash style attacks
+     as doing so would be prohibitively expensive with less utility over
+     time as newer compilers are deployed.
+
+     We assume the guard is at least 64k.  Furthermore, we assume that
+     the caller has not pushed the stack pointer more than 1k into
+     the guard.  A caller that pushes the stack pointer than 1k into
+     the guard is considered invalid.
+
+     Note that the caller's ability to push the stack pointer into the
+     guard is a function of the number and size of outgoing arguments and/or
+     dynamic stack allocations due to the mandatory save of the link register
+     in the caller's frame.
+
+     With those assumptions the callee can allocate up to 63k of stack
+     space without probing.
+
+     When probing is needed, we emit a probe at the start of the prologue
+     and every PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL bytes thereafter.
+
+     We have to track how much space has been allocated, but we do not
+     track stores into the stack as implicit probes except for the
+     fp/lr store.  */
+  HOST_WIDE_INT guard_size
+    = 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE);
+  HOST_WIDE_INT guard_used_by_caller = 1024;
+  if (flag_stack_clash_protection)
+    {
+      if (must_eq (frame_size, 0))
+	dump_stack_clash_frame_info (NO_PROBE_NO_FRAME, false);
+      else if (must_lt (initial_adjust, guard_size - guard_used_by_caller)
+	       && must_lt (final_adjust, guard_size - guard_used_by_caller))
+	dump_stack_clash_frame_info (NO_PROBE_SMALL_FRAME, true);
+    }
+
+  /* In theory we should never have both an initial adjustment
+     and a callee save adjustment.  Verify that is the case since the
+     code below does not handle it for -fstack-clash-protection.  */
+  gcc_assert (must_eq (initial_adjust, 0) || callee_adjust == 0);
+
+  /* Only probe if the initial adjustment is larger than the guard
+     less the amount of the guard reserved for use by the caller's
+     outgoing args.  */
+  if (flag_stack_clash_protection
+      && may_ge (initial_adjust, guard_size - guard_used_by_caller))
+    aarch64_allocate_and_probe_stack_space (ip0_rtx, ip1_rtx, initial_adjust);
+  else
+    aarch64_sub_sp (ip0_rtx, ip1_rtx, initial_adjust, true);
 
   if (callee_adjust != 0)
     aarch64_push_regs (reg1, reg2, callee_adjust);
 
-  if (frame_pointer_needed)
+  if (emit_frame_chain)
     {
+      poly_int64 reg_offset = callee_adjust;
       if (callee_adjust == 0)
-	aarch64_save_callee_saves (DImode, callee_offset, R29_REGNUM,
-				   R30_REGNUM, false);
-      aarch64_add_offset (Pmode, hard_frame_pointer_rtx, ip1_rtx, ip0_rtx,
-			  stack_pointer_rtx, callee_offset, true);
+	{
+	  reg1 = R29_REGNUM;
+	  reg2 = R30_REGNUM;
+	  reg_offset = callee_offset;
+	  aarch64_save_callee_saves (DImode, reg_offset, reg1, reg2, false);
+	}
+      aarch64_add_offset (Pmode, hard_frame_pointer_rtx,
+			  stack_pointer_rtx, callee_offset,
+			  ip1_rtx, ip0_rtx, frame_pointer_needed);
+      if (!frame_size.is_constant ())
+	{
+	  /* Variable-sized frames need to describe the save slot address
+	     using DW_CFA_expression rather than DW_CFA_offset.  This means
+	     that the locations of the registers that we've already saved
+	     do not automatically change as the CFA definition changes.
+	     We instead need to re-express the save slots with addresses
+	     based on the frame pointer rather than the stack pointer.  */
+	  rtx_insn *insn = get_last_insn ();
+	  gcc_assert (RTX_FRAME_RELATED_P (insn));
+
+	  /* Add an explicit CFA definition if this was previously
+	     implicit.  */
+	  if (!find_reg_note (insn, REG_CFA_ADJUST_CFA, NULL_RTX))
+	    {
+	      rtx src = plus_constant (Pmode, stack_pointer_rtx,
+				       callee_offset);
+	      add_reg_note (insn, REG_CFA_ADJUST_CFA,
+			    gen_rtx_SET (hard_frame_pointer_rtx, src));
+	    }
+
+	  /* Change the save slot expressions for the registers that
+	     we've already saved.  */
+	  reg_offset -= callee_offset;
+	  aarch64_add_cfa_expression (insn, reg2, hard_frame_pointer_rtx,
+				      reg_offset + UNITS_PER_WORD);
+	  aarch64_add_cfa_expression (insn, reg1, hard_frame_pointer_rtx,
+				      reg_offset);
+	}
       emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));
     }
 
   aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM,
-			     callee_adjust != 0 || frame_pointer_needed);
+			     callee_adjust != 0 || emit_frame_chain);
   aarch64_save_callee_saves (DFmode, callee_offset, V0_REGNUM, V31_REGNUM,
-			     callee_adjust != 0 || frame_pointer_needed);
-  aarch64_sub_sp (ip1_rtx, ip0_rtx, final_adjust, !frame_pointer_needed);
+			     callee_adjust != 0 || emit_frame_chain);
+
+  /* We may need to probe the final adjustment as well.  */
+  if (flag_stack_clash_protection && may_ne (final_adjust, 0))
+    {
+      /* First probe if the final adjustment is larger than the guard size
+	 less the amount of the guard reserved for use by the caller's
+	 outgoing args.  */
+      if (may_ge (final_adjust, guard_size - guard_used_by_caller))
+	aarch64_allocate_and_probe_stack_space (ip1_rtx, ip0_rtx,
+						final_adjust);
+      else
+	aarch64_sub_sp (ip1_rtx, ip0_rtx, final_adjust, !frame_pointer_needed);
+
+      /* We must also probe if the final adjustment is larger than the guard
+	 that is assumed used by the caller.  This may be sub-optimal.  */
+      if (may_ge (final_adjust, guard_used_by_caller))
+	{
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "Stack clash aarch64 large outgoing arg, probing\n");
+	  emit_stack_probe (stack_pointer_rtx);
+	}
+    }
+  else
+    aarch64_sub_sp (ip1_rtx, ip0_rtx, final_adjust, !frame_pointer_needed);
 }
 
 /* Return TRUE if we can use a simple_return insn.
@@ -4549,8 +4854,13 @@ aarch64_expand_epilogue (bool for_sibcall)
   unsigned reg2 = cfun->machine->frame.wb_candidate2;
   rtx cfi_ops = NULL;
   rtx_insn *insn;
+  /* A stack clash protection prologue may not have left IP0_REGNUM or
+     IP1_REGNUM in a usable state.  The same is true for allocations
+     with an SVE component, since we then need both temporary registers
+     for each allocation.  */
   bool can_inherit_p = (initial_adjust.is_constant ()
-			&& final_adjust.is_constant ());
+			&& final_adjust.is_constant ()
+			&& !flag_stack_clash_protection);
 
   /* We need to add memory barrier to prevent read from deallocated stack.  */
   bool need_barrier_p = may_ne (get_frame_size ()
@@ -4572,9 +4882,9 @@ aarch64_expand_epilogue (bool for_sibcall)
   if (frame_pointer_needed && (may_ne (final_adjust, 0) || cfun->calls_alloca))
     /* If writeback is used when restoring callee-saves, the CFA
        is restored on the instruction doing the writeback.  */
-    aarch64_add_offset (Pmode, stack_pointer_rtx, ip1_rtx, ip0_rtx,
+    aarch64_add_offset (Pmode, stack_pointer_rtx,
 			hard_frame_pointer_rtx, -callee_offset,
-			callee_adjust == 0);
+			ip1_rtx, ip0_rtx, callee_adjust == 0);
   else
     aarch64_add_sp (ip1_rtx, ip0_rtx, final_adjust,
 		    !can_inherit_p || df_regs_ever_live_p (IP1_REGNUM));
@@ -4711,7 +5021,7 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
   temp1 = gen_rtx_REG (Pmode, IP1_REGNUM);
 
   if (vcall_offset == 0)
-    aarch64_add_constant (Pmode, this_rtx, temp1, delta);
+    aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, temp1, temp0, false);
   else
     {
       gcc_assert ((vcall_offset & (POINTER_BYTES - 1)) == 0);
@@ -4723,7 +5033,8 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
 	    addr = gen_rtx_PRE_MODIFY (Pmode, this_rtx,
 				       plus_constant (Pmode, this_rtx, delta));
 	  else
-	    aarch64_add_constant (Pmode, this_rtx, temp1, delta);
+	    aarch64_add_offset (Pmode, this_rtx, this_rtx, delta,
+				temp1, temp0, false);
 	}
 
       if (Pmode == ptr_mode)
@@ -5996,8 +6307,8 @@ aarch64_const_vec_all_same_int_p (rtx x, HOST_WIDE_INT val)
   return aarch64_const_vec_all_same_in_range_p (x, val, val);
 }
 
-/* Return true if VEC is a constant in which every element is in the
-   range [MINVAL, MAXVAL].  */
+/* Return true if VEC is a constant in which every element is in the range
+   [MINVAL, MAXVAL].  The elements do not need to have the same value.  */
 
 static bool
 aarch64_const_vec_all_in_range_p (rtx vec,
@@ -6125,6 +6436,10 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate)
    The acceptable formatting commands given by CODE are:
      'c':		An integer or symbol address without a preceding #
 			sign.
+     'C':		Take the duplicated element in a vector constant
+			and print it in hex.
+     'D':		Take the duplicated element in a vector constant
+			and print it as an unsigned integer, in decimal.
      'e':		Print the sign/zero-extend size as a character 8->b,
 			16->h, 32->w.
      'p':		Prints N such that 2^N == X (X must be power of 2 and
@@ -6134,6 +6449,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate)
 			of regs.
      'm':		Print a condition (eq, ne, etc).
      'M':		Same as 'm', but invert condition.
+     'N':		Take the duplicated element in a vector constant
+			and print the negative of it in decimal.
      'b/h/s/d/q':	Print a scalar FP/SIMD register name.
      'S/T/U/V':		Print a FP/SIMD register name for a register list.
 			The register printed is the FP/SIMD register name
@@ -6350,6 +6667,20 @@ aarch64_print_operand (FILE *f, rtx x, int code)
       }
       break;
 
+    case 'D':
+      {
+	/* Print a replicated constant in decimal, treating it as
+	   unsigned.  */
+	if (!const_vec_duplicate_p (x, &elt) || !CONST_INT_P (elt))
+	  {
+	    output_operand_lossage ("invalid operand for '%%%c'", code);
+	    return;
+	  }
+	scalar_mode inner_mode = GET_MODE_INNER (GET_MODE (x));
+	asm_fprintf (f, "%wd", UINTVAL (elt) & GET_MODE_MASK (inner_mode));
+      }
+      break;
+
     case 'w':
     case 'x':
       if (x == const0_rtx
@@ -6881,13 +7212,15 @@ aarch64_legitimize_address_displacement (rtx *offset1, rtx *offset2,
       /* Split an out-of-range address displacement into a base and
 	 offset.  Use 4KB range for 1- and 2-byte accesses and a 16KB
 	 range otherwise to increase opportunities for sharing the base
-	 address of different sizes.  For unaligned accesses and TI/TF
-	 mode use the signed 9-bit range.  */
-      second_offset = const_offset & (size < 4 ? 0xfff : 0x3ffc);
-      if (mode == TImode
-	  || mode == TFmode
-	  || (const_offset & (size - 1)) != 0)
-	second_offset = (const_offset + 0x100) & ~0x1ff;
+	 address of different sizes.  Unaligned accesses use the signed
+	 9-bit range, TImode/TFmode use the intersection of signed
+	 scaled 7-bit and signed 9-bit offset.  */
+      if (mode == TImode || mode == TFmode)
+	second_offset = ((const_offset + 0x100) & 0x1f8) - 0x100;
+      else if ((const_offset & (size - 1)) != 0)
+	second_offset = ((const_offset + 0x100) & 0x1ff) - 0x100;
+      else
+	second_offset = const_offset & (size < 4 ? 0xfff : 0x3ffc);
 
       if (second_offset == 0 || must_eq (orig_offset, second_offset))
 	return false;
@@ -6985,6 +7318,14 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
 			  machine_mode mode,
 			  secondary_reload_info *sri)
 {
+  if (BYTES_BIG_ENDIAN
+      && reg_class_subset_p (rclass, FP_REGS)
+      && (MEM_P (x) || (REG_P (x) && !HARD_REGISTER_P (x)))
+      && aarch64_sve_data_mode_p (mode))
+    {
+      sri->icode = CODE_FOR_aarch64_sve_reload_be;
+      return NO_REGS;
+    }
 
   /* If we have to disable direct literal pool loads and stores because the
      function is too big, then we need a scratch register.  */
@@ -7023,15 +7364,6 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
   return NO_REGS;
 }
 
-/* Implement TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P.  */
-
-static bool
-aarch64_cannot_substitute_mem_equiv_p (rtx mem)
-{
-  /* See the comments above aarch64_sve_mov<mode> for details.  */
-  return BYTES_BIG_ENDIAN && aarch64_sve_data_mode_p (GET_MODE (mem));
-}
-
 static bool
 aarch64_can_eliminate (const int from, const int to)
 {
@@ -7058,6 +7390,7 @@ aarch64_can_eliminate (const int from, const int to)
 	 LR in the function, then we'll want a frame pointer after all, so
 	 prevent this elimination to ensure a frame pointer is used.  */
       if (to == STACK_POINTER_REGNUM
+	  && flag_omit_frame_pointer == 2
 	  && flag_omit_leaf_frame_pointer
 	  && df_regs_ever_live_p (LR_REGNUM))
 	return false;
@@ -7152,8 +7485,12 @@ aarch64_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 static unsigned char
 aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode)
 {
+  /* ??? Logically we should only need to provide a value when
+     HARD_REGNO_MODE_OK says that at least one register in REGCLASS
+     can hold MODE, but at the moment we need to handle all modes.
+     Just ignore any runtime parts for registers that can't store them.  */
+  HOST_WIDE_INT lowest_size = constant_lower_bound (GET_MODE_SIZE (mode));
   unsigned int nregs;
-  HOST_WIDE_INT size;
   switch (regclass)
     {
     case CALLER_SAVE_REGS:
@@ -7167,10 +7504,9 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode)
 	  && constant_multiple_p (GET_MODE_SIZE (mode),
 				  BYTES_PER_SVE_VECTOR, &nregs))
 	return nregs;
-      size = constant_lower_bound (GET_MODE_SIZE (mode));
       return (aarch64_vector_data_mode_p (mode)
-	      ? CEIL (size, UNITS_PER_VREG)
-	      : CEIL (size, UNITS_PER_WORD));
+	      ? CEIL (lowest_size, UNITS_PER_VREG)
+	      : CEIL (lowest_size, UNITS_PER_WORD));
     case STACK_REG:
     case PR_REGS:
     case PR_LO_REGS:
@@ -8070,20 +8406,16 @@ aarch64_rtx_costs (rtx x, machine_mode mode, int outer ATTRIBUTE_UNUSED,
 	  /* The cost is one per vector-register copied.  */
 	  if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1))
 	    {
-	      int size = constant_lower_bound (GET_MODE_SIZE (GET_MODE (op0)));
-	      int n_minus_1 = (size - 1) / UNITS_PER_VREG;
-	      *cost = COSTS_N_INSNS (n_minus_1 + 1);
+	      int nregs = aarch64_hard_regno_nregs (V0_REGNUM, GET_MODE (op0));
+	      *cost = COSTS_N_INSNS (nregs);
 	    }
 	  /* const0_rtx is in general free, but we will use an
 	     instruction to set a register to 0.  */
 	  else if (REG_P (op1) || op1 == const0_rtx)
 	    {
-	      /* The cost is 1 per register copied.  The size in this
-		 case must be constant, since all variable-size modes
-		 are vectors.  */
-	      int size = GET_MODE_SIZE (GET_MODE (op0)).to_constant ();
-	      int n_minus_1 = (size - 1) / UNITS_PER_WORD;
-	      *cost = COSTS_N_INSNS (n_minus_1 + 1);
+	      /* The cost is 1 per register copied.  */
+	      int nregs = aarch64_hard_regno_nregs (R0_REGNUM, GET_MODE (op0));
+	      *cost = COSTS_N_INSNS (nregs);
 	    }
           else
 	    /* Cost is just the cost of the RHS of the set.  */
@@ -9694,9 +10026,11 @@ aarch64_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 	return costs->scalar_to_vec_cost;
 
       case unaligned_load:
+      case vector_gather_load:
 	return costs->vec_unalign_load_cost;
 
       case unaligned_store:
+      case vector_scatter_store:
 	return costs->vec_unalign_store_cost;
 
       case cond_branch_taken:
@@ -10098,24 +10432,16 @@ aarch64_parse_override_string (const char* input_string,
 static void
 aarch64_override_options_after_change_1 (struct gcc_options *opts)
 {
-  /* The logic here is that if we are disabling all frame pointer generation
-     then we do not need to disable leaf frame pointer generation as a
-     separate operation.  But if we are *only* disabling leaf frame pointer
-     generation then we set flag_omit_frame_pointer to true, but in
-     aarch64_frame_pointer_required we return false only for leaf functions.
-
-     PR 70044: We have to be careful about being called multiple times for the
-     same function.  Once we have decided to set flag_omit_frame_pointer just
-     so that we can omit leaf frame pointers, we must then not interpret a
-     second call as meaning that all frame pointer generation should be
-     omitted.  We do this by setting flag_omit_frame_pointer to a special,
-     non-zero value.  */
-  if (opts->x_flag_omit_frame_pointer == 2)
-    opts->x_flag_omit_frame_pointer = 0;
-
-  if (opts->x_flag_omit_frame_pointer)
-    opts->x_flag_omit_leaf_frame_pointer = false;
-  else if (opts->x_flag_omit_leaf_frame_pointer)
+  /* PR 70044: We have to be careful about being called multiple times for the
+     same function.  This means all changes should be repeatable.  */
+
+  /* If the frame pointer is enabled, set it to a special value that behaves
+     similar to frame pointer omission.  If we don't do this all leaf functions
+     will get a frame pointer even if flag_omit_leaf_frame_pointer is set.
+     If flag_omit_frame_pointer has this special value, we must force the
+     frame pointer if not in a leaf function.  We also need to force it in a
+     leaf function if flag_omit_frame_pointer is not set or if LR is used.  */
+  if (opts->x_flag_omit_frame_pointer == 0)
     opts->x_flag_omit_frame_pointer = 2;
 
   /* If not optimizing for size, set the default
@@ -10225,6 +10551,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
 			   opts->x_param_values,
 			   global_options_set.x_param_values);
 
+  /* Use the alternative scheduling-pressure algorithm by default.  */
+  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, SCHED_PRESSURE_MODEL,
+			 opts->x_param_values,
+			 global_options_set.x_param_values);
+
   /* Enable sw prefetching at specified optimization level for
      CPUS that have prefetch.  Lower optimization level threshold by 1
      when profiling is enabled.  */
@@ -10234,6 +10565,12 @@ aarch64_override_options_internal (struct gcc_options *opts)
       && opts->x_optimize >= aarch64_tune_params.prefetch->default_opt_level)
     opts->x_flag_prefetch_loop_arrays = 1;
 
+  /* We assume the guard page is 64k.  */
+  maybe_set_param_value (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE,
+			 16,
+			 opts->x_param_values,
+			 global_options_set.x_param_values);
+
   aarch64_override_options_after_change_1 (opts);
 }
 
@@ -10409,7 +10746,7 @@ static poly_uint16
 aarch64_convert_sve_vector_bits (aarch64_sve_vector_bits_enum value)
 {
   /* For now generate vector-length agnostic code for -msve-vector-bits=128.
-     This ensures we can clearly distinguish SVE and AdvSIMD modes when
+     This ensures we can clearly distinguish SVE and Advanced SIMD modes when
      deciding which .md file patterns to use and when deciding whether
      something is a legitimate address or constant.  */
   if (value == SVE_SCALABLE || value == SVE_128)
@@ -10712,9 +11049,8 @@ enum aarch64_attr_opt_type
    ATTR_TYPE specifies the type of behavior of the attribute as described
    in the definition of enum aarch64_attr_opt_type.
    ALLOW_NEG is true if the attribute supports a "no-" form.
-   HANDLER is the function that takes the attribute string and whether
-   it is a pragma or attribute and handles the option.  It is needed only
-   when the ATTR_TYPE is aarch64_attr_custom.
+   HANDLER is the function that takes the attribute string as an argument
+   It is needed only when the ATTR_TYPE is aarch64_attr_custom.
    OPT_NUM is the enum specifying the option that the attribute modifies.
    This is needed for attributes that mirror the behavior of a command-line
    option, that is it has ATTR_TYPE aarch64_attr_mask, aarch64_attr_bool or
@@ -10725,15 +11061,14 @@ struct aarch64_attribute_info
   const char *name;
   enum aarch64_attr_opt_type attr_type;
   bool allow_neg;
-  bool (*handler) (const char *, const char *);
+  bool (*handler) (const char *);
   enum opt_code opt_num;
 };
 
-/* Handle the ARCH_STR argument to the arch= target attribute.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+/* Handle the ARCH_STR argument to the arch= target attribute.  */
 
 static bool
-aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_arch (const char *str)
 {
   const struct processor *tmp_arch = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -10750,15 +11085,14 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
   switch (parse_res)
     {
       case AARCH64_PARSE_MISSING_ARG:
-	error ("missing architecture name in 'arch' target %s", pragma_or_attr);
+	error ("missing name in %<target(\"arch=\")%> pragma or attribute");
 	break;
       case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'arch' target %s", str, pragma_or_attr);
+	error ("invalid name (\"%s\") in %<target(\"arch=\")%> pragma or attribute", str);
 	aarch64_print_hint_for_arch (str);
 	break;
       case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid feature modifier %qs for 'arch' target %s",
-	       str, pragma_or_attr);
+	error ("invalid value (\"%s\") in %<target()%> pragma or attribute", str);
 	break;
       default:
 	gcc_unreachable ();
@@ -10767,11 +11101,10 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
   return false;
 }
 
-/* Handle the argument CPU_STR to the cpu= target attribute.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+/* Handle the argument CPU_STR to the cpu= target attribute.  */
 
 static bool
-aarch64_handle_attr_cpu (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_cpu (const char *str)
 {
   const struct processor *tmp_cpu = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -10791,15 +11124,14 @@ aarch64_handle_attr_cpu (const char *str, const char *pragma_or_attr)
   switch (parse_res)
     {
       case AARCH64_PARSE_MISSING_ARG:
-	error ("missing cpu name in 'cpu' target %s", pragma_or_attr);
+	error ("missing name in %<target(\"cpu=\")%> pragma or attribute");
 	break;
       case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'cpu' target %s", str, pragma_or_attr);
+	error ("invalid name (\"%s\") in %<target(\"cpu=\")%> pragma or attribute", str);
 	aarch64_print_hint_for_core (str);
 	break;
       case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid feature modifier %qs for 'cpu' target %s",
-	       str, pragma_or_attr);
+	error ("invalid value (\"%s\") in %<target()%> pragma or attribute", str);
 	break;
       default:
 	gcc_unreachable ();
@@ -10808,11 +11140,10 @@ aarch64_handle_attr_cpu (const char *str, const char *pragma_or_attr)
   return false;
 }
 
-/* Handle the argument STR to the tune= target attribute.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+/* Handle the argument STR to the tune= target attribute.  */
 
 static bool
-aarch64_handle_attr_tune (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_tune (const char *str)
 {
   const struct processor *tmp_tune = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -10829,7 +11160,7 @@ aarch64_handle_attr_tune (const char *str, const char *pragma_or_attr)
   switch (parse_res)
     {
       case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'tune' target %s", str, pragma_or_attr);
+	error ("invalid name (\"%s\") in %<target(\"tune=\")%> pragma or attribute", str);
 	aarch64_print_hint_for_core (str);
 	break;
       default:
@@ -10842,11 +11173,10 @@ aarch64_handle_attr_tune (const char *str, const char *pragma_or_attr)
 /* Parse an architecture extensions target attribute string specified in STR.
    For example "+fp+nosimd".  Show any errors if needed.  Return TRUE
    if successful.  Update aarch64_isa_flags to reflect the ISA features
-   modified.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+   modified.  */
 
 static bool
-aarch64_handle_attr_isa_flags (char *str, const char *pragma_or_attr)
+aarch64_handle_attr_isa_flags (char *str)
 {
   enum aarch64_parse_opt_result parse_res;
   unsigned long isa_flags = aarch64_isa_flags;
@@ -10870,13 +11200,11 @@ aarch64_handle_attr_isa_flags (char *str, const char *pragma_or_attr)
   switch (parse_res)
     {
       case AARCH64_PARSE_MISSING_ARG:
-	error ("missing feature modifier in target %s %qs",
-	       pragma_or_attr, str);
+	error ("missing value in %<target()%> pragma or attribute");
 	break;
 
       case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid feature modifier in target %s %qs",
-	       pragma_or_attr, str);
+	error ("invalid value (\"%s\") in %<target()%> pragma or attribute", str);
 	break;
 
       default:
@@ -10914,12 +11242,10 @@ static const struct aarch64_attribute_info aarch64_attributes[] =
 };
 
 /* Parse ARG_STR which contains the definition of one target attribute.
-   Show appropriate errors if any or return true if the attribute is valid.
-   PRAGMA_OR_ATTR holds the string to use in error messages about whether
-   we're processing a target attribute or pragma.  */
+   Show appropriate errors if any or return true if the attribute is valid.  */
 
 static bool
-aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
+aarch64_process_one_target_attr (char *arg_str)
 {
   bool invert = false;
 
@@ -10927,7 +11253,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 
   if (len == 0)
     {
-      error ("malformed target %s", pragma_or_attr);
+      error ("malformed %<target()%> pragma or attribute");
       return false;
     }
 
@@ -10943,7 +11269,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
      through the machinery for the rest of the target attributes in this
      function.  */
   if (*str_to_check == '+')
-    return aarch64_handle_attr_isa_flags (str_to_check, pragma_or_attr);
+    return aarch64_handle_attr_isa_flags (str_to_check);
 
   if (len > 3 && strncmp (str_to_check, "no-", 3) == 0)
     {
@@ -10975,8 +11301,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 
       if (attr_need_arg_p ^ (arg != NULL))
 	{
-	  error ("target %s %qs does not accept an argument",
-		  pragma_or_attr, str_to_check);
+	  error ("pragma or attribute %<target(\"%s\")%> does not accept an argument", str_to_check);
 	  return false;
 	}
 
@@ -10984,8 +11309,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 	 then we can't match.  */
       if (invert && !p_attr->allow_neg)
 	{
-	  error ("target %s %qs does not allow a negated form",
-		  pragma_or_attr, str_to_check);
+	  error ("pragma or attribute %<target(\"%s\")%> does not allow a negated form", str_to_check);
 	  return false;
 	}
 
@@ -10995,7 +11319,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 	   For example, cpu=, arch=, tune=.  */
 	  case aarch64_attr_custom:
 	    gcc_assert (p_attr->handler);
-	    if (!p_attr->handler (arg, pragma_or_attr))
+	    if (!p_attr->handler (arg))
 	      return false;
 	    break;
 
@@ -11039,8 +11363,7 @@ aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 		}
 	      else
 		{
-		  error ("target %s %s=%s is not valid",
-			 pragma_or_attr, str_to_check, arg);
+		  error ("pragma or attribute %<target(\"%s=%s\")%> is not valid", str_to_check, arg);
 		}
 	      break;
 	    }
@@ -11074,12 +11397,10 @@ num_occurences_in_str (char c, char *str)
 }
 
 /* Parse the tree in ARGS that contains the target attribute information
-   and update the global target options space.  PRAGMA_OR_ATTR is a string
-   to be used in error messages, specifying whether this is processing
-   a target attribute or a target pragma.  */
+   and update the global target options space.  */
 
 bool
-aarch64_process_target_attr (tree args, const char* pragma_or_attr)
+aarch64_process_target_attr (tree args)
 {
   if (TREE_CODE (args) == TREE_LIST)
     {
@@ -11088,7 +11409,7 @@ aarch64_process_target_attr (tree args, const char* pragma_or_attr)
 	  tree head = TREE_VALUE (args);
 	  if (head)
 	    {
-	      if (!aarch64_process_target_attr (head, pragma_or_attr))
+	      if (!aarch64_process_target_attr (head))
 		return false;
 	    }
 	  args = TREE_CHAIN (args);
@@ -11109,7 +11430,7 @@ aarch64_process_target_attr (tree args, const char* pragma_or_attr)
 
   if (len == 0)
     {
-      error ("malformed target %s value", pragma_or_attr);
+      error ("malformed %<target()%> pragma or attribute");
       return false;
     }
 
@@ -11124,9 +11445,9 @@ aarch64_process_target_attr (tree args, const char* pragma_or_attr)
   while (token)
     {
       num_attrs++;
-      if (!aarch64_process_one_target_attr (token, pragma_or_attr))
+      if (!aarch64_process_one_target_attr (token))
 	{
-	  error ("target %s %qs is invalid", pragma_or_attr, token);
+	  error ("pragma or attribute %<target(\"%s\")%> is not valid", token);
 	  return false;
 	}
 
@@ -11135,8 +11456,7 @@ aarch64_process_target_attr (tree args, const char* pragma_or_attr)
 
   if (num_attrs != num_commas + 1)
     {
-      error ("malformed target %s list %qs",
-	      pragma_or_attr, TREE_STRING_POINTER (args));
+      error ("malformed %<target(\"%s\")%> pragma or attribute", TREE_STRING_POINTER (args));
       return false;
     }
 
@@ -11195,8 +11515,7 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
     cl_target_option_restore (&global_options,
 			TREE_TARGET_OPTION (target_option_current_node));
 
-
-  ret = aarch64_process_target_attr (args, "attribute");
+  ret = aarch64_process_target_attr (args);
 
   /* Set up any additional state.  */
   if (ret)
@@ -11525,64 +11844,63 @@ aarch64_legitimate_pic_operand_p (rtx x)
   return true;
 }
 
-/* Return true if X holds either a quarter-precision or
-     floating-point +0.0 constant.  */
-static bool
-aarch64_valid_floating_const (rtx x)
-{
-  if (!CONST_DOUBLE_P (x))
-    return false;
-
-  /* This call determines which constants can be used in mov<mode>
-     as integer moves instead of constant loads.  */
-  if (aarch64_float_const_rtx_p (x))
-    return true;
-
-  return aarch64_float_const_representable_p (x);
-}
+/* Implement TARGET_LEGITIMATE_CONSTANT_P hook.  Return true for constants
+   that should be rematerialized rather than spilled.  */
 
 static bool
 aarch64_legitimate_constant_p (machine_mode mode, rtx x)
 {
-  /* Do not allow vector struct mode constants.  We could support
-     0 and -1 easily, but they need support in aarch64-simd.md.  */
+  /* Support CSE and rematerialization of common constants.  */
+  if (CONST_SCALAR_INT_P (x)
+      || CONST_DOUBLE_P (x)
+      || GET_CODE (x) == CONST_VECTOR)
+    return true;
+
+  /* Do not allow vector struct mode constants for Advanced SIMD.
+     We could support 0 and -1 easily, but they need support in
+     aarch64-simd.md.  */
   unsigned int vec_flags = aarch64_classify_vector_mode (mode);
   if (vec_flags == (VEC_ADVSIMD | VEC_STRUCT))
     return false;
 
-  /* For these cases we never want to use a literal load.
-     As such we have to prevent the compiler from forcing these
-     to memory.  */
-  rtx base, step;
-  if (aarch64_simd_valid_immediate (x, NULL)
-      || (vec_flags == VEC_SVE_DATA
-	  && (const_vec_series_p (x, &base, &step)
-	      || const_vec_duplicate_p (x)))
-      || CONST_INT_P (x)
-      || aarch64_valid_floating_const (x)
-      || aarch64_can_const_movi_rtx_p (x, mode)
-      || aarch64_float_const_rtx_p (x))
-	return !targetm.cannot_force_const_mem (mode, x);
+  /* Only accept variable-length vector constants if they can be
+     handled directly.
 
-  if (GET_CODE (x) == HIGH
-      && aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0))))
-    return true;
+     ??? It would be possible to handle rematerialization of other
+     constants via secondary reloads.  */
+  if (vec_flags & VEC_ANY_SVE)
+    return aarch64_simd_valid_immediate (x, NULL);
+
+  if (GET_CODE (x) == HIGH)
+    x = XEXP (x, 0);
 
-  /* Accept polynomials that can be calculated by using the destination
-     of a move as the sole temporary.  Constants that require a second
-     temporary cannot be rematerialized (they can't be forced to memory
-     and also aren't legitimate constants).  */
+  /* Accept polynomial constants that can be calculated by using the
+     destination of a move as the sole temporary.  Constants that
+     require a second temporary cannot be rematerialized (they can't be
+     forced to memory and also aren't legitimate constants).  */
   poly_int64 offset;
   if (poly_int_rtx_p (x, &offset))
     return aarch64_offset_temporaries (false, offset) <= 1;
 
+  /* If an offset is being added to something else, we need to allow the
+     base to be moved into the destination register, meaning that there
+     are no free temporaries for the offset.  */
+  x = strip_offset (x, &offset);
+  if (!offset.is_constant () && aarch64_offset_temporaries (true, offset) > 0)
+    return false;
+
+  /* Do not allow const (plus (anchor_symbol, const_int)).  */
+  if (may_ne (offset, 0) && SYMBOL_REF_P (x) && SYMBOL_REF_ANCHOR_P (x))
+    return false;
+
   /* Treat symbols as constants.  Avoid TLS symbols as they are complex,
      so spilling them is better than rematerialization.  */
   if (SYMBOL_REF_P (x) && !SYMBOL_REF_TLS_MODEL (x))
     return true;
 
-  if (SCALAR_INT_MODE_P (GET_MODE (x)))
-    return aarch64_constant_address_p (x);
+  /* Label references are always constant.  */
+  if (GET_CODE (x) == LABEL_REF)
+    return true;
 
   return false;
 }
@@ -11809,7 +12127,8 @@ aarch64_gimplify_va_arg_expr (tree valist, tree type, gimple_seq *pre_p,
 					       &nregs,
 					       &is_ha))
     {
-      /* We don't support passing and returning SVE types.  */
+      /* No frontends can create types with variable-sized modes, so we
+	 shouldn't be asked to pass or return them.  */
       unsigned int ag_size = GET_MODE_SIZE (ag_mode).to_constant ();
 
       /* TYPE passed in fp/simd registers.  */
@@ -12430,12 +12749,14 @@ aarch64_simd_container_mode (scalar_mode mode, poly_int64 width)
   if (TARGET_SVE && must_eq (width, BITS_PER_SVE_VECTOR))
     switch (mode)
       {
-      case E_DImode:
-	return V4DImode;
       case E_DFmode:
 	return V4DFmode;
       case E_SFmode:
 	return V8SFmode;
+      case E_HFmode:
+	return V16HFmode;
+      case E_DImode:
+	return V4DImode;
       case E_SImode:
 	return V8SImode;
       case E_HImode:
@@ -12690,6 +13011,7 @@ aarch64_sve_arith_immediate_p (rtx x, bool negate_p)
   HOST_WIDE_INT val = INTVAL (elt);
   if (negate_p)
     val = -val;
+  val &= GET_MODE_MASK (GET_MODE_INNER (GET_MODE (x)));
 
   if (val & 0xff)
     return IN_RANGE (val, 0, 0xff);
@@ -12783,6 +13105,9 @@ aarch64_sve_float_mul_immediate_p (rtx x)
 	  && real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf));
 }
 
+/* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate
+   for the Advanced SIMD operation described by WHICH and INSN.  If INFO
+   is nonnull, use it to describe valid immediates.  */
 static bool
 aarch64_advsimd_valid_immediate_hs (unsigned int val32,
 				    simd_immediate_info *info,
@@ -12829,9 +13154,9 @@ aarch64_advsimd_valid_immediate_hs (unsigned int val32,
   return false;
 }
 
-/* Return true if replicating VAL64 is a valid immediate for an AdvSIMD
-   MOVI or MVNI instruction.  If INFO is nonnull, use it to describe valid
-   immediates.  */
+/* Return true if replicating VAL64 is a valid immediate for the
+   Advanced SIMD operation described by WHICH.  If INFO is nonnull,
+   use it to describe valid immediates.  */
 static bool
 aarch64_advsimd_valid_immediate (unsigned HOST_WIDE_INT val64,
 				 simd_immediate_info *info,
@@ -12863,6 +13188,7 @@ aarch64_advsimd_valid_immediate (unsigned HOST_WIDE_INT val64,
 	  return true;
 	}
     }
+
   /* Try using a bit-to-bytemask.  */
   if (which == AARCH64_CHECK_MOV)
     {
@@ -12885,6 +13211,7 @@ aarch64_advsimd_valid_immediate (unsigned HOST_WIDE_INT val64,
 
 /* Return true if replicating VAL64 gives a valid immediate for an SVE MOV
    instruction.  If INFO is nonnull, use it to describe valid immediates.  */
+
 static bool
 aarch64_sve_valid_immediate (unsigned HOST_WIDE_INT val64,
 			     simd_immediate_info *info)
@@ -12928,8 +13255,9 @@ aarch64_sve_valid_immediate (unsigned HOST_WIDE_INT val64,
   return false;
 }
 
-/* Return true if OP is a valid SIMD immediate.  If INFO is nonnull,
-   use it to describe valid immediates.  */
+/* Return true if OP is a valid SIMD immediate for the operation
+   described by WHICH.  If INFO is nonnull, use it to describe valid
+   immediates.  */
 bool
 aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info,
 			      enum simd_immediate_check which)
@@ -12983,7 +13311,8 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info,
 
   scalar_int_mode elt_int_mode = int_mode_for_mode (elt_mode).require ();
 
-  /* Splat vector constant out into a byte vector.  */
+  /* Expand the vector constant out into a byte vector, with the least
+     significant byte of the register first.  */
   auto_vec<unsigned char, 16> bytes;
   bytes.reserve (n_elts * elt_size);
   for (unsigned int i = 0; i < n_elts; i++)
@@ -13014,7 +13343,8 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info,
     if (bytes[i] != bytes[i - 8])
       return false;
 
-  /* Get the repeating 8-byte value as an integer.  */
+  /* Get the repeating 8-byte value as an integer.  No endian correction
+     is needed here because bytes is already in lsb-first order.  */
   unsigned HOST_WIDE_INT val64 = 0;
   for (unsigned int i = 0; i < 8; i++)
     val64 |= ((unsigned HOST_WIDE_INT) bytes[i % nbytes]
@@ -13126,7 +13456,8 @@ Architecture    3   2   1   0           3   2   1   0
 
 Low Mask:         { 2, 3 }                { 0, 1 }
 High Mask:        { 0, 1 }                { 2, 3 }
-*/
+
+   MODE Is the mode of the vector and NUNITS is the number of units in it.  */
 
 rtx
 aarch64_simd_vect_par_cnst_half (machine_mode mode, int nunits, bool high)
@@ -13206,12 +13537,13 @@ aarch64_simd_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high,
    of mode MODE, and return the result as an SImode rtx.  */
 
 rtx
-endian_lane_rtx (machine_mode mode, unsigned int n)
+aarch64_endian_lane_rtx (machine_mode mode, unsigned int n)
 {
   return gen_int_mode (ENDIAN_LANE_N (GET_MODE_NUNITS (mode), n), SImode);
 }
 
 /* Return TRUE if OP is a valid vector addressing mode.  */
+
 bool
 aarch64_simd_mem_operand_p (rtx op)
 {
@@ -13220,6 +13552,7 @@ aarch64_simd_mem_operand_p (rtx op)
 }
 
 /* Return true if OP is a valid MEM operand for an SVE LD1R instruction.  */
+
 bool
 aarch64_sve_ld1r_operand_p (rtx op)
 {
@@ -13317,7 +13650,7 @@ aarch64_simd_emit_reg_reg_move (rtx *operands, machine_mode mode,
 int
 aarch64_simd_attr_length_rglist (machine_mode mode)
 {
-  /* This is only used (and only meaningful) for AdvSIMD, not SVE.  */
+  /* This is only used (and only meaningful) for Advanced SIMD, not SVE.  */
   return (GET_MODE_SIZE (mode).to_constant () / UNITS_PER_VREG) * 4;
 }
 
@@ -13360,8 +13693,8 @@ aarch64_simd_vector_alignment_reachable (const_tree type, bool is_packed)
     return false;
 
   /* We guarantee alignment for vectors up to 128-bits.  */
-  if (tree_int_cst_compare (TYPE_SIZE (type),
-			    bitsize_int (BIGGEST_ALIGNMENT)) > 0)
+  unsigned int align = aarch64_vectorize_preferred_vector_alignment (type);
+  if (align > BIGGEST_ALIGNMENT)
     return false;
 
   /* Vectors whose size is <= BIGGEST_ALIGNMENT are naturally aligned.  */
@@ -14586,7 +14919,7 @@ aarch64_output_scalar_simd_mov_immediate (rtx immediate, scalar_int_mode mode)
 }
 
 /* Return the output string to use for moving immediate CONST_VECTOR
-   into an SVE register of mode MODE.  */
+   into an SVE register.  */
 
 char *
 aarch64_output_sve_mov_immediate (rtx const_vector)
@@ -14615,7 +14948,7 @@ aarch64_output_sve_mov_immediate (rtx const_vector)
       else
 	{
 	  const int buf_size = 20;
-	  char float_buf[buf_size] = { 0 };
+	  char float_buf[buf_size] = {};
 	  real_to_decimal_for_mode (float_buf,
 				    CONST_DOUBLE_REAL_VALUE (info.value),
 				    buf_size, buf_size, 1, info.elt_mode);
@@ -14638,7 +14971,7 @@ char *
 aarch64_output_ptrue (machine_mode mode, char suffix)
 {
   unsigned int nunits;
-  static char buf[sizeof ("ptrue\t%0.N, vlNNN")];
+  static char buf[sizeof ("ptrue\t%0.N, vlNNNNN")];
   if (GET_MODE_NUNITS (mode).is_constant (&nunits))
     snprintf (buf, sizeof (buf), "ptrue\t%%0.%c, vl%d", suffix, nunits);
   else
@@ -14757,6 +15090,9 @@ aarch64_expand_vec_perm_1 (rtx target, rtx op0, rtx op1, rtx sel)
     }
 }
 
+/* Expand a vec_perm with the operands given by TARGET, OP0, OP1 and SEL.
+   NELT is the number of elements in the vector.  */
+
 void
 aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel,
 			 unsigned int nelt)
@@ -14804,11 +15140,11 @@ aarch64_expand_sve_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
   /* Enforced by the pattern condition.  */
   int nunits = GET_MODE_NUNITS (sel_mode).to_constant ();
 
-  /* Note: All sel indexes are wrapped when they are beyond the size of the
-	   two value vectors.  SVE TBL will produce 0 for any out of range
-	   sel indexes.  Therefore we need to modulo all the sel indexes to
-	   ensure they are all in range.  */
-
+  /* Note: vec_perm indices are supposed to wrap when they go beyond the
+     size of the two value vectors, i.e. the upper bits of the indices
+     are effectively ignored.  SVE TBL instead produces 0 for any
+     out-of-range indices, so we need to modulo all the vec_perm indices
+     to ensure they are all in range.  */
   rtx sel_reg = force_reg (sel_mode, sel);
 
   /* Check if the sel only references the first values vector.  */
@@ -14834,12 +15170,16 @@ aarch64_expand_sve_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
   rtx res0 = gen_reg_rtx (data_mode);
   rtx res1 = gen_reg_rtx (data_mode);
   rtx neg_num_elems = aarch64_simd_gen_const_vector_dup (sel_mode, -nunits);
-  rtx max_sel = aarch64_simd_gen_const_vector_dup (sel_mode, (2 * nunits) - 1);
-
-  rtx sel_mod = expand_simple_binop (sel_mode, AND, sel_reg, max_sel,
+  if (!const_vec_p (sel)
+      || !aarch64_const_vec_all_in_range_p (sel, 0, 2 * nunits - 1))
+    {
+      rtx max_sel = aarch64_simd_gen_const_vector_dup (sel_mode,
+						       2 * nunits - 1);
+      sel_reg = expand_simple_binop (sel_mode, AND, sel_reg, max_sel,
 				     NULL, 0, OPTAB_DIRECT);
-  emit_unspec2 (res0, UNSPEC_TBL, op0, sel_mod);
-  rtx sel_sub = expand_simple_binop (sel_mode, PLUS, sel_mod, neg_num_elems,
+    }
+  emit_unspec2 (res0, UNSPEC_TBL, op0, sel_reg);
+  rtx sel_sub = expand_simple_binop (sel_mode, PLUS, sel_reg, neg_num_elems,
 				     NULL, 0, OPTAB_DIRECT);
   emit_unspec2 (res1, UNSPEC_TBL, op1, sel_sub);
   if (GET_MODE_CLASS (data_mode) == MODE_VECTOR_INT)
@@ -14883,6 +15223,8 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
 
   in0 = d->op0;
   in1 = d->op1;
+  /* We don't need a big-endian lane correction for SVE; see the comment
+     at the head of aarch64-sve.md for details.  */
   if (BYTES_BIG_ENDIAN && d->vec_flags == VEC_ADVSIMD)
     {
       x = in0, in0 = in1, in1 = x;
@@ -14929,6 +15271,8 @@ aarch64_evpc_uzp (struct expand_vec_perm_d *d)
 
   in0 = d->op0;
   in1 = d->op1;
+  /* We don't need a big-endian lane correction for SVE; see the comment
+     at the head of aarch64-sve.md for details.  */
   if (BYTES_BIG_ENDIAN && d->vec_flags == VEC_ADVSIMD)
     {
       x = in0, in0 = in1, in1 = x;
@@ -14980,6 +15324,8 @@ aarch64_evpc_zip (struct expand_vec_perm_d *d)
 
   in0 = d->op0;
   in1 = d->op1;
+  /* We don't need a big-endian lane correction for SVE; see the comment
+     at the head of aarch64-sve.md for details.  */
   if (BYTES_BIG_ENDIAN && d->vec_flags == VEC_ADVSIMD)
     {
       x = in0, in0 = in1, in1 = x;
@@ -15020,8 +15366,10 @@ aarch64_evpc_ext (struct expand_vec_perm_d *d)
     return true;
 
   /* The case where (location == 0) is a no-op for both big- and little-endian,
-     and is removed by the mid-end at optimization levels -O1 and higher.  */
+     and is removed by the mid-end at optimization levels -O1 and higher.
 
+     We don't need a big-endian lane correction for SVE; see the comment
+     at the head of aarch64-sve.md for details.  */
   if (BYTES_BIG_ENDIAN && location != 0 && d->vec_flags == VEC_ADVSIMD)
     {
       /* After setup, we want the high elements of the first vector (stored
@@ -15163,6 +15511,8 @@ aarch64_evpc_tbl (struct expand_vec_perm_d *d)
   return true;
 }
 
+/* Try to implement D using an SVE TBL instruction.  */
+
 static bool
 aarch64_evpc_sve_tbl (struct expand_vec_perm_d *d)
 {
@@ -15178,9 +15528,6 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d *d)
   for (unsigned int i = 0; i < nelt; ++i)
     rperm[i] = GEN_INT (d->perm[i]);
   rtx sel = gen_rtx_CONST_VECTOR (sel_mode, gen_rtvec_v (nelt, rperm));
-  if (!aarch64_sve_vec_perm_operand (sel, sel_mode))
-    sel = force_reg (sel_mode, sel);
-
   aarch64_expand_sve_vec_perm (d->target, d->op0, d->op1, sel);
   return true;
 }
@@ -15224,7 +15571,8 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
   return false;
 }
 
-/* Expand a vec_perm_const pattern.  */
+/* Expand a vec_perm_const pattern with the operands given by TARGET,
+   OP0, OP1 and SEL.  NELT is the number of elements in the vector.  */
 
 bool
 aarch64_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel,
@@ -15324,6 +15672,9 @@ aarch64_vectorize_vec_perm_const_ok (machine_mode vmode, vec_perm_indices sel)
   return ret;
 }
 
+/* Generate a byte permute mask for a register of mode MODE,
+   which has NUNITS units.  */
+
 rtx
 aarch64_reverse_mask (machine_mode mode, unsigned int nunits)
 {
@@ -17024,6 +17375,28 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn)
     }
 }
 
+/* It has been decided that to allow up to 1kb of outgoing argument
+   space to be allocated w/o probing.  If more than 1kb of outgoing
+   argment space is allocated, then it must be probed and the last
+   probe must occur no more than 1kbyte away from the end of the
+   allocated space.
+
+   This implies that the residual part of an alloca allocation may
+   need probing in cases where the generic code might not otherwise
+   think a probe is needed.
+
+   This target hook returns TRUE when allocating RESIDUAL bytes of
+   alloca space requires an additional probe, otherwise FALSE is
+   returned.  */
+
+static bool
+aarch64_stack_clash_protection_final_dynamic_probe (rtx residual)
+{
+  return (residual == CONST0_RTX (Pmode)
+	  || GET_CODE (residual) != CONST_INT
+	  || INTVAL (residual) >= 1024);
+}
+
 /* Implement TARGET_COMPUTE_PRESSURE_CLASSES.  */
 
 static int
@@ -17043,6 +17416,19 @@ aarch64_compute_pressure_classes (reg_class *classes)
   return i;
 }
 
+/* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
+
+static bool
+aarch64_can_change_mode_class (machine_mode from,
+			       machine_mode to, reg_class_t)
+{
+  /* See the comment at the head of aarch64-sve.md for details.  */
+  if (BYTES_BIG_ENDIAN
+      && (aarch64_sve_data_mode_p (from) != aarch64_sve_data_mode_p (to)))
+    return false;
+  return true;
+}
+
 /* Target-specific selftests.  */
 
 #if CHECKING_P
@@ -17292,9 +17678,6 @@ aarch64_libgcc_floating_mode_supported_p
 
 #undef TARGET_SECONDARY_RELOAD
 #define TARGET_SECONDARY_RELOAD aarch64_secondary_reload
-#undef TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P
-#define TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P \
-  aarch64_cannot_substitute_mem_equiv_p
 
 #undef TARGET_SHIFT_TRUNCATION_MASK
 #define TARGET_SHIFT_TRUNCATION_MASK aarch64_shift_truncation_mask
@@ -17517,9 +17900,16 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
+#undef TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE
+#define TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE \
+  aarch64_stack_clash_protection_final_dynamic_probe
+
 #undef TARGET_COMPUTE_PRESSURE_CLASSES
 #define TARGET_COMPUTE_PRESSURE_CLASSES aarch64_compute_pressure_classes
 
+#undef TARGET_CAN_CHANGE_MODE_CLASS
+#define TARGET_CAN_CHANGE_MODE_CLASS aarch64_can_change_mode_class
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 6b9ffae6823..5816bc6c1e4 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -111,6 +111,9 @@
 
 #define STRUCTURE_SIZE_BOUNDARY		8
 
+/* Heap alignment (same as BIGGEST_ALIGNMENT and STACK_BOUNDARY).  */
+#define MALLOC_ABI_ALIGNMENT  128
+
 /* Defined by the ABI */
 #define WCHAR_TYPE "unsigned int"
 #define WCHAR_TYPE_SIZE			32
@@ -136,15 +139,16 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_CRC        (1 << 3)	/* Has CRC.  */
 /* ARMv8.1-A architecture extensions.  */
 #define AARCH64_FL_LSE	      (1 << 4)  /* Has Large System Extensions.  */
-#define AARCH64_FL_RDMA	      (1 << 5)  /* Has Round Double Multiply Add.  */
-#define AARCH64_FL_V8_1	      (1 << 6)  /* Has ARMv8.1-A extensions.  */
+#define AARCH64_FL_RDMA       (1 << 5)  /* Has Round Double Multiply Add.  */
+#define AARCH64_FL_V8_1       (1 << 6)  /* Has ARMv8.1-A extensions.  */
 /* ARMv8.2-A architecture extensions.  */
-#define AARCH64_FL_V8_2	      (1 << 8)  /* Has ARMv8.2-A features.  */
+#define AARCH64_FL_V8_2       (1 << 8)  /* Has ARMv8.2-A features.  */
 #define AARCH64_FL_F16	      (1 << 9)  /* Has ARMv8.2-A FP16 extensions.  */
 #define AARCH64_FL_SVE        (1 << 10) /* Has Scalable Vector Extensions.  */
 /* ARMv8.3-A architecture extensions.  */
-#define AARCH64_FL_V8_3	      (1 << 11)  /* Has ARMv8.3-A features.  */
-#define AARCH64_FL_RCPC	      (1 << 12)  /* Has support for RCpc model.  */
+#define AARCH64_FL_V8_3       (1 << 11)  /* Has ARMv8.3-A features.  */
+#define AARCH64_FL_RCPC       (1 << 12)  /* Has support for RCpc model.  */
+#define AARCH64_FL_DOTPROD    (1 << 13)  /* Has ARMv8.2-A Dot Product ins.  */
 
 /* Has FP and SIMD.  */
 #define AARCH64_FL_FPSIMD     (AARCH64_FL_FP | AARCH64_FL_SIMD)
@@ -174,6 +178,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F16		   (aarch64_isa_flags & AARCH64_FL_F16)
 #define AARCH64_ISA_SVE            (aarch64_isa_flags & AARCH64_FL_SVE)
 #define AARCH64_ISA_V8_3	   (aarch64_isa_flags & AARCH64_FL_V8_3)
+#define AARCH64_ISA_DOTPROD	   (aarch64_isa_flags & AARCH64_FL_DOTPROD)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -188,6 +193,9 @@ extern unsigned aarch64_architecture_version;
 #define TARGET_FP_F16INST (TARGET_FLOAT && AARCH64_ISA_F16)
 #define TARGET_SIMD_F16INST (TARGET_SIMD && AARCH64_ISA_F16)
 
+/* Dot Product is an optional extension to AdvSIMD enabled through +dotprod.  */
+#define TARGET_DOTPROD (TARGET_SIMD && AARCH64_ISA_DOTPROD)
+
 /* SVE instructions, enabled through +sve.  */
 #define TARGET_SVE (AARCH64_ISA_SVE)
 
@@ -273,7 +281,7 @@ extern unsigned aarch64_architecture_version;
     0, 0, 0, 0,   0, 0, 0, 0,   /* V8 - V15 */		\
     0, 0, 0, 0,   0, 0, 0, 0,   /* V16 - V23 */         \
     0, 0, 0, 0,   0, 0, 0, 0,   /* V24 - V31 */         \
-    1, 1, 1,			/* SFP, AP, CC */	\
+    1, 1, 1, 1,			/* SFP, AP, CC, VG */	\
     0, 0, 0, 0,   0, 0, 0, 0,   /* P0 - P7 */           \
     0, 0, 0, 0,   0, 0, 0, 0,   /* P8 - P15 */          \
     1,				/* FFRT */		\
@@ -289,7 +297,7 @@ extern unsigned aarch64_architecture_version;
     0, 0, 0, 0,   0, 0, 0, 0,	/* V8 - V15 */		\
     1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23 */         \
     1, 1, 1, 1,   1, 1, 1, 1,   /* V24 - V31 */         \
-    1, 1, 1,			/* SFP, AP, CC */	\
+    1, 1, 1, 1,			/* SFP, AP, CC, VG */	\
     1, 1, 1, 1,   1, 1, 1, 1,	/* P0 - P7 */		\
     1, 1, 1, 1,   1, 1, 1, 1,	/* P8 - P15 */		\
     1,				/* FFRT */		\
@@ -305,7 +313,7 @@ extern unsigned aarch64_architecture_version;
     "v8",  "v9",  "v10", "v11", "v12", "v13", "v14", "v15",	\
     "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",	\
     "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31",	\
-    "sfp", "ap",  "cc",						\
+    "sfp", "ap",  "cc",  "vg",					\
     "p0",  "p1",  "p2",  "p3",  "p4",  "p5",  "p6",  "p7",	\
     "p8",  "p9",  "p10", "p11", "p12", "p13", "p14", "p15",	\
     "ffrt",							\
@@ -353,9 +361,9 @@ extern unsigned aarch64_architecture_version;
   (epilogue_completed && (REGNO) == LR_REGNUM)
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
-   the stack pointer does not matter.  The value is tested only in
-   functions that have frame pointers.  */
-#define EXIT_IGNORE_STACK	1
+   the stack pointer does not matter.  This is only true if the function
+   uses alloca.  */
+#define EXIT_IGNORE_STACK	(cfun->calls_alloca)
 
 #define STATIC_CHAIN_REGNUM		R18_REGNUM
 #define HARD_FRAME_POINTER_REGNUM	R29_REGNUM
@@ -503,10 +511,10 @@ enum reg_class
   { 0x00000000, 0x0000ffff, 0x00000000 },       /* FP_LO_REGS  */	\
   { 0x00000000, 0xffffffff, 0x00000000 },       /* FP_REGS  */		\
   { 0xffffffff, 0xffffffff, 0x00000003 },	/* POINTER_AND_FP_REGS */\
-  { 0x00000000, 0x00000000, 0x000007f8 },	/* PR_LO_REGS */	\
-  { 0x00000000, 0x00000000, 0x0007f800 },	/* PR_HI_REGS */	\
-  { 0x00000000, 0x00000000, 0x0007fff8 },	/* PR_REGS */		\
-  { 0xffffffff, 0xffffffff, 0x000fffff }	/* ALL_REGS */		\
+  { 0x00000000, 0x00000000, 0x00000ff0 },	/* PR_LO_REGS */	\
+  { 0x00000000, 0x00000000, 0x000ff000 },	/* PR_HI_REGS */	\
+  { 0x00000000, 0x00000000, 0x000ffff0 },	/* PR_REGS */		\
+  { 0xffffffff, 0xffffffff, 0x001fffff }	/* ALL_REGS */		\
 }
 
 #define REGNO_REG_CLASS(REGNO)	aarch64_regno_regclass (REGNO)
@@ -622,6 +630,9 @@ struct GTY (()) aarch64_frame
   /* The size of the stack adjustment after saving callee-saves.  */
   poly_int64 final_adjust;
 
+  /* Store FP,LR and setup a frame pointer.  */
+  bool emit_frame_chain;
+
   unsigned wb_candidate1;
   unsigned wb_candidate2;
 
@@ -989,12 +1000,18 @@ extern tree aarch64_fp16_ptr_type_node;
 
 #ifndef USED_FOR_TARGET
 extern poly_uint16 aarch64_sve_vg;
+
+/* The number of bits and bytes in an SVE vector.  */
 #define BITS_PER_SVE_VECTOR (poly_uint16 (aarch64_sve_vg * 64))
 #define BYTES_PER_SVE_VECTOR (poly_uint16 (aarch64_sve_vg * 8))
+
+/* The number of bytes in an SVE predicate.  */
 #define BYTES_PER_SVE_PRED aarch64_sve_vg
+
+/* The SVE mode for a vector of bytes.  */
 #define SVE_BYTE_MODE V32QImode
 
-/* Maximum number of bytes in a fixed-size vector.  This is 256 bytes
+/* The maximum number of bytes in a fixed-size vector.  This is 256 bytes
    (for -msve-vector-bits=2048) multiplied by the maximum number of
    vectors in a structure mode (4).
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5be96aaa92d..6a15ff0b61d 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -63,18 +63,15 @@
     (SFP_REGNUM		64)
     (AP_REGNUM		65)
     (CC_REGNUM		66)
-    (P0_REGNUM		67)
-    (P7_REGNUM		74)
-    (P15_REGNUM		82)
-    (FFRT_REGNUM	83)
+    ;; Defined only to make the DWARF description simpler.
+    (VG_REGNUM		67)
+    (P0_REGNUM		68)
+    (P7_REGNUM		75)
+    (P15_REGNUM		83)
+    (FFRT_REGNUM	84)
   ]
 )
 
-(define_c_enum "param" [
-  ; Parameter used for poly_ints.  Must be first.
-  CPARAM_VQ
-])
-
 (define_c_enum "unspec" [
     UNSPEC_AUTI1716
     UNSPEC_AUTISP
@@ -170,7 +167,6 @@
     UNSPEC_PACK
     UNSPEC_FLOAT_CONVERT
     UNSPEC_WHILE_LO
-    UNSPEC_PRED_MOVE
     UNSPEC_CLASTB
     UNSPEC_LDFF1
     UNSPEC_READ_NF
@@ -901,8 +897,7 @@
     if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
       operands[1] = force_reg (<MODE>mode, operands[1]);
 
-    if (CONSTANT_P (operands[1])
-	&& !CONST_INT_P (operands[1]))
+    if (GET_CODE (operands[1]) == CONST_POLY_INT)
       {
 	aarch64_expand_mov_immediate (operands[0], operands[1]);
 	DONE;
@@ -1616,8 +1611,8 @@
   else if (operands[0] == stack_pointer_rtx
 	   && aarch64_split_add_offset_immediate (operands[2], <MODE>mode))
     {
-      aarch64_split_add_offset (<MODE>mode, operands[0], NULL_RTX, NULL_RTX,
-				operands[1], operands[2]);
+      aarch64_split_add_offset (<MODE>mode, operands[0], operands[1],
+				operands[2], NULL_RTX, NULL_RTX);
       DONE;
     }
 })
@@ -1734,8 +1729,8 @@
    && aarch64_split_add_offset_immediate (operands[2], <MODE>mode)"
   [(const_int 0)]
   {
-    aarch64_split_add_offset (<MODE>mode, operands[0], operands[0], NULL_RTX,
-			      operands[1], operands[2]);
+    aarch64_split_add_offset (<MODE>mode, operands[0], operands[1],
+			      operands[2], operands[0], NULL_RTX);
     DONE;
   }
   ;; The "alu_imm" type for ADDVL/ADDPL is just a placeholder.
@@ -4123,7 +4118,7 @@
 (define_expand "<optab><mode>3"
   [(set (match_operand:GPI 0 "register_operand")
 	(ASHIFT:GPI (match_operand:GPI 1 "register_operand")
-		    (match_operand:QI 2 "nonmemory_operand")))]
+		    (match_operand:QI 2 "aarch64_reg_or_imm")))]
   ""
   {
     if (CONST_INT_P (operands[2]))
@@ -4159,7 +4154,7 @@
 (define_expand "rotr<mode>3"
   [(set (match_operand:GPI 0 "register_operand")
 	(rotatert:GPI (match_operand:GPI 1 "register_operand")
-		      (match_operand:QI 2 "nonmemory_operand")))]
+		      (match_operand:QI 2 "aarch64_reg_or_imm")))]
   ""
   {
     if (CONST_INT_P (operands[2]))
@@ -4179,7 +4174,7 @@
 (define_expand "rotl<mode>3"
   [(set (match_operand:GPI 0 "register_operand")
 	(rotatert:GPI (match_operand:GPI 1 "register_operand")
-		      (match_operand:QI 2 "nonmemory_operand")))]
+		      (match_operand:QI 2 "aarch64_reg_or_imm")))]
   ""
   {
     /* (SZ - cnt) % SZ == -cnt % SZ */
@@ -5014,11 +5009,37 @@
   [(set_attr "type" "f_cvt")]
 )
 
-(define_insn "<optab>_trunc<GPF_F16:mode><GPI:mode>2"
+;; Convert SF -> SI or DF -> DI while preferring w = w register constraints
+;; and making r = w more expensive
+
+(define_insn "<optab>_trunc<fcvt_target><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=?r,w")
+	(FIXUORS:GPI (match_operand:<FCVT_TARGET> 1 "register_operand" "w,w")))]
+  "TARGET_FLOAT"
+  "@
+   fcvtz<su>\t%<w>0, %<s>1
+   fcvtz<su>\t%<s>0, %<s>1"
+  [(set_attr "type" "f_cvtf2i,neon_fp_to_int_s")]
+)
+
+;; Convert HF -> SI or DI
+
+(define_insn "<optab>_trunchf<GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(FIXUORS:GPI (match_operand:HF 1 "register_operand" "w")))]
+  "TARGET_FP_F16INST"
+  "fcvtz<su>\t%<w>0, %h1"
+  [(set_attr "type" "f_cvtf2i")]
+)
+
+;; Convert DF -> SI or SF -> DI which can only be accomplished with
+;; input in a fp register and output in a integer register
+
+(define_insn "<optab>_trunc<fcvt_change_mode><GPI:mode>2"
   [(set (match_operand:GPI 0 "register_operand" "=r")
-	(FIXUORS:GPI (match_operand:GPF_F16 1 "register_operand" "w")))]
+	(FIXUORS:GPI (match_operand:<FCVT_CHANGE_MODE> 1 "register_operand" "w")))]
   "TARGET_FLOAT"
-  "fcvtz<su>\t%<GPI:w>0, %<GPF_F16:s>1"
+  "fcvtz<su>\t%<w>0, %<fpw>1"
   [(set_attr "type" "f_cvtf2i")]
 )
 
@@ -5320,7 +5341,9 @@
 (define_expand "lrint<GPF:mode><GPI:mode>2"
   [(match_operand:GPI 0 "register_operand")
    (match_operand:GPF 1 "register_operand")]
-  "TARGET_FLOAT"
+  "TARGET_FLOAT
+   && ((GET_MODE_SIZE (<GPF:MODE>mode) <= GET_MODE_SIZE (<GPI:MODE>mode))
+   || !flag_trapping_math || flag_fp_int_builtin_inexact)"
 {
   rtx cvt = gen_reg_rtx (<GPF:MODE>mode);
   emit_insn (gen_rint<GPF:mode>2 (cvt, operands[1]));
@@ -5783,7 +5806,7 @@
 )
 
 (define_insn "probe_stack_range"
-  [(set (match_operand:DI 0 "register_operand" "=r")
+  [(set (match_operand:DI 0 "register_operand" "=rk")
 	(unspec_volatile:DI [(match_operand:DI 1 "register_operand" "0")
 			     (match_operand:DI 2 "register_operand" "r")]
 			      UNSPECV_PROBE_STACK_RANGE))]
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index d7b30b0e5ee..96e740f91a7 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -31541,6 +31541,99 @@ vminnmvq_f16 (float16x8_t __a)
 
 #pragma GCC pop_options
 
+/* AdvSIMD Dot Product intrinsics.  */
+
+#pragma GCC push_options
+#pragma GCC target ("arch=armv8.2-a+dotprod")
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_u32 (uint32x2_t __r, uint8x8_t __a, uint8x8_t __b)
+{
+  return __builtin_aarch64_udotv8qi_uuuu (__r, __a, __b);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
+{
+  return __builtin_aarch64_udotv16qi_uuuu (__r, __a, __b);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_s32 (int32x2_t __r, int8x8_t __a, int8x8_t __b)
+{
+  return __builtin_aarch64_sdotv8qi (__r, __a, __b);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
+{
+  return __builtin_aarch64_sdotv16qi (__r, __a, __b);
+}
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_lane_u32 (uint32x2_t __r, uint8x8_t __a, uint8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_udot_lanev8qi_uuuus (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_laneq_u32 (uint32x2_t __r, uint8x8_t __a, uint8x16_t __b,
+		const int __index)
+{
+  return __builtin_aarch64_udot_laneqv8qi_uuuus (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_lane_u32 (uint32x4_t __r, uint8x16_t __a, uint8x8_t __b,
+		const int __index)
+{
+  return __builtin_aarch64_udot_lanev16qi_uuuus (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_laneq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b,
+		 const int __index)
+{
+  return __builtin_aarch64_udot_laneqv16qi_uuuus (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_lane_s32 (int32x2_t __r, int8x8_t __a, int8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_lanev8qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_laneq_s32 (int32x2_t __r, int8x8_t __a, int8x16_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_laneqv8qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_lane_s32 (int32x4_t __r, int8x16_t __a, int8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_lanev16qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_laneq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_laneqv16qi (__r, __a, __b, __index);
+}
+#pragma GCC pop_options
+
 #undef __aarch64_vget_lane_any
 
 #undef __aarch64_vdup_lane_any
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index 0f7e3099a4c..2a8722c4c86 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -160,14 +160,14 @@
 ;; "ad" for "(A)DDVL/ADDPL (D)irect".
 (define_constraint "Uad"
   "@internal
- A constraint that matches a VG-based constant that can be added by
- a single ADDVL or ADDPL."
+   A constraint that matches a VG-based constant that can be added by
+   a single ADDVL or ADDPL."
  (match_operand 0 "aarch64_sve_addvl_addpl_immediate"))
 
 (define_constraint "Ua1"
   "@internal
- A constraint that matches a VG-based constant that can be added by
- using multiple instructions, with one temporary register."
+   A constraint that matches a VG-based constant that can be added by
+   using multiple instructions, with one temporary register."
  (match_operand 0 "aarch64_split_add_offset_immediate"))
 
 (define_memory_constraint "Q"
@@ -297,7 +297,7 @@
 
 (define_constraint "Dm"
   "@internal
- A constraint that matches a vector of immediate minus one."
+   A constraint that matches a vector of immediate minus one."
  (and (match_code "const,const_vector")
       (match_test "op == CONST1_RTX (GET_MODE (op))")))
 
@@ -322,65 +322,65 @@
 
 (define_constraint "Dv"
   "@internal
- A constraint that matches a VG-based constant that can be loaded by
- a single CNT[BHWD]."
+   A constraint that matches a VG-based constant that can be loaded by
+   a single CNT[BHWD]."
  (match_operand 0 "aarch64_sve_cnt_immediate"))
 
 (define_constraint "vsa"
   "@internal
- A constraint that matches a signed immediate operand valid for SVE
- arithmetic instructions."
+   A constraint that matches an immediate operand valid for SVE
+   arithmetic instructions."
  (match_operand 0 "aarch64_sve_arith_immediate"))
 
 (define_constraint "vsc"
   "@internal
- A constraint that matches a signed immediate operand valid for SVE
- CMP instructions."
+   A constraint that matches a signed immediate operand valid for SVE
+   CMP instructions."
  (match_operand 0 "aarch64_sve_cmp_vsc_immediate"))
 
 (define_constraint "vsd"
   "@internal
- A constraint that matches an unsigned immediate operand valid for SVE
- CMP instructions."
+   A constraint that matches an unsigned immediate operand valid for SVE
+   CMP instructions."
  (match_operand 0 "aarch64_sve_cmp_vsd_immediate"))
 
 (define_constraint "vsi"
   "@internal
- A constraint that matches a vector count operand valid for SVE INC and
- DEC instructions."
+   A constraint that matches a vector count operand valid for SVE INC and
+   DEC instructions."
  (match_operand 0 "aarch64_sve_inc_dec_immediate"))
 
 (define_constraint "vsn"
   "@internal
- A constraint that matches a signed immediate operand whose negative
- is valid for SVE SUB instructions."
+   A constraint that matches an immediate operand whose negative
+   is valid for SVE SUB instructions."
  (match_operand 0 "aarch64_sve_sub_arith_immediate"))
 
 (define_constraint "vsl"
   "@internal
- A constraint that matches an immediate operand valid for SVE logical
- operations."
+   A constraint that matches an immediate operand valid for SVE logical
+   operations."
  (match_operand 0 "aarch64_sve_logical_immediate"))
 
 (define_constraint "vsm"
   "@internal
- A constraint that matches an immediate operand valid for SVE MUL
- operations."
+   A constraint that matches an immediate operand valid for SVE MUL
+   operations."
  (match_operand 0 "aarch64_sve_mul_immediate"))
 
 (define_constraint "vfa"
   "@internal
- A constraint that matches an immediate operand valid for SVE FADD
- and FSUB operations."
+   A constraint that matches an immediate operand valid for SVE FADD
+   and FSUB operations."
  (match_operand 0 "aarch64_sve_float_arith_immediate"))
 
 (define_constraint "vfm"
   "@internal
- A constraint that matches an imediate operand valid for SVE FMUL
- operations."
+   A constraint that matches an imediate operand valid for SVE FMUL
+   operations."
  (match_operand 0 "aarch64_sve_float_mul_immediate"))
 
 (define_constraint "vfn"
   "@internal
- A constraint that matches the negative of vfa"
+   A constraint that matches the negative of vfa"
  (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate"))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 4cb4babe024..2a4e26fb940 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -56,20 +56,20 @@
 ;; Iterator for all scalar floating point modes (SF, DF and TF)
 (define_mode_iterator GPF_TF [SF DF TF])
 
-;; Integer AdvSIMD modes.
+;; Integer Advanced SIMD modes.
 (define_mode_iterator VDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI])
 
-;; AdvSIMD and scalar, 64 & 128-bit container, all integer modes
+;; Advanced SIMD and scalar, 64 & 128-bit container, all integer modes.
 (define_mode_iterator VSDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI QI HI SI DI])
 
-;; AdvSIMD and scalar, 64 & 128-bit container: all AdvSIMD integer modes;
-;; 64-bit scalar integer mode
+;; Advanced SIMD and scalar, 64 & 128-bit container: all Advanced SIMD
+;; integer modes; 64-bit scalar integer mode.
 (define_mode_iterator VSDQ_I_DI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI DI])
 
 ;; Double vector modes.
 (define_mode_iterator VD [V8QI V4HI V4HF V2SI V2SF])
 
-;; AdvSIMD, 64-bit container, all integer modes
+;; Advanced SIMD, 64-bit container, all integer modes.
 (define_mode_iterator VD_BHSI [V8QI V4HI V2SI])
 
 ;; 128 and 64-bit container; 8, 16, 32-bit vector integer modes
@@ -94,16 +94,16 @@
 ;; pointer-sized quantities.  Exactly one of the two alternatives will match.
 (define_mode_iterator PTR [(SI "ptr_mode == SImode") (DI "ptr_mode == DImode")])
 
-;; AdvSIMD Float modes suitable for moving, loading and storing.
+;; Advanced SIMD Float modes suitable for moving, loading and storing.
 (define_mode_iterator VDQF_F16 [V4HF V8HF V2SF V4SF V2DF])
 
-;; AdvSIMD Float modes.
+;; Advanced SIMD Float modes.
 (define_mode_iterator VDQF [V2SF V4SF V2DF])
 (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST")
 			     (V8HF "TARGET_SIMD_F16INST")
 			     V2SF V4SF V2DF])
 
-;; AdvSIMD Float modes, and DF.
+;; Advanced SIMD Float modes, and DF.
 (define_mode_iterator VHSDF_DF [(V4HF "TARGET_SIMD_F16INST")
 				(V8HF "TARGET_SIMD_F16INST")
 				V2SF V4SF V2DF DF])
@@ -113,7 +113,7 @@
 				  (HF "TARGET_SIMD_F16INST")
 				  SF DF])
 
-;; AdvSIMD single Float modes.
+;; Advanced SIMD single Float modes.
 (define_mode_iterator VDQSF [V2SF V4SF])
 
 ;; Quad vector Float modes with half/single elements.
@@ -122,16 +122,16 @@
 ;; Modes suitable to use as the return type of a vcond expression.
 (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI])
 
-;; All scalar and AdvSIMD Float modes.
+;; All scalar and Advanced SIMD Float modes.
 (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF])
 
-;; AdvSIMD Float modes with 2 elements.
+;; Advanced SIMD Float modes with 2 elements.
 (define_mode_iterator V2F [V2SF V2DF])
 
-;; All AdvSIMD modes on which we support any arithmetic operations.
+;; All Advanced SIMD modes on which we support any arithmetic operations.
 (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF])
 
-;; All AdvSIMD modes suitable for moving, loading, and storing.
+;; All Advanced SIMD modes suitable for moving, loading, and storing.
 (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI
 				V4HF V8HF V2SF V4SF V2DF])
 
@@ -139,21 +139,21 @@
 (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI V4SI
 				V4HF V8HF V2SF V4SF])
 
-;; All AdvSIMD modes barring HF modes, plus DI.
+;; All Advanced SIMD modes barring HF modes, plus DI.
 (define_mode_iterator VALLDI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF DI])
 
-;; All AdvSIMD modes and DI.
+;; All Advanced SIMD modes and DI.
 (define_mode_iterator VALLDI_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI
 				  V4HF V8HF V2SF V4SF V2DF DI])
 
-;; All AdvSIMD modes, plus DI and DF.
+;; All Advanced SIMD modes, plus DI and DF.
 (define_mode_iterator VALLDIF [V8QI V16QI V4HI V8HI V2SI V4SI
 			       V2DI V4HF V8HF V2SF V4SF V2DF DI DF])
 
-;; AdvSIMD modes for Integer reduction across lanes.
+;; Advanced SIMD modes for Integer reduction across lanes.
 (define_mode_iterator VDQV [V8QI V16QI V4HI V8HI V4SI V2DI])
 
-;; AdvSIMD modes (except V2DI) for Integer reduction across lanes.
+;; Advanced SIMD modes (except V2DI) for Integer reduction across lanes.
 (define_mode_iterator VDQV_S [V8QI V16QI V4HI V8HI V4SI])
 
 ;; All double integer narrow-able modes.
@@ -162,8 +162,8 @@
 ;; All quad integer narrow-able modes.
 (define_mode_iterator VQN [V8HI V4SI V2DI])
 
-;; AdvSIMD and scalar 128-bit container: narrowable 16, 32, 64-bit integer
-;; modes
+;; Advanced SIMD and scalar 128-bit container: narrowable 16, 32, 64-bit
+;; integer modes
 (define_mode_iterator VSQN_HSDI [V8HI V4SI V2DI HI SI DI])
 
 ;; All quad integer widen-able modes.
@@ -172,54 +172,54 @@
 ;; Double vector modes for combines.
 (define_mode_iterator VDC [V8QI V4HI V4HF V2SI V2SF DI DF])
 
-;; AdvSIMD modes except double int.
+;; Advanced SIMD modes except double int.
 (define_mode_iterator VDQIF [V8QI V16QI V4HI V8HI V2SI V4SI V2SF V4SF V2DF])
 (define_mode_iterator VDQIF_F16 [V8QI V16QI V4HI V8HI V2SI V4SI
                                  V4HF V8HF V2SF V4SF V2DF])
 
-;; AdvSIMD modes for S type.
+;; Advanced SIMD modes for S type.
 (define_mode_iterator VDQ_SI [V2SI V4SI])
 
-;; AdvSIMD modes for S and D
+;; Advanced SIMD modes for S and D.
 (define_mode_iterator VDQ_SDI [V2SI V4SI V2DI])
 
-;; AdvSIMD modes for H, S and D
+;; Advanced SIMD modes for H, S and D.
 (define_mode_iterator VDQ_HSDI [(V4HI "TARGET_SIMD_F16INST")
 				(V8HI "TARGET_SIMD_F16INST")
 				V2SI V4SI V2DI])
 
-;; Scalar and AdvSIMD modes for S and D
+;; Scalar and Advanced SIMD modes for S and D.
 (define_mode_iterator VSDQ_SDI [V2SI V4SI V2DI SI DI])
 
-;; Scalar and AdvSIMD modes for S and D, AdvSIMD modes for H.
+;; Scalar and Advanced SIMD modes for S and D, Advanced SIMD modes for H.
 (define_mode_iterator VSDQ_HSDI [(V4HI "TARGET_SIMD_F16INST")
 				 (V8HI "TARGET_SIMD_F16INST")
 				 V2SI V4SI V2DI
 				 (HI "TARGET_SIMD_F16INST")
 				 SI DI])
 
-;; AdvSIMD modes for Q and H types.
+;; Advanced SIMD modes for Q and H types.
 (define_mode_iterator VDQQH [V8QI V16QI V4HI V8HI])
 
-;; AdvSIMD modes for H and S types.
+;; Advanced SIMD modes for H and S types.
 (define_mode_iterator VDQHS [V4HI V8HI V2SI V4SI])
 
-;; AdvSIMD modes for H, S and D types.
+;; Advanced SIMD modes for H, S and D types.
 (define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI])
 
-;; AdvSIMD and scalar integer modes for H and S
+;; Advanced SIMD and scalar integer modes for H and S.
 (define_mode_iterator VSDQ_HSI [V4HI V8HI V2SI V4SI HI SI])
 
-;; AdvSIMD and scalar 64-bit container: 16, 32-bit integer modes
+;; Advanced SIMD and scalar 64-bit container: 16, 32-bit integer modes.
 (define_mode_iterator VSD_HSI [V4HI V2SI HI SI])
 
-;; AdvSIMD 64-bit container: 16, 32-bit integer modes
+;; Advanced SIMD 64-bit container: 16, 32-bit integer modes.
 (define_mode_iterator VD_HSI [V4HI V2SI])
 
 ;; Scalar 64-bit container: 16, 32-bit integer modes
 (define_mode_iterator SD_HSI [HI SI])
 
-;; AdvSIMD 64-bit container: 16, 32-bit integer modes
+;; Advanced SIMD 64-bit container: 16, 32-bit integer modes.
 (define_mode_iterator VQ_HSI [V8HI V4SI])
 
 ;; All byte modes.
@@ -230,38 +230,45 @@
 
 (define_mode_iterator TX [TI TF])
 
-;; AdvSIMD opaque structure modes.
+;; Advanced SIMD opaque structure modes.
 (define_mode_iterator VSTRUCT [OI CI XI])
 
 ;; Double scalar modes
 (define_mode_iterator DX [DI DF])
 
-;; Modes available for AdvSIMD <f>mul lane operations.
+;; Modes available for Advanced SIMD <f>mul lane operations.
 (define_mode_iterator VMUL [V4HI V8HI V2SI V4SI
 			    (V4HF "TARGET_SIMD_F16INST")
 			    (V8HF "TARGET_SIMD_F16INST")
 			    V2SF V4SF V2DF])
 
-;; Modes available for AdvSIMD <f>mul lane operations changing lane count.
+;; Modes available for Advanced SIMD <f>mul lane operations changing lane
+;; count.
 (define_mode_iterator VMUL_CHANGE_NLANES [V4HI V8HI V2SI V4SI V2SF V4SF])
 
 ;; All SVE vector modes.
-(define_mode_iterator SVE_ALL [V32QI V16HI V8SI V4DI V8SF V4DF])
+(define_mode_iterator SVE_ALL [V32QI V16HI V8SI V4DI V16HF V8SF V4DF])
 
 ;; All SVE vector structure modes.
-(define_mode_iterator SVE_STRUCT [V64QI V32HI V16SI V8DI V16SF V8DF
-				  V96QI V48HI V24SI V12DI V24SF V12DF
-				  V128QI V64HI V32SI V16DI V32SF V16DF])
+(define_mode_iterator SVE_STRUCT [V64QI V32HI V16SI V8DI V32HF V16SF V8DF
+				  V96QI V48HI V24SI V12DI V48HF V24SF V12DF
+				  V128QI V64HI V32SI V16DI V64HF V32SF V16DF])
 
 ;; All SVE vector modes that have 8-bit or 16-bit elements.
-(define_mode_iterator SVE_BH [V32QI V16HI])
+(define_mode_iterator SVE_BH [V32QI V16HI V16HF])
 
 ;; All SVE vector modes that have 8-bit, 16-bit or 32-bit elements.
-(define_mode_iterator SVE_BHS [V32QI V16HI V8SI V8SF])
+(define_mode_iterator SVE_BHS [V32QI V16HI V8SI V16HF V8SF])
 
 ;; All SVE integer vector modes that have 8-bit, 16-bit or 32-bit elements.
 (define_mode_iterator SVE_BHSI [V32QI V16HI V8SI])
 
+;; All SVE integer vector modes that have 16-bit, 32-bit or 64-bit elements.
+(define_mode_iterator SVE_HSDI [V32QI V16HI V8SI])
+
+;; All SVE floating-point vector modes that have 16-bit or 32-bit elements.
+(define_mode_iterator SVE_HSF [V16HF V8SF])
+
 ;; All SVE vector modes that have 32-bit or 64-bit elements.
 (define_mode_iterator SVE_SD [V8SI V4DI V8SF V4DF])
 
@@ -272,7 +279,7 @@
 (define_mode_iterator SVE_I [V32QI V16HI V8SI V4DI])
 
 ;; All SVE floating-point vector modes.
-(define_mode_iterator SVE_F [V8SF V4DF])
+(define_mode_iterator SVE_F [V16HF V8SF V4DF])
 
 ;; All SVE predicate modes.
 (define_mode_iterator PRED_ALL [V32BI V16BI V8BI V4BI])
@@ -358,16 +365,21 @@
     UNSPEC_TBL		; Used in vector permute patterns.
     UNSPEC_TBX		; Used in vector permute patterns.
     UNSPEC_CONCAT	; Used in vector permute patterns.
+
+    ;; The following permute unspecs are generated directly by
+    ;; aarch64_expand_vec_perm_const, so any changes to the underlying
+    ;; instructions would need a corresponding change there.
     UNSPEC_ZIP1		; Used in vector permute patterns.
     UNSPEC_ZIP2		; Used in vector permute patterns.
     UNSPEC_UZP1		; Used in vector permute patterns.
     UNSPEC_UZP2		; Used in vector permute patterns.
     UNSPEC_TRN1		; Used in vector permute patterns.
     UNSPEC_TRN2		; Used in vector permute patterns.
-    UNSPEC_EXT		; Used in aarch64-simd.md.
+    UNSPEC_EXT		; Used in vector permute patterns.
     UNSPEC_REV64	; Used in vector reverse patterns (permute).
     UNSPEC_REV32	; Used in vector reverse patterns (permute).
     UNSPEC_REV16	; Used in vector reverse patterns (permute).
+
     UNSPEC_AESE		; Used in aarch64-simd.md.
     UNSPEC_AESD         ; Used in aarch64-simd.md.
     UNSPEC_AESMC        ; Used in aarch64-simd.md.
@@ -390,6 +402,8 @@
     UNSPEC_SQRDMLSH     ; Used in aarch64-simd.md.
     UNSPEC_FMAXNM       ; Used in aarch64-simd.md.
     UNSPEC_FMINNM       ; Used in aarch64-simd.md.
+    UNSPEC_SDOT		; Used in aarch64-simd.md.
+    UNSPEC_UDOT		; Used in aarch64-simd.md.
     UNSPEC_SEL		; Used in aarch64-sve.md.
     UNSPEC_ANDV		; Used in aarch64-sve.md.
     UNSPEC_IORV		; Used in aarch64-sve.md.
@@ -467,6 +481,9 @@
 (define_mode_attr w1 [(HF "w") (SF "w") (DF "x")])
 (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")])
 
+;; For width of fp registers in fcvt instruction
+(define_mode_attr fpw [(DI "s") (SI "d")])
+
 (define_mode_attr short_mask [(HI "65535") (QI "255")])
 
 ;; For constraints used in scalar immediate vector moves
@@ -475,6 +492,10 @@
 ;; For doubling width of an integer mode
 (define_mode_attr DWI [(QI "HI") (HI "SI") (SI "DI") (DI "TI")])
 
+(define_mode_attr fcvt_change_mode [(SI "df") (DI "sf")])
+
+(define_mode_attr FCVT_CHANGE_MODE [(SI "DF") (DI "SF")])
+
 ;; For scalar usage of vector/FP registers
 (define_mode_attr v [(QI "b") (HI "h") (SI "s") (DI "d")
 		    (HF  "h") (SF "s") (DF "d")
@@ -507,7 +528,8 @@
 (define_mode_attr rtn [(DI "d") (SI "")])
 (define_mode_attr vas [(DI "") (SI ".2s")])
 
-;; Map a vector to the number of units.
+;; Map a vector to the number of units in it, if the size of the mode
+;; is constant.
 (define_mode_attr nunits [(V8QI "8") (V16QI "16")
 			  (V4HI "4") (V8HI "8")
 			  (V2SI "2") (V4SI "4")
@@ -517,8 +539,15 @@
 			  (V1DF "1") (V2DF "2")
 			  (DI "1") (DF "1")])
 
-;; Map a floating point mode to the appropriate register name prefix
-(define_mode_attr s [(HF "h") (SF "s") (DF "d")])
+;; Map a mode to the number of bits in it, if the size of the mode
+;; is constant.
+(define_mode_attr bitsize [(V8QI "64") (V16QI "128")
+			   (V4HI "64") (V8HI "128")
+			   (V2SI "64") (V4SI "128")
+				       (V2DI "128")])
+
+;; Map a floating point or integer mode to the appropriate register name prefix
+(define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI "s") (DI "d")])
 
 ;; Give the length suffix letter for a sign- or zero-extension.
 (define_mode_attr size [(QI "b") (HI "h") (SI "w")])
@@ -571,7 +600,7 @@
 			  (V4HI "h") (V8HI  "h") (V16HI "h") (V16BI "h")
 			  (V2SI "s") (V4SI  "s") (V8SI  "s") (V8BI "s")
 			  (V2DI "d") (V4DI  "d") (V4BI  "d")
-			  (V4HF "h") (V8HF  "h")
+			  (V4HF "h") (V8HF  "h") (V16HF "h")
 			  (V2SF "s") (V4SF  "s") (V8SF  "s")
 			  (V2DF "d") (V4DF  "d")
 			  (HF   "h")
@@ -580,7 +609,8 @@
 			  (SI "s")   (DI "d")])
 
 ;; Equivalent of "size" for a vector element.
-(define_mode_attr Vesize [(V32QI "b") (V16HI "h")
+(define_mode_attr Vesize [(V32QI "b")
+			  (V16HI "h") (V16HF "h")
 			  (V8SI  "w") (V8SF  "w")
 			  (V4DI  "d") (V4DF  "d")])
 
@@ -609,23 +639,41 @@
 			(V4HI "HI") (V8HI  "HI") (V16HI "HI")
 			(V2SI "SI") (V4SI  "SI") (V8SI  "SI")
 			(DI   "DI") (V2DI  "DI") (V4DI  "DI")
-			(V4HF "HF") (V8HF  "HF")
+			(V4HF "HF") (V8HF  "HF") (V16HF "HF")
 			(V2SF "SF") (V4SF  "SF") (V8SF  "SF")
 			(DF   "DF") (V2DF  "DF") (V4DF  "DF")
 			(SI   "SI") (HI    "HI")
 			(QI   "QI")])
 
 ;; Define element mode for each vector mode (lower case).
-(define_mode_attr Vel [(V8QI "qi") (V16QI "qi")
-			(V4HI "hi") (V8HI "hi")
-			(V2SI "si") (V4SI "si")
-			(DI "di")   (V2DI "di")
-			(V4HF "hf") (V8HF "hf")
-			(V2SF "sf") (V4SF "sf")
-			(V2DF "df") (DF "df")
+(define_mode_attr Vel [(V8QI "qi") (V16QI "qi") (V32QI "qi")
+			(V4HI "hi") (V8HI "hi") (V16HI "hi")
+			(V2SI "si") (V4SI "si") (V8SI  "si")
+			(DI "di")   (V2DI "di") (V4DI  "di")
+			(V4HF "hf") (V8HF "hf") (V16HF "hf")
+			(V2SF "sf") (V4SF "sf") (V8SF  "sf")
+			(V2DF "df") (DF "df")   (V4DF  "df")
 			(SI   "si") (HI   "hi")
 			(QI   "qi")])
 
+;; Element mode with floating-point values replaced by like-sized integers.
+(define_mode_attr VEL_INT [(V32QI "QI")
+			   (V16HI "HI") (V16HF "HI")
+			   (V8SI  "SI") (V8SF  "SI")
+			   (V4DI  "DI") (V4DF  "DI")])
+
+;; Gives the mode of the 128-bit lowpart of an SVE vector.
+(define_mode_attr V128 [(V32QI "V16QI")
+			(V16HI "V8HI") (V16HF "V8HF")
+			(V8SI "V4SI") (V8SF "V4SF")
+			(V4DI "V2DI") (V4DF "V2DF")])
+
+;; ...and again in lower case.
+(define_mode_attr v128 [(V32QI "v16qi")
+			(V16HI "v8hi") (V16HF "v8hf")
+			(V8SI "v4si") (V8SF "v4sf")
+			(V4DI "v2di") (V4DF "v2df")])
+
 ;; 64-bit container modes the inner or scalar source mode.
 (define_mode_attr VCOND [(HI "V4HI") (SI "V2SI")
 			 (V4HI "V4HI") (V8HI "V4HI")
@@ -710,15 +758,20 @@
 			 (HI    "SI")    (SI    "DI")
 			 (V8HF  "V4SF")  (V4SF  "V2DF")
 			 (V4HF  "V4SF")  (V2SF  "V2DF")
+			 (V16HF "V8SF")  (V8SF  "V4DF")
 			 (V32QI "V16HI") (V16HI "V8SI")
 			 (V8SI  "V4DI")
 			 (V32BI "V16BI") (V16BI "V8BI")
 			 (V8BI  "V4BI")])
 
+;; Predicate mode associated with VWIDE.
+(define_mode_attr VWIDE_PRED [(V16HF "V8BI") (V8SF "V4BI")])
+
 ;; Widened modes of vector modes, lowercase
 (define_mode_attr Vwide [(V2SF "v2df") (V4HF "v4sf")
 			 (V32QI "v16hi") (V16HI "v8si")
 			 (V8SI  "v4di")
+			 (V16HF "v8sf") (V8SF "v4df")
 			 (V32BI "v16bi") (V16BI "v8bi")
 			 (V8BI  "v4bi")])
 
@@ -729,7 +782,9 @@
 			  (V8HF "4s") (V4SF "2d")])
 
 ;; SVE vector after widening
-(define_mode_attr Vewtype [(V32QI "h") (V16HI "s") (V8SI "d")])
+(define_mode_attr Vewtype [(V32QI "h")
+			   (V16HI "s") (V16HF "s")
+			   (V8SI "d") (V8SF "d")])
 
 ;; Widened mode register suffixes for VDW/VQW.
 (define_mode_attr Vmwtype [(V8QI ".8h") (V4HI ".4s")
@@ -748,6 +803,7 @@
 		      (V4HI "w") (V8HI "w") (V16HI "w")
 		      (V2SI "w") (V4SI "w") (V8SI "w")
 		      (DI   "x") (V2DI "x") (V4DI "x")
+		      (V16HF "h")
 		      (V2SF "s") (V4SF "s") (V8SF "s")
 		      (V2DF "d") (V4DF "d")])
 
@@ -757,7 +813,7 @@
 			  (V4HI "w") (V8HI "w") (V16HI "w")
 			  (V2SI "w") (V4SI "w") (V8SI "w")
 			  (DI   "x") (V2DI "x") (V4DI "x")
-			  (V4HF "w") (V8HF "w")
+			  (V4HF "w") (V8HF "w") (V16HF "w")
 			  (V2SF "w") (V4SF "w") (V8SF "w")
 			  (V2DF "x") (V4DF "x")])
 
@@ -769,7 +825,7 @@
 			       (V4HI "V4HI") (V8HI  "V8HI")  (V16HI "V16HI")
 			       (V2SI "V2SI") (V4SI  "V4SI")  (V8SI  "V8SI")
 			       (DI   "DI")   (V2DI  "V2DI")  (V4DI  "V4DI")
-			       (V4HF "V4HI") (V8HF  "V8HI")
+			       (V4HF "V4HI") (V8HF  "V8HI")  (V16HF "V16HI")
 			       (V2SF "V2SI") (V4SF  "V4SI")  (V8SF  "V8SI")
 			       (DF   "DI")   (V2DF  "V2DI")  (V4DF  "V4DI")
 			       (SF   "SI")   (HF    "HI")])
@@ -779,7 +835,7 @@
 			       (V4HI "v4hi") (V8HI  "v8hi")  (V16HI "v16hi")
 			       (V2SI "v2si") (V4SI  "v4si")  (V8SI  "v8si")
 			       (DI   "di")   (V2DI  "v2di")  (V4DI  "v4di")
-			       (V4HF "v4hi") (V8HF  "v8hi")
+			       (V4HF "v4hi") (V8HF  "v8hi")  (V16HF "v16hi")
 			       (V2SF "v2si") (V4SF  "v4si")  (V8SF  "v8si")
 			       (DF   "di")   (V2DF  "v2di")  (V4DF  "v4di")
 			       (SF   "si")])
@@ -820,29 +876,29 @@
 (define_mode_attr nregs [(OI "2") (CI "3") (XI "4")])
 
 ;; Map the mode of a single vector to a list of two vectors.
-(define_mode_attr VRL2 [(V32QI "V64QI") (V16HI "V32HI")
+(define_mode_attr VRL2 [(V32QI "V64QI") (V16HI "V32HI") (V16HF "V32HF")
 			(V8SI  "V16SI") (V8SF  "V16SF")
 			(V4DI  "V8DI")  (V4DF  "V8DF")])
 
-(define_mode_attr vrl2 [(V32QI "v64qi") (V16HI "v32hi")
+(define_mode_attr vrl2 [(V32QI "v64qi") (V16HI "v32hi") (V16HF "v32hf")
 			(V8SI  "v16si") (V8SF  "v16sf")
 			(V4DI  "v8di")  (V4DF  "v8df")])
 
 ;; Map the mode of a single vector to a list of three vectors.
-(define_mode_attr VRL3 [(V32QI "V96QI") (V16HI "V48HI")
+(define_mode_attr VRL3 [(V32QI "V96QI") (V16HI "V48HI") (V16HF "V48HF")
 			(V8SI  "V24SI") (V8SF  "V24SF")
 			(V4DI  "V12DI") (V4DF  "V12DF")])
 
-(define_mode_attr vrl3 [(V32QI "v96qi") (V16HI "v48hi")
+(define_mode_attr vrl3 [(V32QI "v96qi") (V16HI "v48hi") (V16HF "v48hf")
 			(V8SI  "v24si") (V8SF  "v24sf")
 			(V4DI  "v12di") (V4DF  "v12df")])
 
 ;; Map the mode of a single vector to a list of four vectors.
-(define_mode_attr VRL4 [(V32QI "V128QI") (V16HI "V64HI")
+(define_mode_attr VRL4 [(V32QI "V128QI") (V16HI "V64HI") (V16HF "V64HF")
 			(V8SI  "V32SI")  (V8SF  "V32SF")
 			(V4DI  "V16DI")  (V4DF  "V16DF")])
 
-(define_mode_attr vrl4 [(V32QI "v128qi") (V16HI "v64hi")
+(define_mode_attr vrl4 [(V32QI "v128qi") (V16HI "v64hi") (V16HF "v64hf")
 			(V8SI  "v32si")  (V8SF  "v32sf")
 			(V4DI  "v16di")  (V4DF  "v16df")])
 
@@ -940,6 +996,10 @@
 (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
 (define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])
 
+
+;; Register suffix for DOTPROD input types from the return type.
+(define_mode_attr Vdottype [(V2SI "8b") (V4SI "16b")])
+
 ;; Sum of lengths of instructions needed to move vector registers of a mode.
 (define_mode_attr insn_count [(OI "8") (CI "12") (XI "16")])
 
@@ -948,72 +1008,84 @@
 (define_mode_attr got_modifier [(SI "gotpage_lo14") (DI "gotpage_lo15")])
 
 ;; The number of subvectors in an SVE_STRUCT.
-(define_mode_attr vector_count [(V64QI "2") (V32HI "2") (V16SI "2")
-				(V8DI "2") (V16SF "2") (V8DF "2")
-				(V96QI "3") (V48HI "3") (V24SI "3")
-				(V12DI "3") (V24SF "3") (V12DF "3")
-				(V128QI "4") (V64HI "4") (V32SI "4")
-				(V16DI "4") (V32SF "4") (V16DF "4")])
+(define_mode_attr vector_count [(V64QI "2") (V32HI "2")
+				(V16SI "2") (V8DI "2")
+				(V32HF "2") (V16SF "2") (V8DF "2")
+				(V96QI "3") (V48HI "3")
+				(V24SI "3") (V12DI "3")
+				(V48HF "3") (V24SF "3") (V12DF "3")
+				(V128QI "4") (V64HI "4")
+				(V32SI "4") (V16DI "4")
+				(V64HF "4") (V32SF "4") (V16DF "4")])
 
 ;; The number of instruction bytes needed for an SVE_STRUCT move.  This is
 ;; equal to vector_count * 4.
-(define_mode_attr insn_length [(V64QI "8") (V32HI "8") (V16SI "8")
-			       (V8DI "8") (V16SF "8") (V8DF "8")
-			       (V96QI "12") (V48HI "12") (V24SI "12")
-			       (V12DI "12") (V24SF "12") (V12DF "12")
-			       (V128QI "16") (V64HI "16") (V32SI "16")
-			       (V16DI "16") (V32SF "16") (V16DF "16")])
+(define_mode_attr insn_length [(V64QI "8") (V32HI "8")
+			       (V16SI "8") (V8DI "8")
+			       (V32HF "8") (V16SF "8") (V8DF "8")
+			       (V96QI "12") (V48HI "12")
+			       (V24SI "12") (V12DI "12")
+			       (V48HF "12") (V24SF "12") (V12DF "12")
+			       (V128QI "16") (V64HI "16")
+			       (V32SI "16") (V16DI "16")
+			       (V64HF "16") (V32SF "16") (V16DF "16")])
 
 ;; The type of a subvector in an SVE_STRUCT.
-(define_mode_attr VSINGLE [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI")
-			   (V8DI "V4DI") (V16SF "V8SF") (V8DF "V4DF")
-			   (V96QI "V32QI") (V48HI "V16HI") (V24SI "V8SI")
-			   (V12DI "V4DI") (V24SF "V8SF") (V12DF "V4DF")
-			   (V128QI "V32QI") (V64HI "V16HI") (V32SI "V8SI")
-			   (V16DI "V4DI") (V32SF "V8SF") (V16DF "V4DF")])
+(define_mode_attr VSINGLE [(V64QI "V32QI") (V32HI "V16HI")
+			   (V16SI "V8SI") (V8DI "V4DI")
+			   (V32HF "V16HF") (V16SF "V8SF") (V8DF "V4DF")
+			   (V96QI "V32QI") (V48HI "V16HI")
+			   (V24SI "V8SI") (V12DI "V4DI")
+			   (V48HF "V16HF") (V24SF "V8SF") (V12DF "V4DF")
+			   (V128QI "V32QI") (V64HI "V16HI")
+			   (V32SI "V8SI") (V16DI "V4DI")
+			   (V64HF "V16HF") (V32SF "V8SF") (V16DF "V4DF")])
 
 ;; ...and again in lower case.
-(define_mode_attr vsingle [(V64QI "v32qi") (V32HI "v16hi") (V16SI "v8si")
-			   (V8DI "v4di") (V16SF "v8sf") (V8DF "v4df")
-			   (V96QI "v32qi") (V48HI "v16hi") (V24SI "v8si")
-			   (V12DI "v4di") (V24SF "v8sf") (V12DF "v4df")
-			   (V128QI "v32qi") (V64HI "v16hi") (V32SI "v8si")
-			   (V16DI "v4di") (V32SF "v8sf") (V16DF "v4df")])
+(define_mode_attr vsingle [(V64QI "v32qi") (V32HI "v16hi")
+			   (V16SI "v8si") (V8DI "v4di")
+			   (V32HF "v16hf") (V16SF "v8sf") (V8DF "v4df")
+			   (V96QI "v32qi") (V48HI "v16hi")
+			   (V24SI "v8si") (V12DI "v4di")
+			   (V48HF "v16hf") (V24SF "v8sf") (V12DF "v4df")
+			   (V128QI "v32qi") (V64HI "v16hi")
+			   (V32SI "v8si") (V16DI "v4di")
+			   (V64HF "v16hf") (V32SF "v8sf") (V16DF "v4df")])
 
 ;; The predicate mode associated with an SVE data mode.  For structure modes
 ;; this is equivalent to the <VPRED> of the subvector mode.
 (define_mode_attr VPRED [(V32QI "V32BI")
-			 (V16HI "V16BI")
+			 (V16HI "V16BI") (V16HF "V16BI")
 			 (V8SI "V8BI") (V8SF "V8BI")
 			 (V4DI "V4BI") (V4DF "V4BI")
 			 (V64QI "V32BI")
-			 (V32HI "V16BI")
+			 (V32HI "V16BI") (V32HF "V16BI")
 			 (V16SI "V8BI") (V16SF "V8BI")
 			 (V8DI "V4BI") (V8DF "V4BI")
 			 (V96QI "V32BI")
-			 (V48HI "V16BI")
+			 (V48HI "V16BI") (V48HF "V16BI")
 			 (V24SI "V8BI") (V24SF "V8BI")
 			 (V12DI "V4BI") (V12DF "V4BI")
 			 (V128QI "V32BI")
-			 (V64HI "V16BI")
+			 (V64HI "V16BI") (V64HF "V16BI")
 			 (V32SI "V8BI") (V32SF "V8BI")
 			 (V16DI "V4BI") (V16DF "V4BI")])
 
 ;; ...and again in lower case.
 (define_mode_attr vpred [(V32QI "v32bi")
-			 (V16HI "v16bi")
+			 (V16HI "v16bi") (V16HF "v16bi")
 			 (V8SI "v8bi") (V8SF "v8bi")
 			 (V4DI "v4bi") (V4DF "v4bi")
 			 (V64QI "v32bi")
-			 (V32HI "v16bi")
+			 (V32HI "v16bi") (V32HF "v16bi")
 			 (V16SI "v8bi") (V16SF "v8bi")
 			 (V8DI "v4bi") (V8DF "v4bi")
 			 (V96QI "v32bi")
-			 (V48HI "v16bi")
+			 (V48HI "v16bi") (V48HF "v16bi")
 			 (V24SI "v8bi") (V24SF "v8bi")
 			 (V12DI "v4bi") (V12DF "v4bi")
 			 (V128QI "v32bi")
-			 (V64HI "v16bi")
+			 (V64HI "v16bi") (V64HF "v8bi")
 			 (V32SI "v8bi") (V32SF "v8bi")
 			 (V16DI "v4bi") (V16DF "v4bi")])
 
@@ -1099,6 +1171,7 @@
 ;; SVE integer unary operations.
 (define_code_iterator SVE_INT_UNARY [neg not popcount])
 
+;; SVE floating-point unary operations.
 (define_code_iterator SVE_FP_UNARY [neg abs sqrt])
 
 ;; -------------------------------------------------------------------
@@ -1251,6 +1324,7 @@
 ;; Attribute to describe constants acceptable in atomic logical operations
 (define_mode_attr lconst_atomic [(QI "K") (HI "K") (SI "K") (DI "L")])
 
+;; The integer SVE instruction that implements an rtx code.
 (define_code_attr sve_int_op [(plus "add")
 			      (neg "neg")
 			      (smin "smin")
@@ -1263,6 +1337,7 @@
 			      (not "not")
 			      (popcount "cnt")])
 
+;; The floating-point SVE instruction that implements an rtx code.
 (define_code_attr sve_fp_op [(plus "fadd")
 			     (neg "fneg")
 			     (abs "fabs")
@@ -1286,6 +1361,7 @@
 			      UNSPEC_SHSUB UNSPEC_UHSUB
 			      UNSPEC_SRHSUB UNSPEC_URHSUB])
 
+(define_int_iterator DOTPROD [UNSPEC_SDOT UNSPEC_UDOT])
 
 (define_int_iterator ADDSUBHN [UNSPEC_ADDHN UNSPEC_RADDHN
 			       UNSPEC_SUBHN UNSPEC_RSUBHN])
@@ -1407,6 +1483,9 @@
 ;; Int Iterators Attributes.
 ;; -------------------------------------------------------------------
 
+;; The optab associated with an operation.  Note that for ANDF, IORF
+;; and XORF, the optab should not actually be defined; we just use this
+;; name for consistency with the integer patterns.
 (define_int_attr optab [(UNSPEC_ANDF "and")
 			(UNSPEC_IORF "ior")
 			(UNSPEC_XORF "xor")
@@ -1459,10 +1538,12 @@
 			       (UNSPEC_IORV "orv")
 			       (UNSPEC_XORV "eorv")])
 
+;; The SVE logical instruction that implements an unspec.
 (define_int_attr logicalf_op [(UNSPEC_ANDF "and")
 		 	      (UNSPEC_IORF "orr")
 			      (UNSPEC_XORF "eor")])
 
+;; "s" for signed operations and "u" for unsigned ones.
 (define_int_attr su [(UNSPEC_UNPACKSHI "s")
 		     (UNSPEC_UNPACKUHI "u")
 		     (UNSPEC_UNPACKSLO "s")
@@ -1492,6 +1573,7 @@
 		      (UNSPEC_USHLL  "u")  (UNSPEC_SSHLL "s")
 		      (UNSPEC_URSHL  "ur") (UNSPEC_SRSHL  "sr")
 		      (UNSPEC_UQRSHL  "u") (UNSPEC_SQRSHL  "s")
+		      (UNSPEC_SDOT "s") (UNSPEC_UDOT "u")
 ])
 
 (define_int_attr r [(UNSPEC_SQDMULH "") (UNSPEC_SQRDMULH "r")
@@ -1608,6 +1690,7 @@
 
 (define_int_attr rdma_as [(UNSPEC_SQRDMLAH "a") (UNSPEC_SQRDMLSH "s")])
 
+;; The condition associated with an UNSPEC_COND_<xx>.
 (define_int_attr cmp_op [(UNSPEC_COND_LT "lt")
 			 (UNSPEC_COND_LE "le")
 			 (UNSPEC_COND_EQ "eq")
@@ -1619,6 +1702,7 @@
 			 (UNSPEC_COND_HS "hs")
 			 (UNSPEC_COND_HI "hi")])
 
+;; The constraint to use for an UNSPEC_COND_<xx>.
 (define_int_attr imm_con [(UNSPEC_COND_EQ "vsc")
 			  (UNSPEC_COND_NE "vsc")
 			  (UNSPEC_COND_LT "vsc")
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 4bb473b80c4..972ab2182d5 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -267,16 +267,20 @@
 	    (ior (match_operand 0 "memory_operand")
 		 (match_test "aarch64_mov_operand_p (op, mode)")))))
 
-(define_predicate "aarch64_movti_operand"
-  (and (match_code "reg,subreg,mem,const_int")
+(define_predicate "aarch64_nonmemory_operand"
+  (and (match_code "reg,subreg,const,const_int,symbol_ref,label_ref,high,
+		    const_poly_int,const_vector")
        (ior (match_operand 0 "register_operand")
-	    (ior (match_operand 0 "memory_operand")
-		 (match_operand 0 "const_int_operand")))))
+	    (match_test "aarch64_mov_operand_p (op, mode)"))))
+
+(define_predicate "aarch64_movti_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "memory_operand")
+       (match_operand 0 "const_scalar_int_operand")))
 
 (define_predicate "aarch64_reg_or_imm"
-  (and (match_code "reg,subreg,const_int")
-       (ior (match_operand 0 "register_operand")
-	    (match_operand 0 "const_int_operand"))))
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_scalar_int_operand")))
 
 ;; True for integer comparisons and for FP comparisons other than LTGT or UNEQ.
 (define_special_predicate "aarch64_comparison_operator"
@@ -366,8 +370,8 @@
 (define_predicate "aarch64_simd_reg_or_zero"
   (and (match_code "reg,subreg,const_int,const_double,const,const_vector")
        (ior (match_operand 0 "register_operand")
-	    (match_operand 0 "aarch64_simd_imm_zero")
-	    (match_test "op == const0_rtx"))))
+	    (match_test "op == const0_rtx")
+	    (match_operand 0 "aarch64_simd_imm_zero"))))
 
 (define_predicate "aarch64_simd_struct_operand"
   (and (match_code "mem")
diff --git a/gcc/config/alpha/sync.md b/gcc/config/alpha/sync.md
index 69c9d249b97..a0e67a99e88 100644
--- a/gcc/config/alpha/sync.md
+++ b/gcc/config/alpha/sync.md
@@ -24,7 +24,7 @@
   [(plus "add_operand") (minus "reg_or_8bit_operand")
    (ior "or_operand") (xor "or_operand") (and "and_operand")])
 (define_code_attr fetchop_constr
-  [(plus "rKL") (minus "rI") (ior "rIN") (xor "rIN") (and "riNM")])
+  [(plus "rKL") (minus "rI") (ior "rIN") (xor "rIN") (and "rINM")])
 
 
 (define_expand "memory_barrier"
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 0a68db8e889..90a9a56e016 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -2829,12 +2829,23 @@ arc_save_restore (rtx base_reg,
 	  else
 	    {
 	      insn = frame_insn (insn);
-	      if (epilogue_p)
-		for (r = start_call; r <= end_call; r++)
-		  {
-		    rtx reg = gen_rtx_REG (SImode, r);
-		    add_reg_note (insn, REG_CFA_RESTORE, reg);
-		  }
+	      for (r = start_call, off = 0;
+		   r <= end_call;
+		   r++, off += UNITS_PER_WORD)
+		{
+		  rtx reg = gen_rtx_REG (SImode, r);
+		  if (epilogue_p)
+		      add_reg_note (insn, REG_CFA_RESTORE, reg);
+		  else
+		    {
+		      rtx mem = gen_rtx_MEM (SImode, plus_constant (Pmode,
+								    base_reg,
+								    off));
+
+		      add_reg_note (insn, REG_CFA_OFFSET,
+				    gen_rtx_SET (mem, reg));
+		    }
+		}
 	    }
 	  offset += off;
 	}
@@ -3076,6 +3087,19 @@ arc_expand_prologue (void)
       frame_size_to_allocate -= cfun->machine->frame_info.reg_size;
     }
 
+  /* In the case of millicode thunk, we need to restore the clobbered
+     blink register.  */
+  if (cfun->machine->frame_info.millicode_end_reg > 0
+      && arc_must_save_return_addr (cfun))
+    {
+      HOST_WIDE_INT tmp = cfun->machine->frame_info.reg_size;
+      emit_insn (gen_rtx_SET (gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM),
+			      gen_rtx_MEM (Pmode,
+					   plus_constant (Pmode,
+							  stack_pointer_rtx,
+							  tmp))));
+    }
+
   /* Save frame pointer if needed.  First save the FP on stack, if not
      autosaved.  */
   if (arc_frame_pointer_needed ()
@@ -7189,6 +7213,12 @@ hwloop_optimize (hwloop_info loop)
 	fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
       return false;
     }
+  else if (!loop->length)
+    {
+      if (dump_file)
+	fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
+      return false;
+    }
 
   /* Check if we use a register or not.  */
   if (!REG_P (loop->iter_reg))
@@ -7260,8 +7290,11 @@ hwloop_optimize (hwloop_info loop)
       && INSN_P (last_insn)
       && (JUMP_P (last_insn) || CALL_P (last_insn)
 	  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
-	  || get_attr_type (last_insn) == TYPE_BRCC
-	  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))
+	  /* At this stage we can have (insn (clobber (mem:BLK
+	     (reg)))) instructions, ignore them.  */
+	  || (GET_CODE (PATTERN (last_insn)) != CLOBBER
+	      && (get_attr_type (last_insn) == TYPE_BRCC
+		  || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))))
     {
       if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
 	{
diff --git a/gcc/config/arc/linux.h b/gcc/config/arc/linux.h
index d8e006307fc..707347183ca 100644
--- a/gcc/config/arc/linux.h
+++ b/gcc/config/arc/linux.h
@@ -91,3 +91,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Pre/post modify with register displacement are default off.  */
 #undef TARGET_AUTO_MODIFY_REG_DEFAULT
 #define TARGET_AUTO_MODIFY_REG_DEFAULT 0
+
+#if DEFAULT_LIBC == LIBC_GLIBC
+/* Override linux.h LINK_EH_SPEC definition.
+   Signalize that because we have fde-glibc, we don't need all C shared libs
+   linked against -lgcc_s.  */
+#undef LINK_EH_SPEC
+#define LINK_EH_SPEC "--eh-frame-hdr"
+#endif
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 692496d49d5..d09c6e371de 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -107,6 +107,13 @@ arm_ternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none, qualifier_none };
 #define TERNOP_QUALIFIERS (arm_ternop_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigned T).  */
+static enum arm_type_qualifiers
+arm_unsigned_uternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+      qualifier_unsigned };
+#define UTERNOP_QUALIFIERS (arm_unsigned_uternop_qualifiers)
+
 /* T (T, immediate).  */
 static enum arm_type_qualifiers
 arm_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -133,6 +140,13 @@ arm_mac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
       qualifier_none, qualifier_lane_index };
 #define MAC_LANE_QUALIFIERS (arm_mac_lane_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigend T, lane index).  */
+static enum arm_type_qualifiers
+arm_umac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+      qualifier_unsigned, qualifier_lane_index };
+#define UMAC_LANE_QUALIFIERS (arm_umac_lane_qualifiers)
+
 /* T (T, T, immediate).  */
 static enum arm_type_qualifiers
 arm_ternop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index b2e9af6c45d..635bc3c1c38 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -74,11 +74,11 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_QRDMX", TARGET_NEON_RDMA);
 
-  if (TARGET_CRC32)
-    builtin_define ("__ARM_FEATURE_CRC32");
-
+  def_or_undef_macro (pfile, "__ARM_FEATURE_CRC32", TARGET_CRC32);
+  def_or_undef_macro (pfile, "__ARM_FEATURE_DOTPROD", TARGET_DOTPROD);
   def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT);
 
+  cpp_undef (pfile, "__ARM_FEATURE_CMSE");
   if (arm_arch8 && !arm_arch_notm)
     {
       if (arm_arch_cmse && use_cmse)
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 07de4c9375b..0820ad74c2e 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -156,6 +156,8 @@ define feature crypto
 # FP16 data processing (half-precision float).
 define feature fp16
 
+# Dot Product instructions extension to ARMv8.2-a.
+define feature dotprod
 
 # ISA Quirks (errata?).  Don't forget to add this to the fgroup
 # ALL_QUIRKS below.
@@ -173,6 +175,17 @@ define feature quirk_cm3_ldrd
 define feature smallmul
 
 # Feature groups.  Conventionally all (or mostly) upper case.
+# ALL_FPU lists all the feature bits associated with the floating-point
+# unit; these will all be removed if the floating-point unit is disabled
+# (eg -mfloat-abi=soft).  ALL_FPU_INTERNAL must ONLY contain features that
+# form part of a named -mfpu option; it is used to map the capabilities
+# back to a named FPU for the benefit of the assembler.
+#
+# ALL_SIMD_INTERNAL and ALL_SIMD are similarly defined to help with the
+# construction of ALL_FPU and ALL_FPU_INTERNAL; they describe the SIMD
+# extensions that are either part of a named FPU or optional extensions
+# respectively.
+
 
 # List of all cryptographic extensions to stripout if crypto is
 # disabled.  Currently, that's trivial, but we define it anyway for
@@ -182,11 +195,12 @@ define fgroup ALL_CRYPTO	crypto
 # List of all SIMD bits to strip out if SIMD is disabled.  This does
 # strip off 32 D-registers, but does not remove support for
 # double-precision FP.
-define fgroup ALL_SIMD	fp_d32 neon ALL_CRYPTO
+define fgroup ALL_SIMD_INTERNAL	fp_d32 neon ALL_CRYPTO
+define fgroup ALL_SIMD	ALL_SIMD_INTERNAL dotprod
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD
+define fgroup ALL_FPU_INTERNAL	vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
 
 # Similarly, but including fp16 and other extensions that aren't part of
 # -mfpu support.
@@ -239,6 +253,7 @@ define fgroup FP_D32	FP_DBL fp_d32
 define fgroup FP_ARMv8	FPv5 FP_D32
 define fgroup NEON	FP_D32 neon
 define fgroup CRYPTO	NEON crypto
+define fgroup DOTPROD	NEON dotprod
 
 # List of all quirk bits to strip out when comparing CPU features with
 # architectures.
@@ -561,6 +576,7 @@ begin arch armv8.2-a
  option crypto add FP_ARMv8 CRYPTO
  option nocrypto remove ALL_CRYPTO
  option nofp remove ALL_FP
+ option dotprod add FP_ARMv8 DOTPROD
 end arch armv8.2-a
 
 begin arch armv8-m.base
@@ -1473,7 +1489,7 @@ begin cpu cortex-a55
  cname cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
  option nofp remove ALL_FP
@@ -1484,7 +1500,7 @@ begin cpu cortex-a75
  cname cortexa75
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a73
@@ -1496,7 +1512,7 @@ begin cpu cortex-a75.cortex-a55
  cname cortexa75cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
  costs cortex_a73
@@ -1516,6 +1532,7 @@ begin cpu cortex-m33
  architecture armv8-m.main+dsp
  fpu fpv5-sp-d16
  option nofp remove ALL_FP
+ option nodsp remove armv7em
  costs v7m
 end cpu cortex-m33
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ce3aaeb04e0..47ba0dd09e3 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -973,6 +973,9 @@ int arm_condexec_masklen = 0;
 /* Nonzero if chip supports the ARMv8 CRC instructions.  */
 int arm_arch_crc = 0;
 
+/* Nonzero if chip supports the AdvSIMD Dot Product instructions.  */
+int arm_arch_dotprod = 0;
+
 /* Nonzero if chip supports the ARMv8-M security extensions.  */
 int arm_arch_cmse = 0;
 
@@ -9429,6 +9432,9 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
 		   + rtx_cost (XEXP (x, 0), mode, code, 0, speed_p));
 	  if (speed_p)
 	    *cost += 2 * extra_cost->alu.shift;
+	  /* Slightly disparage left shift by 1 at so we prefer adddi3.  */
+	  if (code == ASHIFT && XEXP (x, 1) == CONST1_RTX (SImode))
+	    *cost += 1;
 	  return true;
 	}
       else if (mode == SImode)
@@ -11252,9 +11258,11 @@ arm_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return current_tune->vec_costs->scalar_to_vec_cost;
 
       case unaligned_load:
+      case vector_gather_load:
         return current_tune->vec_costs->vec_unalign_load_cost;
 
       case unaligned_store:
+      case vector_scatter_store:
         return current_tune->vec_costs->vec_unalign_store_cost;
 
       case cond_branch_taken:
@@ -15293,12 +15301,23 @@ operands_ok_ldrd_strd (rtx rt, rtx rt2, rtx rn, HOST_WIDE_INT offset,
   return true;
 }
 
+/* Return true if a 64-bit access with alignment ALIGN and with a
+   constant offset OFFSET from the base pointer is permitted on this
+   architecture.  */
+static bool
+align_ok_ldrd_strd (HOST_WIDE_INT align, HOST_WIDE_INT offset)
+{
+  return (unaligned_access
+	  ? (align >= BITS_PER_WORD && (offset & 3) == 0)
+	  : (align >= 2 * BITS_PER_WORD && (offset & 7) == 0));
+}
+
 /* Helper for gen_operands_ldrd_strd.  Returns true iff the memory
    operand MEM's address contains an immediate offset from the base
-   register and has no side effects, in which case it sets BASE and
-   OFFSET accordingly.  */
+   register and has no side effects, in which case it sets BASE,
+   OFFSET and ALIGN accordingly.  */
 static bool
-mem_ok_for_ldrd_strd (rtx mem, rtx *base, rtx *offset)
+mem_ok_for_ldrd_strd (rtx mem, rtx *base, rtx *offset, HOST_WIDE_INT *align)
 {
   rtx addr;
 
@@ -15317,6 +15336,7 @@ mem_ok_for_ldrd_strd (rtx mem, rtx *base, rtx *offset)
   gcc_assert (MEM_P (mem));
 
   *offset = const0_rtx;
+  *align = MEM_ALIGN (mem);
 
   addr = XEXP (mem, 0);
 
@@ -15357,7 +15377,7 @@ gen_operands_ldrd_strd (rtx *operands, bool load,
                         bool const_store, bool commute)
 {
   int nops = 2;
-  HOST_WIDE_INT offsets[2], offset;
+  HOST_WIDE_INT offsets[2], offset, align[2];
   rtx base = NULL_RTX;
   rtx cur_base, cur_offset, tmp;
   int i, gap;
@@ -15369,7 +15389,8 @@ gen_operands_ldrd_strd (rtx *operands, bool load,
      registers, and the corresponding memory offsets.  */
   for (i = 0; i < nops; i++)
     {
-      if (!mem_ok_for_ldrd_strd (operands[nops+i], &cur_base, &cur_offset))
+      if (!mem_ok_for_ldrd_strd (operands[nops+i], &cur_base, &cur_offset,
+				 &align[i]))
         return false;
 
       if (i == 0)
@@ -15483,6 +15504,7 @@ gen_operands_ldrd_strd (rtx *operands, bool load,
       /* Swap the instructions such that lower memory is accessed first.  */
       std::swap (operands[0], operands[1]);
       std::swap (operands[2], operands[3]);
+      std::swap (align[0], align[1]);
       if (const_store)
         std::swap (operands[4], operands[5]);
     }
@@ -15496,6 +15518,9 @@ gen_operands_ldrd_strd (rtx *operands, bool load,
   if (gap != 4)
     return false;
 
+  if (!align_ok_ldrd_strd (align[0], offset))
+    return false;
+
   /* Make sure we generate legal instructions.  */
   if (operands_ok_ldrd_strd (operands[0], operands[1], base, offset,
                              false, load))
@@ -30365,6 +30390,8 @@ arm_const_not_ok_for_debug_p (rtx p)
   tree decl_op0 = NULL;
   tree decl_op1 = NULL;
 
+  if (GET_CODE (p) == UNSPEC)
+    return true;
   if (GET_CODE (p) == MINUS)
     {
       if (GET_CODE (XEXP (p, 1)) == SYMBOL_REF)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 336db4b042d..65d6db4d086 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -210,6 +210,11 @@ extern tree arm_fp16_type_node;
 /* FPU supports ARMv8.1 Adv.SIMD extensions.  */
 #define TARGET_NEON_RDMA (TARGET_NEON && arm_arch8_1)
 
+/* Supports for Dot Product AdvSIMD extensions.  */
+#define TARGET_DOTPROD (TARGET_NEON					\
+			&& bitmap_bit_p (arm_active_target.isa,		\
+					isa_bit_dotprod))
+
 /* FPU supports the floating point FP16 instructions for ARMv8.2 and later.  */
 #define TARGET_VFP_FP16INST \
   (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP5 && arm_fp16_inst)
@@ -1248,7 +1253,7 @@ enum reg_class
    couldn't convert a direct call into an indirect one.  */
 #define CALLER_INTERWORKING_SLOT_SIZE			\
   (TARGET_CALLER_INTERWORKING				\
-   && maybe_nonzero (crtl->outgoing_args_size)		\
+   && may_ne (crtl->outgoing_args_size, 0)		\
    ? UNITS_PER_WORD : 0)
 
 /* If we generate an insn to push BYTES bytes,
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f241f9d0b7d..ddb9d8f3590 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4059,12 +4059,6 @@
     {
       rtx scratch1, scratch2;
 
-      if (operands[2] == CONST1_RTX (SImode))
-        {
-          emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
-          DONE;
-        }
-
       /* Ideally we should use iwmmxt here if we could know that operands[1]
          ends up already living in an iwmmxt register. Otherwise it's
          cheaper to have the alternate code being generated than moving
@@ -4081,18 +4075,6 @@
   "
 )
 
-(define_insn "arm_ashldi3_1bit"
-  [(set (match_operand:DI            0 "s_register_operand" "=r,&r")
-        (ashift:DI (match_operand:DI 1 "s_register_operand" "0,r")
-                   (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%Q0, %Q1, asl #1\;adc\\t%R0, %R1, %R1"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)
-
 (define_expand "ashlsi3"
   [(set (match_operand:SI            0 "s_register_operand" "")
 	(ashift:SI (match_operand:SI 1 "s_register_operand" "")
@@ -4128,12 +4110,6 @@
     {
       rtx scratch1, scratch2;
 
-      if (operands[2] == CONST1_RTX (SImode))
-        {
-          emit_insn (gen_arm_ashrdi3_1bit (operands[0], operands[1]));
-          DONE;
-        }
-
       /* Ideally we should use iwmmxt here if we could know that operands[1]
          ends up already living in an iwmmxt register. Otherwise it's
          cheaper to have the alternate code being generated than moving
@@ -4150,18 +4126,6 @@
   "
 )
 
-(define_insn "arm_ashrdi3_1bit"
-  [(set (match_operand:DI              0 "s_register_operand" "=r,&r")
-        (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
-                     (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%R0, %R1, asr #1\;mov\\t%Q0, %Q1, rrx"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)
-
 (define_expand "ashrsi3"
   [(set (match_operand:SI              0 "s_register_operand" "")
 	(ashiftrt:SI (match_operand:SI 1 "s_register_operand" "")
@@ -4194,12 +4158,6 @@
     {
       rtx scratch1, scratch2;
 
-      if (operands[2] == CONST1_RTX (SImode))
-        {
-          emit_insn (gen_arm_lshrdi3_1bit (operands[0], operands[1]));
-          DONE;
-        }
-
       /* Ideally we should use iwmmxt here if we could know that operands[1]
          ends up already living in an iwmmxt register. Otherwise it's
          cheaper to have the alternate code being generated than moving
@@ -4216,18 +4174,6 @@
   "
 )
 
-(define_insn "arm_lshrdi3_1bit"
-  [(set (match_operand:DI              0 "s_register_operand" "=r,&r")
-        (lshiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
-                     (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%R0, %R1, lsr #1\;mov\\t%Q0, %Q1, rrx"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)
-
 (define_expand "lshrsi3"
   [(set (match_operand:SI              0 "s_register_operand" "")
 	(lshiftrt:SI (match_operand:SI 1 "s_register_operand" "")
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index 07f0368343a..982eec810da 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -331,3 +331,7 @@ VAR11 (STORE1, vst4,
 	v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf)
 VAR9 (STORE1LANE, vst4_lane,
 	v8qi, v4hi, v4hf, v2si, v2sf, v8hi, v8hf, v4si, v4sf)
+VAR2 (TERNOP, sdot, v8qi, v16qi)
+VAR2 (UTERNOP, udot, v8qi, v16qi)
+VAR2 (MAC_LANE, sdot_lane, v8qi, v16qi)
+VAR2 (UMAC_LANE, udot_lane, v8qi, v16qi)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 7acbaf1bb40..a4fb234a846 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -410,6 +410,8 @@
 
 (define_int_iterator VFM_LANE_AS [UNSPEC_VFMA_LANE UNSPEC_VFMS_LANE])
 
+(define_int_iterator DOTPROD [UNSPEC_DOT_S UNSPEC_DOT_U])
+
 ;;----------------------------------------------------------------------------
 ;; Mode attributes
 ;;----------------------------------------------------------------------------
@@ -720,6 +722,9 @@
 
 (define_mode_attr pf [(V8QI "p") (V16QI "p") (V2SF "f") (V4SF "f")])
 
+(define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])
+(define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
+
 ;;----------------------------------------------------------------------------
 ;; Code attributes
 ;;----------------------------------------------------------------------------
@@ -816,6 +821,7 @@
   (UNSPEC_VSRA_S_N "s") (UNSPEC_VSRA_U_N "u")
   (UNSPEC_VRSRA_S_N "s") (UNSPEC_VRSRA_U_N "u")
   (UNSPEC_VCVTH_S "s") (UNSPEC_VCVTH_U "u")
+  (UNSPEC_DOT_S "s") (UNSPEC_DOT_U "u")
 ])
 
 (define_int_attr vcvth_op
@@ -1003,3 +1009,6 @@
 
 (define_int_attr mrrc [(VUNSPEC_MRRC "mrrc") (VUNSPEC_MRRC2 "mrrc2")])
 (define_int_attr MRRC [(VUNSPEC_MRRC "MRRC") (VUNSPEC_MRRC2 "MRRC2")])
+
+(define_int_attr opsuffix [(UNSPEC_DOT_S "s8")
+			   (UNSPEC_DOT_U "u8")])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 6590e8cd894..073c26580dd 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1221,12 +1221,8 @@
 	gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1])
 		    || REGNO (operands[0]) == REGNO (operands[1]));
 
-	if (operands[2] == CONST1_RTX (SImode))
-	  /* This clobbers CC.  */
-	  emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
-	else
-	  arm_emit_coreregs_64bit_shift (ASHIFT, operands[0], operands[1],
-					 operands[2], operands[3], operands[4]);
+	arm_emit_coreregs_64bit_shift (ASHIFT, operands[0], operands[1],
+				       operands[2], operands[3], operands[4]);
       }
     DONE;
   }"
@@ -1325,13 +1321,9 @@
 	gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1])
 		    || REGNO (operands[0]) == REGNO (operands[1]));
 
-	if (operands[2] == CONST1_RTX (SImode))
-	  /* This clobbers CC.  */
-	  emit_insn (gen_arm_<shift>di3_1bit (operands[0], operands[1]));
-	else
-	  /* This clobbers CC (ASHIFTRT by register only).  */
-	  arm_emit_coreregs_64bit_shift (<CODE>, operands[0], operands[1],
-				 	 operands[2], operands[3], operands[4]);
+	/* This clobbers CC (ASHIFTRT by register only).  */
+	arm_emit_coreregs_64bit_shift (<CODE>, operands[0], operands[1],
+				       operands[2], operands[3], operands[4]);
       }
 
     DONE;
@@ -3044,6 +3036,76 @@
   DONE;
 })
 
+;; These instructions map to the __builtins for the Dot Product operations.
+(define_insn "neon_<sup>dot<vsi2qi>"
+  [(set (match_operand:VCVTI 0 "register_operand" "=w")
+	(plus:VCVTI (match_operand:VCVTI 1 "register_operand" "0")
+		    (unspec:VCVTI [(match_operand:<VSI2QI> 2
+							"register_operand" "w")
+				   (match_operand:<VSI2QI> 3
+							"register_operand" "w")]
+		DOTPROD)))]
+  "TARGET_DOTPROD"
+  "v<sup>dot.<opsuffix>\\t%<V_reg>0, %<V_reg>2, %<V_reg>3"
+  [(set_attr "type" "neon_dot")]
+)
+
+;; These instructions map to the __builtins for the Dot Product
+;; indexed operations.
+(define_insn "neon_<sup>dot_lane<vsi2qi>"
+  [(set (match_operand:VCVTI 0 "register_operand" "=w")
+	(plus:VCVTI (match_operand:VCVTI 1 "register_operand" "0")
+		    (unspec:VCVTI [(match_operand:<VSI2QI> 2
+							"register_operand" "w")
+				   (match_operand:V8QI 3 "register_operand" "t")
+				   (match_operand:SI 4 "immediate_operand" "i")]
+		DOTPROD)))]
+  "TARGET_DOTPROD"
+  {
+    operands[4]
+      = GEN_INT (NEON_ENDIAN_LANE_N (V8QImode, INTVAL (operands[4])));
+    return "v<sup>dot.<opsuffix>\\t%<V_reg>0, %<V_reg>2, %P3[%c4]";
+  }
+  [(set_attr "type" "neon_dot")]
+)
+
+;; These expands map to the Dot Product optab the vectorizer checks for.
+;; The auto-vectorizer expects a dot product builtin that also does an
+;; accumulation into the provided register.
+;; Given the following pattern
+;;
+;; for (i=0; i<len; i++) {
+;;     c = a[i] * b[i];
+;;     r += c;
+;; }
+;; return result;
+;;
+;; This can be auto-vectorized to
+;; r  = a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
+;;
+;; given enough iterations.  However the vectorizer can keep unrolling the loop
+;; r += a[4]*b[4] + a[5]*b[5] + a[6]*b[6] + a[7]*b[7];
+;; r += a[8]*b[8] + a[9]*b[9] + a[10]*b[10] + a[11]*b[11];
+;; ...
+;;
+;; and so the vectorizer provides r, in which the result has to be accumulated.
+(define_expand "<sup>dot_prod<vsi2qi>"
+  [(set (match_operand:VCVTI 0 "register_operand")
+	(plus:VCVTI (unspec:VCVTI [(match_operand:<VSI2QI> 1
+							"register_operand")
+				   (match_operand:<VSI2QI> 2
+							"register_operand")]
+		     DOTPROD)
+		    (match_operand:VCVTI 3 "register_operand")))]
+  "TARGET_DOTPROD"
+{
+  emit_insn (
+    gen_neon_<sup>dot<vsi2qi> (operands[3], operands[3], operands[1],
+				 operands[2]));
+  emit_insn (gen_rtx_SET (operands[0], operands[3]));
+  DONE;
+})
+
 (define_expand "neon_copysignf<mode>"
   [(match_operand:VCVTF 0 "register_operand")
    (match_operand:VCVTF 1 "register_operand")
diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
index ec4b76dbc8f..47f3673160a 100644
--- a/gcc/config/arm/t-multilib
+++ b/gcc/config/arm/t-multilib
@@ -68,7 +68,7 @@ v7ve_vfpv4_simd_variants := +simd
 v8_a_nosimd_variants	:= +crc
 v8_a_simd_variants	:= $(call all_feat_combs, simd crypto)
 v8_1_a_simd_variants	:= $(call all_feat_combs, simd crypto)
-v8_2_a_simd_variants	:= $(call all_feat_combs, simd fp16 crypto)
+v8_2_a_simd_variants	:= $(call all_feat_combs, simd fp16 crypto dotprod)
 
 
 ifneq (,$(HAS_APROFILE))
diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index 22d993d46a3..03e9cdebb75 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -316,6 +316,8 @@
 ; neon_cls_q
 ; neon_cnt
 ; neon_cnt_q
+; neon_dot
+; neon_dot_q
 ; neon_ext
 ; neon_ext_q
 ; neon_rbit
@@ -764,6 +766,8 @@
 \
   neon_abs,\
   neon_abs_q,\
+  neon_dot,\
+  neon_dot_q,\
   neon_neg,\
   neon_neg_q,\
   neon_qneg,\
@@ -1110,8 +1114,8 @@
           neon_sub, neon_sub_q, neon_sub_widen, neon_sub_long, neon_qsub,\
           neon_qsub_q, neon_sub_halve, neon_sub_halve_q,\
           neon_sub_halve_narrow_q,\
-          neon_abs, neon_abs_q, neon_neg, neon_neg_q, neon_qneg,\
-          neon_qneg_q, neon_qabs, neon_qabs_q, neon_abd, neon_abd_q,\
+	  neon_abs, neon_abs_q, neon_dot, neon_dot_q, neon_neg, neon_neg_q,\
+	  neon_qneg, neon_qneg_q, neon_qabs, neon_qabs_q, neon_abd, neon_abd_q,\
           neon_abd_long, neon_minmax, neon_minmax_q, neon_compare,\
           neon_compare_q, neon_compare_zero, neon_compare_zero_q,\
           neon_arith_acc, neon_arith_acc_q, neon_reduc_add,\
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 99cfa41b08d..c474f4bb5db 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -410,4 +410,6 @@
   UNSPEC_VRNDN
   UNSPEC_VRNDP
   UNSPEC_VRNDX
+  UNSPEC_DOT_S
+  UNSPEC_DOT_U
 ])
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 9521e904d21..a541413c263 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -304,9 +304,9 @@
 ;; DImode moves
 
 (define_insn "*movdi_vfp"
-  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,r,w,w, Uv")
+  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
        (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
-  "TARGET_32BIT && TARGET_HARD_FLOAT && arm_tune != TARGET_CPU_cortexa8
+  "TARGET_32BIT && TARGET_HARD_FLOAT
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))
    && !(TARGET_NEON && CONST_INT_P (operands[1])
@@ -339,71 +339,25 @@
     }
   "
   [(set_attr "type" "multiple,multiple,multiple,multiple,load_8,load_8,store_8,f_mcrr,f_mrrc,ffarithd,f_loadd,f_stored")
-   (set (attr "length") (cond [(eq_attr "alternative" "1,4,5,6") (const_int 8)
+   (set (attr "length") (cond [(eq_attr "alternative" "1") (const_int 8)
                               (eq_attr "alternative" "2") (const_int 12)
                               (eq_attr "alternative" "3") (const_int 16)
+			      (eq_attr "alternative" "4,5,6")
+			       (symbol_ref "arm_count_output_move_double_insns (operands) * 4")
                               (eq_attr "alternative" "9")
                                (if_then_else
                                  (match_test "TARGET_VFP_SINGLE")
                                  (const_int 8)
                                  (const_int 4))]
                               (const_int 4)))
+   (set_attr "predicable"    "yes")
    (set_attr "arm_pool_range"     "*,*,*,*,1020,4096,*,*,*,*,1020,*")
    (set_attr "thumb2_pool_range"     "*,*,*,*,1018,4094,*,*,*,*,1018,*")
    (set_attr "neg_pool_range" "*,*,*,*,1004,0,*,*,*,*,1004,*")
+   (set (attr "ce_count") (symbol_ref "get_attr_length (insn) / 4"))
    (set_attr "arch"           "t2,any,any,any,a,t2,any,any,any,any,any,any")]
 )
 
-(define_insn "*movdi_vfp_cortexa8"
-  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
-	(match_operand:DI 1 "di_operand"		"r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
-  "TARGET_32BIT && TARGET_HARD_FLOAT && arm_tune == TARGET_CPU_cortexa8
-    && (   register_operand (operands[0], DImode)
-        || register_operand (operands[1], DImode))
-    && !(TARGET_NEON && CONST_INT_P (operands[1])
-	 && neon_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
-  "*
-  switch (which_alternative)
-    {
-    case 0: 
-    case 1:
-    case 2:
-    case 3:
-      return \"#\";
-    case 4:
-    case 5:
-    case 6:
-      return output_move_double (operands, true, NULL);
-    case 7:
-      return \"vmov%?\\t%P0, %Q1, %R1\\t%@ int\";
-    case 8:
-      return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
-    case 9:
-      return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
-    case 10: case 11:
-      return output_move_vfp (operands);
-    default:
-      gcc_unreachable ();
-    }
-  "
-  [(set_attr "type" "multiple,multiple,multiple,multiple,load_8,load_8,store_8,f_mcrr,f_mrrc,ffarithd,f_loadd,f_stored")
-   (set (attr "length") (cond [(eq_attr "alternative" "1") (const_int 8)
-                               (eq_attr "alternative" "2") (const_int 12)
-                               (eq_attr "alternative" "3") (const_int 16)
-                               (eq_attr "alternative" "4,5,6") 
-			       (symbol_ref 
-				"arm_count_output_move_double_insns (operands) \
-                                 * 4")]
-                              (const_int 4)))
-   (set_attr "predicable"    "yes")
-   (set_attr "arm_pool_range"     "*,*,*,*,1018,4094,*,*,*,*,1018,*")
-   (set_attr "thumb2_pool_range"     "*,*,*,*,1018,4094,*,*,*,*,1018,*")
-   (set_attr "neg_pool_range" "*,*,*,*,1004,0,*,*,*,*,1004,*")
-   (set (attr "ce_count") 
-	(symbol_ref "get_attr_length (insn) / 4"))
-   (set_attr "arch"           "t2,any,any,any,a,t2,any,any,any,any,any,any")]
- )
-
 ;; HFmode moves
 
 (define_insn "*movhf_vfp_fp16"
diff --git a/gcc/config/cris/cris.h b/gcc/config/cris/cris.h
index f9149c717a7..892a3724393 100644
--- a/gcc/config/cris/cris.h
+++ b/gcc/config/cris/cris.h
@@ -998,7 +998,7 @@ enum cris_symbol_type
 /* (no definitions) */
 
 
-/* Node: SDB and DWARF */
+/* Node: DWARF */
 /* (no definitions) */
 
 /* Node: Misc */
diff --git a/gcc/config/dbxcoff.h b/gcc/config/dbxcoff.h
index e5eef64f60e..c20b4fe77b1 100644
--- a/gcc/config/dbxcoff.h
+++ b/gcc/config/dbxcoff.h
@@ -25,10 +25,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #define DBX_DEBUGGING_INFO 1
 
-/* Generate SDB debugging information by default.  */
+/* Generate DBX debugging information by default.  */
 
 #ifndef PREFERRED_DEBUGGING_TYPE
-#define PREFERRED_DEBUGGING_TYPE SDB_DEBUG
+#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
 #endif
 
 /* Be function-relative for block and source line stab directives.  */
diff --git a/gcc/config/ft32/ft32.c b/gcc/config/ft32/ft32.c
index 99e93821b3a..8a041f167a8 100644
--- a/gcc/config/ft32/ft32.c
+++ b/gcc/config/ft32/ft32.c
@@ -869,6 +869,8 @@ static bool
 ft32_addr_space_legitimate_address_p (machine_mode mode, rtx x, bool strict,
                                       addr_space_t as ATTRIBUTE_UNUSED)
 {
+  int max_offset = TARGET_FT32B ? 16384 : 128;
+
   if (mode != BLKmode)
     {
       if (GET_CODE (x) == PLUS)
@@ -878,8 +880,9 @@ ft32_addr_space_legitimate_address_p (machine_mode mode, rtx x, bool strict,
           op2 = XEXP (x, 1);
           if (GET_CODE (op1) == REG
               && CONST_INT_P (op2)
-              && INTVAL (op2) >= -128
-              && INTVAL (op2) < 128 && reg_ok_for_base_p (op1, strict))
+              && (-max_offset <= INTVAL (op2))
+              && (INTVAL (op2) < max_offset)
+              && reg_ok_for_base_p (op1, strict))
             goto yes;
           if (GET_CODE (op1) == SYMBOL_REF && CONST_INT_P (op2))
             goto yes;
diff --git a/gcc/config/ft32/ft32.h b/gcc/config/ft32/ft32.h
index d52bb9af17c..8bb0d399a0c 100644
--- a/gcc/config/ft32/ft32.h
+++ b/gcc/config/ft32/ft32.h
@@ -39,6 +39,7 @@
 
 #undef LIB_SPEC
 #define LIB_SPEC "%{!shared:%{!symbolic:-lc}} \
+                   %{mcompress:--relax} \
                    %{msim:-Tsim.ld}"
 
 #undef  LINK_SPEC
@@ -199,12 +200,12 @@ enum reg_class
 
 #define GLOBAL_ASM_OP "\t.global\t"
 
-#define JUMP_TABLES_IN_TEXT_SECTION 1
+#define JUMP_TABLES_IN_TEXT_SECTION (TARGET_NOPM ? 0 : 1)
 
 /* This is how to output an element of a case-vector that is absolute.  */
 
 #define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE)  \
-    fprintf (FILE, "\tjmp\t.L%d\n", VALUE);				\
+    fprintf (FILE, "\t.long\t.L%d\n", VALUE);				\
 
 /* Passing Arguments in Registers */
 
@@ -469,7 +470,7 @@ do { \
 #define ADDR_SPACE_PM 1
 
 #define REGISTER_TARGET_PRAGMAS() do { \
-  c_register_addr_space ("__flash__", ADDR_SPACE_PM); \
+  c_register_addr_space ("__flash__", TARGET_NOPM ? 0 : ADDR_SPACE_PM); \
 } while (0);
 
 extern int ft32_is_mem_pm(rtx o);
diff --git a/gcc/config/ft32/ft32.md b/gcc/config/ft32/ft32.md
index 984c3b67e32..2e772faf72f 100644
--- a/gcc/config/ft32/ft32.md
+++ b/gcc/config/ft32/ft32.md
@@ -777,8 +777,12 @@
    (clobber (match_scratch:SI 2 "=&r"))
   ]
   ""
-  "ldk.l\t$cc,%l1\;ashl.l\t%2,%0,2\;add.l\t%2,%2,$cc\;jmpi\t%2"
-  )
+  {
+    if (TARGET_NOPM)
+      return \"ldk.l\t$cc,%l1\;ashl.l\t%2,%0,2\;add.l\t%2,%2,$cc\;ldi.l\t%2,%2,0\;jmpi\t%2\";
+    else
+      return \"ldk.l\t$cc,%l1\;ashl.l\t%2,%0,2\;add.l\t%2,%2,$cc\;lpmi.l\t%2,%2,0\;jmpi\t%2\";
+  })
 
 ;; -------------------------------------------------------------------------
 ;; Atomic exchange instruction
diff --git a/gcc/config/ft32/ft32.opt b/gcc/config/ft32/ft32.opt
index ba01c81ecf1..9e75f340335 100644
--- a/gcc/config/ft32/ft32.opt
+++ b/gcc/config/ft32/ft32.opt
@@ -29,3 +29,15 @@ Use LRA instead of reload.
 mnodiv
 Target Report Mask(NODIV)
 Avoid use of the DIV and MOD instructions
+
+mft32b
+Target Report Mask(FT32B)
+target the FT32B architecture
+
+mcompress
+Target Report Mask(COMPRESS)
+enable FT32B code compression
+
+mnopm
+Target Report Mask(NOPM)
+Avoid placing any readable data in program memory
diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index a967b69a350..df17b180906 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -162,11 +162,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
   LD_STATIC_OPTION " --whole-archive -lasan --no-whole-archive " \
   LD_DYNAMIC_OPTION "}}%{!static-libasan:-lasan}"
 #undef LIBTSAN_EARLY_SPEC
-#define LIBTSAN_EARLY_SPEC "%{static-libtsan:%{!shared:" \
+#define LIBTSAN_EARLY_SPEC "%{!shared:libtsan_preinit%O%s} " \
+  "%{static-libtsan:%{!shared:" \
   LD_STATIC_OPTION " --whole-archive -ltsan --no-whole-archive " \
   LD_DYNAMIC_OPTION "}}%{!static-libtsan:-ltsan}"
 #undef LIBLSAN_EARLY_SPEC
-#define LIBLSAN_EARLY_SPEC "%{static-liblsan:%{!shared:" \
+#define LIBLSAN_EARLY_SPEC "%{!shared:liblsan_preinit%O%s} " \
+  "%{static-liblsan:%{!shared:" \
   LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \
   LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}"
 #endif
diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h
index 88e8adb18c5..8e887d8c5ba 100644
--- a/gcc/config/i386/avx512dqintrin.h
+++ b/gcc/config/i386/avx512dqintrin.h
@@ -1160,16 +1160,63 @@ extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_reduce_sd (__m128d __A, __m128d __B, int __C)
 {
-  return (__m128d) __builtin_ia32_reducesd ((__v2df) __A,
-						 (__v2df) __B, __C);
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) _mm_setzero_pd (),
+						 (__mmask8) -1);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_sd (__m128d __W,  __mmask8 __U, __m128d __A,
+		    __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) __W,
+						 (__mmask8) __U);
+}
+
+extern __inline __m128d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_sd (__mmask8 __U, __m128d __A, __m128d __B, int __C)
+{
+  return (__m128d) __builtin_ia32_reducesd_mask ((__v2df) __A,
+						 (__v2df) __B, __C,
+						 (__v2df) _mm_setzero_pd (),
+						 (__mmask8) __U);
 }
 
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_reduce_ss (__m128 __A, __m128 __B, int __C)
 {
-  return (__m128) __builtin_ia32_reducess ((__v4sf) __A,
-						(__v4sf) __B, __C);
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) _mm_setzero_ps (),
+						(__mmask8) -1);
+}
+
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_reduce_ss (__m128 __W,  __mmask8 __U, __m128 __A,
+		    __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) __W,
+						(__mmask8) __U);
+}
+
+extern __inline __m128
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_reduce_ss (__mmask8 __U, __m128 __A, __m128 __B, int __C)
+{
+  return (__m128) __builtin_ia32_reducess_mask ((__v4sf) __A,
+						(__v4sf) __B, __C,
+						(__v4sf) _mm_setzero_ps (),
+						(__mmask8) __U);
 }
 
 extern __inline __m128d
@@ -2449,12 +2496,34 @@ _mm512_fpclass_ps_mask (__m512 __A, const int __imm)
 						 (int) (c),(__mmask8)-1))
 
 #define _mm_reduce_sd(A, B, C)						\
-  ((__m128d) __builtin_ia32_reducesd ((__v2df)(__m128d)(A),	\
-    (__v2df)(__m128d)(B), (int)(C)))					\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
+    (__mmask8)-1))
+
+#define _mm_mask_reduce_sd(W, U, A, B, C)				\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df)(__m128d)(W), (__mmask8)(U)))
+
+#define _mm_maskz_reduce_sd(U, A, B, C)					\
+  ((__m128d) __builtin_ia32_reducesd_mask ((__v2df)(__m128d)(A),	\
+    (__v2df)(__m128d)(B), (int)(C), (__v2df) _mm_setzero_pd (),		\
+    (__mmask8)(U)))
 
 #define _mm_reduce_ss(A, B, C)						\
-  ((__m128) __builtin_ia32_reducess ((__v4sf)(__m128)(A),		\
-    (__v4sf)(__m128)(A), (int)(C)))					\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)-1))
+
+#define _mm_mask_reduce_ss(W, U, A, B, C)				\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf)(__m128)(W), (__mmask8)(U)))
+
+#define _mm_maskz_reduce_ss(U, A, B, C)					\
+  ((__m128) __builtin_ia32_reducess_mask ((__v4sf)(__m128)(A),		\
+    (__v4sf)(__m128)(B), (int)(C), (__v4sf) _mm_setzero_ps (),		\
+    (__mmask8)(U)))
+
+
 
 #endif
 
diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 72f57f7b6c9..5dc5fae1081 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -14005,6 +14005,326 @@ _mm512_mask_cmp_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y, const int __P)
 
 extern __inline __mmask8
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpeq_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_EQ_OQ,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpeq_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_EQ_OQ,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmplt_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_LT_OS,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmplt_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_LT_OS,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmple_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_LE_OS,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmple_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_LE_OS,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpunord_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_UNORD_Q,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpunord_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_UNORD_Q,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpneq_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NEQ_UQ,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpneq_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NEQ_UQ,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpnlt_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NLT_US,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpnlt_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NLT_US,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpnle_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NLE_US,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpnle_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_NLE_US,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpord_pd_mask (__m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_ORD_Q,
+						  (__mmask8) -1,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpord_pd_mask (__mmask8 __U, __m512d __X, __m512d __Y)
+{
+  return (__mmask8) __builtin_ia32_cmppd512_mask ((__v8df) __X,
+						  (__v8df) __Y, _CMP_ORD_Q,
+						  (__mmask8) __U,
+						  _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpeq_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_EQ_OQ,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpeq_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_EQ_OQ,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmplt_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_LT_OS,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmplt_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_LT_OS,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmple_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_LE_OS,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmple_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_LE_OS,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpunord_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_UNORD_Q,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpunord_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_UNORD_Q,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpneq_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NEQ_UQ,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpneq_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NEQ_UQ,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpnlt_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NLT_US,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpnlt_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NLT_US,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpnle_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NLE_US,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpnle_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_NLE_US,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cmpord_ps_mask (__m512 __X, __m512 __Y)
+{
+  return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_ORD_Q,
+						   (__mmask16) -1,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_cmpord_ps_mask (__mmask16 __U, __m512 __X, __m512 __Y)
+{
+   return (__mmask16) __builtin_ia32_cmpps512_mask ((__v16sf) __X,
+						   (__v16sf) __Y, _CMP_ORD_Q,
+						   (__mmask16) __U,
+						   _MM_FROUND_CUR_DIRECTION);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_cmp_sd_mask (__m128d __X, __m128d __Y, const int __P)
 {
   return (__mmask8) __builtin_ia32_cmpsd_mask ((__v2df) __X,
diff --git a/gcc/config/i386/cet.c b/gcc/config/i386/cet.c
new file mode 100644
index 00000000000..a53c499fd92
--- /dev/null
+++ b/gcc/config/i386/cet.c
@@ -0,0 +1,76 @@
+/* Functions for CET/x86.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "output.h"
+#include "linux-common.h"
+
+void
+file_end_indicate_exec_stack_and_cet (void)
+{
+  file_end_indicate_exec_stack ();
+
+  if (flag_cf_protection == CF_NONE)
+    return;
+
+  unsigned int feature_1 = 0;
+
+  if (TARGET_IBT)
+    /* GNU_PROPERTY_X86_FEATURE_1_IBT.  */
+    feature_1 |= 0x1;
+
+  if (TARGET_SHSTK)
+    /* GNU_PROPERTY_X86_FEATURE_1_SHSTK.  */
+    feature_1 |= 0x2;
+
+  if (feature_1)
+    {
+      int p2align = ptr_mode == SImode ? 2 : 3;
+
+      /* Generate GNU_PROPERTY_X86_FEATURE_1_XXX.  */
+      switch_to_section (get_section (".note.gnu.property",
+				      SECTION_NOTYPE, NULL));
+
+      ASM_OUTPUT_ALIGN (asm_out_file, p2align);
+      /* name length.  */
+      fprintf (asm_out_file, ASM_LONG " 1f - 0f\n");
+      /* data length.  */
+      fprintf (asm_out_file, ASM_LONG " 4f - 1f\n");
+      /* note type: NT_GNU_PROPERTY_TYPE_0.  */
+      fprintf (asm_out_file, ASM_LONG " 5\n");
+      ASM_OUTPUT_LABEL (asm_out_file, "0");
+      /* vendor name: "GNU".  */
+      fprintf (asm_out_file, STRING_ASM_OP " \"GNU\"\n");
+      ASM_OUTPUT_LABEL (asm_out_file, "1");
+      ASM_OUTPUT_ALIGN (asm_out_file, p2align);
+      /* pr_type: GNU_PROPERTY_X86_FEATURE_1_AND.  */
+      fprintf (asm_out_file, ASM_LONG " 0xc0000002\n");
+      /* pr_datasz.  */\
+      fprintf (asm_out_file, ASM_LONG " 3f - 2f\n");
+      ASM_OUTPUT_LABEL (asm_out_file, "2");
+      /* GNU_PROPERTY_X86_FEATURE_1_XXX.  */
+      fprintf (asm_out_file, ASM_LONG " 0x%x\n", feature_1);
+      ASM_OUTPUT_LABEL (asm_out_file, "3");
+      ASM_OUTPUT_ALIGN (asm_out_file, p2align);
+      ASM_OUTPUT_LABEL (asm_out_file, "4");
+    }
+}
diff --git a/gcc/config/i386/cetintrin.h b/gcc/config/i386/cetintrin.h
new file mode 100644
index 00000000000..b15a776d7f8
--- /dev/null
+++ b/gcc/config/i386/cetintrin.h
@@ -0,0 +1,134 @@
+/* Copyright (C) 2015-2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <cetintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _CETINTRIN_H_INCLUDED
+#define _CETINTRIN_H_INCLUDED
+
+#ifndef __SHSTK__
+#pragma GCC push_options
+#pragma GCC target ("shstk")
+#define __DISABLE_SHSTK__
+#endif /* __SHSTK__ */
+
+extern __inline unsigned int
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_rdsspd (unsigned int __B)
+{
+  return __builtin_ia32_rdsspd (__B);
+}
+
+#ifdef __x86_64__
+extern __inline unsigned long long
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_rdsspq (unsigned long long __B)
+{
+  return __builtin_ia32_rdsspq (__B);
+}
+#endif
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_incsspd (unsigned int __B)
+{
+  __builtin_ia32_incsspd (__B);
+}
+
+#ifdef __x86_64__
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_incsspq (unsigned long long __B)
+{
+  __builtin_ia32_incsspq (__B);
+}
+#endif
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_saveprevssp (void)
+{
+  __builtin_ia32_saveprevssp ();
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_rstorssp (void *__B)
+{
+  __builtin_ia32_rstorssp (__B);
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_wrssd (unsigned int __B, void *__C)
+{
+  __builtin_ia32_wrssd (__B, __C);
+}
+
+#ifdef __x86_64__
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_wrssq (unsigned long long __B, void *__C)
+{
+  __builtin_ia32_wrssq (__B, __C);
+}
+#endif
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_wrussd (unsigned int __B, void *__C)
+{
+  __builtin_ia32_wrussd (__B, __C);
+}
+
+#ifdef __x86_64__
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_wrussq (unsigned long long __B, void *__C)
+{
+  __builtin_ia32_wrussq (__B, __C);
+}
+#endif
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_setssbsy (void)
+{
+  __builtin_ia32_setssbsy ();
+}
+
+extern __inline void
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_clrssbsy (void *__B)
+{
+  __builtin_ia32_clrssbsy (__B);
+}
+
+#ifdef __DISABLE_SHSTK__
+#undef __DISABLE_SHSTK__
+#pragma GCC pop_options
+#endif /* __DISABLE_SHSTK__ */
+
+#endif /* _CETINTRIN_H_INCLUDED.  */
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 98c05c9ebab..619b465f059 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -332,6 +332,11 @@
    of it satisfies the e constraint."
   (match_operand 0 "x86_64_hilo_int_operand"))
 
+(define_constraint "Wf"
+  "32-bit signed integer constant zero extended from word size
+   to double word size."
+  (match_operand 0 "x86_64_dwzext_immediate_operand"))
+
 (define_constraint "Z"
   "32-bit unsigned integer constant, or a symbolic reference known
    to fit that range (for immediate operands in zero-extending x86-64
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index b3b0f912c98..8cb1848dff5 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -97,12 +97,15 @@
 #define bit_AVX512VBMI	(1 << 1)
 #define bit_PKU	(1 << 3)
 #define bit_OSPKE	(1 << 4)
+#define bit_SHSTK	(1 << 7)
+#define bit_GFNI	(1 << 8)
 #define bit_AVX512VPOPCNTDQ	(1 << 14)
 #define bit_RDPID	(1 << 22)
 
 /* %edx */
 #define bit_AVX5124VNNIW (1 << 2)
 #define bit_AVX5124FMAPS (1 << 3)
+#define bit_IBT	(1 << 20)
 
 /* XFEATURE_ENABLED_MASK register bits (%eax == 13, %ecx == 0) */
 #define bit_BNDREGS     (1 << 3)
diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h
index a731e2f6c6a..1ed9b170d43 100644
--- a/gcc/config/i386/cygming.h
+++ b/gcc/config/i386/cygming.h
@@ -19,7 +19,6 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define DBX_DEBUGGING_INFO 1
-#define SDB_DEBUGGING_INFO 1
 #if TARGET_64BIT_DEFAULT || defined (HAVE_GAS_PE_SECREL32_RELOC)
 #define DWARF2_DEBUGGING_INFO 1
 #endif
@@ -308,8 +307,7 @@ do {						\
 #define TARGET_SECTION_TYPE_FLAGS  i386_pe_section_type_flags
 
 /* Write the extra assembler code needed to declare a function
-   properly.  If we are generating SDB debugging information, this
-   will happen automatically, so we only need to handle other cases.  */
+   properly.  */
 #undef ASM_DECLARE_FUNCTION_NAME
 #define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL) \
   i386_pe_start_function (FILE, NAME, DECL)
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 4e7fda68281..973abddcc67 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -417,6 +417,8 @@ const char *host_detect_local_cpu (int argc, const char **argv)
   unsigned int has_avx512vbmi = 0, has_avx512ifma = 0, has_clwb = 0;
   unsigned int has_mwaitx = 0, has_clzero = 0, has_pku = 0, has_rdpid = 0;
   unsigned int has_avx5124fmaps = 0, has_avx5124vnniw = 0;
+  unsigned int has_gfni = 0;
+  unsigned int has_ibt = 0, has_shstk = 0;
 
   bool arch;
 
@@ -506,9 +508,13 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       has_avx512vbmi = ecx & bit_AVX512VBMI;
       has_pku = ecx & bit_OSPKE;
       has_rdpid = ecx & bit_RDPID;
+      has_gfni = ecx & bit_GFNI;
 
       has_avx5124vnniw = edx & bit_AVX5124VNNIW;
       has_avx5124fmaps = edx & bit_AVX5124FMAPS;
+
+      has_shstk = ecx & bit_SHSTK;
+      has_ibt = edx & bit_IBT;
     }
 
   if (max_level >= 13)
@@ -1050,6 +1056,9 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       const char *clzero  = has_clzero  ? " -mclzero"  : " -mno-clzero";
       const char *pku = has_pku ? " -mpku" : " -mno-pku";
       const char *rdpid = has_rdpid ? " -mrdpid" : " -mno-rdpid";
+      const char *gfni = has_gfni ? " -mgfni" : " -mno-gfni";
+      const char *ibt = has_ibt ? " -mibt" : " -mno-ibt";
+      const char *shstk = has_shstk ? " -mshstk" : " -mno-shstk";
       options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3,
 			sse4a, cx16, sahf, movbe, aes, sha, pclmul,
 			popcnt, abm, lwp, fma, fma4, xop, bmi, sgx, bmi2,
@@ -1059,7 +1068,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
 			avx512cd, avx512pf, prefetchwt1, clflushopt,
 			xsavec, xsaves, avx512dq, avx512bw, avx512vl,
 			avx512ifma, avx512vbmi, avx5124fmaps, avx5124vnniw,
-			clwb, mwaitx, clzero, pku, rdpid, NULL);
+			clwb, mwaitx, clzero, pku, rdpid, gfni, ibt, shstk, NULL);
     }
 
 done:
diff --git a/gcc/config/i386/gas.h b/gcc/config/i386/gas.h
index 9b42787f6b1..862c1c2cb83 100644
--- a/gcc/config/i386/gas.h
+++ b/gcc/config/i386/gas.h
@@ -40,10 +40,6 @@ along with GCC; see the file COPYING3.  If not see
 #undef DBX_NO_XREFS
 #undef DBX_CONTIN_LENGTH
 
-/* Ask for COFF symbols.  */
-
-#define SDB_DEBUGGING_INFO 1
-
 /* Output #ident as a .ident.  */
 
 #undef TARGET_ASM_OUTPUT_IDENT
diff --git a/gcc/config/i386/gfniintrin.h b/gcc/config/i386/gfniintrin.h
new file mode 100644
index 00000000000..f4ca01c5b11
--- /dev/null
+++ b/gcc/config/i386/gfniintrin.h
@@ -0,0 +1,229 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _IMMINTRIN_H_INCLUDED
+#error "Never use <gfniintrin.h> directly; include <immintrin.h> instead."
+#endif
+
+#ifndef _GFNIINTRIN_H_INCLUDED
+#define _GFNIINTRIN_H_INCLUDED
+
+#ifndef __GFNI__
+#pragma GCC push_options
+#pragma GCC target("gfni")
+#define __DISABLE_GFNI__
+#endif /* __GFNI__ */
+
+#ifdef __OPTIMIZE__
+extern __inline __m128i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_gf2p8affineinv_epi64_epi8 (__m128i __A, __m128i __B, const int __C)
+{
+  return (__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi ((__v16qi) __A,
+							   (__v16qi) __B,
+							    __C);
+}
+#else
+#define _mm_gf2p8affineinv_epi64_epi8(A, B, C)				   \
+  ((__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi((__v16qi)(__m128i)(A), \
+					   (__v16qi)(__m128i)(B), (int)(C)))
+#endif
+
+#ifdef __DISABLE_GFNI__
+#undef __DISABLE_GFNI__
+#pragma GCC pop_options
+#endif /* __DISABLE_GFNI__ */
+
+#if !defined(__GFNI__) || !defined(__AVX__)
+#pragma GCC push_options
+#pragma GCC target("gfni,avx")
+#define __DISABLE_GFNIAVX__
+#endif /* __GFNIAVX__ */
+
+#ifdef __OPTIMIZE__
+extern __inline __m256i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_gf2p8affineinv_epi64_epi8 (__m256i __A, __m256i __B, const int __C)
+{
+  return (__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi ((__v32qi) __A,
+							   (__v32qi) __B,
+							    __C);
+}
+#else
+#define _mm256_gf2p8affineinv_epi64_epi8(A, B, C)			   \
+  ((__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi((__v32qi)(__m256i)(A), \
+						    (__v32qi)(__m256i)(B), \
+						    (int)(C)))
+#endif
+
+#ifdef __DISABLE_GFNIAVX__
+#undef __DISABLE_GFNIAVX__
+#pragma GCC pop_options
+#endif /* __GFNIAVX__ */
+
+#if !defined(__GFNI__) || !defined(__AVX512VL__)
+#pragma GCC push_options
+#pragma GCC target("gfni,avx512vl")
+#define __DISABLE_GFNIAVX512VL__
+#endif /* __GFNIAVX512VL__ */
+
+#ifdef __OPTIMIZE__
+extern __inline __m128i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mask_gf2p8affineinv_epi64_epi8 (__m128i __A, __mmask16 __B, __m128i __C,
+				    __m128i __D, const int __E)
+{
+  return (__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask ((__v16qi) __C,
+								(__v16qi) __D,
+								 __E,
+								(__v16qi)__A,
+								 __B);
+}
+
+extern __inline __m128i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskz_gf2p8affineinv_epi64_epi8 (__mmask16 __A, __m128i __B, __m128i __C,
+				     const int __D)
+{
+  return (__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask ((__v16qi) __B,
+						(__v16qi) __C, __D,
+						(__v16qi) _mm_setzero_si128 (),
+						 __A);
+}
+#else
+#define _mm_mask_gf2p8affineinv_epi64_epi8(A, B, C, D, E) 		   \
+  ((__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(		   \
+			(__v16qi)(__m128i)(C), (__v16qi)(__m128i)(D),      \
+			(int)(E), (__v16qi)(__m128i)(A), (__mmask16)(B)))
+#define _mm_maskz_gf2p8affineinv_epi64_epi8(A, B, C, D) \
+  ((__m128i) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(		   \
+			(__v16qi)(__m128i)(B), (__v16qi)(__m128i)(C),	   \
+			(int)(D), (__v16qi)(__m128i) _mm_setzero_si128 (), \
+			(__mmask16)(A)))
+#endif
+
+#ifdef __DISABLE_GFNIAVX512VL__
+#undef __DISABLE_GFNIAVX512VL__
+#pragma GCC pop_options
+#endif /* __GFNIAVX512VL__ */
+
+#if !defined(__GFNI__) || !defined(__AVX512VL__) || !defined(__AVX512BW__)
+#pragma GCC push_options
+#pragma GCC target("gfni,avx512vl,avx512bw")
+#define __DISABLE_GFNIAVX512VLBW__
+#endif /* __GFNIAVX512VLBW__ */
+
+#ifdef __OPTIMIZE__
+extern __inline __m256i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_mask_gf2p8affineinv_epi64_epi8 (__m256i __A, __mmask32 __B,
+				       __m256i __C, __m256i __D, const int __E)
+{
+  return (__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask ((__v32qi) __C,
+								(__v32qi) __D,
+							 	 __E,
+								(__v32qi)__A,
+								 __B);
+}
+
+extern __inline __m256i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_maskz_gf2p8affineinv_epi64_epi8 (__mmask32 __A, __m256i __B,
+					__m256i __C, const int __D)
+{
+  return (__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask ((__v32qi) __B,
+				      (__v32qi) __C, __D,
+				      (__v32qi) _mm256_setzero_si256 (), __A);
+}
+#else
+#define _mm256_mask_gf2p8affineinv_epi64_epi8(A, B, C, D, E)		\
+  ((__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(		\
+	(__v32qi)(__m256i)(C), (__v32qi)(__m256i)(D), (int)(E),		\
+	(__v32qi)(__m256i)(A), (__mmask32)(B)))
+#define _mm256_maskz_gf2p8affineinv_epi64_epi8(A, B, C, D)		\
+  ((__m256i) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(		\
+	(__v32qi)(__m256i)(B), (__v32qi)(__m256i)(C), (int)(D),		\
+	(__v32qi)(__m256i) _mm256_setzero_si256 (), (__mmask32)(A)))
+#endif
+
+#ifdef __DISABLE_GFNIAVX512VLBW__
+#undef __DISABLE_GFNIAVX512VLBW__
+#pragma GCC pop_options
+#endif /* __GFNIAVX512VLBW__ */
+
+#if !defined(__GFNI__) || !defined(__AVX512F__) || !defined(__AVX512BW__)
+#pragma GCC push_options
+#pragma GCC target("gfni,avx512f,avx512bw")
+#define __DISABLE_GFNIAVX512FBW__
+#endif /* __GFNIAVX512FBW__ */
+
+#ifdef __OPTIMIZE__
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_gf2p8affineinv_epi64_epi8 (__m512i __A, __mmask64 __B, __m512i __C,
+				       __m512i __D, const int __E)
+{
+  return (__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask ((__v64qi) __C,
+								(__v64qi) __D,
+								 __E,
+								(__v64qi)__A,
+								 __B);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_maskz_gf2p8affineinv_epi64_epi8 (__mmask64 __A, __m512i __B,
+					__m512i __C, const int __D)
+{
+  return (__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask ((__v64qi) __B,
+				(__v64qi) __C, __D,
+				(__v64qi) _mm512_setzero_si512 (), __A);
+}
+
+extern __inline __m512i
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_gf2p8affineinv_epi64_epi8 (__m512i __A, __m512i __B, const int __C)
+{
+  return (__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi ((__v64qi) __A,
+							   (__v64qi) __B, __C);
+}
+#else
+#define _mm512_mask_gf2p8affineinv_epi64_epi8(A, B, C, D, E) 		\
+  ((__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(		\
+	(__v64qi)(__m512i)(C), (__v64qi)(__m512i)(D), (int)(E),		\
+	(__v64qi)(__m512i)(A), (__mmask64)(B)))
+#define _mm512_maskz_gf2p8affineinv_epi64_epi8(A, B, C, D)		\
+  ((__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(		\
+	(__v64qi)(__m512i)(B), (__v64qi)(__m512i)(C), (int)(D),		\
+	(__v64qi)(__m512i) _mm512_setzero_si512 (), (__mmask64)(A)))
+#define _mm512_gf2p8affineinv_epi64_epi8(A, B, C)			\
+  ((__m512i) __builtin_ia32_vgf2p8affineinvqb_v64qi (			\
+	(__v64qi)(__m512i)(A), (__v64qi)(__m512i)(B), (int)(C)))
+#endif
+
+#ifdef __DISABLE_GFNIAVX512FBW__
+#undef __DISABLE_GFNIAVX512FBW__
+#pragma GCC pop_options
+#endif /* __GFNIAVX512FBW__ */
+
+#endif /* _GFNIINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 8d584dbe940..5b3b96ea2d0 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -286,7 +286,9 @@ DEF_FUNCTION_TYPE (V8SI, V8SI)
 DEF_FUNCTION_TYPE (VOID, PCVOID)
 DEF_FUNCTION_TYPE (VOID, PVOID)
 DEF_FUNCTION_TYPE (VOID, UINT64)
+DEF_FUNCTION_TYPE (VOID, UINT64, PVOID)
 DEF_FUNCTION_TYPE (VOID, UNSIGNED)
+DEF_FUNCTION_TYPE (VOID, UNSIGNED, PVOID)
 DEF_FUNCTION_TYPE (INT, PUSHORT)
 DEF_FUNCTION_TYPE (INT, PUNSIGNED)
 DEF_FUNCTION_TYPE (INT, PULONGLONG)
@@ -1210,3 +1212,9 @@ DEF_FUNCTION_TYPE (BND, BND, BND)
 DEF_FUNCTION_TYPE (PVOID, PCVOID, BND, ULONG)
 DEF_FUNCTION_TYPE (ULONG, VOID)
 DEF_FUNCTION_TYPE (PVOID, BND)
+
+#GFNI builtins
+DEF_FUNCTION_TYPE (V64QI, V64QI, V64QI, INT)
+DEF_FUNCTION_TYPE (V64QI, V64QI, V64QI, INT, V64QI, UDI)
+DEF_FUNCTION_TYPE (V32QI, V32QI, V32QI, INT, V32QI, USI)
+DEF_FUNCTION_TYPE (V16QI, V16QI, V16QI, INT, V16QI, UHI)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 0d5d5b74675..76e5f0fafdd 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1666,8 +1666,8 @@ BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, CODE_FOR_reducepv4df
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, CODE_FOR_reducepv2df_mask, "__builtin_ia32_reducepd128_mask", IX86_BUILTIN_REDUCEPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_INT_V2DF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, CODE_FOR_reducepv8sf_mask, "__builtin_ia32_reduceps256_mask", IX86_BUILTIN_REDUCEPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_INT_V8SF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, CODE_FOR_reducepv4sf_mask, "__builtin_ia32_reduceps128_mask", IX86_BUILTIN_REDUCEPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_INT_V4SF_UQI)
-BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_reducesv2df, "__builtin_ia32_reducesd", IX86_BUILTIN_REDUCESD_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT)
-BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_reducesv4sf, "__builtin_ia32_reducess", IX86_BUILTIN_REDUCESS_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_reducesv2df_mask, "__builtin_ia32_reducesd_mask", IX86_BUILTIN_REDUCESD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512DQ, CODE_FOR_reducesv4sf_mask, "__builtin_ia32_reducess_mask", IX86_BUILTIN_REDUCESS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_permvarv16hi_mask, "__builtin_ia32_permvarhi256_mask", IX86_BUILTIN_VPERMVARHI256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_permvarv8hi_mask, "__builtin_ia32_permvarhi128_mask", IX86_BUILTIN_VPERMVARHI128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI)
 BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_vpermt2varv16hi3_mask, "__builtin_ia32_vpermt2varhi256_mask", IX86_BUILTIN_VPERMT2VARHI256, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI)
@@ -2589,6 +2589,13 @@ BDESC (OPTION_MASK_ISA_AVX512VPOPCNTDQ, CODE_FOR_vpopcountv8di_mask, "__builtin_
 /* RDPID */
 BDESC (OPTION_MASK_ISA_RDPID, CODE_FOR_rdpid, "__builtin_ia32_rdpid", IX86_BUILTIN_RDPID, UNKNOWN, (int) UNSIGNED_FTYPE_VOID)
 
+/* GFNI */
+BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v64qi, "__builtin_ia32_vgf2p8affineinvqb_v64qi", IX86_BUILTIN_VGF2P8AFFINEINVQB512, UNKNOWN, (int) V64QI_FTYPE_V64QI_V64QI_INT)
+BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW, CODE_FOR_vgf2p8affineinvqb_v64qi_mask, "__builtin_ia32_vgf2p8affineinvqb_v64qi_mask", IX86_BUILTIN_VGF2P8AFFINEINVQB512MASK, UNKNOWN, (int) V64QI_FTYPE_V64QI_V64QI_INT_V64QI_UDI)
+BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v32qi, "__builtin_ia32_vgf2p8affineinvqb_v32qi", IX86_BUILTIN_VGF2P8AFFINEINVQB256, UNKNOWN, (int) V32QI_FTYPE_V32QI_V32QI_INT)
+BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW, CODE_FOR_vgf2p8affineinvqb_v32qi_mask, "__builtin_ia32_vgf2p8affineinvqb_v32qi_mask", IX86_BUILTIN_VGF2P8AFFINEINVQB256MASK, UNKNOWN, (int) V32QI_FTYPE_V32QI_V32QI_INT_V32QI_USI)
+BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v16qi, "__builtin_ia32_vgf2p8affineinvqb_v16qi", IX86_BUILTIN_VGF2P8AFFINEINVQB128, UNKNOWN, (int) V16QI_FTYPE_V16QI_V16QI_INT)
+BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW, CODE_FOR_vgf2p8affineinvqb_v16qi_mask, "__builtin_ia32_vgf2p8affineinvqb_v16qi_mask", IX86_BUILTIN_VGF2P8AFFINEINVQB128MASK, UNKNOWN, (int) V16QI_FTYPE_V16QI_V16QI_INT_V16QI_UHI)
 BDESC_END (ARGS2, MPX)
 
 /* Builtins for MPX.  */
@@ -2779,4 +2786,25 @@ BDESC (OPTION_MASK_ISA_XOP, CODE_FOR_xop_vpermil2v4sf3,     "__builtin_ia32_vper
 BDESC (OPTION_MASK_ISA_XOP, CODE_FOR_xop_vpermil2v4df3,     "__builtin_ia32_vpermil2pd256", IX86_BUILTIN_VPERMIL2PD256, UNKNOWN, (int)MULTI_ARG_4_DF2_DI_I1)
 BDESC (OPTION_MASK_ISA_XOP, CODE_FOR_xop_vpermil2v8sf3,     "__builtin_ia32_vpermil2ps256", IX86_BUILTIN_VPERMIL2PS256, UNKNOWN, (int)MULTI_ARG_4_SF2_SI_I1)
 
-BDESC_END (MULTI_ARG, MAX)
+BDESC_END (MULTI_ARG, CET)
+
+/* CET.  */
+BDESC_FIRST (cet, CET,
+       OPTION_MASK_ISA_SHSTK, CODE_FOR_incsspsi, "__builtin_ia32_incsspd", IX86_BUILTIN_INCSSPD, UNKNOWN, (int) VOID_FTYPE_UNSIGNED)
+BDESC (OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT, CODE_FOR_incsspdi, "__builtin_ia32_incsspq", IX86_BUILTIN_INCSSPQ, UNKNOWN, (int) VOID_FTYPE_UINT64)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_saveprevssp, "__builtin_ia32_saveprevssp", IX86_BUILTIN_SAVEPREVSSP, UNKNOWN, (int) VOID_FTYPE_VOID)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_rstorssp, "__builtin_ia32_rstorssp", IX86_BUILTIN_RSTORSSP, UNKNOWN, (int) VOID_FTYPE_PVOID)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_wrsssi, "__builtin_ia32_wrssd", IX86_BUILTIN_WRSSD, UNKNOWN, (int) VOID_FTYPE_UNSIGNED_PVOID)
+BDESC (OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT, CODE_FOR_wrssdi, "__builtin_ia32_wrssq", IX86_BUILTIN_WRSSQ, UNKNOWN, (int) VOID_FTYPE_UINT64_PVOID)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_wrusssi, "__builtin_ia32_wrussd", IX86_BUILTIN_WRUSSD, UNKNOWN, (int) VOID_FTYPE_UNSIGNED_PVOID)
+BDESC (OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT, CODE_FOR_wrussdi, "__builtin_ia32_wrussq", IX86_BUILTIN_WRUSSQ, UNKNOWN, (int) VOID_FTYPE_UINT64_PVOID)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_setssbsy, "__builtin_ia32_setssbsy", IX86_BUILTIN_SETSSBSY, UNKNOWN, (int) VOID_FTYPE_VOID)
+BDESC (OPTION_MASK_ISA_SHSTK, CODE_FOR_clrssbsy, "__builtin_ia32_clrssbsy", IX86_BUILTIN_CLRSSBSY, UNKNOWN, (int) VOID_FTYPE_PVOID)
+
+BDESC_END (CET, CET_NORMAL)
+
+BDESC_FIRST (cet_rdssp, CET_NORMAL,
+       OPTION_MASK_ISA_SHSTK, CODE_FOR_rdsspsi, "__builtin_ia32_rdsspd", IX86_BUILTIN_RDSSPD, UNKNOWN, (int) UINT_FTYPE_UINT)
+BDESC (OPTION_MASK_ISA_SHSTK | OPTION_MASK_ISA_64BIT, CODE_FOR_rdsspdi, "__builtin_ia32_rdsspq", IX86_BUILTIN_RDSSPQ, UNKNOWN, (int) UINT64_FTYPE_UINT64)
+
+BDESC_END (CET_NORMAL, MAX)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 9bed360c43b..be99d01f110 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -459,6 +459,20 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__PKU__");
   if (isa_flag2 & OPTION_MASK_ISA_RDPID)
     def_or_undef (parse_in, "__RDPID__");
+  if (isa_flag2 & OPTION_MASK_ISA_GFNI)
+    def_or_undef (parse_in, "__GFNI__");
+  if (isa_flag2 & OPTION_MASK_ISA_IBT)
+    {
+      def_or_undef (parse_in, "__IBT__");
+      if (flag_cf_protection != CF_NONE)
+	def_or_undef (parse_in, "__CET__");
+    }
+  if (isa_flag2 & OPTION_MASK_ISA_SHSTK)
+    {
+      def_or_undef (parse_in, "__SHSTK__");
+      if (flag_cf_protection != CF_NONE)
+	def_or_undef (parse_in, "__CET__");
+    }
   if (TARGET_IAMCU)
     {
       def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/config/i386/i386-modes.def b/gcc/config/i386/i386-modes.def
index 83216e38758..dcf6854b57d 100644
--- a/gcc/config/i386/i386-modes.def
+++ b/gcc/config/i386/i386-modes.def
@@ -39,19 +39,22 @@ ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4);
    For the i386, we need separate modes when floating-point
    equality comparisons are being done.
 
-   Add CCNO to indicate comparisons against zero that requires
+   Add CCNO to indicate comparisons against zero that require
    Overflow flag to be unset.  Sign bit test is used instead and
    thus can be used to form "a&b>0" type of tests.
 
-   Add CCGC to indicate comparisons against zero that allows
+   Add CCGC to indicate comparisons against zero that allow
    unspecified garbage in the Carry flag.  This mode is used
    by inc/dec instructions.
 
-   Add CCGOC to indicate comparisons against zero that allows
+   Add CCGOC to indicate comparisons against zero that allow
    unspecified garbage in the Carry and Overflow flag. This
    mode is used to simulate comparisons of (a-b) and (a+b)
    against zero using sub/cmp/add operations.
 
+   Add CCGZ to indicate comparisons that allow unspecified garbage
+   in the Zero flag.  This mode is used in double-word comparisons.
+
    Add CCA to indicate that only the Above flag is valid.
    Add CCC to indicate that only the Carry flag is valid.
    Add CCO to indicate that only the Overflow flag is valid.
@@ -62,14 +65,15 @@ ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4);
 CC_MODE (CCGC);
 CC_MODE (CCGOC);
 CC_MODE (CCNO);
+CC_MODE (CCGZ);
 CC_MODE (CCA);
 CC_MODE (CCC);
 CC_MODE (CCO);
 CC_MODE (CCP);
 CC_MODE (CCS);
 CC_MODE (CCZ);
+
 CC_MODE (CCFP);
-CC_MODE (CCFPU);
 
 /* Vector modes.  Note that VEC_CONCAT patterns require vector
    sizes twice as big as implemented in hardware.  */
diff --git a/gcc/config/i386/i386-passes.def b/gcc/config/i386/i386-passes.def
index 49534619221..5c6e9c3494e 100644
--- a/gcc/config/i386/i386-passes.def
+++ b/gcc/config/i386/i386-passes.def
@@ -29,3 +29,5 @@ along with GCC; see the file COPYING3.  If not see
   /* Run the 64-bit STV pass before the CSE pass so that CONST0_RTX and
      CONSTM1_RTX generated by the STV pass can be CSEed.  */
   INSERT_PASS_BEFORE (pass_cse2, 1, pass_stv, true /* timode_p */);
+
+  INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_endbranch);
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index a5d7a6c75bb..eca6d5cf196 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -158,8 +158,6 @@ extern int ix86_attr_length_immediate_default (rtx_insn *, bool);
 extern int ix86_attr_length_address_default (rtx_insn *);
 extern int ix86_attr_length_vex_default (rtx_insn *, bool, bool);
 
-extern machine_mode ix86_fp_compare_mode (enum rtx_code);
-
 extern rtx ix86_libcall_value (machine_mode);
 extern bool ix86_function_arg_regno_p (int);
 extern void ix86_asm_output_function_label (FILE *, const char *, tree);
@@ -277,8 +275,6 @@ extern bool i386_pe_type_dllexport_p (tree);
 
 extern int i386_pe_reloc_rw_mask (void);
 
-extern rtx maybe_get_pool_constant (rtx);
-
 extern char internal_label_prefix[16];
 extern int internal_label_prefix_len;
 
@@ -356,3 +352,4 @@ class rtl_opt_pass;
 
 extern rtl_opt_pass *make_pass_insert_vzeroupper (gcc::context *);
 extern rtl_opt_pass *make_pass_stv (gcc::context *);
+extern rtl_opt_pass *make_pass_insert_endbranch (gcc::context *);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 619b13b3d09..4b684522082 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -102,6 +102,9 @@ static rtx legitimize_pe_coff_symbol (rtx, bool);
 static void ix86_print_operand_address_as (FILE *, rtx, addr_space_t, bool);
 static bool ix86_save_reg (unsigned int, bool, bool);
 static bool ix86_function_naked (const_tree);
+static bool ix86_notrack_prefixed_insn_p (rtx);
+static void ix86_emit_restore_reg_using_pop (rtx);
+
 
 #ifndef CHECK_STACK_LIMIT
 #define CHECK_STACK_LIMIT (-1)
@@ -302,7 +305,7 @@ int const dbx64_register_map[FIRST_PSEUDO_REGISTER] =
 	7 for %edi (gcc regno = 5)
    The following three DWARF register numbers are never generated by
    the SVR4 C compiler or by the GNU compilers, but SDB on x86/svr4
-   believes these numbers have these meanings.
+   believed these numbers have these meanings.
 	8  for %eip    (no gcc equivalent)
 	9  for %eflags (gcc regno = 17)
 	10 for %trapno (no gcc equivalent)
@@ -310,20 +313,20 @@ int const dbx64_register_map[FIRST_PSEUDO_REGISTER] =
    for the x86 architecture.  If the version of SDB on x86/svr4 were
    a bit less brain dead with respect to floating-point then we would
    have a precedent to follow with respect to DWARF register numbers
-   for x86 FP registers, but the SDB on x86/svr4 is so completely
+   for x86 FP registers, but the SDB on x86/svr4 was so completely
    broken with respect to FP registers that it is hardly worth thinking
    of it as something to strive for compatibility with.
-   The version of x86/svr4 SDB I have at the moment does (partially)
+   The version of x86/svr4 SDB I had does (partially)
    seem to believe that DWARF register number 11 is associated with
    the x86 register %st(0), but that's about all.  Higher DWARF
    register numbers don't seem to be associated with anything in
-   particular, and even for DWARF regno 11, SDB only seems to under-
+   particular, and even for DWARF regno 11, SDB only seemed to under-
    stand that it should say that a variable lives in %st(0) (when
    asked via an `=' command) if we said it was in DWARF regno 11,
-   but SDB still prints garbage when asked for the value of the
+   but SDB still printed garbage when asked for the value of the
    variable in question (via a `/' command).
-   (Also note that the labels SDB prints for various FP stack regs
-   when doing an `x' command are all wrong.)
+   (Also note that the labels SDB printed for various FP stack regs
+   when doing an `x' command were all wrong.)
    Note that these problems generally don't affect the native SVR4
    C compiler because it doesn't allow the use of -O with -g and
    because when it is *not* optimizing, it allocates a memory
@@ -1602,7 +1605,7 @@ dimode_scalar_chain::compute_convert_gain ()
       rtx dst = SET_DEST (def_set);
 
       if (REG_P (src) && REG_P (dst))
-	gain += COSTS_N_INSNS (2) - ix86_cost->sse_move;
+	gain += COSTS_N_INSNS (2) - ix86_cost->xmm_move;
       else if (REG_P (src) && MEM_P (dst))
 	gain += 2 * ix86_cost->int_store[2] - ix86_cost->sse_store[1];
       else if (MEM_P (src) && REG_P (dst))
@@ -2570,6 +2573,151 @@ make_pass_stv (gcc::context *ctxt)
   return new pass_stv (ctxt);
 }
 
+/* Inserting ENDBRANCH instructions.  */
+
+static unsigned int
+rest_of_insert_endbranch (void)
+{
+  timevar_push (TV_MACH_DEP);
+
+  rtx cet_eb;
+  rtx_insn *insn;
+  basic_block bb;
+
+  /* Currently emit EB if it's a tracking function, i.e. 'nocf_check' is
+     absent among function attributes.  Later an optimization will be
+     introduced to make analysis if an address of a static function is
+     taken.  A static function whose address is not taken will get a
+     nocf_check attribute.  This will allow to reduce the number of EB.  */
+
+  if (!lookup_attribute ("nocf_check",
+			 TYPE_ATTRIBUTES (TREE_TYPE (cfun->decl)))
+      && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
+    {
+      cet_eb = gen_nop_endbr ();
+
+      bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
+      insn = BB_HEAD (bb);
+      emit_insn_before (cet_eb, insn);
+    }
+
+  bb = 0;
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (insn = BB_HEAD (bb); insn != NEXT_INSN (BB_END (bb));
+	   insn = NEXT_INSN (insn))
+	{
+	  if (INSN_P (insn) && GET_CODE (insn) == CALL_INSN)
+	    {
+	      rtx_insn *next_insn = insn;
+
+	      while ((next_insn != BB_END (bb))
+		      && (DEBUG_INSN_P (NEXT_INSN (next_insn))
+			  || NOTE_P (NEXT_INSN (next_insn))
+			  || BARRIER_P (NEXT_INSN (next_insn))))
+		next_insn = NEXT_INSN (next_insn);
+
+	      /* Generate ENDBRANCH after CALL, which can return more than
+		 twice, setjmp-like functions.  */
+	      if (find_reg_note (insn, REG_SETJMP, NULL) != NULL)
+		{
+		  cet_eb = gen_nop_endbr ();
+		  emit_insn_after (cet_eb, next_insn);
+		}
+	      continue;
+	    }
+
+	  if (INSN_P (insn) && JUMP_P (insn) && flag_cet_switch)
+	    {
+	      rtx target = JUMP_LABEL (insn);
+	      if (target == NULL_RTX || ANY_RETURN_P (target))
+		continue;
+
+	      /* Check the jump is a switch table.  */
+	      rtx_insn *label = as_a<rtx_insn *> (target);
+	      rtx_insn *table = next_insn (label);
+	      if (table == NULL_RTX || !JUMP_TABLE_DATA_P (table))
+		continue;
+
+	      /* For the indirect jump find out all places it jumps and insert
+		 ENDBRANCH there.  It should be done under a special flag to
+		 control ENDBRANCH generation for switch stmts.  */
+	      edge_iterator ei;
+	      edge e;
+	      basic_block dest_blk;
+
+	      FOR_EACH_EDGE (e, ei, bb->succs)
+		{
+		  rtx_insn *insn;
+
+		  dest_blk = e->dest;
+		  insn = BB_HEAD (dest_blk);
+		  gcc_assert (LABEL_P (insn));
+		  cet_eb = gen_nop_endbr ();
+		  emit_insn_after (cet_eb, insn);
+		}
+	      continue;
+	    }
+
+	  if ((LABEL_P (insn) && LABEL_PRESERVE_P (insn))
+	      || (NOTE_P (insn)
+		  && NOTE_KIND (insn) == NOTE_INSN_DELETED_LABEL))
+/* TODO.  Check /s bit also.  */
+	    {
+	      cet_eb = gen_nop_endbr ();
+	      emit_insn_after (cet_eb, insn);
+	      continue;
+	    }
+	}
+    }
+
+  timevar_pop (TV_MACH_DEP);
+  return 0;
+}
+
+namespace {
+
+const pass_data pass_data_insert_endbranch =
+{
+  RTL_PASS, /* type.  */
+  "cet", /* name.  */
+  OPTGROUP_NONE, /* optinfo_flags.  */
+  TV_MACH_DEP, /* tv_id.  */
+  0, /* properties_required.  */
+  0, /* properties_provided.  */
+  0, /* properties_destroyed.  */
+  0, /* todo_flags_start.  */
+  0, /* todo_flags_finish.  */
+};
+
+class pass_insert_endbranch : public rtl_opt_pass
+{
+public:
+  pass_insert_endbranch (gcc::context *ctxt)
+    : rtl_opt_pass (pass_data_insert_endbranch, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+    {
+      return ((flag_cf_protection & CF_BRANCH) && TARGET_IBT);
+    }
+
+  virtual unsigned int execute (function *)
+    {
+      return rest_of_insert_endbranch ();
+    }
+
+}; // class pass_insert_endbranch
+
+} // anon namespace
+
+rtl_opt_pass *
+make_pass_insert_endbranch (gcc::context *ctxt)
+{
+  return new pass_insert_endbranch (ctxt);
+}
+
 /* Return true if a red-zone is in use.  */
 
 bool
@@ -2597,11 +2745,14 @@ ix86_target_string (HOST_WIDE_INT isa, HOST_WIDE_INT isa2,
      ISAs come first.  Target string will be displayed in the same order.  */
   static struct ix86_target_opts isa2_opts[] =
   {
+    { "-mgfni",		OPTION_MASK_ISA_GFNI },
     { "-mrdpid",	OPTION_MASK_ISA_RDPID },
     { "-msgx",		OPTION_MASK_ISA_SGX },
     { "-mavx5124vnniw", OPTION_MASK_ISA_AVX5124VNNIW },
     { "-mavx5124fmaps", OPTION_MASK_ISA_AVX5124FMAPS },
-    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ }
+    { "-mavx512vpopcntdq", OPTION_MASK_ISA_AVX512VPOPCNTDQ },
+    { "-mibt",	OPTION_MASK_ISA_IBT },
+    { "-mshstk",	OPTION_MASK_ISA_SHSTK }
   };
   static struct ix86_target_opts isa_opts[] =
   {
@@ -4694,6 +4845,37 @@ ix86_option_override_internal (bool main_args_p,
     target_option_default_node = target_option_current_node
       = build_target_option_node (opts);
 
+  /* Do not support control flow instrumentation if CET is not enabled.  */
+  if (opts->x_flag_cf_protection != CF_NONE)
+    {
+      if (!(TARGET_IBT_P (opts->x_ix86_isa_flags2)
+	    || TARGET_SHSTK_P (opts->x_ix86_isa_flags2)))
+	{
+	  if (flag_cf_protection == CF_FULL)
+	    {
+	      error ("%<-fcf-protection=full%> requires CET support "
+		     "on this target. Use -mcet or one of -mibt, "
+		     "-mshstk options to enable CET");
+	    }
+	  else if (flag_cf_protection == CF_BRANCH)
+	    {
+	      error ("%<-fcf-protection=branch%> requires CET support "
+		     "on this target. Use -mcet or one of -mibt, "
+		     "-mshstk options to enable CET");
+	    }
+	  else if (flag_cf_protection == CF_RETURN)
+	    {
+	      error ("%<-fcf-protection=return%> requires CET support "
+		     "on this target. Use -mcet or one of -mibt, "
+		     "-mshstk options to enable CET");
+	    }
+	  flag_cf_protection = CF_NONE;
+	  return false;
+	}
+      opts->x_flag_cf_protection =
+	(cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
+    }
+
   return true;
 }
 
@@ -5123,6 +5305,9 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("mpx",	OPT_mmpx),
     IX86_ATTR_ISA ("clwb",	OPT_mclwb),
     IX86_ATTR_ISA ("rdpid",	OPT_mrdpid),
+    IX86_ATTR_ISA ("gfni",	OPT_mgfni),
+    IX86_ATTR_ISA ("ibt",	OPT_mibt),
+    IX86_ATTR_ISA ("shstk",	OPT_mshstk),
 
     /* enum options */
     IX86_ATTR_ENUM ("fpmath=",	OPT_mfpmath_),
@@ -11943,8 +12128,14 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size)
      we just probe when we cross PROBE_INTERVAL.  */
   if (TREE_THIS_VOLATILE (cfun->decl))
     {
-      emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
-				       -GET_MODE_SIZE (word_mode)));
+      /* We can safely use any register here since we're just going to push
+	 its value and immediately pop it back.  But we do try and avoid
+	 argument passing registers so as not to introduce dependencies in
+	 the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
+      rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
+      rtx_insn *insn = emit_insn (gen_push (dummy_reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+      ix86_emit_restore_reg_using_pop (dummy_reg);
       emit_insn (gen_blockage ());
     }
 
@@ -12512,10 +12703,13 @@ ix86_finalize_stack_frame_flags (void)
 	      for (ref = DF_REG_USE_CHAIN (HARD_FRAME_POINTER_REGNUM);
 		   ref; ref = next)
 		{
-		  rtx_insn *insn = DF_REF_INSN (ref);
+		  next = DF_REF_NEXT_REG (ref);
+		  if (!DF_REF_INSN_INFO (ref))
+		    continue;
+
 		  /* Make sure the next ref is for a different instruction,
 		     so that we're not affected by the rescan.  */
-		  next = DF_REF_NEXT_REG (ref);
+		  rtx_insn *insn = DF_REF_INSN (ref);
 		  while (next && DF_REF_INSN (next) == insn)
 		    next = DF_REF_NEXT_REG (next);
 
@@ -12836,7 +13030,7 @@ ix86_expand_prologue (void)
   if (frame_pointer_needed && !m->fs.fp_valid)
     {
       /* Note: AT&T enter does NOT have reversed args.  Enter is probably
-         slower on all targets.  Also sdb doesn't like it.  */
+         slower on all targets.  Also sdb didn't like it.  */
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
 
@@ -12983,8 +13177,12 @@ ix86_expand_prologue (void)
       && (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
 	  || flag_stack_clash_protection))
     {
-      /* We expect the GP registers to be saved when probes are used.  */
-      gcc_assert (int_registers_saved);
+      /* This assert wants to verify that integer registers were saved
+	 prior to probing.  This is necessary when probing may be implemented
+	 as a function call (Windows).  It is not necessary for stack clash
+	 protection probing.  */
+      if (!flag_stack_clash_protection)
+	gcc_assert (int_registers_saved);
 
       if (flag_stack_clash_protection)
 	{
@@ -13628,7 +13826,7 @@ ix86_expand_epilogue (int style)
 	 the stack pointer, if we will restore SSE regs via sp.  */
       if (TARGET_64BIT
 	  && m->fs.sp_offset > 0x7fffffff
-	  && sp_valid_at (frame.stack_realign_offset)
+	  && sp_valid_at (frame.stack_realign_offset + 1)
 	  && (frame.nsseregs + frame.nregs) != 0)
 	{
 	  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
@@ -14895,10 +15093,16 @@ legitimate_pic_address_disp_p (rtx disp)
 	    break;
 	  op0 = XEXP (XEXP (disp, 0), 0);
 	  op1 = XEXP (XEXP (disp, 0), 1);
-	  if (!CONST_INT_P (op1)
-	      || INTVAL (op1) >= 16*1024*1024
+	  if (!CONST_INT_P (op1))
+	    break;
+	  if (GET_CODE (op0) == UNSPEC
+	      && (XINT (op0, 1) == UNSPEC_DTPOFF
+		  || XINT (op0, 1) == UNSPEC_NTPOFF)
+	      && trunc_int_for_mode (INTVAL (op1), SImode) == INTVAL (op1))
+	    return true;
+	  if (INTVAL (op1) >= 16*1024*1024
 	      || INTVAL (op1) < -16*1024*1024)
-            break;
+	    break;
 	  if (GET_CODE (op0) == LABEL_REF)
 	    return true;
 	  if (GET_CODE (op0) == CONST
@@ -16657,13 +16861,17 @@ ix86_delegitimize_address_1 (rtx x, bool base_term_p)
 	 movl foo@GOTOFF(%ecx), %edx
 	 in which case we return (%ecx - %ebx) + foo
 	 or (%ecx - _GLOBAL_OFFSET_TABLE_) + foo if pseudo_pic_reg
-	 and reload has completed.  */
+	 and reload has completed.  Don't do the latter for debug,
+	 as _GLOBAL_OFFSET_TABLE_ can't be expressed in the assembly.  */
       if (pic_offset_table_rtx
 	  && (!reload_completed || !ix86_use_pseudo_pic_reg ()))
         result = gen_rtx_PLUS (Pmode, gen_rtx_MINUS (Pmode, copy_rtx (addend),
 						     pic_offset_table_rtx),
 			       result);
-      else if (pic_offset_table_rtx && !TARGET_MACHO && !TARGET_VXWORKS_RTP)
+      else if (base_term_p
+	       && pic_offset_table_rtx
+	       && !TARGET_MACHO
+	       && !TARGET_VXWORKS_RTP)
 	{
 	  rtx tmp = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME);
 	  tmp = gen_rtx_MINUS (Pmode, copy_rtx (addend), tmp);
@@ -16716,6 +16924,25 @@ ix86_find_base_term (rtx x)
 
   return ix86_delegitimize_address_1 (x, true);
 }
+
+/* Return true if X shouldn't be emitted into the debug info.
+   Disallow UNSPECs other than @gotoff - we can't emit _GLOBAL_OFFSET_TABLE_
+   symbol easily into the .debug_info section, so we need not to
+   delegitimize, but instead assemble as @gotoff.
+   Disallow _GLOBAL_OFFSET_TABLE_ SYMBOL_REF - the assembler magically
+   assembles that as _GLOBAL_OFFSET_TABLE_-. expression.  */
+
+static bool
+ix86_const_not_ok_for_debug_p (rtx x)
+{
+  if (GET_CODE (x) == UNSPEC && XINT (x, 1) != UNSPEC_GOTOFF)
+    return true;
+
+  if (SYMBOL_REF_P (x) && strcmp (XSTR (x, 0), GOT_SYMBOL_NAME) == 0)
+    return true;
+
+  return false;
+}
 
 static void
 put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
@@ -16723,7 +16950,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 {
   const char *suffix;
 
-  if (mode == CCFPmode || mode == CCFPUmode)
+  if (mode == CCFPmode)
     {
       code = ix86_fp_compare_code_to_integer (code);
       mode = CCmode;
@@ -16734,6 +16961,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
   switch (code)
     {
     case EQ:
+      gcc_assert (mode != CCGZmode);
       switch (mode)
 	{
 	case E_CCAmode:
@@ -16757,6 +16985,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 	}
       break;
     case NE:
+      gcc_assert (mode != CCGZmode);
       switch (mode)
 	{
 	case E_CCAmode:
@@ -16801,6 +17030,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 
 	case E_CCmode:
 	case E_CCGCmode:
+	case E_CCGZmode:
 	  suffix = "l";
 	  break;
 
@@ -16809,7 +17039,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 	}
       break;
     case LTU:
-      if (mode == CCmode)
+      if (mode == CCmode || mode == CCGZmode)
 	suffix = "b";
       else if (mode == CCCmode)
 	suffix = fp ? "b" : "c";
@@ -16826,6 +17056,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 
 	case E_CCmode:
 	case E_CCGCmode:
+	case E_CCGZmode:
 	  suffix = "ge";
 	  break;
 
@@ -16834,7 +17065,7 @@ put_condition_code (enum rtx_code code, machine_mode mode, bool reverse,
 	}
       break;
     case GEU:
-      if (mode == CCmode)
+      if (mode == CCmode || mode == CCGZmode)
 	suffix = "nb";
       else if (mode == CCCmode)
 	suffix = fp ? "nb" : "nc";
@@ -17613,6 +17844,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
 	case '!':
 	  if (ix86_bnd_prefixed_insn_p (current_output_insn))
 	    fputs ("bnd ", file);
+	  if (ix86_notrack_prefixed_insn_p (current_output_insn))
+	    fputs ("notrack ", file);
 	  return;
 
 	default:
@@ -18028,6 +18261,10 @@ i386_asm_output_addr_const_extra (FILE *file, rtx x)
   op = XVECEXP (x, 0, 0);
   switch (XINT (x, 1))
     {
+    case UNSPEC_GOTOFF:
+      output_addr_const (file, op);
+      fputs ("@gotoff", file);
+      break;
     case UNSPEC_GOTTPOFF:
       output_addr_const (file, op);
       /* FIXME: This might be @TPOFF in Sun ld.  */
@@ -18147,89 +18384,66 @@ output_387_binary_op (rtx_insn *insn, rtx *operands)
 {
   static char buf[40];
   const char *p;
-  const char *ssep;
-  int is_sse = SSE_REG_P (operands[0]) || SSE_REG_P (operands[1]) || SSE_REG_P (operands[2]);
+  bool is_sse
+    = (SSE_REG_P (operands[0])
+       || SSE_REG_P (operands[1]) || SSE_REG_P (operands[2]));
 
-  /* Even if we do not want to check the inputs, this documents input
-     constraints.  Which helps in understanding the following code.  */
-  if (flag_checking)
-    {
-      if (STACK_REG_P (operands[0])
-	  && ((REG_P (operands[1])
-	       && REGNO (operands[0]) == REGNO (operands[1])
-	       && (STACK_REG_P (operands[2]) || MEM_P (operands[2])))
-	      || (REG_P (operands[2])
-		  && REGNO (operands[0]) == REGNO (operands[2])
-		  && (STACK_REG_P (operands[1]) || MEM_P (operands[1]))))
-	  && (STACK_TOP_P (operands[1]) || STACK_TOP_P (operands[2])))
-	; /* ok */
-      else
-	gcc_assert (is_sse);
-    }
+  if (is_sse)
+    p = "%v";
+  else if (GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_INT
+	   || GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT)
+    p = "fi";
+  else
+    p = "f";
+
+  strcpy (buf, p);
 
   switch (GET_CODE (operands[3]))
     {
     case PLUS:
-      if (GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_INT
-	  || GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT)
-	p = "fiadd";
-      else
-	p = "fadd";
-      ssep = "vadd";
-      break;
-
+      p = "add"; break;
     case MINUS:
-      if (GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_INT
-	  || GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT)
-	p = "fisub";
-      else
-	p = "fsub";
-      ssep = "vsub";
-      break;
-
+      p = "sub"; break;
     case MULT:
-      if (GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_INT
-	  || GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT)
-	p = "fimul";
-      else
-	p = "fmul";
-      ssep = "vmul";
-      break;
-
+      p = "mul"; break;
     case DIV:
-      if (GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_INT
-	  || GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT)
-	p = "fidiv";
-      else
-	p = "fdiv";
-      ssep = "vdiv";
-      break;
-
+      p = "div"; break;
     default:
       gcc_unreachable ();
     }
 
+  strcat (buf, p);
+
   if (is_sse)
    {
+     p = (GET_MODE (operands[0]) == SFmode) ? "ss" : "sd";
+     strcat (buf, p);
+
      if (TARGET_AVX)
-       {
-	 strcpy (buf, ssep);
-	 if (GET_MODE (operands[0]) == SFmode)
-	   strcat (buf, "ss\t{%2, %1, %0|%0, %1, %2}");
-	 else
-	   strcat (buf, "sd\t{%2, %1, %0|%0, %1, %2}");
-       }
+       p = "\t{%2, %1, %0|%0, %1, %2}";
      else
-       {
-	 strcpy (buf, ssep + 1);
-	 if (GET_MODE (operands[0]) == SFmode)
-	   strcat (buf, "ss\t{%2, %0|%0, %2}");
-	 else
-	   strcat (buf, "sd\t{%2, %0|%0, %2}");
-       }
-      return buf;
+       p = "\t{%2, %0|%0, %2}";
+
+     strcat (buf, p);
+     return buf;
    }
-  strcpy (buf, p);
+
+  /* Even if we do not want to check the inputs, this documents input
+     constraints.  Which helps in understanding the following code.  */
+  if (flag_checking)
+    {
+      if (STACK_REG_P (operands[0])
+	  && ((REG_P (operands[1])
+	       && REGNO (operands[0]) == REGNO (operands[1])
+	       && (STACK_REG_P (operands[2]) || MEM_P (operands[2])))
+	      || (REG_P (operands[2])
+		  && REGNO (operands[0]) == REGNO (operands[2])
+		  && (STACK_REG_P (operands[1]) || MEM_P (operands[1]))))
+	  && (STACK_TOP_P (operands[1]) || STACK_TOP_P (operands[2])))
+	; /* ok */
+      else
+	gcc_unreachable ();
+    }
 
   switch (GET_CODE (operands[3]))
     {
@@ -18818,10 +19032,13 @@ ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED,
 const char *
 output_fix_trunc (rtx_insn *insn, rtx *operands, bool fisttp)
 {
-  int stack_top_dies = find_regno_note (insn, REG_DEAD, FIRST_STACK_REG) != 0;
-  int dimode_p = GET_MODE (operands[0]) == DImode;
+  bool stack_top_dies = find_regno_note (insn, REG_DEAD, FIRST_STACK_REG);
+  bool dimode_p = GET_MODE (operands[0]) == DImode;
   int round_mode = get_attr_i387_cw (insn);
 
+  static char buf[40];
+  const char *p;
+
   /* Jump through a hoop or two for DImode, since the hardware has no
      non-popping instruction.  We used to do this a different way, but
      that was somewhat fragile and broke with post-reload splitters.  */
@@ -18833,18 +19050,20 @@ output_fix_trunc (rtx_insn *insn, rtx *operands, bool fisttp)
   gcc_assert (GET_MODE (operands[1]) != TFmode);
 
   if (fisttp)
-      output_asm_insn ("fisttp%Z0\t%0", operands);
-  else
-    {
-      if (round_mode != I387_CW_ANY)
-	output_asm_insn ("fldcw\t%3", operands);
-      if (stack_top_dies || dimode_p)
-	output_asm_insn ("fistp%Z0\t%0", operands);
-      else
-	output_asm_insn ("fist%Z0\t%0", operands);
-      if (round_mode != I387_CW_ANY)
-	output_asm_insn ("fldcw\t%2", operands);
-    }
+    return "fisttp%Z0\t%0";
+
+  strcpy (buf, "fist");
+
+  if (round_mode != I387_CW_ANY)
+    output_asm_insn ("fldcw\t%3", operands);
+
+  p = "p%Z0\t%0";
+  strcat (buf, p + !(stack_top_dies || dimode_p));
+
+  output_asm_insn (buf, operands);
+
+  if (round_mode != I387_CW_ANY)
+    output_asm_insn ("fldcw\t%2", operands);
 
   return "";
 }
@@ -18881,120 +19100,65 @@ output_387_ffreep (rtx *operands ATTRIBUTE_UNUSED, int opno)
    should be used.  UNORDERED_P is true when fucom should be used.  */
 
 const char *
-output_fp_compare (rtx_insn *insn, rtx *operands, bool eflags_p, bool unordered_p)
+output_fp_compare (rtx_insn *insn, rtx *operands,
+		   bool eflags_p, bool unordered_p)
 {
-  int stack_top_dies;
-  rtx cmp_op0, cmp_op1;
-  int is_sse = SSE_REG_P (operands[0]) || SSE_REG_P (operands[1]);
-
-  if (eflags_p)
-    {
-      cmp_op0 = operands[0];
-      cmp_op1 = operands[1];
-    }
-  else
-    {
-      cmp_op0 = operands[1];
-      cmp_op1 = operands[2];
-    }
+  rtx *xops = eflags_p ? &operands[0] : &operands[1];
+  bool stack_top_dies;
 
-  if (is_sse)
-    {
-      if (GET_MODE (operands[0]) == SFmode)
-	if (unordered_p)
-	  return "%vucomiss\t{%1, %0|%0, %1}";
-	else
-	  return "%vcomiss\t{%1, %0|%0, %1}";
-      else
-	if (unordered_p)
-	  return "%vucomisd\t{%1, %0|%0, %1}";
-	else
-	  return "%vcomisd\t{%1, %0|%0, %1}";
-    }
+  static char buf[40];
+  const char *p;
 
-  gcc_assert (STACK_TOP_P (cmp_op0));
+  gcc_assert (STACK_TOP_P (xops[0]));
 
-  stack_top_dies = find_regno_note (insn, REG_DEAD, FIRST_STACK_REG) != 0;
+  stack_top_dies = find_regno_note (insn, REG_DEAD, FIRST_STACK_REG);
 
-  if (cmp_op1 == CONST0_RTX (GET_MODE (cmp_op1)))
+  if (eflags_p)
     {
-      if (stack_top_dies)
-	{
-	  output_asm_insn ("ftst\n\tfnstsw\t%0", operands);
-	  return output_387_ffreep (operands, 1);
-	}
-      else
-	return "ftst\n\tfnstsw\t%0";
+      p = unordered_p ? "fucomi" : "fcomi";
+      strcpy (buf, p);
+
+      p = "p\t{%y1, %0|%0, %y1}";
+      strcat (buf, p + !stack_top_dies);
+
+      return buf;
     }
 
-  if (STACK_REG_P (cmp_op1)
+  if (STACK_REG_P (xops[1])
       && stack_top_dies
-      && find_regno_note (insn, REG_DEAD, REGNO (cmp_op1))
-      && REGNO (cmp_op1) != FIRST_STACK_REG)
+      && find_regno_note (insn, REG_DEAD, FIRST_STACK_REG + 1))
     {
-      /* If both the top of the 387 stack dies, and the other operand
-	 is also a stack register that dies, then this must be a
-	 `fcompp' float compare */
+      gcc_assert (REGNO (xops[1]) == FIRST_STACK_REG + 1);
 
-      if (eflags_p)
-	{
-	  /* There is no double popping fcomi variant.  Fortunately,
-	     eflags is immune from the fstp's cc clobbering.  */
-	  if (unordered_p)
-	    output_asm_insn ("fucomip\t{%y1, %0|%0, %y1}", operands);
-	  else
-	    output_asm_insn ("fcomip\t{%y1, %0|%0, %y1}", operands);
-	  return output_387_ffreep (operands, 0);
-	}
-      else
-	{
-	  if (unordered_p)
-	    return "fucompp\n\tfnstsw\t%0";
-	  else
-	    return "fcompp\n\tfnstsw\t%0";
-	}
+      /* If both the top of the 387 stack die, and the other operand
+	 is also a stack register that dies, then this must be a
+	 `fcompp' float compare.  */
+      p = unordered_p ? "fucompp" : "fcompp";
+      strcpy (buf, p);
+    }
+  else if (const0_operand (xops[1], VOIDmode))
+    {
+      gcc_assert (!unordered_p);
+      strcpy (buf, "ftst");
     }
   else
     {
-      /* Encoded here as eflags_p | intmode | unordered_p | stack_top_dies.  */
-
-      static const char * const alt[16] =
-      {
-	"fcom%Z2\t%y2\n\tfnstsw\t%0",
-	"fcomp%Z2\t%y2\n\tfnstsw\t%0",
-	"fucom%Z2\t%y2\n\tfnstsw\t%0",
-	"fucomp%Z2\t%y2\n\tfnstsw\t%0",
-
-	"ficom%Z2\t%y2\n\tfnstsw\t%0",
-	"ficomp%Z2\t%y2\n\tfnstsw\t%0",
-	NULL,
-	NULL,
-
-	"fcomi\t{%y1, %0|%0, %y1}",
-	"fcomip\t{%y1, %0|%0, %y1}",
-	"fucomi\t{%y1, %0|%0, %y1}",
-	"fucomip\t{%y1, %0|%0, %y1}",
-
-	NULL,
-	NULL,
-	NULL,
-	NULL
-      };
-
-      int mask;
-      const char *ret;
-
-      mask  = eflags_p << 3;
-      mask |= (GET_MODE_CLASS (GET_MODE (cmp_op1)) == MODE_INT) << 2;
-      mask |= unordered_p << 1;
-      mask |= stack_top_dies;
+      if (GET_MODE_CLASS (GET_MODE (xops[1])) == MODE_INT)
+	{
+	  gcc_assert (!unordered_p);
+	  p = "ficom";
+	}
+      else
+	p = unordered_p ? "fucom" : "fcom";
 
-      gcc_assert (mask < 16);
-      ret = alt[mask];
-      gcc_assert (ret);
+      strcpy (buf, p);
 
-      return ret;
+      p = "p%Z2\t%y2";
+      strcat (buf, p + !stack_top_dies);
     }
+
+  output_asm_insn (buf, operands);
+  return "fnstsw\t%0";
 }
 
 void
@@ -19067,20 +19231,6 @@ ix86_expand_clear (rtx dest)
   emit_insn (tmp);
 }
 
-/* X is an unchanging MEM.  If it is a constant pool reference, return
-   the constant pool rtx, else NULL.  */
-
-rtx
-maybe_get_pool_constant (rtx x)
-{
-  x = ix86_delegitimize_address (XEXP (x, 0));
-
-  if (GET_CODE (x) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (x))
-    return get_pool_constant (x);
-
-  return NULL_RTX;
-}
-
 void
 ix86_expand_move (machine_mode mode, rtx operands[])
 {
@@ -21526,6 +21676,8 @@ ix86_match_ccmode (rtx insn, machine_mode req_mode)
     case E_CCZmode:
       break;
 
+    case E_CCGZmode:
+
     case E_CCAmode:
     case E_CCCmode:
     case E_CCOmode:
@@ -21563,18 +21715,38 @@ ix86_expand_int_compare (enum rtx_code code, rtx op0, rtx op1)
   return gen_rtx_fmt_ee (code, VOIDmode, flags, const0_rtx);
 }
 
-/* Figure out whether to use ordered or unordered fp comparisons.
-   Return the appropriate mode to use.  */
+/* Figure out whether to use unordered fp comparisons.  */
 
-machine_mode
-ix86_fp_compare_mode (enum rtx_code)
+static bool
+ix86_unordered_fp_compare (enum rtx_code code)
 {
-  /* ??? In order to make all comparisons reversible, we do all comparisons
-     non-trapping when compiling for IEEE.  Once gcc is able to distinguish
-     all forms trapping and nontrapping comparisons, we can make inequality
-     comparisons trapping again, since it results in better code when using
-     FCOM based compares.  */
-  return TARGET_IEEE_FP ? CCFPUmode : CCFPmode;
+  if (!TARGET_IEEE_FP)
+    return false;
+
+  switch (code)
+    {
+    case GT:
+    case GE:
+    case LT:
+    case LE:
+      return false;
+
+    case EQ:
+    case NE:
+
+    case LTGT:
+    case UNORDERED:
+    case ORDERED:
+    case UNLT:
+    case UNLE:
+    case UNGT:
+    case UNGE:
+    case UNEQ:
+      return true;
+
+    default:
+      gcc_unreachable ();
+    }
 }
 
 machine_mode
@@ -21585,7 +21757,7 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx op1)
   if (SCALAR_FLOAT_MODE_P (mode))
     {
       gcc_assert (!DECIMAL_FLOAT_MODE_P (mode));
-      return ix86_fp_compare_mode (code);
+      return CCFPmode;
     }
 
   switch (code)
@@ -21707,7 +21879,6 @@ ix86_cc_modes_compatible (machine_mode m1, machine_mode m2)
 	}
 
     case E_CCFPmode:
-    case E_CCFPUmode:
       /* These are only compatible with themselves, which we already
 	 checked above.  */
       return VOIDmode;
@@ -21811,10 +21982,10 @@ ix86_fp_comparison_strategy (enum rtx_code)
 static enum rtx_code
 ix86_prepare_fp_compare_args (enum rtx_code code, rtx *pop0, rtx *pop1)
 {
-  machine_mode fpcmp_mode = ix86_fp_compare_mode (code);
+  bool unordered_compare = ix86_unordered_fp_compare (code);
   rtx op0 = *pop0, op1 = *pop1;
   machine_mode op_mode = GET_MODE (op0);
-  int is_sse = TARGET_SSE_MATH && SSE_FLOAT_MODE_P (op_mode);
+  bool is_sse = TARGET_SSE_MATH && SSE_FLOAT_MODE_P (op_mode);
 
   /* All of the unordered compare instructions only work on registers.
      The same is true of the fcomi compare instructions.  The XFmode
@@ -21823,7 +21994,7 @@ ix86_prepare_fp_compare_args (enum rtx_code code, rtx *pop0, rtx *pop1)
      floating point.  */
 
   if (!is_sse
-      && (fpcmp_mode == CCFPUmode
+      && (unordered_compare
 	  || (op_mode == XFmode
 	      && ! (standard_80387_constant_p (op0) == 1
 		    || standard_80387_constant_p (op1) == 1)
@@ -21920,27 +22091,29 @@ ix86_fp_compare_code_to_integer (enum rtx_code code)
 static rtx
 ix86_expand_fp_compare (enum rtx_code code, rtx op0, rtx op1, rtx scratch)
 {
-  machine_mode fpcmp_mode, intcmp_mode;
+  bool unordered_compare = ix86_unordered_fp_compare (code);
+  machine_mode intcmp_mode;
   rtx tmp, tmp2;
 
-  fpcmp_mode = ix86_fp_compare_mode (code);
   code = ix86_prepare_fp_compare_args (code, &op0, &op1);
 
   /* Do fcomi/sahf based test when profitable.  */
   switch (ix86_fp_comparison_strategy (code))
     {
     case IX86_FPCMP_COMI:
-      intcmp_mode = fpcmp_mode;
-      tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1);
-      tmp = gen_rtx_SET (gen_rtx_REG (fpcmp_mode, FLAGS_REG), tmp);
-      emit_insn (tmp);
+      intcmp_mode = CCFPmode;
+      tmp = gen_rtx_COMPARE (CCFPmode, op0, op1);
+      if (unordered_compare)
+	tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP);
+      emit_insn (gen_rtx_SET (gen_rtx_REG (CCFPmode, FLAGS_REG), tmp));
       break;
 
     case IX86_FPCMP_SAHF:
-      intcmp_mode = fpcmp_mode;
-      tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1);
-      tmp = gen_rtx_SET (gen_rtx_REG (fpcmp_mode, FLAGS_REG), tmp);
-
+      intcmp_mode = CCFPmode;
+      tmp = gen_rtx_COMPARE (CCFPmode, op0, op1);
+      if (unordered_compare)
+	tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP);
+      tmp = gen_rtx_SET (gen_rtx_REG (CCFPmode, FLAGS_REG), tmp);
       if (!scratch)
 	scratch = gen_reg_rtx (HImode);
       tmp2 = gen_rtx_CLOBBER (VOIDmode, scratch);
@@ -21949,11 +22122,13 @@ ix86_expand_fp_compare (enum rtx_code code, rtx op0, rtx op1, rtx scratch)
 
     case IX86_FPCMP_ARITH:
       /* Sadness wrt reg-stack pops killing fpsr -- gotta get fnstsw first.  */
-      tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1);
-      tmp2 = gen_rtx_UNSPEC (HImode, gen_rtvec (1, tmp), UNSPEC_FNSTSW);
+      tmp = gen_rtx_COMPARE (CCFPmode, op0, op1);
+      if (unordered_compare)
+	tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP);
+      tmp = gen_rtx_UNSPEC (HImode, gen_rtvec (1, tmp), UNSPEC_FNSTSW);
       if (!scratch)
 	scratch = gen_reg_rtx (HImode);
-      emit_insn (gen_rtx_SET (scratch, tmp2));
+      emit_insn (gen_rtx_SET (scratch, tmp));
 
       /* In the unordered case, we have to check C2 for NaN's, which
 	 doesn't happen to work out to anything nice combination-wise.
@@ -22234,6 +22409,62 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx op1, rtx label)
 	      break;
 	    }
 
+	/* Emulate comparisons that do not depend on Zero flag with
+	   double-word subtraction.  Note that only Overflow, Sign
+	   and Carry flags are valid, so swap arguments and condition
+	   of comparisons that would otherwise test Zero flag.  */
+
+	switch (code)
+	  {
+	  case LE: case LEU: case GT: case GTU:
+	    std::swap (lo[0], lo[1]);
+	    std::swap (hi[0], hi[1]);
+	    code = swap_condition (code);
+	    /* FALLTHRU */
+
+	  case LT: case LTU: case GE: case GEU:
+	    {
+	      rtx (*cmp_insn) (rtx, rtx);
+	      rtx (*sbb_insn) (rtx, rtx, rtx);
+	      bool uns = (code == LTU || code == GEU);
+
+	      if (TARGET_64BIT)
+		{
+		  cmp_insn = gen_cmpdi_1;
+		  sbb_insn
+		    = uns ? gen_subdi3_carry_ccc : gen_subdi3_carry_ccgz;
+		}
+	      else
+		{
+		  cmp_insn = gen_cmpsi_1;
+		  sbb_insn
+		    = uns ? gen_subsi3_carry_ccc : gen_subsi3_carry_ccgz;
+		}
+
+	      if (!nonimmediate_operand (lo[0], submode))
+		lo[0] = force_reg (submode, lo[0]);
+	      if (!x86_64_general_operand (lo[1], submode))
+		lo[1] = force_reg (submode, lo[1]);
+
+	      if (!register_operand (hi[0], submode))
+		hi[0] = force_reg (submode, hi[0]);
+	      if ((uns && !nonimmediate_operand (hi[1], submode))
+		  || (!uns && !x86_64_general_operand (hi[1], submode)))
+		hi[1] = force_reg (submode, hi[1]);
+
+	      emit_insn (cmp_insn (lo[0], lo[1]));
+	      emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1]));
+
+	      tmp = gen_rtx_REG (uns ? CCCmode : CCGZmode, FLAGS_REG);
+
+	      ix86_expand_branch (code, tmp, const0_rtx, label);
+	      return;
+	    }
+
+	  default:
+	    break;
+	  }
+
 	/* Otherwise, we need two or three jumps.  */
 
 	label2 = gen_label_rtx ();
@@ -22339,8 +22570,7 @@ ix86_expand_carry_flag_compare (enum rtx_code code, rtx op0, rtx op1, rtx *pop)
       compare_seq = get_insns ();
       end_sequence ();
 
-      if (GET_MODE (XEXP (compare_op, 0)) == CCFPmode
-	  || GET_MODE (XEXP (compare_op, 0)) == CCFPUmode)
+      if (GET_MODE (XEXP (compare_op, 0)) == CCFPmode)
         code = ix86_fp_compare_code_to_integer (GET_CODE (compare_op));
       else
 	code = GET_CODE (compare_op);
@@ -22480,8 +22710,7 @@ ix86_expand_int_movcc (rtx operands[])
 
 	      flags = XEXP (compare_op, 0);
 
-	      if (GET_MODE (flags) == CCFPmode
-		  || GET_MODE (flags) == CCFPUmode)
+	      if (GET_MODE (flags) == CCFPmode)
 		{
 		  fpcmp = true;
 		  compare_code
@@ -23826,10 +24055,10 @@ struct expand_vec_perm_d
 };
 
 static bool
-ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1,
+ix86_expand_vec_perm_vpermt2 (rtx target, rtx mask, rtx op0, rtx op1,
 			      struct expand_vec_perm_d *d)
 {
-  /* ix86_expand_vec_perm_vpermi2 is called from both const and non-const
+  /* ix86_expand_vec_perm_vpermt2 is called from both const and non-const
      expander, so args are either in d, or in op0, op1 etc.  */
   machine_mode mode = GET_MODE (d ? d->op0 : op0);
   machine_mode maskmode = mode;
@@ -23839,83 +24068,83 @@ ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1,
     {
     case E_V8HImode:
       if (TARGET_AVX512VL && TARGET_AVX512BW)
-	gen = gen_avx512vl_vpermi2varv8hi3;
+	gen = gen_avx512vl_vpermt2varv8hi3;
       break;
     case E_V16HImode:
       if (TARGET_AVX512VL && TARGET_AVX512BW)
-	gen = gen_avx512vl_vpermi2varv16hi3;
+	gen = gen_avx512vl_vpermt2varv16hi3;
       break;
     case E_V64QImode:
       if (TARGET_AVX512VBMI)
-	gen = gen_avx512bw_vpermi2varv64qi3;
+	gen = gen_avx512bw_vpermt2varv64qi3;
       break;
     case E_V32HImode:
       if (TARGET_AVX512BW)
-	gen = gen_avx512bw_vpermi2varv32hi3;
+	gen = gen_avx512bw_vpermt2varv32hi3;
       break;
     case E_V4SImode:
       if (TARGET_AVX512VL)
-	gen = gen_avx512vl_vpermi2varv4si3;
+	gen = gen_avx512vl_vpermt2varv4si3;
       break;
     case E_V8SImode:
       if (TARGET_AVX512VL)
-	gen = gen_avx512vl_vpermi2varv8si3;
+	gen = gen_avx512vl_vpermt2varv8si3;
       break;
     case E_V16SImode:
       if (TARGET_AVX512F)
-	gen = gen_avx512f_vpermi2varv16si3;
+	gen = gen_avx512f_vpermt2varv16si3;
       break;
     case E_V4SFmode:
       if (TARGET_AVX512VL)
 	{
-	  gen = gen_avx512vl_vpermi2varv4sf3;
+	  gen = gen_avx512vl_vpermt2varv4sf3;
 	  maskmode = V4SImode;
 	}
       break;
     case E_V8SFmode:
       if (TARGET_AVX512VL)
 	{
-	  gen = gen_avx512vl_vpermi2varv8sf3;
+	  gen = gen_avx512vl_vpermt2varv8sf3;
 	  maskmode = V8SImode;
 	}
       break;
     case E_V16SFmode:
       if (TARGET_AVX512F)
 	{
-	  gen = gen_avx512f_vpermi2varv16sf3;
+	  gen = gen_avx512f_vpermt2varv16sf3;
 	  maskmode = V16SImode;
 	}
       break;
     case E_V2DImode:
       if (TARGET_AVX512VL)
-	gen = gen_avx512vl_vpermi2varv2di3;
+	gen = gen_avx512vl_vpermt2varv2di3;
       break;
     case E_V4DImode:
       if (TARGET_AVX512VL)
-	gen = gen_avx512vl_vpermi2varv4di3;
+	gen = gen_avx512vl_vpermt2varv4di3;
       break;
     case E_V8DImode:
       if (TARGET_AVX512F)
-	gen = gen_avx512f_vpermi2varv8di3;
+	gen = gen_avx512f_vpermt2varv8di3;
       break;
     case E_V2DFmode:
       if (TARGET_AVX512VL)
 	{
-	  gen = gen_avx512vl_vpermi2varv2df3;
+	  gen = gen_avx512vl_vpermt2varv2df3;
 	  maskmode = V2DImode;
 	}
       break;
     case E_V4DFmode:
       if (TARGET_AVX512VL)
 	{
-	  gen = gen_avx512vl_vpermi2varv4df3;
+	  gen = gen_avx512vl_vpermt2varv4df3;
 	  maskmode = V4DImode;
 	}
       break;
     case E_V8DFmode:
       if (TARGET_AVX512F)
 	{
-	  gen = gen_avx512f_vpermi2varv8df3;
+	  gen = gen_avx512f_vpermt2varv8df3;
 	  maskmode = V8DImode;
 	}
       break;
@@ -23926,7 +24155,7 @@ ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1,
   if (gen == NULL)
     return false;
 
-  /* ix86_expand_vec_perm_vpermi2 is called from both const and non-const
+  /* ix86_expand_vec_perm_vpermt2 is called from both const and non-const
      expander, so args are either in d, or in op0, op1 etc.  */
   if (d)
     {
@@ -23939,7 +24168,7 @@ ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1,
       mask = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (d->nelt, vec));
     }
 
-  emit_insn (gen (target, op0, force_reg (maskmode, mask), op1));
+  emit_insn (gen (target, force_reg (maskmode, mask), op0, op1));
   return true;
 }
 
@@ -23990,7 +24219,7 @@ ix86_expand_vec_perm (rtx operands[])
 	}
     }
 
-  if (ix86_expand_vec_perm_vpermi2 (target, op0, mask, op1, NULL))
+  if (ix86_expand_vec_perm_vpermt2 (target, mask, op0, op1, NULL))
     return;
 
   if (TARGET_AVX2)
@@ -24515,8 +24744,7 @@ ix86_expand_int_addcc (rtx operands[])
 
   flags = XEXP (compare_op, 0);
 
-  if (GET_MODE (flags) == CCFPmode
-      || GET_MODE (flags) == CCFPUmode)
+  if (GET_MODE (flags) == CCFPmode)
     {
       fpcmp = true;
       code = ix86_fp_compare_code_to_integer (code);
@@ -24603,11 +24831,7 @@ ix86_split_to_parts (rtx operand, rtx *parts, machine_mode mode)
   /* Optimize constant pool reference to immediates.  This is used by fp
      moves, that force all constants to memory to allow combining.  */
   if (MEM_P (operand) && MEM_READONLY_P (operand))
-    {
-      rtx tmp = maybe_get_pool_constant (operand);
-      if (tmp)
-	operand = tmp;
-    }
+    operand = avoid_constant_pool_reference (operand);
 
   if (MEM_P (operand) && !offsettable_memref_p (operand))
     {
@@ -29804,8 +30028,12 @@ BDESC_VERIFYS (IX86_BUILTIN__BDESC_MPX_CONST_FIRST,
 	       IX86_BUILTIN__BDESC_MPX_LAST, 1);
 BDESC_VERIFYS (IX86_BUILTIN__BDESC_MULTI_ARG_FIRST,
 	       IX86_BUILTIN__BDESC_MPX_CONST_LAST, 1);
-BDESC_VERIFYS (IX86_BUILTIN_MAX,
+BDESC_VERIFYS (IX86_BUILTIN__BDESC_CET_FIRST,
 	       IX86_BUILTIN__BDESC_MULTI_ARG_LAST, 1);
+BDESC_VERIFYS (IX86_BUILTIN__BDESC_CET_NORMAL_FIRST,
+	       IX86_BUILTIN__BDESC_CET_LAST, 1);
+BDESC_VERIFYS (IX86_BUILTIN_MAX,
+	       IX86_BUILTIN__BDESC_CET_NORMAL_LAST, 1);
 
 /* Set up all the MMX/SSE builtins, even builtins for instructions that are not
    in the current target ISA to allow the user to compile particular modules
@@ -30472,6 +30700,35 @@ ix86_init_mmx_sse_builtins (void)
   BDESC_VERIFYS (IX86_BUILTIN__BDESC_MULTI_ARG_LAST,
 		 IX86_BUILTIN__BDESC_MULTI_ARG_FIRST,
 		 ARRAY_SIZE (bdesc_multi_arg) - 1);
+
+  /* Add CET inrinsics.  */
+  for (i = 0, d = bdesc_cet; i < ARRAY_SIZE (bdesc_cet); i++, d++)
+    {
+      BDESC_VERIFY (d->code, IX86_BUILTIN__BDESC_CET_FIRST, i);
+      if (d->name == 0)
+	continue;
+
+      ftype = (enum ix86_builtin_func_type) d->flag;
+      def_builtin2 (d->mask, d->name, ftype, d->code);
+    }
+  BDESC_VERIFYS (IX86_BUILTIN__BDESC_CET_LAST,
+		 IX86_BUILTIN__BDESC_CET_FIRST,
+		 ARRAY_SIZE (bdesc_cet) - 1);
+
+  for (i = 0, d = bdesc_cet_rdssp;
+       i < ARRAY_SIZE (bdesc_cet_rdssp);
+       i++, d++)
+    {
+      BDESC_VERIFY (d->code, IX86_BUILTIN__BDESC_CET_NORMAL_FIRST, i);
+      if (d->name == 0)
+	continue;
+
+      ftype = (enum ix86_builtin_func_type) d->flag;
+      def_builtin2 (d->mask, d->name, ftype, d->code);
+    }
+  BDESC_VERIFYS (IX86_BUILTIN__BDESC_CET_NORMAL_LAST,
+		 IX86_BUILTIN__BDESC_CET_NORMAL_FIRST,
+		 ARRAY_SIZE (bdesc_cet_rdssp) - 1);
 }
 
 static void
@@ -33425,6 +33682,7 @@ ix86_expand_args_builtin (const struct builtin_description *d,
     case UQI_FTYPE_V4SF_V4SF_INT:
     case UHI_FTYPE_V16SI_V16SI_INT:
     case UHI_FTYPE_V16SF_V16SF_INT:
+    case V64QI_FTYPE_V64QI_V64QI_INT:
       nargs = 3;
       nargs_constant = 1;
       break;
@@ -33652,6 +33910,13 @@ ix86_expand_args_builtin (const struct builtin_description *d,
       mask_pos = 1;
       nargs_constant = 1;
       break;
+    case V64QI_FTYPE_V64QI_V64QI_INT_V64QI_UDI:
+    case V32QI_FTYPE_V32QI_V32QI_INT_V32QI_USI:
+    case V16QI_FTYPE_V16QI_V16QI_INT_V16QI_UHI:
+      nargs = 5;
+      mask_pos = 1;
+      nargs_constant = 2;
+      break;
 
     default:
       gcc_unreachable ();
@@ -34830,10 +35095,10 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
 		     machine_mode mode, int ignore)
 {
   size_t i;
-  enum insn_code icode;
+  enum insn_code icode, icode2;
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   tree arg0, arg1, arg2, arg3, arg4;
-  rtx op0, op1, op2, op3, op4, pat, insn;
+  rtx op0, op1, op2, op3, op4, pat, pat2, insn;
   machine_mode mode0, mode1, mode2, mode3, mode4;
   unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
 
@@ -35808,22 +36073,34 @@ rdseed_step:
 
     case IX86_BUILTIN_SBB32:
       icode = CODE_FOR_subborrowsi;
+      icode2 = CODE_FOR_subborrowsi_0;
       mode0 = SImode;
+      mode1 = DImode;
+      mode2 = CCmode;
       goto handlecarry;
 
     case IX86_BUILTIN_SBB64:
       icode = CODE_FOR_subborrowdi;
+      icode2 = CODE_FOR_subborrowdi_0;
       mode0 = DImode;
+      mode1 = TImode;
+      mode2 = CCmode;
       goto handlecarry;
 
     case IX86_BUILTIN_ADDCARRYX32:
       icode = CODE_FOR_addcarrysi;
+      icode2 = CODE_FOR_addcarrysi_0;
       mode0 = SImode;
+      mode1 = DImode;
+      mode2 = CCCmode;
       goto handlecarry;
 
     case IX86_BUILTIN_ADDCARRYX64:
       icode = CODE_FOR_addcarrydi;
+      icode2 = CODE_FOR_addcarrydi_0;
       mode0 = DImode;
+      mode1 = TImode;
+      mode2 = CCCmode;
 
     handlecarry:
       arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
@@ -35832,7 +36109,8 @@ rdseed_step:
       arg3 = CALL_EXPR_ARG (exp, 3); /* unsigned int *sum_out.  */
 
       op1 = expand_normal (arg0);
-      op1 = copy_to_mode_reg (QImode, convert_to_mode (QImode, op1, 1));
+      if (!integer_zerop (arg0))
+	op1 = copy_to_mode_reg (QImode, convert_to_mode (QImode, op1, 1));
 
       op2 = expand_normal (arg1);
       if (!register_operand (op2, mode0))
@@ -35849,21 +36127,31 @@ rdseed_step:
 	  op4 = copy_addr_to_reg (op4);
 	}
 
-      /* Generate CF from input operand.  */
-      emit_insn (gen_addqi3_cconly_overflow (op1, constm1_rtx));
-
-      /* Generate instruction that consumes CF.  */
       op0 = gen_reg_rtx (mode0);
+      if (integer_zerop (arg0))
+	{
+	  /* If arg0 is 0, optimize right away into add or sub
+	     instruction that sets CCCmode flags.  */
+	  op1 = gen_rtx_REG (mode2, FLAGS_REG);
+	  emit_insn (GEN_FCN (icode2) (op0, op2, op3));
+	}
+      else
+	{
+	  /* Generate CF from input operand.  */
+	  emit_insn (gen_addqi3_cconly_overflow (op1, constm1_rtx));
 
-      op1 = gen_rtx_REG (CCCmode, FLAGS_REG);
-      pat = gen_rtx_LTU (mode0, op1, const0_rtx);
-      emit_insn (GEN_FCN (icode) (op0, op2, op3, op1, pat));
+	  /* Generate instruction that consumes CF.  */
+	  op1 = gen_rtx_REG (CCCmode, FLAGS_REG);
+	  pat = gen_rtx_LTU (mode1, op1, const0_rtx);
+	  pat2 = gen_rtx_LTU (mode0, op1, const0_rtx);
+	  emit_insn (GEN_FCN (icode) (op0, op2, op3, op1, pat, pat2));
+	}
 
       /* Return current CF value.  */
       if (target == 0)
         target = gen_reg_rtx (QImode);
 
-      PUT_MODE (pat, QImode);
+      pat = gen_rtx_LTU (QImode, op1, const0_rtx);
       emit_insn (gen_rtx_SET (target, pat));
 
       /* Store the result.  */
@@ -36656,6 +36944,57 @@ rdseed_step:
       emit_insn (gen_xabort (op0));
       return 0;
 
+    case IX86_BUILTIN_RSTORSSP:
+    case IX86_BUILTIN_CLRSSBSY:
+      arg0 = CALL_EXPR_ARG (exp, 0);
+      op0 = expand_normal (arg0);
+      icode = (fcode == IX86_BUILTIN_RSTORSSP
+	  ? CODE_FOR_rstorssp
+	  : CODE_FOR_clrssbsy);
+      if (!address_operand (op0, VOIDmode))
+	{
+	  op1 = convert_memory_address (Pmode, op0);
+	  op0 = copy_addr_to_reg (op1);
+	}
+      emit_insn (GEN_FCN (icode) (gen_rtx_MEM (Pmode, op0)));
+      return 0;
+
+    case IX86_BUILTIN_WRSSD:
+    case IX86_BUILTIN_WRSSQ:
+    case IX86_BUILTIN_WRUSSD:
+    case IX86_BUILTIN_WRUSSQ:
+      arg0 = CALL_EXPR_ARG (exp, 0);
+      op0 = expand_normal (arg0);
+      arg1 = CALL_EXPR_ARG (exp, 1);
+      op1 = expand_normal (arg1);
+      switch (fcode)
+	{
+	case IX86_BUILTIN_WRSSD:
+	  icode = CODE_FOR_wrsssi;
+	  mode = SImode;
+	  break;
+	case IX86_BUILTIN_WRSSQ:
+	  icode = CODE_FOR_wrssdi;
+	  mode = DImode;
+	  break;
+	case IX86_BUILTIN_WRUSSD:
+	  icode = CODE_FOR_wrusssi;
+	  mode = SImode;
+	  break;
+	case IX86_BUILTIN_WRUSSQ:
+	  icode = CODE_FOR_wrussdi;
+	  mode = DImode;
+	  break;
+	}
+      op0 = force_reg (mode, op0);
+      if (!address_operand (op1, VOIDmode))
+	{
+	  op2 = convert_memory_address (Pmode, op1);
+	  op1 = copy_addr_to_reg (op2);
+	}
+      emit_insn (GEN_FCN (icode) (op0, gen_rtx_MEM (mode, op1)));
+      return 0;
+
     default:
       break;
     }
@@ -36958,6 +37297,22 @@ s4fma_expand:
 					    d->flag, d->comparison);
     }
 
+  if (fcode >= IX86_BUILTIN__BDESC_CET_FIRST
+      && fcode <= IX86_BUILTIN__BDESC_CET_LAST)
+    {
+      i = fcode - IX86_BUILTIN__BDESC_CET_FIRST;
+      return ix86_expand_special_args_builtin (bdesc_cet + i, exp,
+					       target);
+    }
+
+  if (fcode >= IX86_BUILTIN__BDESC_CET_NORMAL_FIRST
+      && fcode <= IX86_BUILTIN__BDESC_CET_NORMAL_LAST)
+    {
+      i = fcode - IX86_BUILTIN__BDESC_CET_NORMAL_FIRST;
+      return ix86_expand_args_builtin (bdesc_cet_rdssp + i, exp,
+				       target);
+    }
+
   gcc_unreachable ();
 }
 
@@ -38347,6 +38702,28 @@ ix86_can_change_mode_class (machine_mode from, machine_mode to,
   return true;
 }
 
+/* Return index of MODE in the sse load/store tables.  */
+
+static inline int
+sse_store_index (machine_mode mode)
+{
+      switch (GET_MODE_SIZE (mode))
+	{
+	  case 4:
+	    return 0;
+	  case 8:
+	    return 1;
+	  case 16:
+	    return 2;
+	  case 32:
+	    return 3;
+	  case 64:
+	    return 4;
+	  default:
+	    return -1;
+	}
+}
+
 /* Return the cost of moving data of mode M between a
    register and memory.  A value of 2 is the default; this cost is
    relative to those in `REGISTER_MOVE_COST'.
@@ -38390,21 +38767,9 @@ inline_memory_move_cost (machine_mode mode, enum reg_class regclass,
     }
   if (SSE_CLASS_P (regclass))
     {
-      int index;
-      switch (GET_MODE_SIZE (mode))
-	{
-	  case 4:
-	    index = 0;
-	    break;
-	  case 8:
-	    index = 1;
-	    break;
-	  case 16:
-	    index = 2;
-	    break;
-	  default:
-	    return 100;
-	}
+      int index = sse_store_index (mode);
+      if (index == -1)
+	return 100;
       if (in == 2)
         return MAX (ix86_cost->sse_load [index], ix86_cost->sse_store [index]);
       return in ? ix86_cost->sse_load [index] : ix86_cost->sse_store [index];
@@ -38507,8 +38872,10 @@ ix86_register_move_cost (machine_mode mode, reg_class_t class1_i,
       /* In case of copying from general_purpose_register we may emit multiple
          stores followed by single load causing memory size mismatch stall.
          Count this as arbitrarily high cost of 20.  */
-      if (targetm.class_max_nregs (class1, mode)
-	  > targetm.class_max_nregs (class2, mode))
+      if (GET_MODE_BITSIZE (mode) > BITS_PER_WORD
+	  && TARGET_MEMORY_MISMATCH_STALL
+	  && targetm.class_max_nregs (class1, mode)
+	     > targetm.class_max_nregs (class2, mode))
 	cost += 20;
 
       /* In the case of FP/MMX moves, the registers actually overlap, and we
@@ -38530,12 +38897,19 @@ ix86_register_move_cost (machine_mode mode, reg_class_t class1_i,
        where integer modes in MMX/SSE registers are not tieable
        because of missing QImode and HImode moves to, from or between
        MMX/SSE registers.  */
-    return MAX (8, ix86_cost->mmxsse_to_integer);
+    return MAX (8, MMX_CLASS_P (class1) || MMX_CLASS_P (class2)
+		? ix86_cost->mmxsse_to_integer : ix86_cost->ssemmx_to_integer);
 
   if (MAYBE_FLOAT_CLASS_P (class1))
     return ix86_cost->fp_move;
   if (MAYBE_SSE_CLASS_P (class1))
-    return ix86_cost->sse_move;
+    {
+      if (GET_MODE_BITSIZE (mode) <= 128)
+	return ix86_cost->xmm_move;
+      if (GET_MODE_BITSIZE (mode) <= 256)
+	return ix86_cost->ymm_move;
+      return ix86_cost->zmm_move;
+    }
   if (MAYBE_MMX_CLASS_P (class1))
     return ix86_cost->mmx_move;
   return 2;
@@ -38806,6 +39180,27 @@ ix86_set_reg_reg_cost (machine_mode mode)
   return COSTS_N_INSNS (CEIL (GET_MODE_SIZE (mode), units));
 }
 
+/* Return cost of vector operation in MODE given that scalar version has
+   COST.  If PARALLEL is true assume that CPU has more than one unit
+   performing the operation.  */
+
+static int
+ix86_vec_cost (machine_mode mode, int cost, bool parallel)
+{
+  if (!VECTOR_MODE_P (mode))
+    return cost;
+ 
+  if (!parallel)
+    return cost * GET_MODE_NUNITS (mode);
+  if (GET_MODE_BITSIZE (mode) == 128
+      && TARGET_SSE_SPLIT_REGS)
+    return cost * 2;
+  if (GET_MODE_BITSIZE (mode) > 128
+      && TARGET_AVX128_OPTIMAL)
+    return cost * GET_MODE_BITSIZE (mode) / 128;
+  return cost;
+}
+
 /* Compute a (partial) cost for rtx X.  Return true if the complete
    cost has been computed, and false if subexpressions should be
    scanned.  In either case, *TOTAL contains the cost result.  */
@@ -38819,6 +39214,9 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
   enum rtx_code outer_code = (enum rtx_code) outer_code_i;
   const struct processor_costs *cost = speed ? ix86_cost : &ix86_size_cost;
   int src_cost;
+  machine_mode inner_mode = mode;
+  if (VECTOR_MODE_P (mode))
+    inner_mode = GET_MODE_INNER (mode);
 
   switch (code)
     {
@@ -38963,19 +39361,20 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 		     shift with one insn set the cost to prefer paddb.  */
 		  if (CONSTANT_P (XEXP (x, 1)))
 		    {
-		      *total = (cost->fabs
+		      *total = ix86_vec_cost (mode,
+				cost->sse_op
 				+ rtx_cost (XEXP (x, 0), mode, code, 0, speed)
-				+ (speed ? 2 : COSTS_N_BYTES (16)));
+				+ (speed ? 2 : COSTS_N_BYTES (16)), true);
 		      return true;
 		    }
 		  count = 3;
 		}
 	      else if (TARGET_SSSE3)
 		count = 7;
-	      *total = cost->fabs * count;
+	      *total = ix86_vec_cost (mode, cost->sse_op * count, true);
 	    }
 	  else
-	    *total = cost->fabs;
+	    *total = ix86_vec_cost (mode, cost->sse_op, true);
 	}
       else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
 	{
@@ -39017,9 +39416,9 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
         gcc_assert (FLOAT_MODE_P (mode));
         gcc_assert (TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F);
 
-        /* ??? SSE scalar/vector cost should be used here.  */
-        /* ??? Bald assumption that fma has the same cost as fmul.  */
-        *total = cost->fmul;
+        *total = ix86_vec_cost (mode,
+				mode == SFmode ? cost->fmass : cost->fmasd,
+				true);
 	*total += rtx_cost (XEXP (x, 1), mode, FMA, 1, speed);
 
         /* Negate in op0 or op2 is free: FMS, FNMA, FNMS.  */
@@ -39038,8 +39437,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
     case MULT:
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
 	{
-	  /* ??? SSE scalar cost should be used here.  */
-	  *total = cost->fmul;
+	  *total = inner_mode == DFmode ? cost->mulsd : cost->mulss;
 	  return false;
 	}
       else if (X87_FLOAT_MODE_P (mode))
@@ -39049,8 +39447,9 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 	}
       else if (FLOAT_MODE_P (mode))
 	{
-	  /* ??? SSE vector cost should be used here.  */
-	  *total = cost->fmul;
+	  *total = ix86_vec_cost (mode,
+				  inner_mode == DFmode
+				  ? cost->mulsd : cost->mulss, true);
 	  return false;
 	}
       else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
@@ -39063,22 +39462,29 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 		extra = 5;
 	      else if (TARGET_SSSE3)
 		extra = 6;
-	      *total = cost->fmul * 2 + cost->fabs * extra;
+	      *total = ix86_vec_cost (mode,
+				      cost->mulss * 2 + cost->sse_op * extra,
+				      true);
 	    }
 	  /* V*DImode is emulated with 5-8 insns.  */
 	  else if (mode == V2DImode || mode == V4DImode)
 	    {
 	      if (TARGET_XOP && mode == V2DImode)
-		*total = cost->fmul * 2 + cost->fabs * 3;
+		*total = ix86_vec_cost (mode,
+					cost->mulss * 2 + cost->sse_op * 3,
+					true);
 	      else
-		*total = cost->fmul * 3 + cost->fabs * 5;
+		*total = ix86_vec_cost (mode,
+					cost->mulss * 3 + cost->sse_op * 5,
+					true);
 	    }
 	  /* Without sse4.1, we don't have PMULLD; it's emulated with 7
 	     insns, including two PMULUDQ.  */
 	  else if (mode == V4SImode && !(TARGET_SSE4_1 || TARGET_AVX))
-	    *total = cost->fmul * 2 + cost->fabs * 5;
+	    *total = ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5,
+				    true);
 	  else
-	    *total = cost->fmul;
+	    *total = ix86_vec_cost (mode, cost->mulss, true);
 	  return false;
 	}
       else
@@ -39132,13 +39538,13 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
     case MOD:
     case UMOD:
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
-	/* ??? SSE cost should be used here.  */
-	*total = cost->fdiv;
+	*total = inner_mode == DFmode ? cost->divsd : cost->divss;
       else if (X87_FLOAT_MODE_P (mode))
 	*total = cost->fdiv;
       else if (FLOAT_MODE_P (mode))
-	/* ??? SSE vector cost should be used here.  */
-	*total = cost->fdiv;
+	*total = ix86_vec_cost (mode,
+			        inner_mode == DFmode ? cost->divsd : cost->divss,
+				true);
       else
 	*total = cost->divide[MODE_INDEX (mode)];
       return false;
@@ -39217,8 +39623,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
 	{
-	  /* ??? SSE cost should be used here.  */
-	  *total = cost->fadd;
+	  *total = cost->addss;
 	  return false;
 	}
       else if (X87_FLOAT_MODE_P (mode))
@@ -39228,8 +39633,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 	}
       else if (FLOAT_MODE_P (mode))
 	{
-	  /* ??? SSE vector cost should be used here.  */
-	  *total = cost->fadd;
+	  *total = ix86_vec_cost (mode, cost->addss, true);
 	  return false;
 	}
       /* FALLTHRU */
@@ -39252,8 +39656,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
     case NEG:
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
 	{
-	  /* ??? SSE cost should be used here.  */
-	  *total = cost->fchs;
+	  *total = cost->sse_op;
 	  return false;
 	}
       else if (X87_FLOAT_MODE_P (mode))
@@ -39263,20 +39666,14 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 	}
       else if (FLOAT_MODE_P (mode))
 	{
-	  /* ??? SSE vector cost should be used here.  */
-	  *total = cost->fchs;
+	  *total = ix86_vec_cost (mode, cost->sse_op, true);
 	  return false;
 	}
       /* FALLTHRU */
 
     case NOT:
       if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
-	{
-	  /* ??? Should be SSE vector operation cost.  */
-	  /* At least for published AMD latencies, this really is the same
-	     as the latency for a simple fpu operation like fabs.  */
-	  *total = cost->fabs;
-	}
+	*total = ix86_vec_cost (mode, cost->sse_op, true);
       else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
 	*total = cost->add * 2;
       else
@@ -39309,28 +39706,38 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
     case FLOAT_EXTEND:
       if (!(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
 	*total = 0;
+      else
+        *total = ix86_vec_cost (mode, cost->addss, true);
+      return false;
+
+    case FLOAT_TRUNCATE:
+      if (!(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
+	*total = cost->fadd;
+      else
+        *total = ix86_vec_cost (mode, cost->addss, true);
       return false;
 
     case ABS:
+      /* SSE requires memory load for the constant operand. It may make
+	 sense to account for this.  Of course the constant operand may or
+	 may not be reused. */
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
-	/* ??? SSE cost should be used here.  */
-	*total = cost->fabs;
+	*total = cost->sse_op;
       else if (X87_FLOAT_MODE_P (mode))
 	*total = cost->fabs;
       else if (FLOAT_MODE_P (mode))
-	/* ??? SSE vector cost should be used here.  */
-	*total = cost->fabs;
+	*total = ix86_vec_cost (mode, cost->sse_op, true);
       return false;
 
     case SQRT:
       if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
-	/* ??? SSE cost should be used here.  */
-	*total = cost->fsqrt;
+	*total = mode == SFmode ? cost->sqrtss : cost->sqrtsd;
       else if (X87_FLOAT_MODE_P (mode))
 	*total = cost->fsqrt;
       else if (FLOAT_MODE_P (mode))
-	/* ??? SSE vector cost should be used here.  */
-	*total = cost->fsqrt;
+	*total = ix86_vec_cost (mode,
+				mode == SFmode ? cost->sqrtss : cost->sqrtsd,
+				true);
       return false;
 
     case UNSPEC:
@@ -39344,7 +39751,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
       /* ??? Assume all of these vector manipulation patterns are
 	 recognizable.  In which case they all pretty much have the
 	 same cost.  */
-     *total = cost->fabs;
+     *total = cost->sse_op;
      return true;
     case VEC_MERGE:
       mask = XEXP (x, 2);
@@ -39353,7 +39760,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
       if (TARGET_AVX512F && register_operand (mask, GET_MODE (mask)))
 	*total = rtx_cost (XEXP (x, 0), mode, outer_code, opno, speed);
       else
-	*total = cost->fabs;
+	*total = cost->sse_op;
       return true;
 
     default:
@@ -39818,6 +40225,10 @@ x86_output_mi_thunk (FILE *file, tree, HOST_WIDE_INT delta,
 
   emit_note (NOTE_INSN_PROLOGUE_END);
 
+  /* CET is enabled, insert EB instruction.  */
+  if ((flag_cf_protection & CF_BRANCH) && TARGET_IBT)
+    emit_insn (gen_nop_endbr ());
+
   /* If VCALL_OFFSET, we'll need THIS in a register.  Might as well
      pull it in now and let DELTA benefit.  */
   if (REG_P (this_param))
@@ -40835,7 +41246,7 @@ ix86_vector_duplicate_value (machine_mode mode, rtx target, rtx val)
       reg = force_reg (innermode, val);
       if (GET_MODE (reg) != innermode)
 	reg = gen_lowpart (innermode, reg);
-      XEXP (dup, 0) = reg;
+      SET_SRC (PATTERN (insn)) = gen_vec_duplicate (mode, reg);
       seq = get_insns ();
       end_sequence ();
       if (seq)
@@ -42800,9 +43211,9 @@ ix86_encode_section_info (tree decl, rtx rtl, int first)
 enum rtx_code
 ix86_reverse_condition (enum rtx_code code, machine_mode mode)
 {
-  return (mode != CCFPmode && mode != CCFPUmode
-	  ? reverse_condition (code)
-	  : reverse_condition_maybe_unordered (code));
+  return (mode == CCFPmode
+	  ? reverse_condition_maybe_unordered (code)
+	  : reverse_condition (code));
 }
 
 /* Output code to perform an x87 FP register move, from OPERANDS[1]
@@ -43415,17 +43826,20 @@ static rtx_code_label *
 ix86_expand_sse_compare_and_jump (enum rtx_code code, rtx op0, rtx op1,
                                   bool swap_operands)
 {
-  machine_mode fpcmp_mode = ix86_fp_compare_mode (code);
+  bool unordered_compare = ix86_unordered_fp_compare (code);
   rtx_code_label *label;
-  rtx tmp;
+  rtx tmp, reg;
 
   if (swap_operands)
     std::swap (op0, op1);
 
   label = gen_label_rtx ();
-  tmp = gen_rtx_REG (fpcmp_mode, FLAGS_REG);
-  emit_insn (gen_rtx_SET (tmp, gen_rtx_COMPARE (fpcmp_mode, op0, op1)));
-  tmp = gen_rtx_fmt_ee (code, VOIDmode, tmp, const0_rtx);
+  tmp = gen_rtx_COMPARE (CCFPmode, op0, op1);
+  if (unordered_compare)
+    tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP);
+  reg = gen_rtx_REG (CCFPmode, FLAGS_REG);
+  emit_insn (gen_rtx_SET (reg, tmp));
+  tmp = gen_rtx_fmt_ee (code, VOIDmode, reg, const0_rtx);
   tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp,
 			      gen_rtx_LABEL_REF (VOIDmode, label), pc_rtx);
   tmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
@@ -44044,35 +44458,83 @@ static int
 ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
                                  tree vectype, int)
 {
+  bool fp = false;
+  machine_mode mode = TImode;
+  int index;
+  if (vectype != NULL)
+    {
+      fp = FLOAT_TYPE_P (vectype);
+      mode = TYPE_MODE (vectype);
+    }
+
   switch (type_of_cost)
     {
       case scalar_stmt:
-        return ix86_cost->scalar_stmt_cost;
+        return fp ? ix86_cost->addss : COSTS_N_INSNS (1);
 
       case scalar_load:
-        return ix86_cost->scalar_load_cost;
+	/* load/store costs are relative to register move which is 2. Recompute
+ 	   it to COSTS_N_INSNS so everything have same base.  */
+        return COSTS_N_INSNS (fp ? ix86_cost->sse_load[0]
+			      : ix86_cost->int_load [2]) / 2;
 
       case scalar_store:
-        return ix86_cost->scalar_store_cost;
+        return COSTS_N_INSNS (fp ? ix86_cost->sse_store[0]
+			      : ix86_cost->int_store [2]) / 2;
 
       case vector_stmt:
-        return ix86_cost->vec_stmt_cost;
+        return ix86_vec_cost (mode,
+			      fp ? ix86_cost->addss : ix86_cost->sse_op,
+			      true);
 
       case vector_load:
-        return ix86_cost->vec_align_load_cost;
+	index = sse_store_index (mode);
+	gcc_assert (index >= 0);
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS (ix86_cost->sse_load[index]) / 2,
+			      true);
 
       case vector_store:
-        return ix86_cost->vec_store_cost;
+	index = sse_store_index (mode);
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS (ix86_cost->sse_store[index]) / 2,
+			      true);
 
       case vec_to_scalar:
-        return ix86_cost->vec_to_scalar_cost;
-
       case scalar_to_vec:
-        return ix86_cost->scalar_to_vec_cost;
+        return ix86_vec_cost (mode, ix86_cost->sse_op, true);
 
+      /* We should have separate costs for unaligned loads and gather/scatter.
+	 Do that incrementally.  */
       case unaligned_load:
+	index = sse_store_index (mode);
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS
+				 (ix86_cost->sse_unaligned_load[index]) / 2,
+			      true);
+
       case unaligned_store:
-        return ix86_cost->vec_unalign_load_cost;
+	index = sse_store_index (mode);
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS
+				 (ix86_cost->sse_unaligned_store[index]) / 2,
+			      true);
+
+      case vector_gather_load:
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS
+				 (ix86_cost->gather_static
+				  + ix86_cost->gather_per_elt
+				    * TYPE_VECTOR_SUBPARTS (vectype)) / 2,
+			      true);
+
+      case vector_scatter_store:
+        return ix86_vec_cost (mode,
+			      COSTS_N_INSNS
+				 (ix86_cost->scatter_static
+				  + ix86_cost->scatter_per_elt
+				    * TYPE_VECTOR_SUBPARTS (vectype)) / 2,
+			      true);
 
       case cond_branch_taken:
         return ix86_cost->cond_taken_branch_cost;
@@ -44082,10 +44544,11 @@ ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 
       case vec_perm:
       case vec_promote_demote:
-        return ix86_cost->vec_stmt_cost;
+        return ix86_vec_cost (mode,
+			      ix86_cost->sse_op, true);
 
       case vec_construct:
-	return ix86_cost->vec_stmt_cost * (TYPE_VECTOR_SUBPARTS (vectype) - 1);
+	return ix86_vec_cost (mode, ix86_cost->sse_op, false);
 
       default:
         gcc_unreachable ();
@@ -44963,8 +45426,8 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
   if (ix86_expand_vec_one_operand_perm_avx512 (d))
     return true;
 
-  /* Try the AVX512F vpermi2 instructions.  */
-  if (ix86_expand_vec_perm_vpermi2 (NULL_RTX, NULL_RTX, NULL_RTX, NULL_RTX, d))
+  /* Try the AVX512F vpermt2/vpermi2 instructions.  */
+  if (ix86_expand_vec_perm_vpermt2 (NULL_RTX, NULL_RTX, NULL_RTX, NULL_RTX, d))
     return true;
 
   /* See if we can get the same permutation in different vector integer
@@ -46621,9 +47084,9 @@ expand_vec_perm_broadcast (struct expand_vec_perm_d *d)
 }
 
 /* Implement arbitrary permutations of two V64QImode operands
-   will 2 vpermi2w, 2 vpshufb and one vpor instruction.  */
+   with 2 vperm[it]2w, 2 vpshufb and one vpor instruction.  */
 static bool
-expand_vec_perm_vpermi2_vpshub2 (struct expand_vec_perm_d *d)
+expand_vec_perm_vpermt2_vpshub2 (struct expand_vec_perm_d *d)
 {
   if (!TARGET_AVX512BW || !(d->vmode == V64QImode))
     return false;
@@ -46868,7 +47331,7 @@ ix86_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
   if (expand_vec_perm_vpshufb2_vpermq_even_odd (d))
     return true;
 
-  if (expand_vec_perm_vpermi2_vpshub2 (d))
+  if (expand_vec_perm_vpermt2_vpshub2 (d))
     return true;
 
   /* ??? Look for narrow permutations whose element orderings would
@@ -47016,17 +47479,17 @@ ix86_vectorize_vec_perm_const_ok (machine_mode vmode, vec_perm_indices sel)
     case E_V8DImode:
     case E_V8DFmode:
       if (TARGET_AVX512F)
-	/* All implementable with a single vpermi2 insn.  */
+	/* All implementable with a single vperm[it]2 insn.  */
 	return true;
       break;
     case E_V32HImode:
       if (TARGET_AVX512BW)
-	/* All implementable with a single vpermi2 insn.  */
+	/* All implementable with a single vperm[it]2 insn.  */
 	return true;
       break;
     case E_V64QImode:
       if (TARGET_AVX512BW)
-	/* Implementable with 2 vpermi2, 2 vpshufb and 1 or insn.  */
+	/* Implementable with 2 vperm[it]2, 2 vpshufb and 1 or insn.  */
 	return true;
       break;
     case E_V8SImode:
@@ -47034,7 +47497,7 @@ ix86_vectorize_vec_perm_const_ok (machine_mode vmode, vec_perm_indices sel)
     case E_V4DFmode:
     case E_V4DImode:
       if (TARGET_AVX512VL)
-	/* All implementable with a single vpermi2 insn.  */
+	/* All implementable with a single vperm[it]2 insn.  */
 	return true;
       break;
     case E_V16HImode:
@@ -47204,7 +47667,6 @@ ix86_expand_vecop_qihi (enum rtx_code code, rtx dest, rtx op1, rtx op2)
       op2_h = gen_reg_rtx (qimode);
       emit_insn (gen_il (op2_l, op2, op2));
       emit_insn (gen_ih (op2_h, op2, op2));
-      /* FALLTHRU */
 
       op1_l = gen_reg_rtx (qimode);
       op1_h = gen_reg_rtx (qimode);
@@ -47632,6 +48094,46 @@ ix86_bnd_prefixed_insn_p (rtx insn)
   return chkp_function_instrumented_p (current_function_decl);
 }
 
+/* Return 1 if control tansfer instruction INSN
+   should be encoded with notrack prefix.  */
+
+static bool
+ix86_notrack_prefixed_insn_p (rtx insn)
+{
+  if (!insn || !((flag_cf_protection & CF_BRANCH) && TARGET_IBT))
+    return false;
+
+  if (CALL_P (insn))
+    {
+      rtx call = get_call_rtx_from (insn);
+      gcc_assert (call != NULL_RTX);
+      rtx addr = XEXP (call, 0);
+
+      /* Do not emit 'notrack' if it's not an indirect call.  */
+      if (MEM_P (addr)
+	  && GET_CODE (XEXP (addr, 0)) == SYMBOL_REF)
+	return false;
+      else
+	return find_reg_note (insn, REG_CALL_NOCF_CHECK, 0);
+    }
+
+  if (JUMP_P (insn) && !flag_cet_switch)
+    {
+      rtx target = JUMP_LABEL (insn);
+      if (target == NULL_RTX || ANY_RETURN_P (target))
+	return false;
+
+      /* Check the jump is a switch table.  */
+      rtx_insn *label = as_a<rtx_insn *> (target);
+      rtx_insn *table = next_insn (label);
+      if (table == NULL_RTX || !JUMP_TABLE_DATA_P (table))
+	return false;
+      else
+	return true;
+    }
+  return false;
+}
+
 /* Calculate integer abs() using only SSE2 instructions.  */
 
 void
@@ -49420,6 +49922,9 @@ ix86_run_selftests (void)
 #undef TARGET_DELEGITIMIZE_ADDRESS
 #define TARGET_DELEGITIMIZE_ADDRESS ix86_delegitimize_address
 
+#undef TARGET_CONST_NOT_OK_FOR_DEBUG_P
+#define TARGET_CONST_NOT_OK_FOR_DEBUG_P ix86_const_not_ok_for_debug_p
+
 #undef TARGET_MS_BITFIELD_LAYOUT_P
 #define TARGET_MS_BITFIELD_LAYOUT_P ix86_ms_bitfield_layout_p
 
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 1c796ef392d..4855105c4ac 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -103,6 +103,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_SGX_P(x)	TARGET_ISA_SGX_P(x)
 #define TARGET_RDPID	TARGET_ISA_RDPID
 #define TARGET_RDPID_P(x)	TARGET_ISA_RDPID_P(x)
+#define TARGET_GFNI	TARGET_ISA_GFNI
+#define TARGET_GFNI_P(x)	TARGET_ISA_GFNI_P(x)
 #define TARGET_BMI	TARGET_ISA_BMI
 #define TARGET_BMI_P(x)	TARGET_ISA_BMI_P(x)
 #define TARGET_BMI2	TARGET_ISA_BMI2
@@ -167,6 +169,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_MWAITX_P(x)	TARGET_ISA_MWAITX_P(x)
 #define TARGET_PKU	TARGET_ISA_PKU
 #define TARGET_PKU_P(x)	TARGET_ISA_PKU_P(x)
+#define TARGET_IBT	TARGET_ISA_IBT
+#define TARGET_IBT_P(x)	TARGET_ISA_IBT_P(x)
+#define TARGET_SHSTK	TARGET_ISA_SHSTK
+#define TARGET_SHSTK_P(x)	TARGET_ISA_SHSTK_P(x)
 
 #define TARGET_LP64	TARGET_ABI_64
 #define TARGET_LP64_P(x)	TARGET_ABI_64_P(x)
@@ -236,13 +242,21 @@ struct processor_costs {
 				   in SImode and DImode */
   const int mmx_store[2];	/* cost of storing MMX register
 				   in SImode and DImode */
-  const int sse_move;		/* cost of moving SSE register.  */
-  const int sse_load[3];	/* cost of loading SSE register
-				   in SImode, DImode and TImode*/
-  const int sse_store[3];	/* cost of storing SSE register
-				   in SImode, DImode and TImode*/
+  const int xmm_move, ymm_move, /* cost of moving XMM and YMM register.  */
+	    zmm_move;
+  const int sse_load[5];	/* cost of loading SSE register
+				   in 32bit, 64bit, 128bit, 256bit and 512bit */
+  const int sse_unaligned_load[5];/* cost of unaligned load.  */
+  const int sse_store[5];	/* cost of storing SSE register
+				   in SImode, DImode and TImode.  */
+  const int sse_unaligned_store[5];/* cost of unaligned store.  */
   const int mmxsse_to_integer;	/* cost of moving mmxsse register to
-				   integer and vice versa.  */
+				   integer.  */
+  const int ssemmx_to_integer;  /* cost of moving integer to mmxsse register. */
+  const int gather_static, gather_per_elt; /* Cost of gather load is computed
+				   as static + per_item * nelts. */
+  const int scatter_static, scatter_per_elt; /* Cost of gather store is
+				   computed as static + per_item * nelts.  */
   const int l1_cache_size;	/* size of l1 cache, in kilobytes.  */
   const int l2_cache_size;	/* size of l2 cache, in kilobytes.  */
   const int prefetch_block;	/* bytes moved to cache for prefetch.  */
@@ -257,6 +271,16 @@ struct processor_costs {
   const int fsqrt;		/* cost of FSQRT instruction.  */
 				/* Specify what algorithm
 				   to use for stringops on unknown size.  */
+  const int sse_op;		/* cost of cheap SSE instruction.  */
+  const int addss;		/* cost of ADDSS/SD SUBSS/SD instructions.  */
+  const int mulss;		/* cost of MULSS instructions.  */
+  const int mulsd;		/* cost of MULSD instructions.  */
+  const int fmass;		/* cost of FMASS instructions.  */
+  const int fmasd;		/* cost of FMASD instructions.  */
+  const int divss;		/* cost of DIVSS instructions.  */
+  const int divsd;		/* cost of DIVSD instructions.  */
+  const int sqrtss;		/* cost of SQRTSS instructions.  */
+  const int sqrtsd;		/* cost of SQRTSD instructions.  */
   const int reassoc_int, reassoc_fp, reassoc_vec_int, reassoc_vec_fp;
 				/* Specify reassociation width for integer,
 				   fp, vector integer and vector fp
@@ -265,18 +289,6 @@ struct processor_costs {
 				   parallel.  See also
 				   ix86_reassociation_width.  */
   struct stringop_algs *memcpy, *memset;
-  const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
-				   load and store.  */
-  const int scalar_load_cost;   /* Cost of scalar load.  */
-  const int scalar_store_cost;  /* Cost of scalar store.  */
-  const int vec_stmt_cost;      /* Cost of any vector operation, excluding
-                                   load, store, vector-to-scalar and
-                                   scalar-to-vector operation.  */
-  const int vec_to_scalar_cost;    /* Cost of vect-to-scalar operation.  */
-  const int scalar_to_vec_cost;    /* Cost of scalar-to-vector operation.  */
-  const int vec_align_load_cost;   /* Cost of aligned vector load.  */
-  const int vec_unalign_load_cost; /* Cost of unaligned vector load.  */
-  const int vec_store_cost;        /* Cost of vector store.  */
   const int cond_taken_branch_cost;    /* Cost of taken branch for vectorizer
 					  cost model.  */
   const int cond_not_taken_branch_cost;/* Cost of not taken branch for
@@ -2589,6 +2601,7 @@ struct GTY(()) machine_function {
 #define ix86_current_function_calls_tls_descriptor \
   (ix86_tls_descriptor_calls_expanded_in_cfun && df_regs_ever_live_p (SP_REG))
 #define ix86_static_chain_on_stack (cfun->machine->static_chain_on_stack)
+#define ix86_red_zone_size (cfun->machine->frame.red_zone_size)
 
 /* Control behavior of x86_file_start.  */
 #define X86_FILE_START_VERSION_DIRECTIVE false
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3413b90028f..d48decbb7d9 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -62,7 +62,7 @@
 ;; ; -- print a semicolon (after prefixes due to bug in older gas).
 ;; ~ -- print "i" if TARGET_AVX2, "f" otherwise.
 ;; ^ -- print addr32 prefix if TARGET_64BIT and Pmode != word_mode
-;; ! -- print MPX prefix for jxx/call/ret instructions if required.
+;; ! -- print MPX or NOTRACK prefix for jxx/call/ret instructions if required.
 
 (define_c_enum "unspec" [
   ;; Relocation specifiers
@@ -99,6 +99,7 @@
   UNSPEC_SCAS
   UNSPEC_FNSTSW
   UNSPEC_SAHF
+  UNSPEC_NOTRAP
   UNSPEC_PARITY
   UNSPEC_FSTCW
   UNSPEC_FLDCW
@@ -112,6 +113,7 @@
   UNSPEC_STOS
   UNSPEC_PEEPSIB
   UNSPEC_INSN_FALSE_DEP
+  UNSPEC_SBB
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -274,6 +276,17 @@
 
   ;; For RDPID support
   UNSPECV_RDPID
+
+  ;; For CET support
+  UNSPECV_NOP_ENDBR
+  UNSPECV_NOP_RDSSP
+  UNSPECV_INCSSP
+  UNSPECV_SAVEPREVSSP
+  UNSPECV_RSTORSSP
+  UNSPECV_WRSS
+  UNSPECV_WRUSS
+  UNSPECV_SETSSBSY
+  UNSPECV_CLRSSBSY
 ])
 
 ;; Constants to represent rounding modes in the ROUND instruction
@@ -798,7 +811,7 @@
 (define_attr "isa" "base,x64,x64_sse4,x64_sse4_noavx,x64_avx,nox64,
 		    sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx,
 		    avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
-		    fma_avx512f,avx512bw,noavx512bw,avx512dq,noavx512dq,
+		    avx512bw,noavx512bw,avx512dq,noavx512dq,
 		    avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
   (const_string "base"))
 
@@ -832,8 +845,6 @@
 	 (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
 	 (eq_attr "isa" "avx512f") (symbol_ref "TARGET_AVX512F")
 	 (eq_attr "isa" "noavx512f") (symbol_ref "!TARGET_AVX512F")
-	 (eq_attr "isa" "fma_avx512f")
-	   (symbol_ref "TARGET_FMA || TARGET_AVX512F")
 	 (eq_attr "isa" "avx512bw") (symbol_ref "TARGET_AVX512BW")
 	 (eq_attr "isa" "noavx512bw") (symbol_ref "!TARGET_AVX512BW")
 	 (eq_attr "isa" "avx512dq") (symbol_ref "TARGET_AVX512DQ")
@@ -1468,9 +1479,6 @@
 
 ;; FP compares, step 1:
 ;; Set the FP condition codes.
-;;
-;; CCFPmode	compare with exceptions
-;; CCFPUmode	compare with no exceptions
 
 ;; We may not use "#" to split and emit these, since the REG_DEAD notes
 ;; used to manage the reg stack popping would not be preserved.
@@ -1577,9 +1585,11 @@
 (define_insn "*cmpu<mode>_i387"
   [(set (match_operand:HI 0 "register_operand" "=a")
 	(unspec:HI
-	  [(compare:CCFPU
-	     (match_operand:X87MODEF 1 "register_operand" "f")
-	     (match_operand:X87MODEF 2 "register_operand" "f"))]
+	  [(unspec:CCFP
+	     [(compare:CCFP
+		(match_operand:X87MODEF 1 "register_operand" "f")
+		(match_operand:X87MODEF 2 "register_operand" "f"))]
+	     UNSPEC_NOTRAP)]
 	  UNSPEC_FNSTSW))]
   "TARGET_80387"
   "* return output_fp_compare (insn, operands, false, true);"
@@ -1588,18 +1598,22 @@
    (set_attr "mode" "<MODE>")])
 
 (define_insn_and_split "*cmpu<mode>_cc_i387"
-  [(set (reg:CCFPU FLAGS_REG)
-	(compare:CCFPU
-	  (match_operand:X87MODEF 1 "register_operand" "f")
-	  (match_operand:X87MODEF 2 "register_operand" "f")))
+  [(set (reg:CCFP FLAGS_REG)
+	(unspec:CCFP
+	  [(compare:CCFP
+	     (match_operand:X87MODEF 1 "register_operand" "f")
+	     (match_operand:X87MODEF 2 "register_operand" "f"))]
+	  UNSPEC_NOTRAP))
    (clobber (match_operand:HI 0 "register_operand" "=a"))]
   "TARGET_80387 && TARGET_SAHF && !TARGET_CMOVE"
   "#"
   "&& reload_completed"
   [(set (match_dup 0)
 	(unspec:HI
-	  [(compare:CCFPU (match_dup 1)(match_dup 2))]
-	UNSPEC_FNSTSW))
+	  [(unspec:CCFP
+	     [(compare:CCFP (match_dup 1)(match_dup 2))]
+	     UNSPEC_NOTRAP)]
+	  UNSPEC_FNSTSW))
    (set (reg:CC FLAGS_REG)
 	(unspec:CC [(match_dup 0)] UNSPEC_SAHF))]
   ""
@@ -1685,21 +1699,30 @@
    (set_attr "mode" "SI")])
 
 ;; Pentium Pro can do steps 1 through 3 in one go.
-;; comi*, ucomi*, fcomi*, ficomi*, fucomi*
-;; (these i387 instructions set flags directly)
+;; (these instructions set flags directly)
 
-(define_mode_iterator FPCMP [CCFP CCFPU])
-(define_mode_attr unord [(CCFP "") (CCFPU "u")])
+(define_subst_attr "unord" "unord_subst" "" "u")
+(define_subst_attr "unordered" "unord_subst" "false" "true")
+
+(define_subst "unord_subst"
+  [(set (match_operand:CCFP 0)
+        (match_operand:CCFP 1))]
+  ""
+  [(set (match_dup 0)
+        (unspec:CCFP
+	  [(match_dup 1)]
+	  UNSPEC_NOTRAP))])
 
-(define_insn "*cmpi<FPCMP:unord><MODEF:mode>"
-  [(set (reg:FPCMP FLAGS_REG)
-	(compare:FPCMP
+(define_insn "*cmpi<unord><MODEF:mode>"
+  [(set (reg:CCFP FLAGS_REG)
+	(compare:CCFP
 	  (match_operand:MODEF 0 "register_operand" "f,v")
 	  (match_operand:MODEF 1 "register_ssemem_operand" "f,vm")))]
   "(SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH)
    || (TARGET_80387 && TARGET_CMOVE)"
-  "* return output_fp_compare (insn, operands, true,
-			       <FPCMP:MODE>mode == CCFPUmode);"
+  "@
+   * return output_fp_compare (insn, operands, true, <unordered>);
+   %v<unord>comi<MODEF:ssemodesuffix>\t{%1, %0|%0, %1}"
   [(set_attr "type" "fcmp,ssecomi")
    (set_attr "prefix" "orig,maybe_vex")
    (set_attr "mode" "<MODEF:MODE>")
@@ -1728,13 +1751,12 @@
 	 (symbol_ref "false"))))])
 
 (define_insn "*cmpi<unord>xf_i387"
-  [(set (reg:FPCMP FLAGS_REG)
-	(compare:FPCMP
+  [(set (reg:CCFP FLAGS_REG)
+	(compare:CCFP
 	  (match_operand:XF 0 "register_operand" "f")
 	  (match_operand:XF 1 "register_operand" "f")))]
   "TARGET_80387 && TARGET_CMOVE"
-  "* return output_fp_compare (insn, operands, true,
-			       <MODE>mode == CCFPUmode);"
+  "* return output_fp_compare (insn, operands, true, <unordered>);"
   [(set_attr "type" "fcmp")
    (set_attr "mode" "XF")
    (set_attr "athlon_decode" "vector")
@@ -3564,8 +3586,10 @@
 
 	       /* movaps is one byte shorter for non-AVX targets.  */
 	       (eq_attr "alternative" "13,17")
-		 (cond [(ior (match_operand 0 "ext_sse_reg_operand")
-			     (match_operand 1 "ext_sse_reg_operand"))
+		 (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
+				  (not (match_test "TARGET_AVX512VL")))
+			     (ior (match_operand 0 "ext_sse_reg_operand")
+				  (match_operand 1 "ext_sse_reg_operand")))
 			  (const_string "V8DF")
 			(ior (not (match_test "TARGET_SSE2"))
 			     (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
@@ -3739,8 +3763,10 @@
 		  better to maintain the whole registers in single format
 		  to avoid problems on using packed logical operations.  */
 	       (eq_attr "alternative" "6")
-		 (cond [(ior  (match_operand 0 "ext_sse_reg_operand")
-			      (match_operand 1 "ext_sse_reg_operand"))
+		 (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
+				  (not (match_test "TARGET_AVX512VL")))
+			     (ior (match_operand 0 "ext_sse_reg_operand")
+				  (match_operand 1 "ext_sse_reg_operand")))
 			  (const_string "V16SF")
 			(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
 			     (match_test "TARGET_SSE_SPLIT_REGS"))
@@ -6765,6 +6791,17 @@
   [(set_attr "type" "alu")
    (set_attr "mode" "<MODE>")])
 
+(define_peephole2
+  [(parallel
+     [(set (reg:CC FLAGS_REG)
+	   (compare:CC (match_operand:SWI 0 "general_reg_operand")
+		       (match_operand:SWI 1 "general_gr_operand")))
+      (set (match_dup 0)
+	   (minus:SWI (match_dup 0) (match_dup 1)))])]
+  "find_regno_note (peep2_next_insn (0), REG_UNUSED, REGNO (operands[0])) != 0"
+  [(set (reg:CC FLAGS_REG)
+	(compare:CC (match_dup 0) (match_dup 1)))])
+
 (define_insn "*subsi_3_zext"
   [(set (reg FLAGS_REG)
 	(compare (match_operand:SI 1 "register_operand" "0")
@@ -6818,15 +6855,19 @@
 (define_insn "addcarry<mode>"
   [(set (reg:CCC FLAGS_REG)
 	(compare:CCC
-	  (plus:SWI48
+	  (zero_extend:<DWI>
 	    (plus:SWI48
-	      (match_operator:SWI48 4 "ix86_carry_flag_operator"
-	       [(match_operand 3 "flags_reg_operand") (const_int 0)])
-	      (match_operand:SWI48 1 "nonimmediate_operand" "%0"))
-	    (match_operand:SWI48 2 "nonimmediate_operand" "rm"))
-	  (match_dup 1)))
+	      (plus:SWI48
+		(match_operator:SWI48 5 "ix86_carry_flag_operator"
+		  [(match_operand 3 "flags_reg_operand") (const_int 0)])
+		(match_operand:SWI48 1 "nonimmediate_operand" "%0"))
+	      (match_operand:SWI48 2 "nonimmediate_operand" "rm")))
+	  (plus:<DWI>
+	    (zero_extend:<DWI> (match_dup 2))
+	    (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+	      [(match_dup 3) (const_int 0)]))))
    (set (match_operand:SWI48 0 "register_operand" "=r")
-	(plus:SWI48 (plus:SWI48 (match_op_dup 4
+	(plus:SWI48 (plus:SWI48 (match_op_dup 5
 				 [(match_dup 3) (const_int 0)])
 				(match_dup 1))
 		    (match_dup 2)))]
@@ -6837,6 +6878,18 @@
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
 
+(define_expand "addcarry<mode>_0"
+  [(parallel
+     [(set (reg:CCC FLAGS_REG)
+	   (compare:CCC
+	     (plus:SWI48
+	       (match_operand:SWI48 1 "nonimmediate_operand")
+	       (match_operand:SWI48 2 "x86_64_general_operand"))
+	     (match_dup 1)))
+      (set (match_operand:SWI48 0 "register_operand")
+	   (plus:SWI48 (match_dup 1) (match_dup 2)))])]
+  "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)")
+
 (define_insn "sub<mode>3_carry"
   [(set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m,<r>")
 	(minus:SWI
@@ -6870,18 +6923,67 @@
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 
+(define_insn "sub<mode>3_carry_ccc"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
+	  (plus:<DWI>
+	    (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
+	    (zero_extend:<DWI>
+	      (match_operand:DWIH 2 "x86_64_sext_operand" "rmWe")))))
+   (clobber (match_scratch:DWIH 0 "=r"))]
+  ""
+  "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*sub<mode>3_carry_ccc_1"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
+	  (plus:<DWI>
+	    (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
+	    (match_operand:<DWI> 2 "x86_64_dwzext_immediate_operand" "Wf"))))
+   (clobber (match_scratch:DWIH 0 "=r"))]
+  ""
+{
+  operands[3] = simplify_subreg (<MODE>mode, operands[2], <DWI>mode, 0);
+  return "sbb{<imodesuffix>}\t{%3, %0|%0, %3}";
+}
+  [(set_attr "type" "alu")
+   (set_attr "mode" "<MODE>")])
+
+;; The sign flag is set from the
+;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
+;; result, the overflow flag likewise, but the overflow flag is also
+;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
+(define_insn "sub<mode>3_carry_ccgz"
+  [(set (reg:CCGZ FLAGS_REG)
+	(unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
+		      (match_operand:DWIH 2 "x86_64_general_operand" "rme")
+		      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
+		     UNSPEC_SBB))
+   (clobber (match_scratch:DWIH 0 "=r"))]
+  ""
+  "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "subborrow<mode>"
   [(set (reg:CCC FLAGS_REG)
 	(compare:CCC
-	  (match_operand:SWI48 1 "nonimmediate_operand" "0")
-	  (plus:SWI48
-	    (match_operator:SWI48 4 "ix86_carry_flag_operator"
-	     [(match_operand 3 "flags_reg_operand") (const_int 0)])
-	    (match_operand:SWI48 2 "nonimmediate_operand" "rm"))))
+	  (zero_extend:<DWI>
+	    (match_operand:SWI48 1 "nonimmediate_operand" "0"))
+	  (plus:<DWI>
+	    (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+	      [(match_operand 3 "flags_reg_operand") (const_int 0)])
+	    (zero_extend:<DWI>
+	      (match_operand:SWI48 2 "nonimmediate_operand" "rm")))))
    (set (match_operand:SWI48 0 "register_operand" "=r")
-	(minus:SWI48 (minus:SWI48 (match_dup 1)
-				  (match_op_dup 4
-				   [(match_dup 3) (const_int 0)]))
+	(minus:SWI48 (minus:SWI48
+		       (match_dup 1)
+		       (match_operator:SWI48 5 "ix86_carry_flag_operator"
+			 [(match_dup 3) (const_int 0)]))
 		     (match_dup 2)))]
   "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)"
   "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
@@ -6889,6 +6991,16 @@
    (set_attr "use_carry" "1")
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
+
+(define_expand "subborrow<mode>_0"
+  [(parallel
+     [(set (reg:CC FLAGS_REG)
+	   (compare:CC
+	     (match_operand:SWI48 1 "nonimmediate_operand")
+	     (match_operand:SWI48 2 "<general_operand>")))
+      (set (match_operand:SWI48 0 "register_operand")
+	   (minus:SWI48 (match_dup 1) (match_dup 2)))])]
+  "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)")
 
 ;; Overflow setting add instructions
 
@@ -12286,6 +12398,34 @@
   ix86_expand_clear (operands[3]);
 })
 
+(define_peephole2
+  [(set (reg FLAGS_REG) (match_operand 0))
+   (parallel [(set (reg FLAGS_REG) (match_operand 1))
+	      (match_operand 5)])
+   (set (match_operand:QI 2 "register_operand")
+	(match_operator:QI 3 "ix86_comparison_operator"
+	  [(reg FLAGS_REG) (const_int 0)]))
+   (set (match_operand 4 "any_QIreg_operand")
+	(zero_extend (match_dup 2)))]
+  "(peep2_reg_dead_p (4, operands[2])
+    || operands_match_p (operands[2], operands[4]))
+   && ! reg_overlap_mentioned_p (operands[4], operands[0])
+   && ! reg_overlap_mentioned_p (operands[4], operands[1])
+   && ! reg_set_p (operands[4], operands[5])
+   && refers_to_regno_p (FLAGS_REG, operands[1], (rtx *)NULL)
+   && peep2_regno_dead_p (0, FLAGS_REG)"
+  [(set (match_dup 6) (match_dup 0))
+   (parallel [(set (match_dup 7) (match_dup 1))
+	      (match_dup 5)])
+   (set (strict_low_part (match_dup 8))
+	(match_dup 3))]
+{
+  operands[6] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+  operands[7] = gen_rtx_REG (GET_MODE (operands[1]), FLAGS_REG);
+  operands[8] = gen_lowpart (QImode, operands[4]);
+  ix86_expand_clear (operands[4]);
+})
+
 ;; Similar, but match zero extend with andsi3.
 
 (define_peephole2
@@ -12331,6 +12471,35 @@
   operands[6] = gen_lowpart (QImode, operands[3]);
   ix86_expand_clear (operands[3]);
 })
+
+(define_peephole2
+  [(set (reg FLAGS_REG) (match_operand 0))
+   (parallel [(set (reg FLAGS_REG) (match_operand 1))
+	      (match_operand 5)])
+   (set (match_operand:QI 2 "register_operand")
+	(match_operator:QI 3 "ix86_comparison_operator"
+	  [(reg FLAGS_REG) (const_int 0)]))
+   (parallel [(set (match_operand 4 "any_QIreg_operand")
+		   (zero_extend (match_dup 2)))
+	      (clobber (reg:CC FLAGS_REG))])]
+  "(peep2_reg_dead_p (4, operands[2])
+    || operands_match_p (operands[2], operands[4]))
+   && ! reg_overlap_mentioned_p (operands[4], operands[0])
+   && ! reg_overlap_mentioned_p (operands[4], operands[1])
+   && ! reg_set_p (operands[4], operands[5])
+   && refers_to_regno_p (FLAGS_REG, operands[1], (rtx *)NULL)
+   && peep2_regno_dead_p (0, FLAGS_REG)"
+  [(set (match_dup 6) (match_dup 0))
+   (parallel [(set (match_dup 7) (match_dup 1))
+	      (match_dup 5)])
+   (set (strict_low_part (match_dup 8))
+	(match_dup 3))]
+{
+  operands[6] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+  operands[7] = gen_rtx_REG (GET_MODE (operands[1]), FLAGS_REG);
+  operands[8] = gen_lowpart (QImode, operands[4]);
+  ix86_expand_clear (operands[4]);
+})
 
 ;; Call instructions.
 
@@ -18135,6 +18304,28 @@
   "* return output_probe_stack_range (operands[0], operands[2]);"
   [(set_attr "type" "multi")])
 
+/* Additional processing for builtin_setjmp.  Store the shadow stack pointer
+   as a forth element in jmpbuf.  */
+(define_expand "builtin_setjmp_setup"
+  [(match_operand 0 "address_operand")]
+  "TARGET_SHSTK"
+{
+  if (flag_cf_protection & CF_RETURN)
+    {
+      rtx mem, reg_ssp;
+
+      mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0],
+					       3 * GET_MODE_SIZE (Pmode)));
+      reg_ssp = gen_reg_rtx (Pmode);
+      emit_insn (gen_rtx_SET (reg_ssp, const0_rtx));
+      emit_insn ((Pmode == SImode)
+		  ? gen_rdsspsi (reg_ssp, reg_ssp)
+		  : gen_rdsspdi (reg_ssp, reg_ssp));
+      emit_move_insn (mem, reg_ssp);
+    }
+  DONE;
+})
+
 (define_expand "builtin_setjmp_receiver"
   [(label_ref (match_operand 0))]
   "!TARGET_64BIT && flag_pic"
@@ -18155,6 +18346,83 @@
   DONE;
 })
 
+(define_expand "builtin_longjmp"
+  [(match_operand 0 "address_operand")]
+  "TARGET_SHSTK"
+{
+  rtx fp, lab, stack;
+  rtx jump, label, reg_adj, reg_ssp, reg_minus, mem_buf, tmp, clob;
+  machine_mode sa_mode = STACK_SAVEAREA_MODE (SAVE_NONLOCAL);
+
+  /* Adjust the shadow stack pointer (ssp) to the value saved in the
+     jmp_buf.  The saving was done in the builtin_setjmp_setup.  */
+  if (flag_cf_protection & CF_RETURN)
+    {
+      /* Get current shadow stack pointer.  The code below will check if
+	 SHSTK feature is enabled.  If it's not enabled RDSSP instruction
+	 is a NOP.  */
+      reg_ssp = gen_reg_rtx (Pmode);
+      emit_insn (gen_rtx_SET (reg_ssp, const0_rtx));
+      emit_insn ((Pmode == SImode)
+		 ? gen_rdsspsi (reg_ssp, reg_ssp)
+		 : gen_rdsspdi (reg_ssp, reg_ssp));
+      mem_buf = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0],
+						   3 * GET_MODE_SIZE (Pmode))),
+
+      /* Compare through substraction the saved and the current ssp to decide
+	 if ssp has to be adjusted.  */
+      reg_minus = gen_reg_rtx (Pmode);
+      tmp = gen_rtx_SET (reg_minus, gen_rtx_MINUS (Pmode, reg_ssp, mem_buf));
+      clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+      tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));
+      emit_insn (tmp);
+
+      /* Jump over adjustment code.  */
+      label = gen_label_rtx ();
+      tmp = gen_rtx_REG (CCmode, FLAGS_REG);
+      tmp = gen_rtx_EQ (VOIDmode, tmp, const0_rtx);
+      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp,
+				  gen_rtx_LABEL_REF (VOIDmode, label),
+				  pc_rtx);
+      jump = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+      JUMP_LABEL (jump) = label;
+
+      /* Adjust the ssp.  */
+      reg_adj = gen_reg_rtx (Pmode);
+      tmp = gen_rtx_SET (reg_adj,
+			 gen_rtx_LSHIFTRT (Pmode, negate_rtx (Pmode, reg_minus),
+					   GEN_INT (3)));
+      clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+      tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));
+      emit_insn (tmp);
+      emit_insn ((Pmode == SImode)
+		 ? gen_incsspsi (reg_adj)
+		 : gen_incsspdi (reg_adj));
+
+      emit_label (label);
+      LABEL_NUSES (label) = 1;
+    }
+
+  /* This code is the same as in expand_buildin_longjmp.  */
+  fp = gen_rtx_MEM (Pmode, operands[0]);
+  lab = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0],
+					   GET_MODE_SIZE (Pmode)));
+  stack = gen_rtx_MEM (sa_mode, plus_constant (Pmode, operands[0],
+					       2 * GET_MODE_SIZE (Pmode)));
+  lab = copy_to_reg (lab);
+
+  emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
+  emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
+
+  emit_move_insn (hard_frame_pointer_rtx, fp);
+  emit_stack_restore (SAVE_NONLOCAL, stack);
+
+  emit_use (hard_frame_pointer_rtx);
+  emit_use (stack_pointer_rtx);
+  emit_indirect_jump (lab);
+})
+
+
 ;; Avoid redundant prefixes by splitting HImode arithmetic to SImode.
 ;; Do not split instructions with mask registers.
 (define_split
@@ -18814,7 +19082,7 @@
 	      (clobber (mem:BLK (scratch)))])]
   "(TARGET_SINGLE_PUSH || optimize_insn_for_size_p ())
    && INTVAL (operands[0]) == -GET_MODE_SIZE (word_mode)
-   && !ix86_using_red_zone ()"
+   && ix86_red_zone_size == 0"
   [(clobber (match_dup 1))
    (parallel [(set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))
 	      (clobber (mem:BLK (scratch)))])])
@@ -18828,7 +19096,7 @@
 	      (clobber (mem:BLK (scratch)))])]
   "(TARGET_DOUBLE_PUSH || optimize_insn_for_size_p ())
    && INTVAL (operands[0]) == -2*GET_MODE_SIZE (word_mode)
-   && !ix86_using_red_zone ()"
+   && ix86_red_zone_size == 0"
   [(clobber (match_dup 1))
    (set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))
    (parallel [(set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))
@@ -18843,7 +19111,7 @@
 	      (clobber (reg:CC FLAGS_REG))])]
   "(TARGET_SINGLE_PUSH || optimize_insn_for_size_p ())
    && INTVAL (operands[0]) == -GET_MODE_SIZE (word_mode)
-   && !ix86_using_red_zone ()"
+   && ix86_red_zone_size == 0"
   [(clobber (match_dup 1))
    (set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))])
 
@@ -18855,7 +19123,7 @@
 	      (clobber (reg:CC FLAGS_REG))])]
   "(TARGET_DOUBLE_PUSH || optimize_insn_for_size_p ())
    && INTVAL (operands[0]) == -2*GET_MODE_SIZE (word_mode)
-   && !ix86_using_red_zone ()"
+   && ix86_red_zone_size == 0"
   [(clobber (match_dup 1))
    (set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))
    (set (mem:W (pre_dec:P (reg:P SP_REG))) (match_dup 1))])
@@ -19775,6 +20043,83 @@
   [(set_attr "length" "2")
    (set_attr "memory" "unknown")])
 
+;; CET instructions
+(define_insn "rdssp<mode>"
+  [(set (match_operand:SWI48x 0 "register_operand" "=r")
+	(unspec_volatile:SWI48x
+	  [(match_operand:SWI48x 1 "register_operand" "0")]
+	  UNSPECV_NOP_RDSSP))]
+  "TARGET_SHSTK"
+  "rdssp<mskmodesuffix>\t%0"
+  [(set_attr "length" "4")
+   (set_attr "type" "other")])
+
+(define_insn "incssp<mode>"
+  [(unspec_volatile [(match_operand:SWI48x 0 "register_operand" "r")]
+		   UNSPECV_INCSSP)]
+  "TARGET_SHSTK"
+  "incssp<mskmodesuffix>\t%0"
+  [(set_attr "length" "4")
+   (set_attr "type" "other")])
+
+(define_insn "saveprevssp"
+  [(unspec_volatile [(const_int 0)] UNSPECV_SAVEPREVSSP)]
+  "TARGET_SHSTK"
+  "saveprevssp"
+  [(set_attr "length" "5")
+   (set_attr "type" "other")])
+
+(define_insn "rstorssp"
+  [(unspec_volatile [(match_operand 0 "memory_operand" "m")]
+		   UNSPECV_RSTORSSP)]
+  "TARGET_SHSTK"
+  "rstorssp\t%0"
+  [(set_attr "length" "5")
+   (set_attr "type" "other")])
+
+(define_insn "wrss<mode>"
+  [(unspec_volatile [(match_operand:SWI48x 0 "register_operand" "r")
+		     (match_operand:SWI48x 1 "memory_operand" "m")]
+		   UNSPECV_WRSS)]
+  "TARGET_SHSTK"
+  "wrss<mskmodesuffix>\t%0, %1"
+  [(set_attr "length" "3")
+   (set_attr "type" "other")])
+
+(define_insn "wruss<mode>"
+  [(unspec_volatile [(match_operand:SWI48x 0 "register_operand" "r")
+		     (match_operand:SWI48x 1 "memory_operand" "m")]
+		   UNSPECV_WRUSS)]
+  "TARGET_SHSTK"
+  "wruss<mskmodesuffix>\t%0, %1"
+  [(set_attr "length" "4")
+   (set_attr "type" "other")])
+
+(define_insn "setssbsy"
+  [(unspec_volatile [(const_int 0)] UNSPECV_SETSSBSY)]
+  "TARGET_SHSTK"
+  "setssbsy"
+  [(set_attr "length" "4")
+   (set_attr "type" "other")])
+
+(define_insn "clrssbsy"
+  [(unspec_volatile [(match_operand 0 "memory_operand" "m")]
+		   UNSPECV_CLRSSBSY)]
+  "TARGET_SHSTK"
+  "clrssbsy\t%0"
+  [(set_attr "length" "4")
+   (set_attr "type" "other")])
+
+(define_insn "nop_endbr"
+  [(unspec_volatile [(const_int 0)] UNSPECV_NOP_ENDBR)]
+  "TARGET_IBT"
+  "*
+{ return (TARGET_64BIT)? \"endbr64\" : \"endbr32\"; }"
+  [(set_attr "length" "4")
+   (set_attr "length_immediate" "0")
+   (set_attr "modrm" "0")])
+
+;; For RTM support
 (define_expand "xbegin"
   [(set (match_operand:SI 0 "register_operand")
 	(unspec_volatile:SI [(const_int 0)] UNSPECV_XBEGIN))]
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 9064bf09eb5..7c9dd471686 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -753,6 +753,10 @@ mrdpid
 Target Report Mask(ISA_RDPID) Var(ix86_isa_flags2) Save
 Support RDPID built-in functions and code generation.
 
+mgfni
+Target Report Mask(ISA_GFNI) Var(ix86_isa_flags2) Save
+Support GFNI built-in functions and code generation.
+
 mbmi
 Target Report Mask(ISA_BMI) Var(ix86_isa_flags) Save
 Support BMI built-in functions and code generation.
@@ -953,3 +957,23 @@ Attempt to avoid generating instruction sequences containing ret bytes.
 mgeneral-regs-only
 Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Var(ix86_target_flags) Save
 Generate code which uses only the general registers.
+
+mcet
+Target Report Var(flag_cet) Init(0)
+Support Control-flow  Enforcment Technology (CET) built-in functions
+and code generation.
+
+mibt
+Target Report Mask(ISA_IBT) Var(ix86_isa_flags2) Save
+Specifically enables an indirect branch tracking feature from Control-flow
+Enforcment Technology (CET).
+
+mshstk
+Target Report Mask(ISA_SHSTK) Var(ix86_isa_flags2) Save
+Specifically enables an shadow stack support feature from Control-flow
+Enforcment Technology (CET).
+
+mcet-switch
+Target Report Undocumented Var(flag_cet_switch) Init(0)
+Turn on CET instrumentation for switch statements, which use jump table and
+indirect jump.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index b52f58efa40..365d2db7dd0 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -90,6 +90,10 @@
 
 #include <xtestintrin.h>
 
+#include <cetintrin.h>
+
+#include <gfniintrin.h>
+
 #ifndef __RDRND__
 #pragma GCC push_options
 #pragma GCC target("rdrnd")
diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h
index 6380639b204..6613807180e 100644
--- a/gcc/config/i386/linux-common.h
+++ b/gcc/config/i386/linux-common.h
@@ -121,3 +121,8 @@ along with GCC; see the file COPYING3.  If not see
 #define CHKP_SPEC "\
 %{!nostdlib:%{!nodefaultlibs:" LIBMPX_SPEC LIBMPXWRAPPERS_SPEC "}}" MPX_SPEC
 #endif
+
+extern void file_end_indicate_exec_stack_and_cet (void);
+
+#undef TARGET_ASM_FILE_END
+#define TARGET_ASM_FILE_END file_end_indicate_exec_stack_and_cet
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 0917fad15d4..c3f442eb8ac 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -366,6 +366,31 @@
     }
 })
 
+;; Return true if VALUE is a constant integer whose value is
+;; x86_64_immediate_operand value zero extended from word mode to mode.
+(define_predicate "x86_64_dwzext_immediate_operand"
+  (match_code "const_int,const_wide_int")
+{
+  switch (GET_CODE (op))
+    {
+    case CONST_INT:
+      if (!TARGET_64BIT)
+	return UINTVAL (op) <= HOST_WIDE_INT_UC (0xffffffff);
+      return UINTVAL (op) <= HOST_WIDE_INT_UC (0x7fffffff);
+
+    case CONST_WIDE_INT:
+      if (!TARGET_64BIT)
+	return false;
+      return (CONST_WIDE_INT_NUNITS (op) == 2
+	      && CONST_WIDE_INT_ELT (op, 1) == 0
+	      && (trunc_int_for_mode (CONST_WIDE_INT_ELT (op, 0), SImode)
+		  == (HOST_WIDE_INT) CONST_WIDE_INT_ELT (op, 0)));
+
+    default:
+      gcc_unreachable ();
+    }
+})
+
 ;; Return true if size of VALUE can be stored in a sign
 ;; extended immediate field.
 (define_predicate "x86_64_immediate_size_operand"
@@ -979,9 +1004,9 @@
   (match_code "mem")
 {
   unsigned n_elts;
-  op = maybe_get_pool_constant (op);
+  op = avoid_constant_pool_reference (op);
 
-  if (!(op && GET_CODE (op) == CONST_VECTOR))
+  if (GET_CODE (op) != CONST_VECTOR)
     return false;
 
   n_elts = CONST_VECTOR_NUNITS (op);
@@ -1276,7 +1301,7 @@
   machine_mode inmode = GET_MODE (XEXP (op, 0));
   enum rtx_code code = GET_CODE (op);
 
-  if (inmode == CCFPmode || inmode == CCFPUmode)
+  if (inmode == CCFPmode)
     {
       if (!ix86_trivial_fp_comparison_operator (op, mode))
 	return false;
@@ -1286,8 +1311,7 @@
   switch (code)
     {
     case LTU: case GTU: case LEU: case GEU:
-      if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode
-	  || inmode == CCCmode)
+      if (inmode == CCmode || inmode == CCFPmode || inmode == CCCmode)
 	return true;
       return false;
     case ORDERED: case UNORDERED:
@@ -1323,20 +1347,26 @@
   machine_mode inmode = GET_MODE (XEXP (op, 0));
   enum rtx_code code = GET_CODE (op);
 
-  if (inmode == CCFPmode || inmode == CCFPUmode)
+  if (inmode == CCFPmode)
     return ix86_trivial_fp_comparison_operator (op, mode);
 
   switch (code)
     {
     case EQ: case NE:
+      if (inmode == CCGZmode)
+	return false;
       return true;
-    case LT: case GE:
+    case GE: case LT:
       if (inmode == CCmode || inmode == CCGCmode
-	  || inmode == CCGOCmode || inmode == CCNOmode)
+	  || inmode == CCGOCmode || inmode == CCNOmode || inmode == CCGZmode)
 	return true;
       return false;
-    case LTU: case GTU: case LEU: case GEU:
-      if (inmode == CCmode || inmode == CCCmode)
+    case GEU: case LTU:
+      if (inmode == CCGZmode)
+	return true;
+      /* FALLTHRU */
+    case GTU: case LEU:
+      if (inmode == CCmode || inmode == CCCmode || inmode == CCGZmode)
 	return true;
       return false;
     case ORDERED: case UNORDERED:
@@ -1360,7 +1390,7 @@
   machine_mode inmode = GET_MODE (XEXP (op, 0));
   enum rtx_code code = GET_CODE (op);
 
-  if (inmode == CCFPmode || inmode == CCFPUmode)
+  if (inmode == CCFPmode)
     {
       if (!ix86_trivial_fp_comparison_operator (op, mode))
 	return false;
diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h
index 61733603fa2..05e5e1a4949 100644
--- a/gcc/config/i386/sol2.h
+++ b/gcc/config/i386/sol2.h
@@ -65,8 +65,16 @@ along with GCC; see the file COPYING3.  If not see
 #define ASM_CPU64_DEFAULT_SPEC "-xarch=generic64"
 #endif
 
+/* Since Studio 12.6, as needs -xbrace_comment=no so its AVX512 syntax is
+   fully compatible with gas.  */
+#ifdef HAVE_AS_XBRACE_COMMENT_OPTION
+#define ASM_XBRACE_COMMENT_SPEC "-xbrace_comment=no"
+#else
+#define ASM_XBRACE_COMMENT_SPEC ""
+#endif
+
 #undef ASM_CPU_SPEC
-#define ASM_CPU_SPEC "%(asm_cpu_default)"
+#define ASM_CPU_SPEC "%(asm_cpu_default) " ASM_XBRACE_COMMENT_SPEC
 
 /* Don't include ASM_PIC_SPEC.  While the Solaris 10+ assembler accepts -K PIC,
    it gives many warnings: 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d5e2ec00237..4dfb2f8d3b3 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -83,9 +83,7 @@
   UNSPEC_VSIBADDR
 
   ;; For AVX512F support
-  UNSPEC_VPERMI2
   UNSPEC_VPERMT2
-  UNSPEC_VPERMI2_MASK
   UNSPEC_UNSIGNED_FIX_NOTRUNC
   UNSPEC_UNSIGNED_PCMP
   UNSPEC_TESTM
@@ -157,6 +155,9 @@
   UNSPEC_VP4FNMADD
   UNSPEC_VP4DPWSSD
   UNSPEC_VP4DPWSSDS
+
+  ;; For GFNI support
+  UNSPEC_GF2P8AFFINEINV
 ])
 
 (define_c_enum "unspecv" [
@@ -325,6 +326,9 @@
 (define_mode_iterator VI1_AVX512
   [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI])
 
+(define_mode_iterator VI1_AVX512F
+  [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI])
+
 (define_mode_iterator VI2_AVX2
   [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI])
 
@@ -371,10 +375,17 @@
   [V16SF V16SI])
 
 ;; ??? We should probably use TImode instead.
-(define_mode_iterator VIMAX_AVX2
+(define_mode_iterator VIMAX_AVX2_AVX512BW
   [(V4TI "TARGET_AVX512BW") (V2TI "TARGET_AVX2") V1TI])
 
-;; ??? This should probably be dropped in favor of VIMAX_AVX2.
+;; Suppose TARGET_AVX512BW as baseline
+(define_mode_iterator VIMAX_AVX512VL
+  [V4TI (V2TI "TARGET_AVX512VL") (V1TI "TARGET_AVX512VL")])
+
+(define_mode_iterator VIMAX_AVX2
+  [(V2TI "TARGET_AVX2") V1TI])
+
+;; ??? This should probably be dropped in favor of VIMAX_AVX2_AVX512BW.
 (define_mode_iterator SSESCALARMODE
   [(V4TI "TARGET_AVX512BW") (V2TI "TARGET_AVX2") TI])
 
@@ -403,11 +414,19 @@
   [(V8SI "TARGET_AVX2") V4SI
    (V4DI "TARGET_AVX2") V2DI])
 
+(define_mode_iterator VI248_AVX2
+  [(V16HI "TARGET_AVX2") V8HI
+   (V8SI "TARGET_AVX2") V4SI
+   (V4DI "TARGET_AVX2") V2DI])
+
 (define_mode_iterator VI248_AVX2_8_AVX512F_24_AVX512BW
   [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI
    (V16SI "TARGET_AVX512BW") (V8SI "TARGET_AVX2") V4SI
    (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX2") V2DI])
 
+(define_mode_iterator VI248_AVX512BW
+  [(V32HI "TARGET_AVX512BW") V16SI V8DI])
+
 (define_mode_iterator VI248_AVX512BW_AVX512VL
   [(V32HI "TARGET_AVX512BW") 
    (V4DI "TARGET_AVX512VL") V16SI V8DI])
@@ -418,6 +437,11 @@
   V8SI V4SI
   V2DI])
    
+(define_mode_iterator VI248_AVX512BW_2
+ [(V16HI "TARGET_AVX512BW") (V8HI "TARGET_AVX512BW")
+  V8SI V4SI
+  V4DI V2DI])
+   
 (define_mode_iterator VI48_AVX512F
   [(V16SI "TARGET_AVX512F") V8SI V4SI
    (V8DI "TARGET_AVX512F") V4DI V2DI])
@@ -2522,7 +2546,7 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "reduces<mode>"
+(define_insn "reduces<mode><mask_scalar_name>"
   [(set (match_operand:VF_128 0 "register_operand" "=v")
 	(vec_merge:VF_128
 	  (unspec:VF_128
@@ -2533,7 +2557,7 @@
 	  (match_dup 1)
 	  (const_int 1)))]
   "TARGET_AVX512DQ"
-  "vreduce<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "vreduce<ssescalarmodesuffix>\t{%3, %2, %1, %0<mask_scalar_operand4>|%0<mask_scalar_operand4>, %1, %2, %3}"
   [(set_attr "type" "sse")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<MODE>")])
@@ -2737,7 +2761,7 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<sse>_comi<round_saeonly_name>"
+(define_insn "<sse>_<unord>comi<round_saeonly_name>"
   [(set (reg:CCFP FLAGS_REG)
 	(compare:CCFP
 	  (vec_select:MODEF
@@ -2747,27 +2771,7 @@
 	    (match_operand:<ssevecmode> 1 "<round_saeonly_nimm_scalar_predicate>" "<round_saeonly_constraint>")
 	    (parallel [(const_int 0)]))))]
   "SSE_FLOAT_MODE_P (<MODE>mode)"
-  "%vcomi<ssemodesuffix>\t{<round_saeonly_op2>%1, %0|%0, %<iptr>1<round_saeonly_op2>}"
-  [(set_attr "type" "ssecomi")
-   (set_attr "prefix" "maybe_vex")
-   (set_attr "prefix_rep" "0")
-   (set (attr "prefix_data16")
-	(if_then_else (eq_attr "mode" "DF")
-		      (const_string "1")
-		      (const_string "0")))
-   (set_attr "mode" "<MODE>")])
-
-(define_insn "<sse>_ucomi<round_saeonly_name>"
-  [(set (reg:CCFPU FLAGS_REG)
-	(compare:CCFPU
-	  (vec_select:MODEF
-	    (match_operand:<ssevecmode> 0 "register_operand" "v")
-	    (parallel [(const_int 0)]))
-	  (vec_select:MODEF
-	    (match_operand:<ssevecmode> 1 "<round_saeonly_nimm_scalar_predicate>" "<round_saeonly_constraint>")
-	    (parallel [(const_int 0)]))))]
-  "SSE_FLOAT_MODE_P (<MODE>mode)"
-  "%vucomi<ssemodesuffix>\t{<round_saeonly_op2>%1, %0|%0, %<iptr>1<round_saeonly_op2>}"
+  "%v<unord>comi<ssemodesuffix>\t{<round_saeonly_op2>%1, %0|%0, %<iptr>1<round_saeonly_op2>}"
   [(set_attr "type" "ssecomi")
    (set_attr "prefix" "maybe_vex")
    (set_attr "prefix_rep" "0")
@@ -3700,8 +3704,7 @@
   "@
    vfmadd132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfmadd213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fmadd_<mode>_mask3<round_name>"
@@ -3715,8 +3718,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F"
   "vfmadd231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*fma_fmsub_<mode>"
@@ -3766,8 +3768,7 @@
   "@
    vfmsub132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfmsub213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fmsub_<mode>_mask3<round_name>"
@@ -3782,8 +3783,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F && <round_mode512bit_condition>"
   "vfmsub231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*fma_fnmadd_<mode>"
@@ -3833,8 +3833,7 @@
   "@
    vfnmadd132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfnmadd213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fnmadd_<mode>_mask3<round_name>"
@@ -3849,8 +3848,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F && <round_mode512bit_condition>"
   "vfnmadd231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*fma_fnmsub_<mode>"
@@ -3903,8 +3901,7 @@
   "@
    vfnmsub132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfnmsub213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fnmsub_<mode>_mask3<round_name>"
@@ -3920,8 +3917,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F"
   "vfnmsub231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 ;; FMA parallel floating point multiply addsub and subadd operations.
@@ -4005,8 +4001,7 @@
   "@
    vfmaddsub132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfmaddsub213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fmaddsub_<mode>_mask3<round_name>"
@@ -4021,8 +4016,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F"
   "vfmaddsub231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*fma_fmsubadd_<mode>"
@@ -4075,8 +4069,7 @@
   "@
    vfmsubadd132<ssemodesuffix>\t{<round_op5>%2, %3, %0%{%4%}|%0%{%4%}, %3, %2<round_op5>}
    vfmsubadd213<ssemodesuffix>\t{<round_op5>%3, %2, %0%{%4%}|%0%{%4%}, %2, %3<round_op5>}"
-  [(set_attr "isa" "fma_avx512f,fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 (define_insn "<avx512>_fmsubadd_<mode>_mask3<round_name>"
@@ -4092,8 +4085,7 @@
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F"
   "vfmsubadd231<ssemodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%}, %1, %2<round_op5>}"
-  [(set_attr "isa" "fma_avx512f")
-   (set_attr "type" "ssemuladd")
+  [(set_attr "type" "ssemuladd")
    (set_attr "mode" "<MODE>")])
 
 ;; FMA3 floating point scalar intrinsics. These merge result with
@@ -10168,8 +10160,7 @@
                          (const_int 12) (const_int 14)])))))]
   "TARGET_AVX512F && ix86_binary_operator_ok (MULT, V16SImode, operands)"
   "vpmuludq\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
-  [(set_attr "isa" "avx512f")
-   (set_attr "type" "sseimul")
+  [(set_attr "type" "sseimul")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "XI")])
@@ -10285,8 +10276,7 @@
                          (const_int 12) (const_int 14)])))))]
   "TARGET_AVX512F && ix86_binary_operator_ok (MULT, V16SImode, operands)"
   "vpmuldq\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
-  [(set_attr "isa" "avx512f")
-   (set_attr "type" "sseimul")
+  [(set_attr "type" "sseimul")
    (set_attr "prefix_extra" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "XI")])
@@ -10731,65 +10721,57 @@
        (const_string "0")))
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<shift_insn><mode>3<mask_name>"
-  [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand" "=x,v")
-	(any_lshift:VI2_AVX2_AVX512BW
-	  (match_operand:VI2_AVX2_AVX512BW 1 "register_operand" "0,v")
-	  (match_operand:DI 2 "nonmemory_operand" "xN,vN")))]
-  "TARGET_SSE2 && <mask_mode512bit_condition> && <mask_avx512bw_condition>"
-  "@
-   p<vshift><ssemodesuffix>\t{%2, %0|%0, %2}
-   vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
-  [(set_attr "isa" "noavx,avx")
-   (set_attr "type" "sseishft")
+(define_insn "<mask_codefor><shift_insn><mode>3<mask_name>"
+  [(set (match_operand:VI248_AVX512BW_2 0 "register_operand" "=v,v")
+	(any_lshift:VI248_AVX512BW_2
+	  (match_operand:VI248_AVX512BW_2 1 "nonimmediate_operand" "v,vm")
+	  (match_operand:DI 2 "nonmemory_operand" "v,N")))]
+  "TARGET_AVX512VL"
+  "vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
+  [(set_attr "type" "sseishft")
    (set (attr "length_immediate")
      (if_then_else (match_operand 2 "const_int_operand")
        (const_string "1")
        (const_string "0")))
-   (set_attr "prefix_data16" "1,*")
-   (set_attr "prefix" "orig,vex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<shift_insn><mode>3<mask_name>"
-  [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
-	(any_lshift:VI48_AVX2
-	  (match_operand:VI48_AVX2 1 "register_operand" "0,x,v")
-	  (match_operand:DI 2 "nonmemory_operand" "xN,xN,vN")))]
-  "TARGET_SSE2 && <mask_mode512bit_condition>"
+(define_insn "<shift_insn><mode>3"
+  [(set (match_operand:VI248_AVX2 0 "register_operand" "=x,x")
+	(any_lshift:VI248_AVX2
+	  (match_operand:VI248_AVX2 1 "register_operand" "0,x")
+	  (match_operand:DI 2 "nonmemory_operand" "xN,xN")))]
+  "TARGET_SSE2"
   "@
    p<vshift><ssemodesuffix>\t{%2, %0|%0, %2}
-   vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}
-   vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"  
-  [(set_attr "isa" "noavx,avx,avx512bw")
+   vp<vshift><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sseishft")
    (set (attr "length_immediate")
      (if_then_else (match_operand 2 "const_int_operand")
        (const_string "1")
        (const_string "0")))
-   (set_attr "prefix_data16" "1,*,*")
-   (set_attr "prefix" "orig,vex,evex")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix" "orig,vex")
    (set_attr "mode" "<sseinsnmode>")])
 
 (define_insn "<shift_insn><mode>3<mask_name>"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v,v")
-	(any_lshift:VI48_512
-	  (match_operand:VI48_512 1 "nonimmediate_operand" "v,m")
+  [(set (match_operand:VI248_AVX512BW 0 "register_operand" "=v,v")
+	(any_lshift:VI248_AVX512BW
+	  (match_operand:VI248_AVX512BW 1 "nonimmediate_operand" "v,m")
 	  (match_operand:DI 2 "nonmemory_operand" "vN,N")))]
-  "TARGET_AVX512F && <mask_mode512bit_condition>"
+  "TARGET_AVX512F"
   "vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2}"
-  [(set_attr "isa" "avx512f")
-   (set_attr "type" "sseishft")
+  [(set_attr "type" "sseishft")
    (set (attr "length_immediate")
      (if_then_else (match_operand 2 "const_int_operand")
        (const_string "1")
        (const_string "0")))
-   (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
 
-(define_expand "vec_shl_<mode>"
+(define_expand "vec_shr_<mode>"
   [(set (match_dup 3)
-	(ashift:V1TI
+	(lshiftrt:V1TI
 	 (match_operand:VI_128 1 "register_operand")
 	 (match_operand:SI 2 "const_0_to_255_mul_8_operand")))
    (set (match_operand:VI_128 0 "register_operand") (match_dup 4))]
@@ -10800,48 +10782,24 @@
   operands[4] = gen_lowpart (<MODE>mode, operands[3]);
 })
 
-(define_insn "<sse2_avx2>_ashl<mode>3"
-  [(set (match_operand:VIMAX_AVX2 0 "register_operand" "=x,v")
-	(ashift:VIMAX_AVX2
-	 (match_operand:VIMAX_AVX2 1 "register_operand" "0,v")
-	 (match_operand:SI 2 "const_0_to_255_mul_8_operand" "n,n")))]
-  "TARGET_SSE2"
+(define_insn "avx512bw_<shift_insn><mode>3"
+  [(set (match_operand:VIMAX_AVX512VL 0 "register_operand" "=v")
+	(any_lshift:VIMAX_AVX512VL
+	 (match_operand:VIMAX_AVX512VL 1 "nonimmediate_operand" "vm")
+	 (match_operand:SI 2 "const_0_to_255_mul_8_operand" "n")))]
+  "TARGET_AVX512BW"
 {
   operands[2] = GEN_INT (INTVAL (operands[2]) / 8);
-
-  switch (which_alternative)
-    {
-    case 0:
-      return "pslldq\t{%2, %0|%0, %2}";
-    case 1:
-      return "vpslldq\t{%2, %1, %0|%0, %1, %2}";
-    default:
-      gcc_unreachable ();
-    }
+  return "vp<vshift>dq\t{%2, %1, %0|%0, %1, %2}";
 }
-  [(set_attr "isa" "noavx,avx")
-   (set_attr "type" "sseishft")
+  [(set_attr "type" "sseishft")
    (set_attr "length_immediate" "1")
-   (set_attr "prefix_data16" "1,*")
-   (set_attr "prefix" "orig,vex")
+   (set_attr "prefix" "maybe_evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_expand "vec_shr_<mode>"
-  [(set (match_dup 3)
-	(lshiftrt:V1TI
-	 (match_operand:VI_128 1 "register_operand")
-	 (match_operand:SI 2 "const_0_to_255_mul_8_operand")))
-   (set (match_operand:VI_128 0 "register_operand") (match_dup 4))]
-  "TARGET_SSE2"
-{
-  operands[1] = gen_lowpart (V1TImode, operands[1]);
-  operands[3] = gen_reg_rtx (V1TImode);
-  operands[4] = gen_lowpart (<MODE>mode, operands[3]);
-})
-
-(define_insn "<sse2_avx2>_lshr<mode>3"
+(define_insn "<sse2_avx2>_<shift_insn><mode>3"
   [(set (match_operand:VIMAX_AVX2 0 "register_operand" "=x,v")
-	(lshiftrt:VIMAX_AVX2
+	(any_lshift:VIMAX_AVX2
 	 (match_operand:VIMAX_AVX2 1 "register_operand" "0,v")
 	 (match_operand:SI 2 "const_0_to_255_mul_8_operand" "n,n")))]
   "TARGET_SSE2"
@@ -10851,9 +10809,9 @@
   switch (which_alternative)
     {
     case 0:
-      return "psrldq\t{%2, %0|%0, %2}";
+      return "p<vshift>dq\t{%2, %0|%0, %2}";
     case 1:
-      return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";
+      return "vp<vshift>dq\t{%2, %1, %0|%0, %1, %2}";
     default:
       gcc_unreachable ();
     }
@@ -11562,10 +11520,10 @@
   "TARGET_AVX512BW")
 
 (define_insn "*andnot<mode>3"
-  [(set (match_operand:VI 0 "register_operand" "=x,v")
+  [(set (match_operand:VI 0 "register_operand" "=x,x,v")
 	(and:VI
-	  (not:VI (match_operand:VI 1 "register_operand" "0,v"))
-	  (match_operand:VI 2 "vector_operand" "xBm,vm")))]
+	  (not:VI (match_operand:VI 1 "register_operand" "0,x,v"))
+	  (match_operand:VI 2 "vector_operand" "xBm,xm,vm")))]
   "TARGET_SSE"
 {
   static char buf[64];
@@ -11600,10 +11558,11 @@
 	case E_V4DImode:
 	case E_V4SImode:
 	case E_V2DImode:
-	  ssesuffix = TARGET_AVX512VL ? "<ssemodesuffix>" : "";
+	  ssesuffix = (TARGET_AVX512VL && which_alternative == 2
+		       ? "<ssemodesuffix>" : "");
 	  break;
 	default:
-	  ssesuffix = TARGET_AVX512VL ? "q" : "";
+	  ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : "";
 	}
       break;
 
@@ -11629,6 +11588,7 @@
       ops = "%s%s\t{%%2, %%0|%%0, %%2}";
       break;
     case 1:
+    case 2:
       ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
       break;
     default:
@@ -11638,7 +11598,7 @@
   snprintf (buf, sizeof (buf), ops, tmp, ssesuffix);
   return buf;
 }
-  [(set_attr "isa" "noavx,avx")
+  [(set_attr "isa" "noavx,avx,avx")
    (set_attr "type" "sselog")
    (set (attr "prefix_data16")
      (if_then_else
@@ -11646,7 +11606,7 @@
 	    (eq_attr "mode" "TI"))
        (const_string "1")
        (const_string "*")))
-   (set_attr "prefix" "orig,vex")
+   (set_attr "prefix" "orig,vex,evex")
    (set (attr "mode")
 	(cond [(and (match_test "<MODE_SIZE> == 16")
 		    (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
@@ -11691,10 +11651,10 @@
 })
 
 (define_insn "<mask_codefor><code><mode>3<mask_name>"
-  [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,v")
+  [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,x,v")
 	(any_logic:VI48_AVX_AVX512F
-	  (match_operand:VI48_AVX_AVX512F 1 "vector_operand" "%0,v")
-	  (match_operand:VI48_AVX_AVX512F 2 "vector_operand" "xBm,vm")))]
+	  (match_operand:VI48_AVX_AVX512F 1 "vector_operand" "%0,x,v")
+	  (match_operand:VI48_AVX_AVX512F 2 "vector_operand" "xBm,xm,vm")))]
   "TARGET_SSE && <mask_mode512bit_condition>
    && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
 {
@@ -11724,7 +11684,9 @@
 	case E_V4DImode:
 	case E_V4SImode:
 	case E_V2DImode:
-	  ssesuffix = TARGET_AVX512VL ? "<ssemodesuffix>" : "";
+	  ssesuffix = (TARGET_AVX512VL
+		       && (<mask_applied> || which_alternative == 2)
+		       ? "<ssemodesuffix>" : "");
 	  break;
 	default:
 	  gcc_unreachable ();
@@ -11753,6 +11715,7 @@
         ops = "%s%s\t{%%2, %%0|%%0, %%2}";
       break;
     case 1:
+    case 2:
       ops = "v%s%s\t{%%2, %%1, %%0<mask_operand3_1>|%%0<mask_operand3_1>, %%1, %%2}";
       break;
     default:
@@ -11762,7 +11725,7 @@
   snprintf (buf, sizeof (buf), ops, tmp, ssesuffix);
   return buf;
 }
-  [(set_attr "isa" "noavx,avx")
+  [(set_attr "isa" "noavx,avx,avx")
    (set_attr "type" "sselog")
    (set (attr "prefix_data16")
      (if_then_else
@@ -11770,7 +11733,7 @@
 	    (eq_attr "mode" "TI"))
        (const_string "1")
        (const_string "*")))
-   (set_attr "prefix" "<mask_prefix3>")
+   (set_attr "prefix" "<mask_prefix3>,evex")
    (set (attr "mode")
 	(cond [(and (match_test "<MODE_SIZE> == 16")
 		    (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
@@ -11789,10 +11752,10 @@
 	      (const_string "<sseinsnmode>")))])
 
 (define_insn "*<code><mode>3"
-  [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,v")
+  [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,x,v")
 	(any_logic: VI12_AVX_AVX512F
-	  (match_operand:VI12_AVX_AVX512F 1 "vector_operand" "%0,v")
-	  (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,vm")))]
+	  (match_operand:VI12_AVX_AVX512F 1 "vector_operand" "%0,x,v")
+	  (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xm,vm")))]
   "TARGET_SSE && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
 {
   static char buf[64];
@@ -11821,7 +11784,7 @@
 	case E_V16HImode:
 	case E_V16QImode:
 	case E_V8HImode:
-	  ssesuffix = TARGET_AVX512VL ? "q" : "";
+	  ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : "";
 	  break;
 	default:
 	  gcc_unreachable ();
@@ -11847,6 +11810,7 @@
       ops = "%s%s\t{%%2, %%0|%%0, %%2}";
       break;
     case 1:
+    case 2:
       ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
       break;
     default:
@@ -11856,7 +11820,7 @@
   snprintf (buf, sizeof (buf), ops, tmp, ssesuffix);
   return buf;
 }
-  [(set_attr "isa" "noavx,avx")
+  [(set_attr "isa" "noavx,avx,avx")
    (set_attr "type" "sselog")
    (set (attr "prefix_data16")
      (if_then_else
@@ -11864,7 +11828,7 @@
 	    (eq_attr "mode" "TI"))
        (const_string "1")
        (const_string "*")))
-   (set_attr "prefix" "<mask_prefix3>")
+   (set_attr "prefix" "<mask_prefix3>,evex")
    (set (attr "mode")
 	(cond [(and (match_test "<MODE_SIZE> == 16")
 		    (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
@@ -18099,96 +18063,48 @@
    (set_attr "prefix" "<mask_prefix>")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_expand "<avx512>_vpermi2var<mode>3_maskz"
-  [(match_operand:VI48F 0 "register_operand")
-   (match_operand:VI48F 1 "register_operand")
-   (match_operand:<sseintvecmode> 2 "register_operand")
-   (match_operand:VI48F 3 "nonimmediate_operand")
-   (match_operand:<avx512fmaskmode> 4 "register_operand")]
-  "TARGET_AVX512F"
-{
-  emit_insn (gen_<avx512>_vpermi2var<mode>3_maskz_1 (
-	operands[0], operands[1], operands[2], operands[3],
-	CONST0_RTX (<MODE>mode), operands[4]));
-  DONE;
-})
-
-(define_expand "<avx512>_vpermi2var<mode>3_maskz"
-  [(match_operand:VI1_AVX512VL 0 "register_operand")
-   (match_operand:VI1_AVX512VL 1 "register_operand")
-   (match_operand:<sseintvecmode> 2 "register_operand")
-   (match_operand:VI1_AVX512VL 3 "nonimmediate_operand")
-   (match_operand:<avx512fmaskmode> 4 "register_operand")]
-  "TARGET_AVX512VBMI"
-{
-  emit_insn (gen_<avx512>_vpermi2var<mode>3_maskz_1 (
-	operands[0], operands[1], operands[2], operands[3],
-	CONST0_RTX (<MODE>mode), operands[4]));
-  DONE;
-})
-
-(define_expand "<avx512>_vpermi2var<mode>3_maskz"
-  [(match_operand:VI2_AVX512VL 0 "register_operand")
-   (match_operand:VI2_AVX512VL 1 "register_operand")
-   (match_operand:<sseintvecmode> 2 "register_operand")
-   (match_operand:VI2_AVX512VL 3 "nonimmediate_operand")
-   (match_operand:<avx512fmaskmode> 4 "register_operand")]
-  "TARGET_AVX512BW"
-{
-  emit_insn (gen_<avx512>_vpermi2var<mode>3_maskz_1 (
-	operands[0], operands[1], operands[2], operands[3],
-	CONST0_RTX (<MODE>mode), operands[4]));
-  DONE;
-})
-
-(define_insn "<avx512>_vpermi2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI48F 0 "register_operand" "=v")
-	(unspec:VI48F
-	  [(match_operand:VI48F 1 "register_operand" "v")
-	   (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	   (match_operand:VI48F 3 "nonimmediate_operand" "vm")]
-	  UNSPEC_VPERMI2))]
+(define_mode_iterator VPERMI2
+  [V16SI V16SF V8DI V8DF
+   (V8SI "TARGET_AVX512VL") (V8SF "TARGET_AVX512VL")
+   (V4DI "TARGET_AVX512VL") (V4DF "TARGET_AVX512VL")
+   (V4SI "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")
+   (V2DI "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")
+   (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V8HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V64QI "TARGET_AVX512VBMI") (V32QI "TARGET_AVX512VBMI && TARGET_AVX512VL")
+   (V16QI "TARGET_AVX512VBMI && TARGET_AVX512VL")])
+
+(define_mode_iterator VPERMI2I
+  [V16SI V8DI
+   (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL")
+   (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")
+   (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V8HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V64QI "TARGET_AVX512VBMI") (V32QI "TARGET_AVX512VBMI && TARGET_AVX512VL")
+   (V16QI "TARGET_AVX512VBMI && TARGET_AVX512VL")])
+
+(define_expand "<avx512>_vpermi2var<mode>3_mask"
+  [(set (match_operand:VPERMI2 0 "register_operand")
+	(vec_merge:VPERMI2
+	  (unspec:VPERMI2
+	    [(match_operand:<sseintvecmode> 2 "register_operand")
+	     (match_operand:VPERMI2 1 "register_operand")
+	     (match_operand:VPERMI2 3 "nonimmediate_operand")]
+	    UNSPEC_VPERMT2)
+	  (match_dup 5)
+	  (match_operand:<avx512fmaskmode> 4 "register_operand")))]
   "TARGET_AVX512F"
-  "vpermi2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermi2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI1_AVX512VL 0 "register_operand" "=v")
-	(unspec:VI1_AVX512VL
-	  [(match_operand:VI1_AVX512VL 1 "register_operand" "v")
-	   (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	   (match_operand:VI1_AVX512VL 3 "nonimmediate_operand" "vm")]
-	  UNSPEC_VPERMI2))]
-  "TARGET_AVX512VBMI"
-  "vpermi2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermi2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
-	(unspec:VI2_AVX512VL
-	  [(match_operand:VI2_AVX512VL 1 "register_operand" "v")
-	   (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	   (match_operand:VI2_AVX512VL 3 "nonimmediate_operand" "vm")]
-	  UNSPEC_VPERMI2))]
-  "TARGET_AVX512BW"
-  "vpermi2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermi2var<mode>3_mask"
-  [(set (match_operand:VI48F 0 "register_operand" "=v")
-	(vec_merge:VI48F
-	  (unspec:VI48F
-	    [(match_operand:VI48F 1 "register_operand" "v")
-	    (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	    (match_operand:VI48F 3 "nonimmediate_operand" "vm")]
-	    UNSPEC_VPERMI2_MASK)
-	  (match_dup 0)
+  "operands[5] = gen_lowpart (<MODE>mode, operands[2]);")
+
+(define_insn "*<avx512>_vpermi2var<mode>3_mask"
+  [(set (match_operand:VPERMI2I 0 "register_operand" "=v")
+	(vec_merge:VPERMI2I
+	  (unspec:VPERMI2I
+	    [(match_operand:<sseintvecmode> 2 "register_operand" "0")
+	     (match_operand:VPERMI2I 1 "register_operand" "v")
+	     (match_operand:VPERMI2I 3 "nonimmediate_operand" "vm")]
+	    UNSPEC_VPERMT2)
+	  (match_dup 2)
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
   "TARGET_AVX512F"
   "vpermi2<ssemodesuffix>\t{%3, %1, %0%{%4%}|%0%{%4%}, %1, %3}"
@@ -18196,43 +18112,27 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<avx512>_vpermi2var<mode>3_mask"
-  [(set (match_operand:VI1_AVX512VL 0 "register_operand" "=v")
-	(vec_merge:VI1_AVX512VL
-	  (unspec:VI1_AVX512VL
-	    [(match_operand:VI1_AVX512VL 1 "register_operand" "v")
-	    (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	    (match_operand:VI1_AVX512VL 3 "nonimmediate_operand" "vm")]
-	    UNSPEC_VPERMI2_MASK)
-	  (match_dup 0)
-	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
-  "TARGET_AVX512VBMI"
-  "vpermi2<ssemodesuffix>\t{%3, %1, %0%{%4%}|%0%{%4%}, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermi2var<mode>3_mask"
-  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
-	(vec_merge:VI2_AVX512VL
-	  (unspec:VI2_AVX512VL
-	    [(match_operand:VI2_AVX512VL 1 "register_operand" "v")
-	    (match_operand:<sseintvecmode> 2 "register_operand" "0")
-	    (match_operand:VI2_AVX512VL 3 "nonimmediate_operand" "vm")]
-	    UNSPEC_VPERMI2_MASK)
-	  (match_dup 0)
+(define_insn "*<avx512>_vpermi2var<mode>3_mask"
+  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
+	(vec_merge:VF_AVX512VL
+	  (unspec:VF_AVX512VL
+	    [(match_operand:<sseintvecmode> 2 "register_operand" "0")
+	     (match_operand:VF_AVX512VL 1 "register_operand" "v")
+	     (match_operand:VF_AVX512VL 3 "nonimmediate_operand" "vm")]
+	    UNSPEC_VPERMT2)
+	  (subreg:VF_AVX512VL (match_dup 2) 0)
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
-  "TARGET_AVX512BW"
+  "TARGET_AVX512F"
   "vpermi2<ssemodesuffix>\t{%3, %1, %0%{%4%}|%0%{%4%}, %1, %3}"
   [(set_attr "type" "sselog")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
 (define_expand "<avx512>_vpermt2var<mode>3_maskz"
-  [(match_operand:VI48F 0 "register_operand")
+  [(match_operand:VPERMI2 0 "register_operand")
    (match_operand:<sseintvecmode> 1 "register_operand")
-   (match_operand:VI48F 2 "register_operand")
-   (match_operand:VI48F 3 "nonimmediate_operand")
+   (match_operand:VPERMI2 2 "register_operand")
+   (match_operand:VPERMI2 3 "nonimmediate_operand")
    (match_operand:<avx512fmaskmode> 4 "register_operand")]
   "TARGET_AVX512F"
 {
@@ -18242,80 +18142,28 @@
   DONE;
 })
 
-(define_expand "<avx512>_vpermt2var<mode>3_maskz"
-  [(match_operand:VI1_AVX512VL 0 "register_operand")
-   (match_operand:<sseintvecmode> 1 "register_operand")
-   (match_operand:VI1_AVX512VL 2 "register_operand")
-   (match_operand:VI1_AVX512VL 3 "nonimmediate_operand")
-   (match_operand:<avx512fmaskmode> 4 "register_operand")]
-  "TARGET_AVX512VBMI"
-{
-  emit_insn (gen_<avx512>_vpermt2var<mode>3_maskz_1 (
-	operands[0], operands[1], operands[2], operands[3],
-	CONST0_RTX (<MODE>mode), operands[4]));
-  DONE;
-})
-
-(define_expand "<avx512>_vpermt2var<mode>3_maskz"
-  [(match_operand:VI2_AVX512VL 0 "register_operand")
-   (match_operand:<sseintvecmode> 1 "register_operand")
-   (match_operand:VI2_AVX512VL 2 "register_operand")
-   (match_operand:VI2_AVX512VL 3 "nonimmediate_operand")
-   (match_operand:<avx512fmaskmode> 4 "register_operand")]
-  "TARGET_AVX512BW"
-{
-  emit_insn (gen_<avx512>_vpermt2var<mode>3_maskz_1 (
-	operands[0], operands[1], operands[2], operands[3],
-	CONST0_RTX (<MODE>mode), operands[4]));
-  DONE;
-})
-
 (define_insn "<avx512>_vpermt2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI48F 0 "register_operand" "=v")
-	(unspec:VI48F
-	  [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	   (match_operand:VI48F 2 "register_operand" "0")
-	   (match_operand:VI48F 3 "nonimmediate_operand" "vm")]
+  [(set (match_operand:VPERMI2 0 "register_operand" "=v,v")
+	(unspec:VPERMI2
+	  [(match_operand:<sseintvecmode> 1 "register_operand" "v,0")
+	   (match_operand:VPERMI2 2 "register_operand" "0,v")
+	   (match_operand:VPERMI2 3 "nonimmediate_operand" "vm,vm")]
 	  UNSPEC_VPERMT2))]
   "TARGET_AVX512F"
-  "vpermt2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermt2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI1_AVX512VL 0 "register_operand" "=v")
-	(unspec:VI1_AVX512VL
-	  [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	   (match_operand:VI1_AVX512VL 2 "register_operand" "0")
-	   (match_operand:VI1_AVX512VL 3 "nonimmediate_operand" "vm")]
-	  UNSPEC_VPERMT2))]
-  "TARGET_AVX512VBMI"
-  "vpermt2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermt2var<mode>3<sd_maskz_name>"
-  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
-	(unspec:VI2_AVX512VL
-	  [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	   (match_operand:VI2_AVX512VL 2 "register_operand" "0")
-	   (match_operand:VI2_AVX512VL 3 "nonimmediate_operand" "vm")]
-	  UNSPEC_VPERMT2))]
-  "TARGET_AVX512BW"
-  "vpermt2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}"
+  "@
+   vpermt2<ssemodesuffix>\t{%3, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3}
+   vpermi2<ssemodesuffix>\t{%3, %2, %0<sd_mask_op4>|%0<sd_mask_op4>, %2, %3}"
   [(set_attr "type" "sselog")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
 (define_insn "<avx512>_vpermt2var<mode>3_mask"
-  [(set (match_operand:VI48F 0 "register_operand" "=v")
-	(vec_merge:VI48F
-	  (unspec:VI48F
+  [(set (match_operand:VPERMI2 0 "register_operand" "=v")
+	(vec_merge:VPERMI2
+	  (unspec:VPERMI2
 	    [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	    (match_operand:VI48F 2 "register_operand" "0")
-	    (match_operand:VI48F 3 "nonimmediate_operand" "vm")]
+	    (match_operand:VPERMI2 2 "register_operand" "0")
+	    (match_operand:VPERMI2 3 "nonimmediate_operand" "vm")]
 	    UNSPEC_VPERMT2)
 	  (match_dup 2)
 	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
@@ -18325,38 +18173,6 @@
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_insn "<avx512>_vpermt2var<mode>3_mask"
-  [(set (match_operand:VI1_AVX512VL 0 "register_operand" "=v")
-	(vec_merge:VI1_AVX512VL
-	  (unspec:VI1_AVX512VL
-	    [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	    (match_operand:VI1_AVX512VL 2 "register_operand" "0")
-	    (match_operand:VI1_AVX512VL 3 "nonimmediate_operand" "vm")]
-	    UNSPEC_VPERMT2)
-	  (match_dup 2)
-	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
-  "TARGET_AVX512VBMI"
-  "vpermt2<ssemodesuffix>\t{%3, %1, %0%{%4%}|%0%{%4%}, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
-(define_insn "<avx512>_vpermt2var<mode>3_mask"
-  [(set (match_operand:VI2_AVX512VL 0 "register_operand" "=v")
-	(vec_merge:VI2_AVX512VL
-	  (unspec:VI2_AVX512VL
-	    [(match_operand:<sseintvecmode> 1 "register_operand" "v")
-	    (match_operand:VI2_AVX512VL 2 "register_operand" "0")
-	    (match_operand:VI2_AVX512VL 3 "nonimmediate_operand" "vm")]
-	    UNSPEC_VPERMT2)
-	  (match_dup 2)
-	  (match_operand:<avx512fmaskmode> 4 "register_operand" "Yk")))]
-  "TARGET_AVX512BW"
-  "vpermt2<ssemodesuffix>\t{%3, %1, %0%{%4%}|%0%{%4%}, %1, %3}"
-  [(set_attr "type" "sselog")
-   (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
-
 (define_expand "avx_vperm2f128<mode>3"
   [(set (match_operand:AVX256MODE2P 0 "register_operand")
 	(unspec:AVX256MODE2P
@@ -19613,8 +19429,7 @@
 	  UNSPEC_DBPSADBW))]
    "TARGET_AVX512BW"
   "vdbpsadbw\t{%3, %2, %1, %0<mask_operand4>|%0<mask_operand4>, %1, %2, %3}"
-  [(set_attr "isa" "avx")
-   (set_attr "type" "sselog1")
+  [(set_attr "type" "sselog1")
    (set_attr "length_immediate" "1")
    (set_attr "prefix" "evex")
    (set_attr "mode" "<sseinsnmode>")])
@@ -20159,3 +19974,20 @@
     ])]
   "TARGET_SSE && TARGET_64BIT"
   "jmp\t%P1")
+
+(define_insn "vgf2p8affineinvqb_<mode><mask_name>"
+  [(set (match_operand:VI1_AVX512F 0 "register_operand" "=x,x,v")
+	(unspec:VI1_AVX512F [(match_operand:VI1_AVX512F 1 "register_operand" "%0,x,v")
+			       (match_operand:VI1_AVX512F 2 "nonimmediate_operand" "xBm,xm,vm")
+			       (match_operand:QI 3 "const_0_to_255_operand" "n,n,n")]
+			      UNSPEC_GF2P8AFFINEINV))]
+  "TARGET_GFNI"
+  "@
+   gf2p8affineinvqb\t{%3, %2, %0| %0, %2, %3}
+   vgf2p8affineinvqb\t{%3, %2, %1, %0<mask_operand4>| %0<mask_operand4>, %1, %2, %3}
+   vgf2p8affineinvqb\t{%3, %2, %1, %0<mask_operand4>| %0<mask_operand4>, %1, %2, %3}"
+  [(set_attr "isa" "noavx,avx,avx512bw")
+   (set_attr "prefix_data16" "1,*,*")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "orig,maybe_evex,evex")
+   (set_attr "mode" "<sseinsnmode>")])
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index a318a8d4c80..d9100c8d6b0 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -37,8 +37,7 @@
    V8DI  V4DI  V2DI
    V16SF V8SF  V4SF
    V8DF  V4DF  V2DF
-   QI HI SI DI SF DF
-   CCFP CCFPU])
+   QI HI SI DI SF DF])
 
 (define_subst_attr "mask_name" "mask" "" "_mask")
 (define_subst_attr "mask_applied" "mask" "false" "true")
@@ -62,8 +61,8 @@
 (define_subst_attr "store_mask_predicate" "mask" "nonimmediate_operand" "register_operand")
 (define_subst_attr "mask_prefix" "mask" "vex" "evex")
 (define_subst_attr "mask_prefix2" "mask" "maybe_vex" "evex")
-(define_subst_attr "mask_prefix3" "mask" "orig,vex" "evex")
-(define_subst_attr "mask_prefix4" "mask" "orig,orig,vex" "evex")
+(define_subst_attr "mask_prefix3" "mask" "orig,vex" "evex,evex")
+(define_subst_attr "mask_prefix4" "mask" "orig,orig,vex" "evex,evex,evex")
 (define_subst_attr "mask_expand_op3" "mask" "3" "5")
 
 (define_subst "mask"
@@ -183,6 +182,16 @@
 	  UNSPEC_EMBEDDED_ROUNDING))
 ])
 
+(define_subst "round_saeonly"
+  [(set (match_operand:CCFP 0)
+        (match_operand:CCFP 1))]
+  "TARGET_AVX512F"
+  [(set (match_dup 0)
+	(unspec:CCFP [(match_dup 1)
+	  (match_operand:SI 2 "const48_operand")]
+	  UNSPEC_EMBEDDED_ROUNDING))
+])
+
 (define_subst_attr "round_expand_name" "round_expand" "" "_round")
 (define_subst_attr "round_expand_nimm_predicate" "round_expand" "nonimmediate_operand" "register_operand")
 (define_subst_attr "round_expand_operand" "round_expand" "" ", operands[5]")
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index 29b82f86d43..eceaa73a679 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -219,29 +219,71 @@
    (set (match_operand:DI 2 "memory_operand")
 	(unspec:DI [(match_dup 0)]
 		   UNSPEC_FIST_ATOMIC))
-   (set (match_operand:DF 3 "fp_register_operand")
+   (set (match_operand:DF 3 "any_fp_register_operand")
 	(match_operand:DF 4 "memory_operand"))]
   "!TARGET_64BIT
    && peep2_reg_dead_p (2, operands[0])
-   && rtx_equal_p (operands[4], adjust_address_nv (operands[2], DFmode, 0))"
+   && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
   [(set (match_dup 3) (match_dup 5))]
   "operands[5] = gen_lowpart (DFmode, operands[1]);")
 
 (define_peephole2
+  [(set (match_operand:DF 0 "fp_register_operand")
+	(unspec:DF [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_FILD_ATOMIC))
+   (set (match_operand:DI 2 "memory_operand")
+	(unspec:DI [(match_dup 0)]
+		   UNSPEC_FIST_ATOMIC))
+   (set (mem:BLK (scratch:SI))
+	(unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:DF 3 "any_fp_register_operand")
+	(match_operand:DF 4 "memory_operand"))]
+  "!TARGET_64BIT
+   && peep2_reg_dead_p (2, operands[0])
+   && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
+  [(const_int 0)]
+{
+  emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1]));
+  emit_insn (gen_memory_blockage ());
+  DONE;
+})
+
+(define_peephole2
   [(set (match_operand:DF 0 "sse_reg_operand")
 	(unspec:DF [(match_operand:DI 1 "memory_operand")]
 		   UNSPEC_LDX_ATOMIC))
    (set (match_operand:DI 2 "memory_operand")
 	(unspec:DI [(match_dup 0)]
 		   UNSPEC_STX_ATOMIC))
-   (set (match_operand:DF 3 "fp_register_operand")
+   (set (match_operand:DF 3 "any_fp_register_operand")
 	(match_operand:DF 4 "memory_operand"))]
   "!TARGET_64BIT
    && peep2_reg_dead_p (2, operands[0])
-   && rtx_equal_p (operands[4], adjust_address_nv (operands[2], DFmode, 0))"
+   && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
   [(set (match_dup 3) (match_dup 5))]
   "operands[5] = gen_lowpart (DFmode, operands[1]);")
 
+(define_peephole2
+  [(set (match_operand:DF 0 "sse_reg_operand")
+	(unspec:DF [(match_operand:DI 1 "memory_operand")]
+		   UNSPEC_LDX_ATOMIC))
+   (set (match_operand:DI 2 "memory_operand")
+	(unspec:DI [(match_dup 0)]
+		   UNSPEC_STX_ATOMIC))
+   (set (mem:BLK (scratch:SI))
+	(unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:DF 3 "any_fp_register_operand")
+	(match_operand:DF 4 "memory_operand"))]
+  "!TARGET_64BIT
+   && peep2_reg_dead_p (2, operands[0])
+   && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))"
+  [(const_int 0)]
+{
+  emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1]));
+  emit_insn (gen_memory_blockage ());
+  DONE;
+})
+
 (define_expand "atomic_store<mode>"
   [(set (match_operand:ATOMIC 0 "memory_operand")
 	(unspec:ATOMIC [(match_operand:ATOMIC 1 "nonimmediate_operand")
@@ -331,7 +373,7 @@
 
 (define_peephole2
   [(set (match_operand:DF 0 "memory_operand")
-	(match_operand:DF 1 "fp_register_operand"))
+	(match_operand:DF 1 "any_fp_register_operand"))
    (set (match_operand:DF 2 "fp_register_operand")
 	(unspec:DF [(match_operand:DI 3 "memory_operand")]
 		   UNSPEC_FILD_ATOMIC))
@@ -340,13 +382,34 @@
 		   UNSPEC_FIST_ATOMIC))]
   "!TARGET_64BIT
    && peep2_reg_dead_p (3, operands[2])
-   && rtx_equal_p (operands[0], adjust_address_nv (operands[3], DFmode, 0))"
+   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
   [(set (match_dup 5) (match_dup 1))]
   "operands[5] = gen_lowpart (DFmode, operands[4]);")
 
 (define_peephole2
   [(set (match_operand:DF 0 "memory_operand")
-	(match_operand:DF 1 "fp_register_operand"))
+	(match_operand:DF 1 "any_fp_register_operand"))
+   (set (mem:BLK (scratch:SI))
+	(unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:DF 2 "fp_register_operand")
+	(unspec:DF [(match_operand:DI 3 "memory_operand")]
+		   UNSPEC_FILD_ATOMIC))
+   (set (match_operand:DI 4 "memory_operand")
+	(unspec:DI [(match_dup 2)]
+		   UNSPEC_FIST_ATOMIC))]
+  "!TARGET_64BIT
+   && peep2_reg_dead_p (4, operands[2])
+   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
+  [(const_int 0)]
+{
+  emit_insn (gen_memory_blockage ());
+  emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
+  DONE;
+})
+
+(define_peephole2
+  [(set (match_operand:DF 0 "memory_operand")
+	(match_operand:DF 1 "any_fp_register_operand"))
    (set (match_operand:DF 2 "sse_reg_operand")
 	(unspec:DF [(match_operand:DI 3 "memory_operand")]
 		   UNSPEC_LDX_ATOMIC))
@@ -355,10 +418,31 @@
 		   UNSPEC_STX_ATOMIC))]
   "!TARGET_64BIT
    && peep2_reg_dead_p (3, operands[2])
-   && rtx_equal_p (operands[0], adjust_address_nv (operands[3], DFmode, 0))"
+   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
   [(set (match_dup 5) (match_dup 1))]
   "operands[5] = gen_lowpart (DFmode, operands[4]);")
 
+(define_peephole2
+  [(set (match_operand:DF 0 "memory_operand")
+	(match_operand:DF 1 "any_fp_register_operand"))
+   (set (mem:BLK (scratch:SI))
+	(unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE))
+   (set (match_operand:DF 2 "sse_reg_operand")
+	(unspec:DF [(match_operand:DI 3 "memory_operand")]
+		   UNSPEC_LDX_ATOMIC))
+   (set (match_operand:DI 4 "memory_operand")
+	(unspec:DI [(match_dup 2)]
+		   UNSPEC_STX_ATOMIC))]
+  "!TARGET_64BIT
+   && peep2_reg_dead_p (4, operands[2])
+   && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))"
+  [(const_int 0)]
+{
+  emit_insn (gen_memory_blockage ());
+  emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]);
+  DONE;
+})
+
 ;; ??? You'd think that we'd be able to perform this via FLOAT + FIX_TRUNC
 ;; operations.  But the fix_trunc patterns want way more setup than we want
 ;; to provide.  Note that the scratch is DFmode instead of XFmode in order
diff --git a/gcc/config/i386/t-cet b/gcc/config/i386/t-cet
new file mode 100644
index 00000000000..317f30dbb98
--- /dev/null
+++ b/gcc/config/i386/t-cet
@@ -0,0 +1,21 @@
+# Copyright (C) 2017 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+cet.o: $(srcdir)/config/i386/cet.c
+	  $(COMPILE) $<
+	  $(POSTCOMPILE)
diff --git a/gcc/config/i386/winnt.c b/gcc/config/i386/winnt.c
index e690d2b907d..2f8518e1f1d 100644
--- a/gcc/config/i386/winnt.c
+++ b/gcc/config/i386/winnt.c
@@ -1217,8 +1217,7 @@ void
 i386_pe_start_function (FILE *f, const char *name, tree decl)
 {
   i386_pe_maybe_record_exported_symbol (decl, name, 0);
-  if (write_symbols != SDB_DEBUG)
-    i386_pe_declare_function_type (f, name, TREE_PUBLIC (decl));
+  i386_pe_declare_function_type (f, name, TREE_PUBLIC (decl));
   /* In case section was altered by debugging output.  */
   if (decl != NULL_TREE)
     switch_to_section (function_section (decl));
diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index d27072c0901..c7ac70e8453 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1,4 +1,26 @@
+/* Costs of operations of individual x86 CPUs.
+   Copyright (C) 1988-2017 Free Software Foundation, Inc.
 
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
 /* Processor costs (relative to an add) */
 /* We assume COSTS_N_INSNS is defined as (N)*4 and an addition is 2 bytes.  */
 #define COSTS_N_BYTES(N) ((N) * 2)
@@ -33,6 +55,8 @@ struct processor_costs ix86_size_cost = {/* costs for tuning for size */
   COSTS_N_BYTES (3),			/* cost of movzx */
   0,					/* "large" insn */
   2,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2. */
   2,				     /* cost for loading QImode using movzbl */
   {2, 2, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -48,12 +72,18 @@ struct processor_costs ix86_size_cost = {/* costs for tuning for size */
 					   in SImode and DImode */
   {3, 3},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  3,					/* cost of moving SSE register */
-  {3, 3, 3},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {3, 3, 3},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  3, 3, 3,				/* cost of moving XMM,YMM,ZMM register */
+  {3, 3, 3, 3, 3},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {3, 3, 3, 3, 3},			/* cost of unaligned SSE load
+					   in 128bit, 256bit and 512bit */
+  {3, 3, 3, 3, 3},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {3, 3, 3, 3, 3},				/* cost of unaligned SSE store
+					   in 128bit, 256bit and 512bit */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  5, 0,					/* Gather load static, per_elt.  */
+  5, 0,					/* Gather store static, per_elt.  */
   0,					/* size of l1 cache  */
   0,					/* size of l2 cache  */
   0,					/* size of prefetch block */
@@ -65,20 +95,22 @@ struct processor_costs ix86_size_cost = {/* costs for tuning for size */
   COSTS_N_BYTES (2),			/* cost of FABS instruction.  */
   COSTS_N_BYTES (2),			/* cost of FCHS instruction.  */
   COSTS_N_BYTES (2),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_BYTES (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_BYTES (2),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_BYTES (2),			/* cost of MULSS instruction.  */
+  COSTS_N_BYTES (2),			/* cost of MULSD instruction.  */
+  COSTS_N_BYTES (2),			/* cost of FMA SS instruction.  */
+  COSTS_N_BYTES (2),			/* cost of FMA SD instruction.  */
+  COSTS_N_BYTES (2),			/* cost of DIVSS instruction.  */
+  COSTS_N_BYTES (2),			/* cost of DIVSD instruction.  */
+  COSTS_N_BYTES (2),			/* cost of SQRTSS instruction.  */
+  COSTS_N_BYTES (2),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   ix86_size_memcpy,
   ix86_size_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  1,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  1,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_BYTES (1),			/* cond_taken_branch_cost.  */
+  COSTS_N_BYTES (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* Processor costs (relative to an add) */
@@ -110,6 +142,9 @@ struct processor_costs i386_cost = {	/* 386 specific costs */
   COSTS_N_INSNS (2),			/* cost of movzx */
   15,					/* "large" insn */
   3,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {2, 4, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -125,12 +160,16 @@ struct processor_costs i386_cost = {	/* 386 specific costs */
 					   in SImode and DImode */
   {4, 8},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 8, 16},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 8, 16},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 8, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned loads.  */
+  {4, 8, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   0,					/* size of l1 cache  */
   0,					/* size of l2 cache  */
   0,					/* size of prefetch block */
@@ -142,20 +181,22 @@ struct processor_costs i386_cost = {	/* 386 specific costs */
   COSTS_N_INSNS (22),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (24),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (122),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (23),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (27),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (27),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (27),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (27),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (88),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (88),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (122),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (122),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   i386_memcpy,
   i386_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs i486_memcpy[2] = {
@@ -186,6 +227,9 @@ struct processor_costs i486_cost = {	/* 486 specific costs */
   COSTS_N_INSNS (2),			/* cost of movzx */
   15,					/* "large" insn */
   3,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {2, 4, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -201,12 +245,16 @@ struct processor_costs i486_cost = {	/* 486 specific costs */
 					   in SImode and DImode */
   {4, 8},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 8, 16},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 8, 16},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 8, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned loads.  */
+  {4, 8, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   4,					/* size of l1 cache.  486 has 8kB cache
 					   shared for code and data, so 4kB is
 					   not really precise.  */
@@ -220,20 +268,22 @@ struct processor_costs i486_cost = {	/* 486 specific costs */
   COSTS_N_INSNS (3),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (3),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (83),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (8),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (16),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (16),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (16),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (16),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (73),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (74),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (83),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (83),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   i486_memcpy,
   i486_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs pentium_memcpy[2] = {
@@ -264,6 +314,9 @@ struct processor_costs pentium_cost = {
   COSTS_N_INSNS (2),			/* cost of movzx */
   8,					/* "large" insn */
   6,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   6,				     /* cost for loading QImode using movzbl */
   {2, 4, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -279,12 +332,16 @@ struct processor_costs pentium_cost = {
 					   in SImode and DImode */
   {8, 8},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 8, 16},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 8, 16},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 8, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned loads.  */
+  {4, 8, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   8,					/* size of l1 cache.  */
   8,					/* size of l2 cache  */
   0,					/* size of prefetch block */
@@ -296,20 +353,22 @@ struct processor_costs pentium_cost = {
   COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (70),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (3),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (3),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (39),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (39),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (70),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (70),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   pentium_memcpy,
   pentium_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static const
@@ -333,6 +392,9 @@ struct processor_costs lakemont_cost = {
   COSTS_N_INSNS (2),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   6,				     /* cost for loading QImode using movzbl */
   {2, 4, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -348,12 +410,16 @@ struct processor_costs lakemont_cost = {
 					   in SImode and DImode */
   {8, 8},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 8, 16},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 8, 16},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 8, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned loads.  */
+  {4, 8, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   8,					/* size of l1 cache.  */
   8,					/* size of l2 cache  */
   0,					/* size of prefetch block */
@@ -365,20 +431,22 @@ struct processor_costs lakemont_cost = {
   COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (70),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (5),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (5),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (10),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (10),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (31),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (60),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (31),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (63),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   pentium_memcpy,
   pentium_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* PentiumPro has optimized rep instructions for blocks aligned by 8 bytes
@@ -417,6 +485,9 @@ struct processor_costs pentiumpro_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   6,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   2,				     /* cost for loading QImode using movzbl */
   {4, 4, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -432,12 +503,16 @@ struct processor_costs pentiumpro_cost = {
 					   in SImode and DImode */
   {2, 2},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {2, 2, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {2, 2, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 8, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned loads.  */
+  {4, 8, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 8, 16, 32, 64},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   8,					/* size of l1 cache.  */
   256,					/* size of l2 cache  */
   32,					/* size of prefetch block */
@@ -449,20 +524,22 @@ struct processor_costs pentiumpro_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (56),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (7),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (7),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (18),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (18),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (31),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (31),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   pentiumpro_memcpy,
   pentiumpro_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs geode_memcpy[2] = {
@@ -492,13 +569,16 @@ struct processor_costs geode_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   4,					/* MOVE_RATIO */
-  1,				     /* cost for loading QImode using movzbl */
-  {1, 1, 1},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  2,				     /* cost for loading QImode using movzbl */
+  {2, 2, 2},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {1, 1, 1},				/* cost of storing integer registers */
-  1,					/* cost of reg,reg fld/fst */
-  {1, 1, 1},				/* cost of loading fp registers
+  {2, 2, 2},				/* cost of storing integer registers */
+  2,					/* cost of reg,reg fld/fst */
+  {2, 2, 2},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
   {4, 6, 6},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
@@ -508,12 +588,16 @@ struct processor_costs geode_cost = {
 					   in SImode and DImode */
   {2, 2},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {2, 2, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {2, 2, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {2, 2, 8, 16, 32},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {2, 2, 8, 16, 32},			/* cost of unaligned loads.  */
+  {2, 2, 8, 16, 32},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {2, 2, 8, 16, 32},			/* cost of unaligned stores.  */
+  6, 6,					/* SSE->integer and integer->SSE moves */
+  2, 2,					/* Gather load static, per_elt.  */
+  2, 2,					/* Gather store static, per_elt.  */
   64,					/* size of l1 cache.  */
   128,					/* size of l2 cache.  */
   32,					/* size of prefetch block */
@@ -525,20 +609,22 @@ struct processor_costs geode_cost = {
   COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (54),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (6),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (11),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (11),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (17),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (17),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (47),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (47),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (54),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (54),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   geode_memcpy,
   geode_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs k6_memcpy[2] = {
@@ -568,6 +654,9 @@ struct processor_costs k6_cost = {
   COSTS_N_INSNS (2),			/* cost of movzx */
   8,					/* "large" insn */
   4,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   3,				     /* cost for loading QImode using movzbl */
   {4, 5, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -583,12 +672,16 @@ struct processor_costs k6_cost = {
 					   in SImode and DImode */
   {2, 2},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {2, 2, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {2, 2, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  6,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {2, 2, 8, 16, 32},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {2, 2, 8, 16, 32},			/* cost of unaligned loads.  */
+  {2, 2, 8, 16, 32},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {2, 2, 8, 16, 32},			/* cost of unaligned stores.  */
+  6, 6,					/* SSE->integer and integer->SSE moves */
+  2, 2,					/* Gather load static, per_elt.  */
+  2, 2,					/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   32,					/* size of l2 cache.  Some models
 					   have integrated l2 cache, but
@@ -603,20 +696,22 @@ struct processor_costs k6_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (56),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (2),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (2),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (2),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (4),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (56),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (56),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (56),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (56),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   k6_memcpy,
   k6_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* For some reason, Athlon deals better with REP prefix (relative to loops)
@@ -649,6 +744,9 @@ struct processor_costs athlon_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {3, 4, 3},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -664,12 +762,16 @@ struct processor_costs athlon_cost = {
 					   in SImode and DImode */
   {4, 4},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 6},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 5},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 4, 6, 12, 24},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 4, 6, 12, 24},			/* cost of unaligned loads.  */
+  {4, 4, 5, 10, 20},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 4, 5, 10, 20},			/* cost of unaligned stores.  */
+  5, 5,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   64,					/* size of l1 cache.  */
   256,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -681,20 +783,23 @@ struct processor_costs athlon_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (35),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (4),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SD instruction.  */
+  /* 11-16  */
+  COSTS_N_INSNS (16),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (24),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (19),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (19),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   athlon_memcpy,
   athlon_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* K8 has optimized REP instruction for medium sized blocks, but for very
@@ -731,6 +836,9 @@ struct processor_costs k8_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {3, 4, 3},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -746,12 +854,16 @@ struct processor_costs k8_cost = {
 					   in SImode and DImode */
   {4, 4},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 3, 6},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 5},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 3, 6, 12, 24},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 3, 6, 12, 24},			/* cost of unaligned loads.  */
+  {4, 4, 5, 10, 20},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 4, 5, 10, 20},			/* cost of unaligned stores.  */
+  5, 5,					/* SSE->integer and integer->SSE moves */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   64,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -768,20 +880,23 @@ struct processor_costs k8_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (35),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (4),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SD instruction.  */
+  /* 11-16  */
+  COSTS_N_INSNS (16),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (20),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (19),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (27),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   k8_memcpy,
   k8_memset,
-  4,					/* scalar_stmt_cost.  */
-  2,					/* scalar load_cost.  */
-  2,					/* scalar_store_cost.  */
-  5,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  2,					/* vec_align_load_cost.  */
-  3,					/* vec_unalign_load_cost.  */
-  3,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
 /* AMDFAM10 has optimized REP instruction for medium sized blocks, but for
@@ -817,6 +932,9 @@ struct processor_costs amdfam10_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {3, 4, 3},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
@@ -832,12 +950,14 @@ struct processor_costs amdfam10_cost = {
 					   in SImode and DImode */
   {4, 4},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 3},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 5},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {4, 4, 3, 6, 12},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 4, 3, 7, 12},			/* cost of unaligned loads.  */
+  {4, 4, 5, 10, 20},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {4, 4, 5, 10, 20},			/* cost of unaligned stores.  */
+  3, 3,					/* SSE->integer and integer->SSE moves */
   					/* On K8:
   					    MOVD reg64, xmmreg Double FSTORE 4
 					    MOVD reg32, xmmreg Double FSTORE 4
@@ -846,6 +966,8 @@ struct processor_costs amdfam10_cost = {
 							       1/1  1/1
 					    MOVD reg32, xmmreg Double FADD 3
 							       1/1  1/1 */
+  4, 4,					/* Gather load static, per_elt.  */
+  4, 4,					/* Gather store static, per_elt.  */
   64,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -862,20 +984,23 @@ struct processor_costs amdfam10_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (35),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (4),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (8),			/* cost of FMA SD instruction.  */
+  /* 11-16  */
+  COSTS_N_INSNS (16),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (20),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (19),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (27),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   amdfam10_memcpy,
   amdfam10_memset,
-  4,					/* scalar_stmt_cost.  */
-  2,					/* scalar load_cost.  */
-  2,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  2,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  2,					/* vec_store_cost.  */
-  2,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /*  BDVER1 has optimized REP instruction for medium sized blocks, but for
@@ -912,35 +1037,34 @@ const struct processor_costs bdver1_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {5, 5, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {8, 8, 8},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  2,					/* cost of reg,reg fld/fst */
-  {5, 5, 12},				/* cost of loading fp registers
+  {8, 8, 8},				/* cost of storing integer registers */
+  4,					/* cost of reg,reg fld/fst */
+  {12, 12, 28},				/* cost of loading fp registers
 		   			   in SFmode, DFmode and XFmode */
-  {4, 4, 8},				/* cost of storing fp registers
+  {10, 10, 18},				/* cost of storing fp registers
  		   			   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {4, 4},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {12, 12},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {10, 10},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 4},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 4},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  2,					/* MMX or SSE register to integer */
-  					/* On K8:
-					    MOVD reg64, xmmreg Double FSTORE 4
-					    MOVD reg32, xmmreg Double FSTORE 4
-					   On AMDFAM10:
-					    MOVD reg64, xmmreg Double FADD 3
-							       1/1  1/1
-					    MOVD reg32, xmmreg Double FADD 3
-							       1/1  1/1 */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {12, 12, 10, 20, 30},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {12, 12, 10, 20, 30},			/* cost of unaligned loads.  */
+  {10, 10, 10, 20, 30},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 20, 30},			/* cost of unaligned stores.  */
+  16, 20,				/* SSE->integer and integer->SSE moves */
+  12, 12,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   16,					/* size of l1 cache.  */
   2048,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -957,20 +1081,24 @@ const struct processor_costs bdver1_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (52),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (6),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (6),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  /* 9-24  */
+  COSTS_N_INSNS (24),			/* cost of DIVSS instruction.  */
+  /* 9-27  */
+  COSTS_N_INSNS (27),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (15),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (26),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   bdver1_memcpy,
   bdver1_memset,
-  6,					/* scalar_stmt_cost.  */
-  4,					/* scalar load_cost.  */
-  4,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  4,					/* vec_align_load_cost.  */
-  4,					/* vec_unalign_load_cost.  */
-  4,					/* vec_store_cost.  */
-  4,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (4),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
 /*  BDVER2 has optimized REP instruction for medium sized blocks, but for
@@ -1008,35 +1136,34 @@ const struct processor_costs bdver2_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {5, 5, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {8, 8, 8},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  2,					/* cost of reg,reg fld/fst */
-  {5, 5, 12},				/* cost of loading fp registers
+  {8, 8, 8},				/* cost of storing integer registers */
+  4,					/* cost of reg,reg fld/fst */
+  {12, 12, 28},				/* cost of loading fp registers
 		   			   in SFmode, DFmode and XFmode */
-  {4, 4, 8},				/* cost of storing fp registers
+  {10, 10, 18},				/* cost of storing fp registers
  		   			   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {4, 4},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {12, 12},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {10, 10},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 4},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 4},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  2,					/* MMX or SSE register to integer */
-  					/* On K8:
-					    MOVD reg64, xmmreg Double FSTORE 4
-					    MOVD reg32, xmmreg Double FSTORE 4
-					   On AMDFAM10:
-					    MOVD reg64, xmmreg Double FADD 3
-							       1/1  1/1
-					    MOVD reg32, xmmreg Double FADD 3
-							       1/1  1/1 */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {12, 12, 10, 20, 30},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {12, 12, 10, 20, 30},			/* cost of unaligned loads.  */
+  {10, 10, 10, 20, 30},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 20, 30},			/* cost of unaligned stores.  */
+  16, 20,				/* SSE->integer and integer->SSE moves */
+  12, 12,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   16,					/* size of l1 cache.  */
   2048,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1053,20 +1180,24 @@ const struct processor_costs bdver2_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (52),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (6),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (6),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  /* 9-24  */
+  COSTS_N_INSNS (24),			/* cost of DIVSS instruction.  */
+  /* 9-27  */
+  COSTS_N_INSNS (27),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (15),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (26),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   bdver2_memcpy,
   bdver2_memset,
-  6,					/* scalar_stmt_cost.  */
-  4,					/* scalar load_cost.  */
-  4,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  4,					/* vec_align_load_cost.  */
-  4,					/* vec_unalign_load_cost.  */
-  4,					/* vec_store_cost.  */
-  4,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (4),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
 
@@ -1103,27 +1234,34 @@ struct processor_costs bdver3_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {5, 5, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {8, 8, 8},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  2,					/* cost of reg,reg fld/fst */
-  {5, 5, 12},				/* cost of loading fp registers
+  {8, 8, 8},				/* cost of storing integer registers */
+  4,					/* cost of reg,reg fld/fst */
+  {12, 12, 28},				/* cost of loading fp registers
 		   			   in SFmode, DFmode and XFmode */
-  {4, 4, 8},				/* cost of storing fp registers
+  {10, 10, 18},				/* cost of storing fp registers
  		   			   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {4, 4},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {12, 12},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {10, 10},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 4},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 4},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  2,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {12, 12, 10, 20, 30},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {12, 12, 10, 20, 30},			/* cost of unaligned loads.  */
+  {10, 10, 10, 20, 30},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 20, 30},			/* cost of unaligned stores.  */
+  16, 20,				/* SSE->integer and integer->SSE moves */
+  12, 12,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   16,					/* size of l1 cache.  */
   2048,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1140,20 +1278,24 @@ struct processor_costs bdver3_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (52),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (6),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (6),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  /* 9-24  */
+  COSTS_N_INSNS (24),			/* cost of DIVSS instruction.  */
+  /* 9-27  */
+  COSTS_N_INSNS (27),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (15),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (26),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   bdver3_memcpy,
   bdver3_memset,
-  6,					/* scalar_stmt_cost.  */
-  4,					/* scalar load_cost.  */
-  4,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  4,					/* vec_align_load_cost.  */
-  4,					/* vec_unalign_load_cost.  */
-  4,					/* vec_store_cost.  */
-  4,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (4),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
 /*  BDVER4 has optimized REP instruction for medium sized blocks, but for
@@ -1189,27 +1331,34 @@ struct processor_costs bdver4_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {5, 5, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {8, 8, 8},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  2,					/* cost of reg,reg fld/fst */
-  {5, 5, 12},				/* cost of loading fp registers
+  {8, 8, 8},				/* cost of storing integer registers */
+  4,					/* cost of reg,reg fld/fst */
+  {12, 12, 28},				/* cost of loading fp registers
 		   			   in SFmode, DFmode and XFmode */
-  {4, 4, 8},				/* cost of storing fp registers
+  {10, 10, 18},				/* cost of storing fp registers
  		   			   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {4, 4},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {12, 12},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {10, 10},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 4},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 4},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  2,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {12, 12, 10, 20, 30},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {12, 12, 10, 20, 30},			/* cost of unaligned loads.  */
+  {10, 10, 10, 20, 30},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 20, 30},			/* cost of unaligned stores.  */
+  16, 20,				/* SSE->integer and integer->SSE moves */
+  12, 12,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   16,					/* size of l1 cache.  */
   2048,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1226,20 +1375,24 @@ struct processor_costs bdver4_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (52),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (6),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (6),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  /* 9-24  */
+  COSTS_N_INSNS (24),			/* cost of DIVSS instruction.  */
+  /* 9-27  */
+  COSTS_N_INSNS (27),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (15),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (26),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   bdver4_memcpy,
   bdver4_memset,
-  6,					/* scalar_stmt_cost.  */
-  4,					/* scalar load_cost.  */
-  4,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  4,					/* vec_align_load_cost.  */
-  4,					/* vec_unalign_load_cost.  */
-  4,					/* vec_store_cost.  */
-  4,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (4),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
 
@@ -1264,42 +1417,59 @@ struct processor_costs znver1_cost = {
   {COSTS_N_INSNS (3),			/* cost of starting multiply for QI.  */
    COSTS_N_INSNS (3),			/*				 HI.  */
    COSTS_N_INSNS (3),			/*				 SI.  */
-   COSTS_N_INSNS (4),			/*				 DI.  */
-   COSTS_N_INSNS (4)},			/*			      other.  */
+   COSTS_N_INSNS (3),			/*				 DI.  */
+   COSTS_N_INSNS (3)},			/*			      other.  */
   0,					/* cost of multiply per each bit
 					    set.  */
-  {COSTS_N_INSNS (19),			/* cost of a divide/mod for QI.  */
-   COSTS_N_INSNS (35),			/*			    HI.  */
-   COSTS_N_INSNS (51),			/*			    SI.  */
-   COSTS_N_INSNS (83),			/*			    DI.  */
-   COSTS_N_INSNS (83)},			/*			    other.  */
+   /* Depending on parameters, idiv can get faster on ryzen.  This is upper
+      bound.  */
+  {COSTS_N_INSNS (16),			/* cost of a divide/mod for QI.  */
+   COSTS_N_INSNS (22),			/*			    HI.  */
+   COSTS_N_INSNS (30),			/*			    SI.  */
+   COSTS_N_INSNS (45),			/*			    DI.  */
+   COSTS_N_INSNS (45)},			/*			    other.  */
   COSTS_N_INSNS (1),			/* cost of movsx.  */
   COSTS_N_INSNS (1),			/* cost of movzx.  */
   8,					/* "large" insn.  */
   9,					/* MOVE_RATIO.  */
-  4,					/* cost for loading QImode using
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+
+  /* reg-reg moves are done by renaming and thus they are even cheaper than
+     1 cycle. Becuase reg-reg move cost is 2 and the following tables correspond
+     to doubles of latencies, we do not model this correctly.  It does not
+     seem to make practical difference to bump prices up even more.  */
+  6,					/* cost for loading QImode using
 					   movzbl.  */
-  {5, 5, 4},				/* cost of loading integer registers
+  {6, 6, 6},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer
+  {8, 8, 8},				/* cost of storing integer
 					   registers.  */
   2,					/* cost of reg,reg fld/fst.  */
-  {5, 5, 12},				/* cost of loading fp registers
+  {6, 6, 16},				/* cost of loading fp registers
 		   			   in SFmode, DFmode and XFmode.  */
-  {4, 4, 8},				/* cost of storing fp registers
+  {8, 8, 16},				/* cost of storing fp registers
  		   			   in SFmode, DFmode and XFmode.  */
   2,					/* cost of moving MMX register.  */
-  {4, 4},				/* cost of loading MMX registers
+  {6, 6},				/* cost of loading MMX registers
 					   in SImode and DImode.  */
-  {4, 4},				/* cost of storing MMX registers
+  {8, 8},				/* cost of storing MMX registers
 					   in SImode and DImode.  */
-  2,					/* cost of moving SSE register.  */
-  {4, 4, 4},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode.  */
-  {4, 4, 4},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode.  */
-  2,					/* MMX or SSE register to integer.  */
+  2, 3, 6,				/* cost of moving XMM,YMM,ZMM register.  */
+  {6, 6, 6, 10, 20},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit.  */
+  {6, 6, 6, 10, 20},			/* cost of unaligned loads.  */
+  {8, 8, 8, 8, 16},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit.  */
+  {8, 8, 8, 8, 16},			/* cost of unaligned stores.  */
+  6, 6,					/* SSE->integer and integer->SSE moves.  */
+  /* VGATHERDPD is 23 uops and throughput is 9, VGATHERDPD is 35 uops,
+     throughput 12.  Approx 9 uops do not depend on vector size and every load
+     is 7 uops.  */
+  18, 8,				/* Gather load static, per_elt.  */
+  18, 10,				/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block.  */
@@ -1310,12 +1480,26 @@ struct processor_costs znver1_cost = {
      time).  */
   100,					/* number of parallel prefetches.  */
   3,					/* Branch cost.  */
-  COSTS_N_INSNS (6),			/* cost of FADD and FSUB insns.  */
-  COSTS_N_INSNS (6),			/* cost of FMUL instruction.  */
-  COSTS_N_INSNS (42),			/* cost of FDIV instruction.  */
-  COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
-  COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
-  COSTS_N_INSNS (52),			/* cost of FSQRT instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FADD and FSUB insns.  */
+  COSTS_N_INSNS (5),			/* cost of FMUL instruction.  */
+  /* Latency of fdiv is 8-15.  */
+  COSTS_N_INSNS (15),			/* cost of FDIV instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
+  /* Latency of fsqrt is 4-10.  */
+  COSTS_N_INSNS (10),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (3),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (10),			/* cost of DIVSS instruction.  */
+  /* 9-13  */
+  COSTS_N_INSNS (13),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (10),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (15),			/* cost of SQRTSD instruction.  */
   /* Zen can execute 4 integer operations per cycle. FP operations take 3 cycles
      and it can execute 2 integer additions and 2 multiplications thus
      reassociation may make sense up to with of 6.  SPEC2k6 bencharks suggests
@@ -1327,17 +1511,8 @@ struct processor_costs znver1_cost = {
   4, 4, 3, 6,				/* reassoc int, fp, vec_int, vec_fp.  */
   znver1_memcpy,
   znver1_memset,
-  6,					/* scalar_stmt_cost.  */
-  4,					/* scalar load_cost.  */
-  4,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  4,					/* vec_align_load_cost.  */
-  4,					/* vec_unalign_load_cost.  */
-  4,					/* vec_store_cost.  */
-  4,					/* cond_taken_branch_cost.  */
-  2,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (4),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_not_taken_branch_cost.  */
 };
 
   /* BTVER1 has optimized REP instruction for medium sized blocks, but for
@@ -1373,35 +1548,34 @@ const struct processor_costs btver1_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {3, 4, 3},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {6, 8, 6},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {3, 4, 3},				/* cost of storing integer registers */
+  {6, 8, 6},				/* cost of storing integer registers */
   4,					/* cost of reg,reg fld/fst */
-  {4, 4, 12},				/* cost of loading fp registers
+  {12, 12, 28},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {12, 12, 38},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {3, 3},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {10, 10},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {12, 12},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 3},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 5},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
-					/* On K8:
-					   MOVD reg64, xmmreg Double FSTORE 4
-					   MOVD reg32, xmmreg Double FSTORE 4
-					   On AMDFAM10:
-					   MOVD reg64, xmmreg Double FADD 3
-							       1/1  1/1
-					    MOVD reg32, xmmreg Double FADD 3
-							       1/1  1/1 */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {10, 10, 12, 24, 48},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 12, 24, 48},			/* cost of unaligned loads.  */
+  {10, 10, 12, 24, 48},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 12, 24, 48},			/* cost of unaligned stores.  */
+  14, 14,				/* SSE->integer and integer->SSE moves */
+  10, 10,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1413,20 +1587,22 @@ const struct processor_costs btver1_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (35),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (2),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (13),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (17),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (14),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (48),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   btver1_memcpy,
   btver1_memset,
-  4,					/* scalar_stmt_cost.  */
-  2,					/* scalar load_cost.  */
-  2,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  2,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  2,					/* vec_store_cost.  */
-  2,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs btver2_memcpy[2] = {
@@ -1459,35 +1635,34 @@ const struct processor_costs btver2_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   9,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
-  {3, 4, 3},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,				     /* cost for loading QImode using movzbl */
+  {8, 8, 6},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {3, 4, 3},				/* cost of storing integer registers */
+  {8, 8, 6},				/* cost of storing integer registers */
   4,					/* cost of reg,reg fld/fst */
-  {4, 4, 12},				/* cost of loading fp registers
+  {12, 12, 28},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {12, 12, 38},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {3, 3},				/* cost of loading MMX registers
+  4,					/* cost of moving MMX register */
+  {10, 10},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {4, 4},				/* cost of storing MMX registers
+  {12, 12},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {4, 4, 3},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {4, 4, 5},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  3,					/* MMX or SSE register to integer */
-					/* On K8:
-					   MOVD reg64, xmmreg Double FSTORE 4
-					   MOVD reg32, xmmreg Double FSTORE 4
-					   On AMDFAM10:
-					   MOVD reg64, xmmreg Double FADD 3
-							       1/1  1/1
-					    MOVD reg32, xmmreg Double FADD 3
-							       1/1  1/1 */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {10, 10, 12, 24, 48},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 12, 24, 48},			/* cost of unaligned loads.  */
+  {10, 10, 12, 24, 48},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 12, 24, 48},			/* cost of unaligned stores.  */
+  14, 14,				/* SSE->integer and integer->SSE moves */
+  10, 10,				/* Gather load static, per_elt.  */
+  10, 10,				/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   2048,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1499,20 +1674,22 @@ const struct processor_costs btver2_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (35),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (2),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (13),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (19),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (16),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (21),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   btver2_memcpy,
   btver2_memset,
-  4,					/* scalar_stmt_cost.  */
-  2,					/* scalar load_cost.  */
-  2,					/* scalar_store_cost.  */
-  6,					/* vec_stmt_cost.  */
-  0,					/* vec_to_scalar_cost.  */
-  2,					/* scalar_to_vec_cost.  */
-  2,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  2,					/* vec_store_cost.  */
-  2,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (2),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs pentium4_memcpy[2] = {
@@ -1544,27 +1721,34 @@ struct processor_costs pentium4_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   16,					/* "large" insn */
   6,					/* MOVE_RATIO */
-  2,				     /* cost for loading QImode using movzbl */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  5,				     /* cost for loading QImode using movzbl */
   {4, 5, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
   {2, 3, 2},				/* cost of storing integer registers */
-  2,					/* cost of reg,reg fld/fst */
-  {2, 2, 6},				/* cost of loading fp registers
+  12,					/* cost of reg,reg fld/fst */
+  {14, 14, 14},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {4, 4, 6},				/* cost of storing fp registers
+  {14, 14, 14},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
-  2,					/* cost of moving MMX register */
-  {2, 2},				/* cost of loading MMX registers
+  12,					/* cost of moving MMX register */
+  {16, 16},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {2, 2},				/* cost of storing MMX registers
+  {16, 16},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  12,					/* cost of moving SSE register */
-  {12, 12, 12},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {2, 2, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  10,					/* MMX or SSE register to integer */
+  12, 24, 48,				/* cost of moving XMM,YMM,ZMM register */
+  {16, 16, 16, 32, 64},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {32, 32, 32, 64, 128},		/* cost of unaligned loads.  */
+  {16, 16, 16, 32, 64},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {32, 32, 32, 64, 128},		/* cost of unaligned stores.  */
+  20, 12,				/* SSE->integer and integer->SSE moves */
+  16, 16,				/* Gather load static, per_elt.  */
+  16, 16,				/* Gather store static, per_elt.  */
   8,					/* size of l1 cache.  */
   256,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1576,20 +1760,22 @@ struct processor_costs pentium4_cost = {
   COSTS_N_INSNS (2),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (2),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (43),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (4),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (6),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (23),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (38),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (23),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (38),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   pentium4_memcpy,
   pentium4_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs nocona_memcpy[2] = {
@@ -1624,27 +1810,34 @@ struct processor_costs nocona_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   16,					/* "large" insn */
   17,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {4, 4, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
   {4, 4, 4},				/* cost of storing integer registers */
-  3,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  12,					/* cost of reg,reg fld/fst */
+  {14, 14, 14},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {4, 4, 4},				/* cost of storing fp registers
+  {14, 14, 14},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
-  6,					/* cost of moving MMX register */
+  14,					/* cost of moving MMX register */
   {12, 12},				/* cost of loading MMX registers
 					   in SImode and DImode */
   {12, 12},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  6,					/* cost of moving SSE register */
-  {12, 12, 12},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {12, 12, 12},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  8,					/* MMX or SSE register to integer */
+  6, 12, 24,				/* cost of moving XMM,YMM,ZMM register */
+  {12, 12, 12, 24, 48},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {24, 24, 24, 48, 96},			/* cost of unaligned loads.  */
+  {12, 12, 12, 24, 48},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {24, 24, 24, 48, 96},			/* cost of unaligned stores.  */
+  20, 12,				/* SSE->integer and integer->SSE moves */
+  12, 12,				/* Gather load static, per_elt.  */
+  12, 12,				/* Gather store static, per_elt.  */
   8,					/* size of l1 cache.  */
   1024,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1656,20 +1849,22 @@ struct processor_costs nocona_cost = {
   COSTS_N_INSNS (3),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (3),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (44),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (2),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (5),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (7),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (7),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (7),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (7),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (32),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (40),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (32),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (41),			/* cost of SQRTSD instruction.  */
   1, 1, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   nocona_memcpy,
   nocona_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs atom_memcpy[2] = {
@@ -1702,27 +1897,34 @@ struct processor_costs atom_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
-  4,					/* cost for loading QImode using movzbl */
-  {4, 4, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  6,					/* cost for loading QImode using movzbl */
+  {6, 6, 6},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
+  {6, 6, 6},				/* cost of storing integer registers */
   4,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  {6, 6, 18},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {14, 14, 24},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
   2,					/* cost of moving MMX register */
   {8, 8},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {8, 8},				/* cost of storing MMX registers
+  {10, 10},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {8, 8, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {8, 8, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {8, 8, 8, 16, 32},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {16, 16, 16, 32, 64},			/* cost of unaligned loads.  */
+  {8, 8, 8, 16, 32},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {16, 16, 16, 32, 64},			/* cost of unaligned stores.  */
+  8, 6,					/* SSE->integer and integer->SSE moves */
+  8, 8,					/* Gather load static, per_elt.  */
+  8, 8,					/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   256,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1734,20 +1936,22 @@ struct processor_costs atom_cost = {
   COSTS_N_INSNS (8),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (8),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (40),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (5),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (31),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (60),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (31),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (63),			/* cost of SQRTSD instruction.  */
   2, 2, 2, 2,				/* reassoc int, fp, vec_int, vec_fp.  */
   atom_memcpy,
   atom_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs slm_memcpy[2] = {
@@ -1780,27 +1984,34 @@ struct processor_costs slm_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
-  4,					/* cost for loading QImode using movzbl */
-  {4, 4, 4},				/* cost of loading integer registers
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  8,					/* cost for loading QImode using movzbl */
+  {8, 8, 8},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  4,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  {6, 6, 6},				/* cost of storing integer registers */
+  2,					/* cost of reg,reg fld/fst */
+  {8, 8, 18},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {6, 6, 18},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
   2,					/* cost of moving MMX register */
   {8, 8},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {8, 8},				/* cost of storing MMX registers
+  {6, 6},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {8, 8, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {8, 8, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 4, 8,				/* cost of moving XMM,YMM,ZMM register */
+  {8, 8, 8, 16, 32},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {16, 16, 16, 32, 64},			/* cost of unaligned loads.  */
+  {8, 8, 8, 16, 32},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {16, 16, 16, 32, 64},			/* cost of unaligned stores.  */
+  8, 6,					/* SSE->integer and integer->SSE moves */
+  8, 8,					/* Gather load static, per_elt.  */
+  8, 8,					/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   256,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1812,20 +2023,22 @@ struct processor_costs slm_cost = {
   COSTS_N_INSNS (8),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (8),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (40),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (39),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (69),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (20),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (35),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   slm_memcpy,
   slm_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  4,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 static stringop_algs intel_memcpy[2] = {
@@ -1858,27 +2071,34 @@ struct processor_costs intel_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
-  4,					/* cost for loading QImode using movzbl */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  6,				     /* cost for loading QImode using movzbl */
   {4, 4, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  4,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  {6, 6, 6},				/* cost of storing integer registers */
+  2,					/* cost of reg,reg fld/fst */
+  {6, 6, 8},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {6, 6, 10},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
   2,					/* cost of moving MMX register */
-  {8, 8},				/* cost of loading MMX registers
+  {6, 6},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {8, 8},				/* cost of storing MMX registers
+  {6, 6},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {8, 8, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {8, 8, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 2, 2,				/* cost of moving XMM,YMM,ZMM register */
+  {6, 6, 6, 6, 6},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 10, 10},			/* cost of unaligned loads.  */
+  {6, 6, 6, 6, 6},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 10, 10},			/* cost of unaligned loads.  */
+  4, 4,					/* SSE->integer and integer->SSE moves */
+  6, 6,					/* Gather load static, per_elt.  */
+  6, 6,					/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   256,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1890,20 +2110,22 @@ struct processor_costs intel_cost = {
   COSTS_N_INSNS (8),			/* cost of FABS instruction.  */
   COSTS_N_INSNS (8),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (40),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (8),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (8),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (8),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (8),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (6),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (20),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (20),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (40),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (40),			/* cost of SQRTSD instruction.  */
   1, 4, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   intel_memcpy,
   intel_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  4,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* Generic should produce code tuned for Core-i7 (and newer chips)
@@ -1922,8 +2144,7 @@ static stringop_algs generic_memset[2] = {
 static const
 struct processor_costs generic_cost = {
   COSTS_N_INSNS (1),			/* cost of an add instruction */
-  /* On all chips taken into consideration lea is 2 cycles and more.  With
-     this cost however our current implementation of synth_mult results in
+  /* Setting cost to 2 makes our current implementation of synth_mult result in
      use of unnecessary temporary registers causing regression on several
      SPECfp benchmarks.  */
   COSTS_N_INSNS (1) + 1,		/* cost of a lea instruction */
@@ -1944,27 +2165,34 @@ struct processor_costs generic_cost = {
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
   4,				     /* cost for loading QImode using movzbl */
   {4, 4, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
+  {6, 6, 6},				/* cost of storing integer registers */
   4,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  {6, 6, 12},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {6, 6, 12},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
   2,					/* cost of moving MMX register */
-  {8, 8},				/* cost of loading MMX registers
+  {6, 6},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {8, 8},				/* cost of storing MMX registers
+  {6, 6},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {8, 8, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {8, 8, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 3, 4,				/* cost of moving XMM,YMM,ZMM register */
+  {6, 6, 6, 10, 15},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 15, 20},			/* cost of unaligned loads.  */
+  {6, 6, 6, 10, 15},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {10, 10, 10, 15, 20},			/* cost of unaligned storess.  */
+  20, 20,				/* SSE->integer and integer->SSE moves */
+  6, 6,					/* Gather load static, per_elt.  */
+  6, 6,					/* Gather store static, per_elt.  */
   32,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
@@ -1972,26 +2200,28 @@ struct processor_costs generic_cost = {
   /* Benchmarks shows large regressions on K8 sixtrack benchmark when this
      value is increased to perhaps more appropriate value of 5.  */
   3,					/* Branch cost */
-  COSTS_N_INSNS (8),			/* cost of FADD and FSUB insns.  */
-  COSTS_N_INSNS (8),			/* cost of FMUL instruction.  */
+  COSTS_N_INSNS (3),			/* cost of FADD and FSUB insns.  */
+  COSTS_N_INSNS (3),			/* cost of FMUL instruction.  */
   COSTS_N_INSNS (20),			/* cost of FDIV instruction.  */
-  COSTS_N_INSNS (8),			/* cost of FABS instruction.  */
-  COSTS_N_INSNS (8),			/* cost of FCHS instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
   COSTS_N_INSNS (40),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (18),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (32),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (30),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (58),			/* cost of SQRTSD instruction.  */
   1, 2, 1, 1,				/* reassoc int, fp, vec_int, vec_fp.  */
   generic_memcpy,
   generic_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
 /* core_cost should produce code tuned for Core familly of CPUs.  */
@@ -2021,63 +2251,78 @@ struct processor_costs core_cost = {
    COSTS_N_INSNS (4),			/*				 HI */
    COSTS_N_INSNS (3),			/*				 SI */
    COSTS_N_INSNS (4),			/*				 DI */
-   COSTS_N_INSNS (2)},			/*			      other */
+   COSTS_N_INSNS (4)},			/*			      other */
   0,					/* cost of multiply per each bit set */
-  {COSTS_N_INSNS (18),			/* cost of a divide/mod for QI */
-   COSTS_N_INSNS (26),			/*			    HI */
-   COSTS_N_INSNS (42),			/*			    SI */
-   COSTS_N_INSNS (74),			/*			    DI */
-   COSTS_N_INSNS (74)},			/*			    other */
+  {COSTS_N_INSNS (8),			/* cost of a divide/mod for QI */
+   COSTS_N_INSNS (8),			/*			    HI */
+   /* 8-11 */
+   COSTS_N_INSNS (11),			/*			    SI */
+   /* 24-81 */
+   COSTS_N_INSNS (81),			/*			    DI */
+   COSTS_N_INSNS (81)},			/*			    other */
   COSTS_N_INSNS (1),			/* cost of movsx */
   COSTS_N_INSNS (1),			/* cost of movzx */
   8,					/* "large" insn */
   17,					/* MOVE_RATIO */
-  4,				     /* cost for loading QImode using movzbl */
+
+  /* All move costs are relative to integer->integer move times 2 and thus
+     they are latency*2. */
+  6,				     /* cost for loading QImode using movzbl */
   {4, 4, 4},				/* cost of loading integer registers
 					   in QImode, HImode and SImode.
 					   Relative to reg-reg move (2).  */
-  {4, 4, 4},				/* cost of storing integer registers */
-  4,					/* cost of reg,reg fld/fst */
-  {12, 12, 12},				/* cost of loading fp registers
+  {6, 6, 6},				/* cost of storing integer registers */
+  2,					/* cost of reg,reg fld/fst */
+  {6, 6, 8},				/* cost of loading fp registers
 					   in SFmode, DFmode and XFmode */
-  {6, 6, 8},				/* cost of storing fp registers
+  {6, 6, 10},				/* cost of storing fp registers
 					   in SFmode, DFmode and XFmode */
   2,					/* cost of moving MMX register */
-  {8, 8},				/* cost of loading MMX registers
+  {6, 6},				/* cost of loading MMX registers
 					   in SImode and DImode */
-  {8, 8},				/* cost of storing MMX registers
+  {6, 6},				/* cost of storing MMX registers
 					   in SImode and DImode */
-  2,					/* cost of moving SSE register */
-  {8, 8, 8},				/* cost of loading SSE registers
-					   in SImode, DImode and TImode */
-  {8, 8, 8},				/* cost of storing SSE registers
-					   in SImode, DImode and TImode */
-  5,					/* MMX or SSE register to integer */
+  2, 2, 4,				/* cost of moving XMM,YMM,ZMM register */
+  {6, 6, 6, 6, 12},			/* cost of loading SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {6, 6, 6, 6, 12},			/* cost of unaligned loads.  */
+  {6, 6, 6, 6, 12},			/* cost of storing SSE registers
+					   in 32,64,128,256 and 512-bit */
+  {6, 6, 6, 6, 12},			/* cost of unaligned stores.  */
+  2, 2,					/* SSE->integer and integer->SSE moves */
+  /* VGATHERDPD is 7 uops, rec throughput 5, while VGATHERDPD is 9 uops,
+     rec. throughput 6.
+     So 5 uops statically and one uops per load.  */
+  10, 6,				/* Gather load static, per_elt.  */
+  10, 6,				/* Gather store static, per_elt.  */
   64,					/* size of l1 cache.  */
   512,					/* size of l2 cache.  */
   64,					/* size of prefetch block */
   6,					/* number of parallel prefetches */
   /* FIXME perhaps more appropriate value is 5.  */
   3,					/* Branch cost */
-  COSTS_N_INSNS (8),			/* cost of FADD and FSUB insns.  */
-  COSTS_N_INSNS (8),			/* cost of FMUL instruction.  */
-  COSTS_N_INSNS (20),			/* cost of FDIV instruction.  */
-  COSTS_N_INSNS (8),			/* cost of FABS instruction.  */
-  COSTS_N_INSNS (8),			/* cost of FCHS instruction.  */
-  COSTS_N_INSNS (40),			/* cost of FSQRT instruction.  */
+  COSTS_N_INSNS (3),			/* cost of FADD and FSUB insns.  */
+  COSTS_N_INSNS (5),			/* cost of FMUL instruction.  */
+  /* 10-24 */
+  COSTS_N_INSNS (24),			/* cost of FDIV instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FABS instruction.  */
+  COSTS_N_INSNS (1),			/* cost of FCHS instruction.  */
+  COSTS_N_INSNS (23),			/* cost of FSQRT instruction.  */
+
+  COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
+  COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
+  COSTS_N_INSNS (4),			/* cost of MULSS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
+  COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
+  COSTS_N_INSNS (18),			/* cost of DIVSS instruction.  */
+  COSTS_N_INSNS (32),			/* cost of DIVSD instruction.  */
+  COSTS_N_INSNS (30),			/* cost of SQRTSS instruction.  */
+  COSTS_N_INSNS (58),			/* cost of SQRTSD instruction.  */
   1, 4, 2, 2,				/* reassoc int, fp, vec_int, vec_fp.  */
   core_memcpy,
   core_memset,
-  1,					/* scalar_stmt_cost.  */
-  1,					/* scalar load_cost.  */
-  1,					/* scalar_store_cost.  */
-  1,					/* vec_stmt_cost.  */
-  1,					/* vec_to_scalar_cost.  */
-  1,					/* scalar_to_vec_cost.  */
-  1,					/* vec_align_load_cost.  */
-  2,					/* vec_unalign_load_cost.  */
-  1,					/* vec_store_cost.  */
-  3,					/* cond_taken_branch_cost.  */
-  1,					/* cond_not_taken_branch_cost.  */
+  COSTS_N_INSNS (3),			/* cond_taken_branch_cost.  */
+  COSTS_N_INSNS (1),			/* cond_not_taken_branch_cost.  */
 };
 
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 9d01761eff9..99282c88341 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -48,7 +48,8 @@ DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
    over partial stores.  For example preffer MOVZBL or MOVQ to load 8bit
    value over movb.  */
 DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
-          m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_INTEL
+          m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
+	  | m_BONNELL | m_SILVERMONT | m_INTEL
 	  | m_KNL | m_KNM | m_AMD_MULTIPLE | m_GENERIC)
 
 /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: This knob promotes all store
@@ -84,8 +85,9 @@ DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, "partial_flag_reg_stall",
 /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
    partial dependencies.  */
 DEF_TUNE (X86_TUNE_MOVX, "movx",
-          m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT
-	  | m_KNL | m_KNM | m_INTEL | m_GEODE | m_AMD_MULTIPLE  | m_GENERIC)
+          m_PPRO | m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
+	  | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
+	  | m_GEODE | m_AMD_MULTIPLE  | m_GENERIC)
 
 /* X86_TUNE_MEMORY_MISMATCH_STALL: Avoid partial stores that are followed by
    full sized loads.  */
@@ -218,10 +220,15 @@ DEF_TUNE (X86_TUNE_LCP_STALL, "lcp_stall", m_CORE_ALL | m_GENERIC)
    as "add mem, reg".  */
 DEF_TUNE (X86_TUNE_READ_MODIFY, "read_modify", ~(m_PENT | m_LAKEMONT | m_PPRO))
 
-/* X86_TUNE_USE_INCDEC: Enable use of inc/dec instructions.   */
+/* X86_TUNE_USE_INCDEC: Enable use of inc/dec instructions.
+
+   Core2 and nehalem has stall of 7 cycles for partial flag register stalls.
+   Sandy bridge and Ivy bridge generate extra uop.  On Haswell this extra uop
+   is output only when the values needs to be really merged, which is not
+   done by GCC generated code.  */
 DEF_TUNE (X86_TUNE_USE_INCDEC, "use_incdec",
-          ~(m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_INTEL
-	   |  m_KNL | m_KNM | m_GENERIC))
+          ~(m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
+	    | m_BONNELL | m_SILVERMONT | m_INTEL |  m_KNL | m_KNM | m_GENERIC))
 
 /* X86_TUNE_INTEGER_DFMODE_MOVES: Enable if integer moves are preferred
    for DFmode copies */
@@ -364,7 +371,7 @@ DEF_TUNE (X86_TUNE_SSE_LOAD0_BY_PXOR, "sse_load0_by_pxor",
    to SSE registers.  If disabled, the moves will be done by storing
    the value to memory and reloading.  */
 DEF_TUNE (X86_TUNE_INTER_UNIT_MOVES_TO_VEC, "inter_unit_moves_to_vec",
-          ~(m_AMD_MULTIPLE | m_GENERIC))
+          ~(m_ATHLON_K8 | m_AMDFAM10 | m_BDVER | m_BTVER | m_GENERIC))
 
 /* X86_TUNE_INTER_UNIT_MOVES_TO_VEC: Enable moves in from SSE
    to integer registers.  If disabled, the moves will be done by storing
diff --git a/gcc/config/ia64/ia64.h b/gcc/config/ia64/ia64.h
index e7073d1cf20..eceab5f23b6 100644
--- a/gcc/config/ia64/ia64.h
+++ b/gcc/config/ia64/ia64.h
@@ -1470,7 +1470,7 @@ do {									\
 /* Likewise.  */
 
 
-/* Macros for SDB and Dwarf Output.  */
+/* Macros for Dwarf Output.  */
 
 /* Define this macro if GCC should produce dwarf version 2 format debugging
    output in response to the `-g' option.  */
diff --git a/gcc/config/m68k/m68kelf.h b/gcc/config/m68k/m68kelf.h
index fb1a0a4d917..159223f64c7 100644
--- a/gcc/config/m68k/m68kelf.h
+++ b/gcc/config/m68k/m68kelf.h
@@ -97,7 +97,7 @@ do {								\
 
 /* Define how the m68k registers should be numbered for Dwarf output.
    The numbering provided here should be compatible with the native
-   SVR4 SDB debugger in the m68k/SVR4 reference port, where d0-d7
+   SVR4 debugger in the m68k/SVR4 reference port, where d0-d7
    are 0-7, a0-a8 are 8-15, and fp0-fp7 are 16-23.  */
 
 #undef DBX_REGISTER_NUMBER
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 550d283158e..f5c28bf70e3 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1314,9 +1314,7 @@ struct mips_cpu_info {
 %{g} %{g0} %{g1} %{g2} %{g3} \
 %{ggdb:-g} %{ggdb0:-g0} %{ggdb1:-g1} %{ggdb2:-g2} %{ggdb3:-g3} \
 %{gstabs:-g} %{gstabs0:-g0} %{gstabs1:-g1} %{gstabs2:-g2} %{gstabs3:-g3} \
-%{gstabs+:-g} %{gstabs+0:-g0} %{gstabs+1:-g1} %{gstabs+2:-g2} %{gstabs+3:-g3} \
-%{gcoff:-g} %{gcoff0:-g0} %{gcoff1:-g1} %{gcoff2:-g2} %{gcoff3:-g3} \
-%{gcoff*:-mdebug} %{!gcoff*:-no-mdebug}"
+%{gstabs+:-g} %{gstabs+0:-g0} %{gstabs+1:-g1} %{gstabs+2:-g2} %{gstabs+3:-g3}"
 #endif
 
 /* FP_ASM_SPEC represents the floating-point options that must be passed
diff --git a/gcc/config/mmix/mmix.h b/gcc/config/mmix/mmix.h
index 5dafe2dbf98..2ee3592f3c8 100644
--- a/gcc/config/mmix/mmix.h
+++ b/gcc/config/mmix/mmix.h
@@ -761,7 +761,7 @@ typedef struct { int regs; int lib; } CUMULATIVE_ARGS;
 /* (empty) */
 
 
-/* Node: SDB and DWARF */
+/* Node: DWARF */
 #define DWARF2_DEBUGGING_INFO 1
 #define DWARF2_ASM_LINE_DEBUG_INFO 1
 
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 8b025c755bf..04fe58ed2fc 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -753,6 +753,10 @@ hwmult_name (unsigned int val)
 static void
 msp430_option_override (void)
 {
+  /* The MSP430 architecture can safely dereference a NULL pointer. In fact,
+  there are memory mapped registers there.  */
+  flag_delete_null_pointer_checks = 0;
+
   init_machine_status = msp430_init_machine_status;
 
   if (target_cpu)
diff --git a/gcc/config/nds32/nds32.c b/gcc/config/nds32/nds32.c
index add64ee4a80..6657b354a4b 100644
--- a/gcc/config/nds32/nds32.c
+++ b/gcc/config/nds32/nds32.c
@@ -3765,7 +3765,7 @@ nds32_target_alignment (rtx_insn *label)
 
 /* -- File Names in DBX Format.  */
 
-/* -- Macros for SDB and DWARF Output.  */
+/* -- Macros for DWARF Output.  */
 
 /* -- Macros for VMS Debug Format.  */
 
diff --git a/gcc/config/nios2/constraints.md b/gcc/config/nios2/constraints.md
index c6c539265ac..51f71cf742e 100644
--- a/gcc/config/nios2/constraints.md
+++ b/gcc/config/nios2/constraints.md
@@ -95,8 +95,8 @@
        (match_test "TARGET_ARCH_R2 && ANDCLEAR_INT (ival)")))
 
 (define_constraint "S"
-  "An immediate stored in small data, accessible by GP."
-  (match_test "gprel_constant_p (op)"))
+  "An immediate stored in small data, accessible by GP, or by offset from r0."
+  (match_test "gprel_constant_p (op) || r0rel_constant_p (op)"))
 
 (define_constraint "T"
   "A constant unspec offset representing a relocation."
diff --git a/gcc/config/nios2/nios2-protos.h b/gcc/config/nios2/nios2-protos.h
index 4478334970c..84d450bfe94 100644
--- a/gcc/config/nios2/nios2-protos.h
+++ b/gcc/config/nios2/nios2-protos.h
@@ -30,6 +30,11 @@ extern bool nios2_expand_return (void);
 extern void nios2_function_profiler (FILE *, int);
 
 #ifdef RTX_CODE
+extern bool nios2_large_constant_p (rtx);
+extern bool nios2_symbolic_memory_operand_p (rtx);
+
+extern rtx nios2_split_large_constant (rtx, rtx);
+extern rtx nios2_split_symbolic_memory_operand (rtx);
 extern bool nios2_emit_move_sequence (rtx *, machine_mode);
 extern void nios2_emit_expensive_div (rtx *, machine_mode);
 extern void nios2_adjust_call_address (rtx *, rtx);
@@ -47,6 +52,7 @@ extern const char * nios2_add_insn_asm (rtx_insn *, rtx *);
 
 extern bool nios2_legitimate_pic_operand_p (rtx);
 extern bool gprel_constant_p (rtx);
+extern bool r0rel_constant_p (rtx);
 extern bool nios2_regno_ok_for_base_p (int, bool);
 extern bool nios2_unspec_reloc_p (rtx);
 
diff --git a/gcc/config/nios2/nios2.c b/gcc/config/nios2/nios2.c
index 2a23886da48..f0bc668edd1 100644
--- a/gcc/config/nios2/nios2.c
+++ b/gcc/config/nios2/nios2.c
@@ -50,11 +50,14 @@
 #include "langhooks.h"
 #include "stor-layout.h"
 #include "builtins.h"
+#include "tree-pass.h"
+#include "xregex.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
 
 /* Forward function declarations.  */
+static bool nios2_symbolic_constant_p (rtx);
 static bool prologue_saved_reg_p (unsigned);
 static void nios2_load_pic_register (void);
 static void nios2_register_custom_code (unsigned int, enum nios2_ccs_code, int);
@@ -62,6 +65,7 @@ static const char *nios2_unspec_reloc_name (int);
 static void nios2_register_builtin_fndecl (unsigned, tree);
 static rtx nios2_ldst_parallel (bool, bool, bool, rtx, int,
 				unsigned HOST_WIDE_INT, bool);
+static int nios2_address_cost (rtx, machine_mode, addr_space_t, bool);
 
 /* Threshold for data being put into the small data/bss area, instead
    of the normal data area (references to the small data/bss area take
@@ -102,6 +106,10 @@ static int custom_code_index[256];
 /* Set to true if any conflicts (re-use of a code between 0-255) are found.  */
 static bool custom_code_conflict = false;
 
+/* State for command-line options.  */
+regex_t nios2_gprel_sec_regex;
+regex_t nios2_r0rel_sec_regex;
+
 
 /* Definition of builtin function types for nios2.  */
 
@@ -1108,7 +1116,9 @@ nios2_initial_elimination_offset (int from, int to)
   switch (from)
     {
     case FRAME_POINTER_REGNUM:
-      offset = cfun->machine->args_size;
+      /* This is the high end of the local variable storage, not the
+	 hard frame pointer.  */
+      offset = cfun->machine->args_size + cfun->machine->var_size;
       break;
 
     case ARG_POINTER_REGNUM:
@@ -1370,6 +1380,31 @@ nios2_option_override (void)
 	nios2_gpopt_option = gpopt_local;
     }
 
+  /* GP-relative and r0-relative addressing don't make sense for PIC.  */
+  if (flag_pic)
+    {
+      if (nios2_gpopt_option != gpopt_none)
+	error ("-mgpopt not supported with PIC.");
+      if (nios2_gprel_sec)
+	error ("-mgprel-sec= not supported with PIC.");
+      if (nios2_r0rel_sec)
+	error ("-mr0rel-sec= not supported with PIC.");
+    }
+
+  /* Process -mgprel-sec= and -m0rel-sec=.  */
+  if (nios2_gprel_sec)
+    {
+      if (regcomp (&nios2_gprel_sec_regex, nios2_gprel_sec, 
+		   REG_EXTENDED | REG_NOSUB))
+	error ("-mgprel-sec= argument is not a valid regular expression.");
+    }
+  if (nios2_r0rel_sec)
+    {
+      if (regcomp (&nios2_r0rel_sec_regex, nios2_r0rel_sec, 
+		   REG_EXTENDED | REG_NOSUB))
+	error ("-mr0rel-sec= argument is not a valid regular expression.");
+    }
+
   /* If we don't have mul, we don't have mulx either!  */
   if (!TARGET_HAS_MUL && TARGET_HAS_MULX)
     target_flags &= ~MASK_HAS_MULX;
@@ -1430,29 +1465,25 @@ nios2_simple_const_p (const_rtx cst)
    cost has been computed, and false if subexpressions should be
    scanned.  In either case, *TOTAL contains the cost result.  */
 static bool
-nios2_rtx_costs (rtx x, machine_mode mode ATTRIBUTE_UNUSED,
-		 int outer_code ATTRIBUTE_UNUSED,
-		 int opno ATTRIBUTE_UNUSED,
-		 int *total, bool speed ATTRIBUTE_UNUSED)
+nios2_rtx_costs (rtx x, machine_mode mode,
+		 int outer_code,
+		 int opno,
+		 int *total, bool speed)
 {
   int code = GET_CODE (x);
 
   switch (code)
     {
       case CONST_INT:
-        if (INTVAL (x) == 0)
+        if (INTVAL (x) == 0 || nios2_simple_const_p (x))
           {
             *total = COSTS_N_INSNS (0);
             return true;
           }
-        else if (nios2_simple_const_p (x))
-          {
-            *total = COSTS_N_INSNS (2);
-            return true;
-          }
         else
           {
-            *total = COSTS_N_INSNS (4);
+	    /* High + lo_sum.  */
+            *total = COSTS_N_INSNS (1);
             return true;
           }
 
@@ -1460,10 +1491,30 @@ nios2_rtx_costs (rtx x, machine_mode mode ATTRIBUTE_UNUSED,
       case SYMBOL_REF:
       case CONST:
       case CONST_DOUBLE:
-        {
-          *total = COSTS_N_INSNS (4);
-          return true;
-        }
+	if (gprel_constant_p (x) || r0rel_constant_p (x))
+          {
+            *total = COSTS_N_INSNS (1);
+            return true;
+          }
+	else
+	  {
+	    /* High + lo_sum.  */
+	    *total = COSTS_N_INSNS (1);
+	    return true;
+	  }
+
+      case HIGH:
+	{
+	  /* This is essentially a constant.  */
+	  *total = COSTS_N_INSNS (0);
+	  return true;
+	}
+
+      case LO_SUM:
+	{
+	  *total = COSTS_N_INSNS (0);
+	  return true;
+	}
 
       case AND:
 	{
@@ -1477,29 +1528,83 @@ nios2_rtx_costs (rtx x, machine_mode mode ATTRIBUTE_UNUSED,
 	  return false;
 	}
 
+      /* For insns that have an execution latency (3 cycles), don't
+	 penalize by the full amount since we can often schedule
+	 to avoid it.  */
       case MULT:
         {
-          *total = COSTS_N_INSNS (1);
+	  if (!TARGET_HAS_MUL)
+	    *total = COSTS_N_INSNS (5);  /* Guess?  */
+	  else if (speed)
+	    *total = COSTS_N_INSNS (2);  /* Latency adjustment.  */
+	  else 
+	    *total = COSTS_N_INSNS (1);
           return false;
         }
-      case SIGN_EXTEND:
+
+      case DIV:
         {
-          *total = COSTS_N_INSNS (3);
+	  if (!TARGET_HAS_DIV)
+	    *total = COSTS_N_INSNS (5);  /* Guess?  */
+	  else if (speed)
+	    *total = COSTS_N_INSNS (2);  /* Latency adjustment.  */
+	  else 
+	    *total = COSTS_N_INSNS (1);
           return false;
         }
-      case ZERO_EXTEND:
+
+      case ASHIFT:
+      case ASHIFTRT:
+      case LSHIFTRT:
+      case ROTATE:
         {
-          *total = COSTS_N_INSNS (1);
+	  if (!speed)
+	    *total = COSTS_N_INSNS (1);
+	  else 
+	    *total = COSTS_N_INSNS (2);  /* Latency adjustment.  */
           return false;
         }
+	
+      case ZERO_EXTRACT:
+	if (TARGET_HAS_BMX)
+	  {
+	    *total = COSTS_N_INSNS (1);
+	    return true;
+	  }
+	return false;
 
-    case ZERO_EXTRACT:
-      if (TARGET_HAS_BMX)
+      case SIGN_EXTEND:
+        {
+	  if (MEM_P (XEXP (x, 0)))
+	    *total = COSTS_N_INSNS (1);
+	  else
+	    *total = COSTS_N_INSNS (3);
+	  return false;
+	}
+
+      case MEM:
 	{
-          *total = COSTS_N_INSNS (1);
-          return true;
+	  rtx addr = XEXP (x, 0);
+
+	  /* Account for cost of different addressing modes.  */
+	  *total = nios2_address_cost (addr, mode, ADDR_SPACE_GENERIC, speed);
+
+	  if (outer_code == SET && opno == 0)
+	    /* Stores execute in 1 cycle accounted for by
+	       the outer SET.  */
+	    ;
+	  else if (outer_code == SET || outer_code == SIGN_EXTEND
+		   || outer_code == ZERO_EXTEND)
+	    /* Latency adjustment.  */
+	    {
+	      if (speed)
+		*total += COSTS_N_INSNS (1);
+	    }
+	  else
+	    /* This is going to have to be split into a load.  */
+	    *total += COSTS_N_INSNS (speed ? 2 : 1);
+	  return true;
 	}
-      return false;
 
       default:
         return false;
@@ -1904,7 +2009,53 @@ nios2_validate_compare (machine_mode mode, rtx *cmp, rtx *op1, rtx *op2)
 }
 
 
-/* Addressing Modes.  */
+/* Addressing modes and constants.  */
+
+/* Symbolic constants are split into high/lo_sum pairs during the 
+   split1 pass.  After that, they are not considered legitimate addresses.
+   This function returns true if in a pre-split context where these
+   constants are allowed.  */
+static bool
+nios2_symbolic_constant_allowed (void)
+{
+  /* The reload_completed check is for the benefit of
+     nios2_asm_output_mi_thunk and perhaps other places that try to
+     emulate a post-reload pass.  */
+  return !(cfun->curr_properties & PROP_rtl_split_insns) && !reload_completed;
+}
+
+/* Return true if X is constant expression with a reference to an
+   "ordinary" symbol; not GOT-relative, not GP-relative, not TLS.  */
+static bool
+nios2_symbolic_constant_p (rtx x)
+{
+  rtx base, offset;
+
+  if (flag_pic)
+    return false;
+  if (GET_CODE (x) == LABEL_REF)
+    return true;
+  else if (CONSTANT_P (x))
+    {
+      split_const (x, &base, &offset);
+      return (SYMBOL_REF_P (base)
+		&& !SYMBOL_REF_TLS_MODEL (base)
+		&& !gprel_constant_p (base)
+		&& !r0rel_constant_p (base)
+		&& SMALL_INT (INTVAL (offset)));
+    }
+  return false;
+}
+
+/* Return true if X is an expression of the form 
+   (PLUS reg symbolic_constant).  */
+static bool
+nios2_plus_symbolic_constant_p (rtx x)
+{
+  return (GET_CODE (x) == PLUS
+	  && REG_P (XEXP (x, 0))
+	  && nios2_symbolic_constant_p (XEXP (x, 1)));
+}
 
 /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
 static bool
@@ -1973,6 +2124,8 @@ nios2_valid_addr_expr_p (rtx base, rtx offset, bool strict_p)
 	  && nios2_regno_ok_for_base_p (REGNO (base), strict_p)
 	  && (offset == NULL_RTX
 	      || nios2_valid_addr_offset_p (offset)
+	      || (nios2_symbolic_constant_allowed () 
+		  && nios2_symbolic_constant_p (offset))
 	      || nios2_unspec_reloc_p (offset)));
 }
 
@@ -1990,11 +2143,16 @@ nios2_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
 
       /* Else, fall through.  */
     case CONST:
-      if (gprel_constant_p (operand))
+      if (gprel_constant_p (operand) || r0rel_constant_p (operand))
 	return true;
 
       /* Else, fall through.  */
     case LABEL_REF:
+      if (nios2_symbolic_constant_allowed () 
+	  && nios2_symbolic_constant_p (operand))
+	return true;
+
+      /* Else, fall through.  */
     case CONST_INT:
     case CONST_DOUBLE:
       return false;
@@ -2009,9 +2167,28 @@ nios2_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
         rtx op0 = XEXP (operand, 0);
         rtx op1 = XEXP (operand, 1);
 
-	return (nios2_valid_addr_expr_p (op0, op1, strict_p)
-		|| nios2_valid_addr_expr_p (op1, op0, strict_p));
+	if (nios2_valid_addr_expr_p (op0, op1, strict_p) 
+	    || nios2_valid_addr_expr_p (op1, op0, strict_p))
+	  return true;
       }
+      break;
+
+      /* %lo(constant)(reg)
+	 This requires a 16-bit relocation and isn't valid with R2
+	 io-variant load/stores.  */
+    case LO_SUM:
+      if (TARGET_ARCH_R2 
+	  && (TARGET_BYPASS_CACHE || TARGET_BYPASS_CACHE_VOLATILE))
+	return false;
+      else
+	{
+	  rtx op0 = XEXP (operand, 0);
+	  rtx op1 = XEXP (operand, 1);
+
+	  return (REG_P (op0)
+		  && nios2_regno_ok_for_base_p (REGNO (op0), strict_p)
+		  && nios2_large_constant_p (op1));
+	}
 
     default:
       break;
@@ -2019,6 +2196,106 @@ nios2_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
   return false;
 }
 
+/* Implement TARGET_ADDRESS_COST.
+   Experimentation has shown that we get better code by penalizing the
+   the (plus reg symbolic_constant) and (plus reg (const ...)) forms
+   but giving (plus reg symbol_ref) address modes the same cost as those
+   that don't require splitting.  Also, from a theoretical point of view:
+   - This is in line with the recommendation in the GCC internals 
+     documentation to make address forms involving multiple
+     registers more expensive than single-register forms.  
+   - OTOH it still encourages fwprop1 to propagate constants into 
+     address expressions more aggressively.
+   - We should discourage splitting (symbol + offset) into hi/lo pairs
+     to allow CSE'ing the symbol when it's used with more than one offset,
+     but not so heavily as to avoid this addressing mode at all.  */
+static int
+nios2_address_cost (rtx address, 
+		    machine_mode mode ATTRIBUTE_UNUSED,
+		    addr_space_t as ATTRIBUTE_UNUSED, 
+		    bool speed ATTRIBUTE_UNUSED)
+{
+  if (nios2_plus_symbolic_constant_p (address))
+    return COSTS_N_INSNS (1);
+  if (nios2_symbolic_constant_p (address))
+    {
+      if (GET_CODE (address) == CONST)
+	return COSTS_N_INSNS (1);
+      else
+	return COSTS_N_INSNS (0);
+    }
+  return COSTS_N_INSNS (0);
+}
+
+/* Return true if X is a MEM whose address expression involves a symbolic
+   constant.  */
+bool
+nios2_symbolic_memory_operand_p (rtx x)
+{
+  rtx addr;
+
+  if (GET_CODE (x) != MEM)
+    return false;
+  addr = XEXP (x, 0);
+
+  return (nios2_symbolic_constant_p (addr)
+	  || nios2_plus_symbolic_constant_p (addr));
+}
+
+
+/* Return true if X is something that needs to be split into a 
+   high/lo_sum pair.  */
+bool
+nios2_large_constant_p (rtx x)
+{
+  return (nios2_symbolic_constant_p (x)
+	  || nios2_large_unspec_reloc_p (x));
+}
+
+/* Given an RTX X that satisfies nios2_large_constant_p, split it into
+   high and lo_sum parts using TEMP as a scratch register.  Emit the high 
+   instruction and return the lo_sum expression.  */
+rtx
+nios2_split_large_constant (rtx x, rtx temp)
+{
+  emit_insn (gen_rtx_SET (temp, gen_rtx_HIGH (Pmode, copy_rtx (x))));
+  return gen_rtx_LO_SUM (Pmode, temp, copy_rtx (x));
+}
+
+/* Split an RTX of the form
+     (plus op0 op1)
+   where op1 is a large constant into
+     (set temp (high op1))
+     (set temp (plus op0 temp))
+     (lo_sum temp op1)
+   returning the lo_sum expression as the value.  */
+static rtx
+nios2_split_plus_large_constant (rtx op0, rtx op1)
+{
+  rtx temp = gen_reg_rtx (Pmode);
+  op0 = force_reg (Pmode, op0);
+
+  emit_insn (gen_rtx_SET (temp, gen_rtx_HIGH (Pmode, copy_rtx (op1))));
+  emit_insn (gen_rtx_SET (temp, gen_rtx_PLUS (Pmode, op0, temp)));
+  return gen_rtx_LO_SUM (Pmode, temp, copy_rtx (op1));
+}
+
+/* Given a MEM OP with an address that includes a splittable symbol,
+   emit some instructions to do the split and return a new MEM.  */
+rtx
+nios2_split_symbolic_memory_operand (rtx op)
+{
+  rtx addr = XEXP (op, 0);
+
+  if (nios2_symbolic_constant_p (addr))
+    addr = nios2_split_large_constant (addr, gen_reg_rtx (Pmode));
+  else if (nios2_plus_symbolic_constant_p (addr))
+    addr = nios2_split_plus_large_constant (XEXP (addr, 0), XEXP (addr, 1));
+  else
+    gcc_unreachable ();
+  return replace_equiv_address (op, addr, false);
+}
+
 /* Return true if SECTION is a small section name.  */
 static bool
 nios2_small_section_name_p (const char *section)
@@ -2026,7 +2303,17 @@ nios2_small_section_name_p (const char *section)
   return (strcmp (section, ".sbss") == 0
 	  || strncmp (section, ".sbss.", 6) == 0
 	  || strcmp (section, ".sdata") == 0
-	  || strncmp (section, ".sdata.", 7) == 0);
+	  || strncmp (section, ".sdata.", 7) == 0
+	  || (nios2_gprel_sec 
+	      && regexec (&nios2_gprel_sec_regex, section, 0, NULL, 0) == 0));
+}
+
+/* Return true if SECTION is a r0-relative section name.  */
+static bool
+nios2_r0rel_section_name_p (const char *section)
+{
+  return (nios2_r0rel_sec 
+	  && regexec (&nios2_r0rel_sec_regex, section, 0, NULL, 0) == 0);
 }
 
 /* Return true if EXP should be placed in the small data section.  */
@@ -2135,6 +2422,33 @@ nios2_symbol_ref_in_small_data_p (rtx sym)
     }
 }
 
+/* Likewise for r0-relative addressing.  */
+static bool
+nios2_symbol_ref_in_r0rel_data_p (rtx sym)
+{
+  tree decl;
+
+  gcc_assert (GET_CODE (sym) == SYMBOL_REF);
+  decl = SYMBOL_REF_DECL (sym);
+
+  /* TLS variables are not accessed through r0.  */
+  if (SYMBOL_REF_TLS_MODEL (sym) != 0)
+    return false;
+
+  /* On Nios II R2, there is no r0-relative relocation that can be
+     used with "io" instructions.  So, if we are implicitly generating
+     those instructions, we cannot emit r0-relative accesses.  */
+  if (TARGET_ARCH_R2
+      && (TARGET_BYPASS_CACHE || TARGET_BYPASS_CACHE_VOLATILE))
+    return false;
+
+  /* If the user has explicitly placed the symbol in a r0rel section
+     via an attribute, generate r0-relative addressing.  */
+  if (decl && DECL_SECTION_NAME (decl))
+    return nios2_r0rel_section_name_p (DECL_SECTION_NAME (decl));
+  return false;
+}
+
 /* Implement TARGET_SECTION_TYPE_FLAGS.  */
 
 static unsigned int
@@ -2221,6 +2535,9 @@ nios2_legitimize_constant_address (rtx addr)
     base = nios2_legitimize_tls_address (base);
   else if (flag_pic)
     base = nios2_load_pic_address (base, UNSPEC_PIC_SYM, NULL_RTX);
+  else if (!nios2_symbolic_constant_allowed () 
+	   && nios2_symbolic_constant_p (addr))
+    return nios2_split_large_constant (addr, gen_reg_rtx (Pmode));
   else
     return addr;
 
@@ -2241,9 +2558,35 @@ static rtx
 nios2_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 			  machine_mode mode ATTRIBUTE_UNUSED)
 {
+  rtx op0, op1;
+  
   if (CONSTANT_P (x))
     return nios2_legitimize_constant_address (x);
 
+  /* Remaining cases all involve something + a constant.  */
+  if (GET_CODE (x) != PLUS)
+    return x;
+
+  op0 = XEXP (x, 0);
+  op1 = XEXP (x, 1);
+
+  /* Target-independent code turns (exp + constant) into plain
+     register indirect.  Although subsequent optimization passes will
+     eventually sort that out, ivopts uses the unoptimized form for
+     computing its cost model, so we get better results by generating
+     the correct form from the start.  */
+  if (nios2_valid_addr_offset_p (op1))
+    return gen_rtx_PLUS (Pmode, force_reg (Pmode, op0), copy_rtx (op1));
+
+  /* We may need to split symbolic constants now.  */
+  else if (nios2_symbolic_constant_p (op1))
+    {
+      if (nios2_symbolic_constant_allowed ())
+	return gen_rtx_PLUS (Pmode, force_reg (Pmode, op0), copy_rtx (op1));
+      else
+	return nios2_split_plus_large_constant (op0, op1);
+    }
+
   /* For the TLS LE (Local Exec) model, the compiler may try to
      combine constant offsets with unspec relocs, creating address RTXs
      looking like this:
@@ -2266,20 +2609,19 @@ nios2_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
                               (const_int 48 [0x30])))] UNSPEC_ADD_TLS_LE)))
 
      Which will be output as '%tls_le(var+48)(r23)' in assembly.  */
-  if (GET_CODE (x) == PLUS
-      && GET_CODE (XEXP (x, 1)) == CONST)
+  else if (GET_CODE (op1) == CONST)
     {
       rtx unspec, offset;
-      split_const (XEXP (x, 1), &unspec, &offset);
+      split_const (op1, &unspec, &offset);
       if (GET_CODE (unspec) == UNSPEC
 	  && !nios2_large_offset_p (XINT (unspec, 1))
 	  && offset != const0_rtx)
 	{
-	  rtx reg = force_reg (Pmode, XEXP (x, 0));
+	  rtx reg = force_reg (Pmode, op0);
 	  unspec = copy_rtx (unspec);
 	  XVECEXP (unspec, 0, 0)
 	    = plus_constant (Pmode, XVECEXP (unspec, 0, 0), INTVAL (offset));
-	  x = gen_rtx_PLUS (Pmode, reg, gen_rtx_CONST (Pmode, unspec));
+	  return gen_rtx_PLUS (Pmode, reg, gen_rtx_CONST (Pmode, unspec));
 	}
     }
 
@@ -2340,10 +2682,29 @@ nios2_emit_move_sequence (rtx *operands, machine_mode mode)
 	      return true;
 	    }
 	}
-      else if (!gprel_constant_p (from))
+      else if (gprel_constant_p (from) || r0rel_constant_p (from))
+	/* Handled directly by movsi_internal as gp + offset 
+	   or r0 + offset.  */
+	;
+      else if (nios2_large_constant_p (from))
+	/* This case covers either a regular symbol reference or an UNSPEC
+	   representing a 32-bit offset.  We split the former 
+	   only conditionally and the latter always.  */
 	{
-	  if (!nios2_large_unspec_reloc_p (from))
-	    from = nios2_legitimize_constant_address (from);
+	  if (!nios2_symbolic_constant_allowed () 
+	      || nios2_large_unspec_reloc_p (from))
+	    {
+	      rtx lo = nios2_split_large_constant (from, to);
+	      emit_insn (gen_rtx_SET (to, lo));
+	      set_unique_reg_note (get_last_insn (), REG_EQUAL,
+				   copy_rtx (operands[1]));
+	      return true;
+	    }
+	}
+      else 
+	/* This is a TLS or PIC symbol.  */
+	{
+	  from = nios2_legitimize_constant_address (from);
 	  if (CONSTANT_P (from))
 	    {
 	      emit_insn (gen_rtx_SET (to,
@@ -2654,6 +3015,7 @@ nios2_print_operand (FILE *file, rtx op, int letter)
       break;
     }
 
+  debug_rtx (op);
   output_operand_lossage ("Unsupported operand for code '%c'", letter);
   gcc_unreachable ();
 }
@@ -2672,6 +3034,20 @@ gprel_constant_p (rtx op)
   return false;
 }
 
+/* Likewise if this is a zero-relative accessible reference.  */
+bool
+r0rel_constant_p (rtx op)
+{
+  if (GET_CODE (op) == SYMBOL_REF
+      && nios2_symbol_ref_in_r0rel_data_p (op))
+    return true;
+  else if (GET_CODE (op) == CONST
+           && GET_CODE (XEXP (op, 0)) == PLUS)
+    return r0rel_constant_p (XEXP (XEXP (op, 0), 0));
+
+  return false;
+}
+
 /* Return the name string for a supported unspec reloc offset.  */
 static const char *
 nios2_unspec_reloc_name (int unspec)
@@ -2736,7 +3112,13 @@ nios2_print_operand_address (FILE *file, machine_mode mode, rtx op)
           fprintf (file, ")(%s)", reg_names[GP_REGNO]);
           return;
         }
-
+      else if (r0rel_constant_p (op))
+        {
+          fprintf (file, "%%lo(");
+          output_addr_const (file, op);
+          fprintf (file, ")(r0)");
+          return;
+        }
       break;
 
     case PLUS:
@@ -2759,6 +3141,20 @@ nios2_print_operand_address (FILE *file, machine_mode mode, rtx op)
       }
       break;
 
+    case LO_SUM:
+      {
+        rtx op0 = XEXP (op, 0);
+        rtx op1 = XEXP (op, 1);
+
+	if (REG_P (op0) && CONSTANT_P (op1))
+	  {
+	    nios2_print_operand (file, op1, 'L');
+	    fprintf (file, "(%s)", reg_names[REGNO (op0)]);
+	    return;
+	  }
+      }
+      break;
+
     case REG:
       fprintf (file, "0(%s)", reg_names[REGNO (op)]);
       return;
@@ -4328,9 +4724,12 @@ nios2_cdx_narrow_form_p (rtx_insn *insn)
 		|| TARGET_BYPASS_CACHE)
 	      return false;
 	    addr = XEXP (mem, 0);
-	    /* GP-based references are never narrow.  */
-	    if (gprel_constant_p (addr))
+	    /* GP-based and R0-based references are never narrow.  */
+	    if (gprel_constant_p (addr) || r0rel_constant_p (addr))
 		return false;
+	    /* %lo requires a 16-bit relocation and is never narrow.  */
+	    if (GET_CODE (addr) == LO_SUM)
+	      return false;
 	    ret = split_mem_address (addr, &rhs1, &rhs2);
 	    gcc_assert (ret);
 	  }
@@ -4372,8 +4771,11 @@ nios2_cdx_narrow_form_p (rtx_insn *insn)
 	      || TARGET_BYPASS_CACHE)
 	    return false;
 	  addr = XEXP (mem, 0);
-	  /* GP-based references are never narrow.  */
-	  if (gprel_constant_p (addr))
+	  /* GP-based and r0-based references are never narrow.  */
+	  if (gprel_constant_p (addr) || r0rel_constant_p (addr))
+	    return false;
+	  /* %lo requires a 16-bit relocation and is never narrow.  */
+	  if (GET_CODE (addr) == LO_SUM)
 	    return false;
 	  ret = split_mem_address (addr, &rhs1, &rhs2);
 	  gcc_assert (ret);
@@ -5054,15 +5456,15 @@ nios2_adjust_reg_alloc_order (void)
 #undef TARGET_LEGITIMATE_ADDRESS_P
 #define TARGET_LEGITIMATE_ADDRESS_P nios2_legitimate_address_p
 
-#undef TARGET_LRA_P
-#define TARGET_LRA_P hook_bool_void_false
-
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS nios2_preferred_reload_class
 
 #undef TARGET_RTX_COSTS
 #define TARGET_RTX_COSTS nios2_rtx_costs
 
+#undef TARGET_ADDRESS_COST
+#define TARGET_ADDRESS_COST nios2_address_cost
+
 #undef TARGET_HAVE_TLS
 #define TARGET_HAVE_TLS TARGET_LINUX_ABI
 
diff --git a/gcc/config/nios2/nios2.h b/gcc/config/nios2/nios2.h
index 420543e4f46..9fdff024cd8 100644
--- a/gcc/config/nios2/nios2.h
+++ b/gcc/config/nios2/nios2.h
@@ -252,6 +252,7 @@ enum reg_class
 
 /* Stack layout.  */
 #define STACK_GROWS_DOWNWARD 1
+#define FRAME_GROWS_DOWNWARD 1
 #define FIRST_PARM_OFFSET(FUNDECL) 0
 
 /* Before the prologue, RA lives in r31.  */
diff --git a/gcc/config/nios2/nios2.md b/gcc/config/nios2/nios2.md
index 206ebce1c46..ef2883f2516 100644
--- a/gcc/config/nios2/nios2.md
+++ b/gcc/config/nios2/nios2.md
@@ -201,7 +201,7 @@
   "addi\\t%0, %1, %L2"
   [(set_attr "type" "alu")])
 
-(define_insn "movqi_internal"
+(define_insn_and_split "movqi_internal"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=m, r,r")
         (match_operand:QI 1 "general_operand"       "rM,m,rI"))]
   "(register_operand (operands[0], QImode)
@@ -224,20 +224,47 @@
 	gcc_unreachable ();
       }
   }
+  "(nios2_symbolic_memory_operand_p (operands[0]) 
+   || nios2_symbolic_memory_operand_p (operands[1]))"
+  [(set (match_dup 0) (match_dup 1))]
+  {
+    if (nios2_symbolic_memory_operand_p (operands[0]))
+      operands[0] = nios2_split_symbolic_memory_operand (operands[0]);
+    else
+      operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type" "st,ld,mov")])
 
-(define_insn "movhi_internal"
+(define_insn_and_split "movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=m, r,r")
         (match_operand:HI 1 "general_operand"       "rM,m,rI"))]
   "(register_operand (operands[0], HImode)
     || reg_or_0_operand (operands[1], HImode))"
-  "@
-    sth%o0%.\\t%z1, %0
-    ldhu%o1%.\\t%0, %1
-    mov%i1%.\\t%0, %z1"
+  {
+    switch (which_alternative)
+      {
+      case 0:
+        return "sth%o0%.\\t%z1, %0";
+      case 1:
+        return "ldhu%o1%.\\t%0, %1";
+      case 2:
+        return "mov%i1%.\\t%0, %z1";
+      default:
+	gcc_unreachable ();
+      }
+  }
+  "(nios2_symbolic_memory_operand_p (operands[0]) 
+   || nios2_symbolic_memory_operand_p (operands[1]))"
+  [(set (match_dup 0) (match_dup 1))]
+  {
+    if (nios2_symbolic_memory_operand_p (operands[0]))
+      operands[0] = nios2_split_symbolic_memory_operand (operands[0]);
+    else
+      operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type" "st,ld,mov")])
 
-(define_insn "movsi_internal"
+(define_insn_and_split "movsi_internal"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=m, r,r,   r")
         (match_operand:SI 1 "general_operand"       "rM,m,rIJK,S"))]
   "(register_operand (operands[0], SImode)
@@ -269,6 +296,18 @@
 	gcc_unreachable ();
       }
   }
+  "(nios2_symbolic_memory_operand_p (operands[0]) 
+    || nios2_symbolic_memory_operand_p (operands[1])
+    || nios2_large_constant_p (operands[1]))"
+  [(set (match_dup 0) (match_dup 1))]
+  {
+    if (nios2_symbolic_memory_operand_p (operands[0]))
+      operands[0] = nios2_split_symbolic_memory_operand (operands[0]);
+    else if (nios2_symbolic_memory_operand_p (operands[1]))
+      operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+    else
+      operands[1] = nios2_split_large_constant (operands[1], operands[0]);
+  }
   [(set_attr "type" "st,ld,mov,alu")])
 
 (define_mode_iterator BH [QI HI])
@@ -318,42 +357,62 @@
 (define_mode_iterator QX [HI SI])
 
 ;; Zero extension patterns
-(define_insn "zero_extendhisi2"
+(define_insn_and_split "zero_extendhisi2"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
         (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
   ""
   "@
     andi%.\\t%0, %1, 0xffff
     ldhu%o1%.\\t%0, %1"
+  "nios2_symbolic_memory_operand_p (operands[1])"
+  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))]
+  {
+    operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type"     "and,ld")])
 
-(define_insn "zero_extendqi<mode>2"
+(define_insn_and_split "zero_extendqi<mode>2"
   [(set (match_operand:QX 0 "register_operand" "=r,r")
         (zero_extend:QX (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   ""
   "@
     andi%.\\t%0, %1, 0xff
     ldbu%o1%.\\t%0, %1"
+  "nios2_symbolic_memory_operand_p (operands[1])"
+  [(set (match_dup 0) (zero_extend:QX (match_dup 1)))]
+  {
+    operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type"     "and,ld")])
 
 ;; Sign extension patterns
 
-(define_insn "extendhisi2"
+(define_insn_and_split "extendhisi2"
   [(set (match_operand:SI 0 "register_operand"                     "=r,r")
         (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand"  "r,m")))]
   ""
   "@
    #
    ldh%o1%.\\t%0, %1"
+  "nios2_symbolic_memory_operand_p (operands[1])"
+  [(set (match_dup 0) (sign_extend:SI (match_dup 1)))]
+  {
+    operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type" "alu,ld")])
 
-(define_insn "extendqi<mode>2"
+(define_insn_and_split "extendqi<mode>2"
   [(set (match_operand:QX 0 "register_operand"                     "=r,r")
         (sign_extend:QX (match_operand:QI 1 "nonimmediate_operand"  "r,m")))]
   ""
   "@
    #
    ldb%o1%.\\t%0, %1"
+  "nios2_symbolic_memory_operand_p (operands[1])"
+  [(set (match_dup 0) (sign_extend:QX (match_dup 1)))]
+  {
+    operands[1] = nios2_split_symbolic_memory_operand (operands[1]);
+  }
   [(set_attr "type" "alu,ld")])
 
 ;; Split patterns for register alternative cases.
diff --git a/gcc/config/nios2/nios2.opt b/gcc/config/nios2/nios2.opt
index 08cb93541ee..a50dbee3fa7 100644
--- a/gcc/config/nios2/nios2.opt
+++ b/gcc/config/nios2/nios2.opt
@@ -586,3 +586,11 @@ Enable generation of R2 BMX instructions.
 mcdx
 Target Report Mask(HAS_CDX)
 Enable generation of R2 CDX instructions.
+
+mgprel-sec=
+Target RejectNegative Joined Var(nios2_gprel_sec) Init(NULL)
+Regular expression matching additional GP-addressible section names.
+
+mr0rel-sec=
+Target RejectNegative Joined Var(nios2_r0rel_sec) Init(NULL)
+Regular expression matching section names for r0-relative addressing.
diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index 9ed929a301e..514de811f17 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -691,8 +691,8 @@ void hppa_profile_hook (int label_no);
 extern int may_call_alloca;
 
 #define EXIT_IGNORE_STACK	\
- (maybe_nonzero (get_frame_size ())	\
-  || cfun->calls_alloca || maybe_nonzero (crtl->outgoing_args_size))
+ (may_ne (get_frame_size (), 0)	\
+  || cfun->calls_alloca || may_ne (crtl->outgoing_args_size, 0))
 
 /* Length in units of the trampoline for entering a nested function.  */
 
diff --git a/gcc/config/powerpcspe/powerpcspe.c b/gcc/config/powerpcspe/powerpcspe.c
index 6da9f59148d..7a817fa3493 100644
--- a/gcc/config/powerpcspe/powerpcspe.c
+++ b/gcc/config/powerpcspe/powerpcspe.c
@@ -5860,6 +5860,7 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return 3;
 
       case unaligned_load:
+      case vector_gather_load:
 	if (TARGET_P9_VECTOR)
 	  return 3;
 
@@ -5901,6 +5902,7 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return 2;
 
       case unaligned_store:
+      case vector_scatter_store:
 	if (TARGET_EFFICIENT_UNALIGNED_VSX)
 	  return 1;
 
@@ -9539,6 +9541,8 @@ rs6000_delegitimize_address (rtx orig_x)
 static bool
 rs6000_const_not_ok_for_debug_p (rtx x)
 {
+  if (GET_CODE (x) == UNSPEC)
+    return true;
   if (GET_CODE (x) == SYMBOL_REF
       && CONSTANT_POOL_ADDRESS_P (x))
     {
diff --git a/gcc/config/riscv/pic.md b/gcc/config/riscv/pic.md
index 6a29ead32d3..03b8f9bc669 100644
--- a/gcc/config/riscv/pic.md
+++ b/gcc/config/riscv/pic.md
@@ -22,13 +22,20 @@
 ;; Simplify PIC loads to static variables.
 ;; These should go away once we figure out how to emit auipc discretely.
 
-(define_insn "*local_pic_load<mode>"
+(define_insn "*local_pic_load_s<mode>"
   [(set (match_operand:ANYI 0 "register_operand" "=r")
-	(mem:ANYI (match_operand 1 "absolute_symbolic_operand" "")))]
+	(sign_extend:ANYI (mem:ANYI (match_operand 1 "absolute_symbolic_operand" ""))))]
   "USE_LOAD_ADDRESS_MACRO (operands[1])"
   "<load>\t%0,%1"
   [(set (attr "length") (const_int 8))])
 
+(define_insn "*local_pic_load_u<mode>"
+  [(set (match_operand:ZERO_EXTEND_LOAD 0 "register_operand" "=r")
+	(zero_extend:ZERO_EXTEND_LOAD (mem:ZERO_EXTEND_LOAD (match_operand 1 "absolute_symbolic_operand" ""))))]
+  "USE_LOAD_ADDRESS_MACRO (operands[1])"
+  "<load>u\t%0,%1"
+  [(set (attr "length") (const_int 8))])
+
 (define_insn "*local_pic_load<mode>"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
 	(mem:ANYF (match_operand 1 "absolute_symbolic_operand" "")))
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 06106f22b8b..8ce93528b99 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -1334,6 +1334,22 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
       return true;
     }
 
+  /* RISC-V GCC may generate non-legitimate address due to we provide some
+     pattern for optimize access PIC local symbol and it's make GCC generate
+     unrecognizable instruction during optmizing.  */
+
+  if (MEM_P (dest) && !riscv_legitimate_address_p (mode, XEXP (dest, 0),
+						   reload_completed))
+    {
+      XEXP (dest, 0) = riscv_force_address (XEXP (dest, 0), mode);
+    }
+
+  if (MEM_P (src) && !riscv_legitimate_address_p (mode, XEXP (src, 0),
+						  reload_completed))
+    {
+      XEXP (src, 0) = riscv_force_address (XEXP (src, 0), mode);
+    }
+
   return false;
 }
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index fd9236c7c17..9f056bbcda4 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -259,6 +259,9 @@
 ;; Iterator for QImode extension patterns.
 (define_mode_iterator SUPERQI [HI SI (DI "TARGET_64BIT")])
 
+;; Iterator for extending loads.
+(define_mode_iterator ZERO_EXTEND_LOAD [QI HI (SI "TARGET_64BIT")])
+
 ;; Iterator for hardware integer modes narrower than XLEN.
 (define_mode_iterator SUBX [QI HI (SI "TARGET_64BIT")])
 
diff --git a/gcc/config/rl78/rl78-protos.h b/gcc/config/rl78/rl78-protos.h
index a155df61b99..976bffa61e7 100644
--- a/gcc/config/rl78/rl78-protos.h
+++ b/gcc/config/rl78/rl78-protos.h
@@ -54,3 +54,13 @@ void		rl78_output_aligned_common (FILE *, tree, const char *,
 					    int, int, int);
 
 int		rl78_one_far_p (rtx *operands, int num_operands);
+
+#ifdef RTX_CODE
+#ifdef HAVE_MACHINE_MODES
+
+rtx rl78_emit_libcall (const char*, enum rtx_code,
+                       enum machine_mode, enum machine_mode,
+                       int, rtx*);
+
+#endif
+#endif
diff --git a/gcc/config/rl78/rl78.c b/gcc/config/rl78/rl78.c
index ce66866ef84..d2baa8ccfae 100644
--- a/gcc/config/rl78/rl78.c
+++ b/gcc/config/rl78/rl78.c
@@ -362,6 +362,7 @@ rl78_option_override (void)
   if (TARGET_ES0
       && strcmp (lang_hooks.name, "GNU C")
       && strcmp (lang_hooks.name, "GNU C11")
+      && strcmp (lang_hooks.name, "GNU C17")
       && strcmp (lang_hooks.name, "GNU C89")
       && strcmp (lang_hooks.name, "GNU C99")
       /* Compiling with -flto results in a language of GNU GIMPLE being used... */
@@ -4793,6 +4794,45 @@ rl78_addsi3_internal (rtx * operands, unsigned int alternative)
     }
 }
 
+rtx
+rl78_emit_libcall (const char *name, enum rtx_code code,
+                   enum machine_mode dmode, enum machine_mode smode,
+                   int noperands, rtx *operands)
+{
+  rtx ret;
+  rtx_insn *insns;
+  rtx libcall;
+  rtx equiv;
+
+  start_sequence ();
+  libcall = gen_rtx_SYMBOL_REF (Pmode, name);
+
+  switch (noperands)
+    {
+    case 2:
+      ret = emit_library_call_value (libcall, NULL_RTX, LCT_CONST,
+                                     dmode, operands[1], smode);
+      equiv = gen_rtx_fmt_e (code, dmode, operands[1]);
+      break;
+
+    case 3:
+      ret = emit_library_call_value (libcall, NULL_RTX,
+                                     LCT_CONST, dmode,
+                                     operands[1], smode, operands[2],
+                                     smode);
+      equiv = gen_rtx_fmt_ee (code, dmode, operands[1], operands[2]);
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  insns = get_insns ();
+  end_sequence ();
+  emit_libcall_block (insns, operands[0], ret, equiv);
+  return ret;
+}
+
 
 #undef  TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS rl78_preferred_reload_class
diff --git a/gcc/config/rl78/rl78.md b/gcc/config/rl78/rl78.md
index 722d98439b2..c53ca0ff98d 100644
--- a/gcc/config/rl78/rl78.md
+++ b/gcc/config/rl78/rl78.md
@@ -224,6 +224,16 @@
    DONE;"
 )
 
+(define_expand "adddi3"
+  [(set (match_operand:DI          0 "nonimmediate_operand" "")
+	(plus:DI (match_operand:DI 1 "general_operand"      "")
+		 (match_operand:DI 2 "general_operand"      "")))
+   ]
+  ""
+  "rl78_emit_libcall (\"__adddi3\", PLUS, DImode, DImode, 3, operands);
+   DONE;"
+)
+
 (define_insn "addsi3_internal_virt"
   [(set (match_operand:SI          0 "nonimmediate_operand" "=v,&vm, vm")
 	(plus:SI (match_operand:SI 1 "general_operand"      "0, vim, vim")
@@ -258,6 +268,16 @@
   DONE;"
 )
 
+(define_expand "subdi3"
+ [(set (match_operand:DI          0 "nonimmediate_operand" "")
+    (minus:DI (match_operand:DI 1 "general_operand"      "")
+         (match_operand:DI    2 "general_operand"      "")))
+   ]
+  ""
+  "rl78_emit_libcall (\"__subdi3\", MINUS, DImode, DImode, 3, operands);
+   DONE;"
+)
+
 (define_insn "subsi3_internal_virt"
   [(set (match_operand:SI           0 "nonimmediate_operand" "=v,&vm, vm")
 	(minus:SI (match_operand:SI 1 "general_operand"      "0, vim, vim")
diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index 36c4a522b4f..31fda583c2c 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -77,6 +77,9 @@
 #undef  TARGET_IEEEQUAD
 #define TARGET_IEEEQUAD 0
 
+#undef  TARGET_IEEEQUAD_DEFAULT
+#define TARGET_IEEEQUAD_DEFAULT 0
+
 /* The AIX linker will discard static constructors in object files before
    collect has a chance to see them, so scan the object files directly.  */
 #define COLLECT_EXPORT_LIST
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 8f2631ccc7c..d0fcd1c3d8a 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -789,11 +789,12 @@
   rtx zero = gen_reg_rtx (V8HImode);
 
   emit_insn (gen_altivec_vspltish (zero, const0_rtx));
-  emit_insn (gen_altivec_vmladduhm(operands[0], operands[1], operands[2], zero));
+  emit_insn (gen_fmav8hi4 (operands[0], operands[1], operands[2], zero));
 
   DONE;
 })
 
+
 ;; Fused multiply subtract 
 (define_insn "*altivec_vnmsubfp"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
@@ -930,7 +931,7 @@
   "vmhraddshs %0,%1,%2,%3"
   [(set_attr "type" "veccomplex")])
 
-(define_insn "altivec_vmladduhm"
+(define_insn "fmav8hi4"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
         (plus:V8HI (mult:V8HI (match_operand:V8HI 1 "register_operand" "v")
 		   	      (match_operand:V8HI 2 "register_operand" "v"))
diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 2c3b04c613b..5f03fa5f206 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -272,6 +272,9 @@ extern int darwin_emit_branch_islands;
 #undef  TARGET_IEEEQUAD
 #define TARGET_IEEEQUAD 0
 
+#undef  TARGET_IEEEQUAD_DEFAULT
+#define TARGET_IEEEQUAD_DEFAULT 0
+
 /* Since Darwin doesn't do TOCs, stub this out.  */
 
 #define ASM_OUTPUT_SPECIAL_POOL_ENTRY_P(X, MODE)  ((void)X, (void)MODE, 0)
diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h
new file mode 100644
index 00000000000..6a33ad9dd13
--- /dev/null
+++ b/gcc/config/rs6000/emmintrin.h
@@ -0,0 +1,2340 @@
+/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Implemented from the specification included in the Intel C++ Compiler
+   User Guide and Reference, version 9.0.  */
+
+#ifndef NO_WARN_X86_INTRINSICS
+/* This header is distributed to simplify porting x86_64 code that
+   makes explicit use of Intel intrinsics to powerpc64le.
+   It is the user's responsibility to determine if the results are
+   acceptable and make additional changes as necessary.
+   Note that much code that uses Intel intrinsics can be rewritten in
+   standard C or GNU C extensions, which are more portable and better
+   optimized across multiple targets.
+
+   In the specific case of X86 SSE2 (__m128i, __m128d) intrinsics,
+   the PowerPC VMX/VSX ISA is a good match for vector double SIMD
+   operations.  However scalar double operations in vector (XMM)
+   registers require the POWER8 VSX ISA (2.07) level. Also there are
+   important differences for data format and placement of double
+   scalars in the vector register.
+
+   For PowerISA Scalar double is in FPRs (left most 64-bits of the
+   low 32 VSRs), while X86_64 SSE2 uses the right most 64-bits of
+   the XMM. These differences require extra steps on POWER to match
+   the SSE2 scalar double semantics.
+
+   Most SSE2 scalar double intrinsic operations can be performed more
+   efficiently as C language double scalar operations or optimized to
+   use vector SIMD operations.  We recommend this for new applications.
+
+   Another difference is the format and details of the X86_64 MXSCR vs
+   the PowerISA FPSCR / VSCR registers. We recommend applications
+   replace direct access to the MXSCR with the more portable <fenv.h>
+   Posix APIs. */
+#warning "Please read comment above.  Use -DNO_WARN_X86_INTRINSICS to disable this warning."
+#endif
+
+#ifndef EMMINTRIN_H_
+#define EMMINTRIN_H_
+
+#include <altivec.h>
+#include <assert.h>
+
+/* We need definitions from the SSE header files.  */
+#include <xmmintrin.h>
+
+/* SSE2 */
+typedef __vector double __v2df;
+typedef __vector long long __v2di;
+typedef __vector unsigned long long __v2du;
+typedef __vector int __v4si;
+typedef __vector unsigned int __v4su;
+typedef __vector short __v8hi;
+typedef __vector unsigned short __v8hu;
+typedef __vector signed char __v16qi;
+typedef __vector unsigned char __v16qu;
+
+/* The Intel API is flexible enough that we must allow aliasing with other
+   vector types, and their scalar components.  */
+typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
+typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
+
+/* Unaligned version of the same types.  */
+typedef long long __m128i_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1)));
+typedef double __m128d_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1)));
+
+/* Create a vector with element 0 as F and the rest zero.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_sd (double __F)
+{
+  return __extension__ (__m128d){ __F, 0.0 };
+}
+
+/* Create a vector with both elements equal to F.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_pd (double __F)
+{
+  return __extension__ (__m128d){ __F, __F };
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_pd1 (double __F)
+{
+  return _mm_set1_pd (__F);
+}
+
+/* Create a vector with the lower value X and upper value W.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_pd (double __W, double __X)
+{
+  return __extension__ (__m128d){ __X, __W };
+}
+
+/* Create a vector with the lower value W and upper value X.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setr_pd (double __W, double __X)
+{
+  return __extension__ (__m128d){ __W, __X };
+}
+
+/* Create an undefined vector.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_undefined_pd (void)
+{
+  __m128d __Y = __Y;
+  return __Y;
+}
+
+/* Create a vector of zeros.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setzero_pd (void)
+{
+  return (__m128d) vec_splats (0);
+}
+
+/* Sets the low DPFP value of A from the low value of B.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_move_sd (__m128d __A, __m128d __B)
+{
+  __v2df result = (__v2df) __A;
+  result [0] = ((__v2df) __B)[0];
+  return (__m128d) result;
+}
+
+/* Load two DPFP values from P.  The address must be 16-byte aligned.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_load_pd (double const *__P)
+{
+  assert(((unsigned long)__P & 0xfUL) == 0UL);
+  return ((__m128d)vec_ld(0, (__v16qu*)__P));
+}
+
+/* Load two DPFP values from P.  The address need not be 16-byte aligned.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadu_pd (double const *__P)
+{
+  return (vec_vsx_ld(0, __P));
+}
+
+/* Create a vector with all two elements equal to *P.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_load1_pd (double const *__P)
+{
+  return (vec_splats (*__P));
+}
+
+/* Create a vector with element 0 as *P and the rest zero.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_load_sd (double const *__P)
+{
+  return _mm_set_sd (*__P);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_load_pd1 (double const *__P)
+{
+  return _mm_load1_pd (__P);
+}
+
+/* Load two DPFP values in reverse order.  The address must be aligned.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadr_pd (double const *__P)
+{
+  __v2df __tmp = _mm_load_pd (__P);
+  return (__m128d)vec_xxpermdi (__tmp, __tmp, 2);
+}
+
+/* Store two DPFP values.  The address must be 16-byte aligned.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_store_pd (double *__P, __m128d __A)
+{
+  assert(((unsigned long)__P & 0xfUL) == 0UL);
+  vec_st((__v16qu)__A, 0, (__v16qu*)__P);
+}
+
+/* Store two DPFP values.  The address need not be 16-byte aligned.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storeu_pd (double *__P, __m128d __A)
+{
+  *(__m128d *)__P = __A;
+}
+
+/* Stores the lower DPFP value.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_store_sd (double *__P, __m128d __A)
+{
+  *__P = ((__v2df)__A)[0];
+}
+
+extern __inline double __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsd_f64 (__m128d __A)
+{
+  return ((__v2df)__A)[0];
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storel_pd (double *__P, __m128d __A)
+{
+  _mm_store_sd (__P, __A);
+}
+
+/* Stores the upper DPFP value.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storeh_pd (double *__P, __m128d __A)
+{
+  *__P = ((__v2df)__A)[1];
+}
+/* Store the lower DPFP value across two words.
+   The address must be 16-byte aligned.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_store1_pd (double *__P, __m128d __A)
+{
+  _mm_store_pd (__P, vec_splat (__A, 0));
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_store_pd1 (double *__P, __m128d __A)
+{
+  _mm_store1_pd (__P, __A);
+}
+
+/* Store two DPFP values in reverse order.  The address must be aligned.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storer_pd (double *__P, __m128d __A)
+{
+  _mm_store_pd (__P, vec_xxpermdi (__A, __A, 2));
+}
+
+/* Intel intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi128_si64 (__m128i __A)
+{
+  return ((__v2di)__A)[0];
+}
+
+/* Microsoft intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi128_si64x (__m128i __A)
+{
+  return ((__v2di)__A)[0];
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) ((__v2df)__A + (__v2df)__B);
+}
+
+/* Add the lower double-precision (64-bit) floating-point element in
+   a and b, store the result in the lower element of dst, and copy
+   the upper element from a to the upper element of dst. */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_sd (__m128d __A, __m128d __B)
+{
+  __A[0] = __A[0] + __B[0];
+  return (__A);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) ((__v2df)__A - (__v2df)__B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_sd (__m128d __A, __m128d __B)
+{
+  __A[0] = __A[0] - __B[0];
+  return (__A);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mul_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) ((__v2df)__A * (__v2df)__B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mul_sd (__m128d __A, __m128d __B)
+{
+  __A[0] = __A[0] * __B[0];
+  return (__A);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_div_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) ((__v2df)__A / (__v2df)__B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_div_sd (__m128d __A, __m128d __B)
+{
+  __A[0] = __A[0] / __B[0];
+  return (__A);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sqrt_pd (__m128d __A)
+{
+  return (vec_sqrt (__A));
+}
+
+/* Return pair {sqrt (B[0]), A[1]}.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sqrt_sd (__m128d __A, __m128d __B)
+{
+  __v2df c;
+  c = vec_sqrt ((__v2df) _mm_set1_pd (__B[0]));
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_pd (__m128d __A, __m128d __B)
+{
+  return (vec_min (__A, __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = vec_min (a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_pd (__m128d __A, __m128d __B)
+{
+  return (vec_max (__A, __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = vec_max (a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmpeq ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmplt_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmplt ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmple_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmple ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmpgt ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpge_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmpge ((__v2df) __A,(__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpneq_pd (__m128d __A, __m128d __B)
+{
+  __v2df temp = (__v2df) vec_cmpeq ((__v2df) __A, (__v2df)__B);
+  return ((__m128d)vec_nor (temp, temp));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnlt_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmpge ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnle_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmpgt ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpngt_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmple ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnge_pd (__m128d __A, __m128d __B)
+{
+  return ((__m128d)vec_cmplt ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpord_pd (__m128d __A, __m128d __B)
+{
+#if _ARCH_PWR8
+  __v2du c, d;
+  /* Compare against self will return false (0's) if NAN.  */
+  c = (__v2du)vec_cmpeq (__A, __A);
+  d = (__v2du)vec_cmpeq (__B, __B);
+#else
+  __v2du a, b;
+  __v2du c, d;
+  const __v2du double_exp_mask  = {0x7ff0000000000000, 0x7ff0000000000000};
+  a = (__v2du)vec_abs ((__v2df)__A);
+  b = (__v2du)vec_abs ((__v2df)__B);
+  c = (__v2du)vec_cmpgt (double_exp_mask, a);
+  d = (__v2du)vec_cmpgt (double_exp_mask, b);
+#endif
+  /* A != NAN and B != NAN.  */
+  return ((__m128d)vec_and(c, d));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpunord_pd (__m128d __A, __m128d __B)
+{
+#if _ARCH_PWR8
+  __v2du c, d;
+  /* Compare against self will return false (0's) if NAN.  */
+  c = (__v2du)vec_cmpeq ((__v2df)__A, (__v2df)__A);
+  d = (__v2du)vec_cmpeq ((__v2df)__B, (__v2df)__B);
+  /* A == NAN OR B == NAN converts too:
+     NOT(A != NAN) OR NOT(B != NAN).  */
+  c = vec_nor (c, c);
+  return ((__m128d)vec_orc(c, d));
+#else
+  __v2du c, d;
+  /* Compare against self will return false (0's) if NAN.  */
+  c = (__v2du)vec_cmpeq ((__v2df)__A, (__v2df)__A);
+  d = (__v2du)vec_cmpeq ((__v2df)__B, (__v2df)__B);
+  /* Convert the true ('1's) is NAN.  */
+  c = vec_nor (c, c);
+  d = vec_nor (d, d);
+  return ((__m128d)vec_or(c, d));
+#endif
+}
+
+extern __inline  __m128d  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_sd(__m128d  __A, __m128d  __B)
+{
+  __v2df a, b, c;
+  /* PowerISA VSX does not allow partial (for just lower double)
+     results. So to insure we don't generate spurious exceptions
+     (from the upper double values) we splat the lower double
+     before we do the operation. */
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmpeq(a, b);
+  /* Then we merge the lower double result with the original upper
+     double from __A.  */
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmplt_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmplt(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmple_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmple(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmpgt(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpge_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmpge(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpneq_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  c = (__v2df) vec_cmpeq(a, b);
+  c = vec_nor (c, c);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnlt_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  /* Not less than is just greater than or equal.  */
+  c = (__v2df) vec_cmpge(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnle_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  /* Not less than or equal is just greater than.  */
+  c = (__v2df) vec_cmpge(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpngt_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  /* Not greater than is just less than or equal.  */
+  c = (__v2df) vec_cmple(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpnge_sd (__m128d __A, __m128d __B)
+{
+  __v2df a, b, c;
+  a = vec_splats (__A[0]);
+  b = vec_splats (__B[0]);
+  /* Not greater than or equal is just less than.  */
+  c = (__v2df) vec_cmplt(a, b);
+  return (__m128d) _mm_setr_pd (c[0], __A[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpord_sd (__m128d __A, __m128d __B)
+{
+  __v2df r;
+  r = (__v2df)_mm_cmpord_pd (vec_splats (__A[0]), vec_splats (__B[0]));
+  return (__m128d) _mm_setr_pd (r[0], ((__v2df)__A)[1]);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpunord_sd (__m128d __A, __m128d __B)
+{
+  __v2df r;
+  r = _mm_cmpunord_pd (vec_splats (__A[0]), vec_splats (__B[0]));
+  return (__m128d) _mm_setr_pd (r[0], __A[1]);
+}
+
+/* FIXME
+   The __mm_comi??_sd and __mm_ucomi??_sd implementations below are
+   exactly the same because GCC for PowerPC only generates unordered
+   compares (scalar and vector).
+   Technically __mm_comieq_sp et all should be using the ordered
+   compare and signal for QNaNs.  The __mm_ucomieq_sd et all should
+   be OK.   */
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comieq_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] == __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comilt_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] < __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comile_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] <= __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comigt_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] > __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comige_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] >= __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_comineq_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] != __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomieq_sd (__m128d __A, __m128d __B)
+{
+	return (__A[0] == __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomilt_sd (__m128d __A, __m128d __B)
+{
+	return (__A[0] < __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomile_sd (__m128d __A, __m128d __B)
+{
+	return (__A[0] <= __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomigt_sd (__m128d __A, __m128d __B)
+{
+	return (__A[0] > __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomige_sd (__m128d __A, __m128d __B)
+{
+	return (__A[0] >= __B[0]);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_ucomineq_sd (__m128d __A, __m128d __B)
+{
+  return (__A[0] != __B[0]);
+}
+
+/* Create a vector of Qi, where i is the element number.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_epi64x (long long __q1, long long __q0)
+{
+  return __extension__ (__m128i)(__v2di){ __q0, __q1 };
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_epi64 (__m64 __q1,  __m64 __q0)
+{
+  return _mm_set_epi64x ((long long)__q1, (long long)__q0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_epi32 (int __q3, int __q2, int __q1, int __q0)
+{
+  return __extension__ (__m128i)(__v4si){ __q0, __q1, __q2, __q3 };
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_epi16 (short __q7, short __q6, short __q5, short __q4,
+	       short __q3, short __q2, short __q1, short __q0)
+{
+  return __extension__ (__m128i)(__v8hi){
+    __q0, __q1, __q2, __q3, __q4, __q5, __q6, __q7 };
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set_epi8 (char __q15, char __q14, char __q13, char __q12,
+	      char __q11, char __q10, char __q09, char __q08,
+	      char __q07, char __q06, char __q05, char __q04,
+	      char __q03, char __q02, char __q01, char __q00)
+{
+  return __extension__ (__m128i)(__v16qi){
+    __q00, __q01, __q02, __q03, __q04, __q05, __q06, __q07,
+    __q08, __q09, __q10, __q11, __q12, __q13, __q14, __q15
+  };
+}
+
+/* Set all of the elements of the vector to A.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_epi64x (long long __A)
+{
+  return _mm_set_epi64x (__A, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_epi64 (__m64 __A)
+{
+  return _mm_set_epi64 (__A, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_epi32 (int __A)
+{
+  return _mm_set_epi32 (__A, __A, __A, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_epi16 (short __A)
+{
+  return _mm_set_epi16 (__A, __A, __A, __A, __A, __A, __A, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_set1_epi8 (char __A)
+{
+  return _mm_set_epi8 (__A, __A, __A, __A, __A, __A, __A, __A,
+		       __A, __A, __A, __A, __A, __A, __A, __A);
+}
+
+/* Create a vector of Qi, where i is the element number.
+   The parameter order is reversed from the _mm_set_epi* functions.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setr_epi64 (__m64 __q0, __m64 __q1)
+{
+  return _mm_set_epi64 (__q1, __q0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setr_epi32 (int __q0, int __q1, int __q2, int __q3)
+{
+  return _mm_set_epi32 (__q3, __q2, __q1, __q0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setr_epi16 (short __q0, short __q1, short __q2, short __q3,
+	        short __q4, short __q5, short __q6, short __q7)
+{
+  return _mm_set_epi16 (__q7, __q6, __q5, __q4, __q3, __q2, __q1, __q0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setr_epi8 (char __q00, char __q01, char __q02, char __q03,
+	       char __q04, char __q05, char __q06, char __q07,
+	       char __q08, char __q09, char __q10, char __q11,
+	       char __q12, char __q13, char __q14, char __q15)
+{
+  return _mm_set_epi8 (__q15, __q14, __q13, __q12, __q11, __q10, __q09, __q08,
+		       __q07, __q06, __q05, __q04, __q03, __q02, __q01, __q00);
+}
+
+/* Create a vector with element 0 as *P and the rest zero.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_load_si128 (__m128i const *__P)
+{
+  return *__P;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadu_si128 (__m128i_u const *__P)
+{
+  return (__m128i) (vec_vsx_ld(0, (signed int const *)__P));
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadl_epi64 (__m128i_u const *__P)
+{
+  return _mm_set_epi64 ((__m64)0LL, *(__m64 *)__P);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_store_si128 (__m128i *__P, __m128i __B)
+{
+  assert(((unsigned long )__P & 0xfUL) == 0UL);
+  vec_st ((__v16qu) __B, 0, (__v16qu*)__P);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storeu_si128 (__m128i_u *__P, __m128i __B)
+{
+  *__P = __B;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_storel_epi64 (__m128i_u *__P, __m128i __B)
+{
+  *(long long *)__P = ((__v2di)__B)[0];
+}
+
+extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_movepi64_pi64 (__m128i_u __B)
+{
+  return (__m64) ((__v2di)__B)[0];
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_movpi64_epi64 (__m64 __A)
+{
+  return _mm_set_epi64 ((__m64)0LL, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_move_epi64 (__m128i __A)
+{
+  return _mm_set_epi64 ((__m64)0LL, (__m64)__A[0]);
+}
+
+/* Create an undefined vector.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_undefined_si128 (void)
+{
+  __m128i __Y = __Y;
+  return __Y;
+}
+
+/* Create a vector of zeros.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_setzero_si128 (void)
+{
+  return __extension__ (__m128i)(__v4si){ 0, 0, 0, 0 };
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi32_pd (__m128i __A)
+{
+  __v2di val;
+  /* For LE need to generate Vector Unpack Low Signed Word.
+     Which is generated from unpackh.  */
+  val = (__v2di)vec_unpackh ((__v4si)__A);
+
+  return (__m128d)vec_ctf (val, 0);
+}
+#endif
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi32_ps (__m128i __A)
+{
+  return ((__m128)vec_ctf((__v4si)__A, 0));
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtpd_epi32 (__m128d __A)
+{
+  __v2df rounded = vec_rint (__A);
+  __v4si result, temp;
+  const __v4si vzero =
+    { 0, 0, 0, 0 };
+
+  /* VSX Vector truncate Double-Precision to integer and Convert to
+   Signed Integer Word format with Saturate.  */
+  __asm__(
+      "xvcvdpsxws %x0,%x1"
+      : "=wa" (temp)
+      : "wa" (rounded)
+      : );
+
+#ifdef _ARCH_PWR8
+  temp = vec_mergeo (temp, temp);
+  result = (__v4si)vec_vpkudum ((vector long)temp, (vector long)vzero);
+#else
+  {
+    const __v16qu pkperm = {0x00, 0x01, 0x02, 0x03, 0x08, 0x09, 0x0a, 0x0b,
+	0x14, 0x15, 0x16, 0x17, 0x1c, 0x1d, 0x1e, 0x1f };
+    result = (__v4si) vec_perm ((__v16qu) temp, (__v16qu) vzero, pkperm);
+  }
+#endif
+  return (__m128i) result;
+}
+
+extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtpd_pi32 (__m128d __A)
+{
+  __m128i result = _mm_cvtpd_epi32(__A);
+
+  return (__m64) result[0];
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtpd_ps (__m128d __A)
+{
+  __v4sf result;
+  __v4si temp;
+  const __v4si vzero = { 0, 0, 0, 0 };
+
+  __asm__(
+      "xvcvdpsp %x0,%x1"
+      : "=wa" (temp)
+      : "wa" (__A)
+      : );
+
+#ifdef _ARCH_PWR8
+  temp = vec_mergeo (temp, temp);
+  result = (__v4sf)vec_vpkudum ((vector long)temp, (vector long)vzero);
+#else
+  {
+    const __v16qu pkperm = {0x00, 0x01, 0x02, 0x03, 0x08, 0x09, 0x0a, 0x0b,
+	0x14, 0x15, 0x16, 0x17, 0x1c, 0x1d, 0x1e, 0x1f };
+    result = (__v4sf) vec_perm ((__v16qu) temp, (__v16qu) vzero, pkperm);
+  }
+#endif
+  return ((__m128)result);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttpd_epi32 (__m128d __A)
+{
+  __v4si result;
+  __v4si temp;
+  const __v4si vzero = { 0, 0, 0, 0 };
+
+  /* VSX Vector truncate Double-Precision to integer and Convert to
+   Signed Integer Word format with Saturate.  */
+  __asm__(
+      "xvcvdpsxws %x0,%x1"
+      : "=wa" (temp)
+      : "wa" (__A)
+      : );
+
+#ifdef _ARCH_PWR8
+  temp = vec_mergeo (temp, temp);
+  result = (__v4si)vec_vpkudum ((vector long)temp, (vector long)vzero);
+#else
+  {
+    const __v16qu pkperm = {0x00, 0x01, 0x02, 0x03, 0x08, 0x09, 0x0a, 0x0b,
+	0x14, 0x15, 0x16, 0x17, 0x1c, 0x1d, 0x1e, 0x1f };
+    result = (__v4si) vec_perm ((__v16qu) temp, (__v16qu) vzero, pkperm);
+  }
+#endif
+
+  return ((__m128i) result);
+}
+
+extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttpd_pi32 (__m128d __A)
+{
+  __m128i result = _mm_cvttpd_epi32 (__A);
+
+  return (__m64) result[0];
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi128_si32 (__m128i __A)
+{
+  return ((__v4si)__A)[0];
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtpi32_pd (__m64 __A)
+{
+  __v4si temp;
+  __v2di tmp2;
+  __v2df result;
+
+  temp = (__v4si)vec_splats (__A);
+  tmp2 = (__v2di)vec_unpackl (temp);
+  result = vec_ctf ((__vector signed long)tmp2, 0);
+  return (__m128d)result;
+}
+#endif
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtps_epi32 (__m128 __A)
+{
+  __v4sf rounded;
+  __v4si result;
+
+  rounded = vec_rint((__v4sf) __A);
+  result = vec_cts (rounded, 0);
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttps_epi32 (__m128 __A)
+{
+  __v4si result;
+
+  result = vec_cts ((__v4sf) __A, 0);
+  return (__m128i) result;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtps_pd (__m128 __A)
+{
+  /* Check if vec_doubleh is defined by <altivec.h>. If so use that. */
+#ifdef vec_doubleh
+  return (__m128d) vec_doubleh ((__v4sf)__A);
+#else
+  /* Otherwise the compiler is not current and so need to generate the
+     equivalent code.  */
+  __v4sf a = (__v4sf)__A;
+  __v4sf temp;
+  __v2df result;
+#ifdef __LITTLE_ENDIAN__
+  /* The input float values are in elements {[0], [1]} but the convert
+     instruction needs them in elements {[1], [3]}, So we use two
+     shift left double vector word immediates to get the elements
+     lined up.  */
+  temp = __builtin_vsx_xxsldwi (a, a, 3);
+  temp = __builtin_vsx_xxsldwi (a, temp, 2);
+#elif __BIG_ENDIAN__
+  /* The input float values are in elements {[0], [1]} but the convert
+     instruction needs them in elements {[0], [2]}, So we use two
+     shift left double vector word immediates to get the elements
+     lined up.  */
+  temp = vec_vmrghw (a, a);
+#endif
+  __asm__(
+      " xvcvspdp %x0,%x1"
+      : "=wa" (result)
+      : "wa" (temp)
+      : );
+  return (__m128d) result;
+#endif
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsd_si32 (__m128d __A)
+{
+  __v2df rounded = vec_rint((__v2df) __A);
+  int result = ((__v2df)rounded)[0];
+
+  return result;
+}
+/* Intel intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsd_si64 (__m128d __A)
+{
+  __v2df rounded = vec_rint ((__v2df) __A );
+  long long result = ((__v2df) rounded)[0];
+
+  return result;
+}
+
+/* Microsoft intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsd_si64x (__m128d __A)
+{
+  return _mm_cvtsd_si64 ((__v2df)__A);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttsd_si32 (__m128d __A)
+{
+  int result = ((__v2df)__A)[0];
+
+  return result;
+}
+
+/* Intel intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttsd_si64 (__m128d __A)
+{
+  long long result = ((__v2df)__A)[0];
+
+  return result;
+}
+
+/* Microsoft intrinsic.  */
+extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvttsd_si64x (__m128d __A)
+{
+  return _mm_cvttsd_si64 (__A);
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsd_ss (__m128 __A, __m128d __B)
+{
+  __v4sf result = (__v4sf)__A;
+
+#ifdef __LITTLE_ENDIAN__
+  __v4sf temp_s;
+  /* Copy double element[0] to element [1] for conversion.  */
+  __v2df temp_b = vec_splat((__v2df)__B, 0);
+
+  /* Pre-rotate __A left 3 (logically right 1) elements.  */
+  result = __builtin_vsx_xxsldwi (result, result, 3);
+  /* Convert double to single float scalar in a vector.  */
+  __asm__(
+      "xscvdpsp %x0,%x1"
+      : "=wa" (temp_s)
+      : "wa" (temp_b)
+      : );
+  /* Shift the resulting scalar into vector element [0].  */
+  result = __builtin_vsx_xxsldwi (result, temp_s, 1);
+#else
+  result [0] = ((__v2df)__B)[0];
+#endif
+  return (__m128) result;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi32_sd (__m128d __A, int __B)
+{
+  __v2df result = (__v2df)__A;
+  double db = __B;
+  result [0] = db;
+  return (__m128d)result;
+}
+
+/* Intel intrinsic.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi64_sd (__m128d __A, long long __B)
+{
+  __v2df result = (__v2df)__A;
+  double db = __B;
+  result [0] = db;
+  return (__m128d)result;
+}
+
+/* Microsoft intrinsic.  */
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi64x_sd (__m128d __A, long long __B)
+{
+  return _mm_cvtsi64_sd (__A, __B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtss_sd (__m128d __A, __m128 __B)
+{
+#ifdef __LITTLE_ENDIAN__
+  /* Use splat to move element [0] into position for the convert. */
+  __v4sf temp = vec_splat ((__v4sf)__B, 0);
+  __v2df res;
+  /* Convert single float scalar to double in a vector.  */
+  __asm__(
+      "xscvspdp %x0,%x1"
+      : "=wa" (res)
+      : "wa" (temp)
+      : );
+  return (__m128d) vec_mergel (res, (__v2df)__A);
+#else
+  __v2df res = (__v2df)__A;
+  res [0] = ((__v4sf)__B) [0];
+  return (__m128d) res;
+#endif
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_shuffle_pd(__m128d __A, __m128d __B, const int __mask)
+{
+  __vector double result;
+  const int litmsk = __mask & 0x3;
+
+  if (litmsk == 0)
+    result = vec_mergeh (__A, __B);
+#if __GNUC__ < 6
+  else if (litmsk == 1)
+    result = vec_xxpermdi (__B, __A, 2);
+  else if (litmsk == 2)
+    result = vec_xxpermdi (__B, __A, 1);
+#else
+  else if (litmsk == 1)
+    result = vec_xxpermdi (__A, __B, 2);
+  else if (litmsk == 2)
+    result = vec_xxpermdi (__A, __B, 1);
+#endif
+  else
+    result = vec_mergel (__A, __B);
+
+  return result;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpackhi_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) vec_mergel ((__v2df)__A, (__v2df)__B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpacklo_pd (__m128d __A, __m128d __B)
+{
+  return (__m128d) vec_mergeh ((__v2df)__A, (__v2df)__B);
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadh_pd (__m128d __A, double const *__B)
+{
+  __v2df result = (__v2df)__A;
+  result [1] = *__B;
+  return (__m128d)result;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_loadl_pd (__m128d __A, double const *__B)
+{
+  __v2df result = (__v2df)__A;
+  result [0] = *__B;
+  return (__m128d)result;
+}
+
+#ifdef _ARCH_PWR8
+/* Intrinsic functions that require PowerISA 2.07 minimum.  */
+
+/* Creates a 2-bit mask from the most significant bits of the DPFP values.  */
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_movemask_pd (__m128d  __A)
+{
+  __vector __m64 result;
+  static const __vector unsigned int perm_mask =
+    {
+#ifdef __LITTLE_ENDIAN__
+	0x80800040, 0x80808080, 0x80808080, 0x80808080
+#elif __BIG_ENDIAN__
+      0x80808080, 0x80808080, 0x80808080, 0x80800040
+#endif
+    };
+
+  result = (__vector __m64) vec_vbpermq ((__vector unsigned char) __A,
+					 (__vector unsigned char) perm_mask);
+
+#ifdef __LITTLE_ENDIAN__
+  return result[1];
+#elif __BIG_ENDIAN__
+  return result[0];
+#endif
+}
+#endif /* _ARCH_PWR8 */
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_packs_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_packs ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_packs_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_packs ((__v4si)__A, (__v4si)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_packus_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_packsu ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpackhi_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergel ((__v16qu)__A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpackhi_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergel ((__v8hu)__A, (__v8hu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpackhi_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergel ((__v4su)__A, (__v4su)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpackhi_epi64 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergel ((__vector long)__A, (__vector long)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpacklo_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergeh ((__v16qu)__A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpacklo_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergeh ((__v8hi)__A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpacklo_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergeh ((__v4si)__A, (__v4si)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_unpacklo_epi64 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_mergeh ((__vector long)__A, (__vector long)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v16qu)__A + (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v8hu)__A + (__v8hu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v4su)__A + (__v4su)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_add_epi64 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v2du)__A + (__v2du)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_adds_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_adds ((__v16qi)__A, (__v16qi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_adds_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_adds ((__v8hi)__A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_adds_epu8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_adds ((__v16qu)__A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_adds_epu16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_adds ((__v8hu)__A, (__v8hu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v16qu)__A - (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v8hu)__A - (__v8hu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v4su)__A - (__v4su)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sub_epi64 (__m128i __A, __m128i __B)
+{
+  return (__m128i) ((__v2du)__A - (__v2du)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_subs_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_subs ((__v16qi)__A, (__v16qi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_subs_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_subs ((__v8hi)__A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_subs_epu8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_subs ((__v16qu)__A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_subs_epu16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_subs ((__v8hu)__A, (__v8hu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_madd_epi16 (__m128i __A, __m128i __B)
+{
+  __vector signed int zero = {0, 0, 0, 0};
+
+  return (__m128i) vec_vmsumshm ((__v8hi)__A, (__v8hi)__B, zero);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mulhi_epi16 (__m128i __A, __m128i __B)
+{
+  __vector signed int w0, w1;
+
+  __vector unsigned char xform1 = {
+#ifdef __LITTLE_ENDIAN__
+      0x02, 0x03, 0x12, 0x13,  0x06, 0x07, 0x16, 0x17,
+      0x0A, 0x0B, 0x1A, 0x1B,  0x0E, 0x0F, 0x1E, 0x1F
+#elif __BIG_ENDIAN__
+      0x00, 0x01, 0x10, 0x11,  0x04, 0x05, 0x14, 0x15,
+      0x08, 0x09, 0x18, 0x19,  0x0C, 0x0D, 0x1C, 0x1D
+#endif
+    };
+
+  w0 = vec_vmulesh ((__v8hi)__A, (__v8hi)__B);
+  w1 = vec_vmulosh ((__v8hi)__A, (__v8hi)__B);
+  return (__m128i) vec_perm (w0, w1, xform1);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mullo_epi16 (__m128i __A, __m128i __B)
+{
+    return (__m128i) ((__v8hi)__A * (__v8hi)__B);
+}
+
+extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mul_su32 (__m64 __A, __m64 __B)
+{
+  unsigned int a = __A;
+  unsigned int b = __B;
+
+  return ((__m64)a * (__m64)b);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mul_epu32 (__m128i __A, __m128i __B)
+{
+#if __GNUC__ < 8
+  __v2du result;
+
+#ifdef __LITTLE_ENDIAN__
+  /* VMX Vector Multiply Odd Unsigned Word.  */
+  __asm__(
+      "vmulouw %0,%1,%2"
+      : "=v" (result)
+      : "v" (__A), "v" (__B)
+      : );
+#elif __BIG_ENDIAN__
+  /* VMX Vector Multiply Even Unsigned Word.  */
+  __asm__(
+      "vmuleuw %0,%1,%2"
+      : "=v" (result)
+      : "v" (__A), "v" (__B)
+      : );
+#endif
+  return (__m128i) result;
+#else
+#ifdef __LITTLE_ENDIAN__
+  return (__m128i) vec_mule ((__v4su)__A, (__v4su)__B);
+#elif __BIG_ENDIAN__
+  return (__m128i) vec_mulo ((__v4su)__A, (__v4su)__B);
+#endif
+#endif
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_slli_epi16 (__m128i __A, int __B)
+{
+  __v8hu lshift;
+  __v8hi result = { 0, 0, 0, 0, 0, 0, 0, 0 };
+
+  if (__B < 16)
+    {
+      if (__builtin_constant_p(__B))
+	  lshift = (__v8hu) vec_splat_s16(__B);
+      else
+	  lshift = vec_splats ((unsigned short) __B);
+
+      result = vec_vslh ((__v8hi) __A, lshift);
+    }
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_slli_epi32 (__m128i __A, int __B)
+{
+  __v4su lshift;
+  __v4si result = { 0, 0, 0, 0 };
+
+  if (__B < 32)
+    {
+      if (__builtin_constant_p(__B))
+	lshift = (__v4su) vec_splat_s32(__B);
+      else
+	lshift = vec_splats ((unsigned int) __B);
+
+      result = vec_vslw ((__v4si) __A, lshift);
+    }
+
+  return (__m128i) result;
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_slli_epi64 (__m128i __A, int __B)
+{
+  __v2du lshift;
+  __v2di result = { 0, 0 };
+
+  if (__B < 64)
+    {
+      if (__builtin_constant_p(__B))
+	{
+	  if (__B < 32)
+	      lshift = (__v2du) vec_splat_s32(__B);
+	    else
+	      lshift = (__v2du) vec_splats((unsigned long long)__B);
+	}
+      else
+	  lshift = (__v2du) vec_splats ((unsigned int) __B);
+
+      result = vec_vsld ((__v2di) __A, lshift);
+    }
+
+  return (__m128i) result;
+}
+#endif
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srai_epi16 (__m128i __A, int __B)
+{
+  __v8hu rshift = { 15, 15, 15, 15, 15, 15, 15, 15 };
+  __v8hi result;
+
+  if (__B < 16)
+    {
+      if (__builtin_constant_p(__B))
+	rshift = (__v8hu) vec_splat_s16(__B);
+      else
+	rshift = vec_splats ((unsigned short) __B);
+    }
+  result = vec_vsrah ((__v8hi) __A, rshift);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srai_epi32 (__m128i __A, int __B)
+{
+  __v4su rshift = { 31, 31, 31, 31 };
+  __v4si result;
+
+  if (__B < 32)
+    {
+      if (__builtin_constant_p(__B))
+	{
+	  if (__B < 16)
+	      rshift = (__v4su) vec_splat_s32(__B);
+	    else
+	      rshift = (__v4su) vec_splats((unsigned int)__B);
+	}
+      else
+	rshift = vec_splats ((unsigned int) __B);
+    }
+  result = vec_vsraw ((__v4si) __A, rshift);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_bslli_si128 (__m128i __A, const int __N)
+{
+  __v16qu result;
+  const __v16qu zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+  if (__N < 16)
+    result = vec_sld ((__v16qu) __A, zeros, __N);
+  else
+    result = zeros;
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_bsrli_si128 (__m128i __A, const int __N)
+{
+  __v16qu result;
+  const __v16qu zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+  if (__N < 16)
+    if (__builtin_constant_p(__N))
+      /* Would like to use Vector Shift Left Double by Octet
+	 Immediate here to use the immediate form and avoid
+	 load of __N * 8 value into a separate VR.  */
+      result = vec_sld (zeros, (__v16qu) __A, (16 - __N));
+    else
+      {
+	__v16qu shift = vec_splats((unsigned char)(__N*8));
+	result = vec_sro ((__v16qu)__A, shift);
+      }
+  else
+    result = zeros;
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srli_si128 (__m128i __A, const int __N)
+{
+  return _mm_bsrli_si128 (__A, __N);
+}
+
+extern __inline  __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_slli_si128 (__m128i __A, const int _imm5)
+{
+  __v16qu result;
+  const __v16qu zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+  if (_imm5 < 16)
+#ifdef __LITTLE_ENDIAN__
+    result = vec_sld ((__v16qu) __A, zeros, _imm5);
+#elif __BIG_ENDIAN__
+    result = vec_sld (zeros, (__v16qu) __A, (16 - _imm5));
+#endif
+  else
+    result = zeros;
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+
+_mm_srli_epi16 (__m128i  __A, int __B)
+{
+  __v8hu rshift;
+  __v8hi result = { 0, 0, 0, 0, 0, 0, 0, 0 };
+
+  if (__B < 16)
+    {
+      if (__builtin_constant_p(__B))
+	rshift = (__v8hu) vec_splat_s16(__B);
+      else
+	rshift = vec_splats ((unsigned short) __B);
+
+      result = vec_vsrh ((__v8hi) __A, rshift);
+    }
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srli_epi32 (__m128i __A, int __B)
+{
+  __v4su rshift;
+  __v4si result = { 0, 0, 0, 0 };
+
+  if (__B < 32)
+    {
+      if (__builtin_constant_p(__B))
+	{
+	  if (__B < 16)
+	      rshift = (__v4su) vec_splat_s32(__B);
+	    else
+	      rshift = (__v4su) vec_splats((unsigned int)__B);
+	}
+      else
+	rshift = vec_splats ((unsigned int) __B);
+
+      result = vec_vsrw ((__v4si) __A, rshift);
+    }
+
+  return (__m128i) result;
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srli_epi64 (__m128i __A, int __B)
+{
+  __v2du rshift;
+  __v2di result = { 0, 0 };
+
+  if (__B < 64)
+    {
+      if (__builtin_constant_p(__B))
+	{
+	  if (__B < 16)
+	      rshift = (__v2du) vec_splat_s32(__B);
+	    else
+	      rshift = (__v2du) vec_splats((unsigned long long)__B);
+	}
+      else
+	rshift = (__v2du) vec_splats ((unsigned int) __B);
+
+      result = vec_vsrd ((__v2di) __A, rshift);
+    }
+
+  return (__m128i) result;
+}
+#endif
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sll_epi16 (__m128i __A, __m128i __B)
+{
+  __v8hu lshift, shmask;
+  const __v8hu shmax = { 15, 15, 15, 15, 15, 15, 15, 15 };
+  __v8hu result;
+
+#ifdef __LITTLE_ENDIAN__
+  lshift = vec_splat ((__v8hu)__B, 0);
+#elif __BIG_ENDIAN__
+  lshift = vec_splat ((__v8hu)__B, 3);
+#endif
+  shmask = lshift <= shmax;
+  result = vec_vslh ((__v8hu) __A, lshift);
+  result = vec_sel (shmask, result, shmask);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sll_epi32 (__m128i __A, __m128i __B)
+{
+  __v4su lshift, shmask;
+  const __v4su shmax = { 32, 32, 32, 32 };
+  __v4su result;
+#ifdef __LITTLE_ENDIAN__
+  lshift = vec_splat ((__v4su)__B, 0);
+#elif __BIG_ENDIAN__
+  lshift = vec_splat ((__v4su)__B, 1);
+#endif
+  shmask = lshift < shmax;
+  result = vec_vslw ((__v4su) __A, lshift);
+  result = vec_sel (shmask, result, shmask);
+
+  return (__m128i) result;
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sll_epi64 (__m128i __A, __m128i __B)
+{
+  __v2du lshift, shmask;
+  const __v2du shmax = { 64, 64 };
+  __v2du result;
+
+  lshift = (__v2du) vec_splat ((__v2du)__B, 0);
+  shmask = lshift < shmax;
+  result = vec_vsld ((__v2du) __A, lshift);
+  result = (__v2du) vec_sel ((__v2df) shmask, (__v2df) result,
+			      (__v2df) shmask);
+
+  return (__m128i) result;
+}
+#endif
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sra_epi16 (__m128i __A, __m128i __B)
+{
+  const __v8hu rshmax = { 15, 15, 15, 15, 15, 15, 15, 15 };
+  __v8hu rshift;
+  __v8hi result;
+
+#ifdef __LITTLE_ENDIAN__
+  rshift = vec_splat ((__v8hu)__B, 0);
+#elif __BIG_ENDIAN__
+  rshift = vec_splat ((__v8hu)__B, 3);
+#endif
+  rshift = vec_min (rshift, rshmax);
+  result = vec_vsrah ((__v8hi) __A, rshift);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sra_epi32 (__m128i __A, __m128i __B)
+{
+  const __v4su rshmax = { 31, 31, 31, 31 };
+  __v4su rshift;
+  __v4si result;
+
+#ifdef __LITTLE_ENDIAN__
+  rshift = vec_splat ((__v4su)__B, 0);
+#elif __BIG_ENDIAN__
+  rshift = vec_splat ((__v4su)__B, 1);
+#endif
+  rshift = vec_min (rshift, rshmax);
+  result = vec_vsraw ((__v4si) __A, rshift);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srl_epi16 (__m128i __A, __m128i __B)
+{
+  __v8hu rshift, shmask;
+  const __v8hu shmax = { 15, 15, 15, 15, 15, 15, 15, 15 };
+  __v8hu result;
+
+#ifdef __LITTLE_ENDIAN__
+  rshift = vec_splat ((__v8hu)__B, 0);
+#elif __BIG_ENDIAN__
+  rshift = vec_splat ((__v8hu)__B, 3);
+#endif
+  shmask = rshift <= shmax;
+  result = vec_vsrh ((__v8hu) __A, rshift);
+  result = vec_sel (shmask, result, shmask);
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srl_epi32 (__m128i __A, __m128i __B)
+{
+  __v4su rshift, shmask;
+  const __v4su shmax = { 32, 32, 32, 32 };
+  __v4su result;
+
+#ifdef __LITTLE_ENDIAN__
+  rshift = vec_splat ((__v4su)__B, 0);
+#elif __BIG_ENDIAN__
+  rshift = vec_splat ((__v4su)__B, 1);
+#endif
+  shmask = rshift < shmax;
+  result = vec_vsrw ((__v4su) __A, rshift);
+  result = vec_sel (shmask, result, shmask);
+
+  return (__m128i) result;
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_srl_epi64 (__m128i __A, __m128i __B)
+{
+  __v2du rshift, shmask;
+  const __v2du shmax = { 64, 64 };
+  __v2du result;
+
+  rshift = (__v2du) vec_splat ((__v2du)__B, 0);
+  shmask = rshift < shmax;
+  result = vec_vsrd ((__v2du) __A, rshift);
+  result = (__v2du)vec_sel ((__v2du)shmask, (__v2du)result, (__v2du)shmask);
+
+  return (__m128i) result;
+}
+#endif
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_and_pd (__m128d __A, __m128d __B)
+{
+  return (vec_and ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_andnot_pd (__m128d __A, __m128d __B)
+{
+  return (vec_andc ((__v2df) __B, (__v2df) __A));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_or_pd (__m128d __A, __m128d __B)
+{
+  return (vec_or ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_xor_pd (__m128d __A, __m128d __B)
+{
+  return (vec_xor ((__v2df) __A, (__v2df) __B));
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpeq ((__v16qi) __A, (__v16qi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpeq ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpeq ((__v4si) __A, (__v4si)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmplt_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmplt ((__v16qi) __A, (__v16qi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmplt_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmplt ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmplt_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmplt ((__v4si) __A, (__v4si)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_epi8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpgt ((__v16qi) __A, (__v16qi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpgt ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_epi32 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_cmpgt ((__v4si) __A, (__v4si)__B);
+}
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_extract_epi16 (__m128i const __A, int const __N)
+{
+  return (unsigned short) ((__v8hi)__A)[__N & 7];
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_insert_epi16 (__m128i const __A, int const __D, int const __N)
+{
+  __v8hi result = (__v8hi)__A;
+
+  result [(__N & 7)] = __D;
+
+  return (__m128i) result;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_max ((__v8hi)__A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epu8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_max ((__v16qu) __A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epi16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_min ((__v8hi) __A, (__v8hi)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epu8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_min ((__v16qu) __A, (__v16qu)__B);
+}
+
+
+#ifdef _ARCH_PWR8
+/* Intrinsic functions that require PowerISA 2.07 minimum.  */
+
+/* Creates a 4-bit mask from the most significant bits of the SPFP values.  */
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_movemask_epi8 (__m128i __A)
+{
+  __vector __m64 result;
+  static const __vector unsigned char perm_mask =
+    {
+#ifdef __LITTLE_ENDIAN__
+	0x78, 0x70, 0x68, 0x60, 0x58, 0x50, 0x48, 0x40,
+	0x38, 0x30, 0x28, 0x20, 0x18, 0x10, 0x08, 0x00
+#elif __BIG_ENDIAN__
+	0x00, 0x08, 0x10, 0x18, 0x20, 0x28, 0x30, 0x38,
+	0x40, 0x48, 0x50, 0x58, 0x60, 0x68, 0x70, 0x78
+#endif
+    };
+
+  result = (__vector __m64) vec_vbpermq ((__vector unsigned char) __A,
+					 (__vector unsigned char) perm_mask);
+
+#ifdef __LITTLE_ENDIAN__
+  return result[1];
+#elif __BIG_ENDIAN__
+  return result[0];
+#endif
+}
+#endif /* _ARCH_PWR8 */
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mulhi_epu16 (__m128i __A, __m128i __B)
+{
+  __v4su w0, w1;
+  __v16qu xform1 = {
+#ifdef __LITTLE_ENDIAN__
+      0x02, 0x03, 0x12, 0x13,  0x06, 0x07, 0x16, 0x17,
+      0x0A, 0x0B, 0x1A, 0x1B,  0x0E, 0x0F, 0x1E, 0x1F
+#elif __BIG_ENDIAN__
+      0x00, 0x01, 0x10, 0x11,  0x04, 0x05, 0x14, 0x15,
+      0x08, 0x09, 0x18, 0x19,  0x0C, 0x0D, 0x1C, 0x1D
+#endif
+    };
+
+  w0 = vec_vmuleuh ((__v8hu)__A, (__v8hu)__B);
+  w1 = vec_vmulouh ((__v8hu)__A, (__v8hu)__B);
+  return (__m128i) vec_perm (w0, w1, xform1);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_shufflehi_epi16 (__m128i __A, const int __mask)
+{
+  unsigned long element_selector_98 = __mask & 0x03;
+  unsigned long element_selector_BA = (__mask >> 2) & 0x03;
+  unsigned long element_selector_DC = (__mask >> 4) & 0x03;
+  unsigned long element_selector_FE = (__mask >> 6) & 0x03;
+  static const unsigned short permute_selectors[4] =
+    {
+#ifdef __LITTLE_ENDIAN__
+	      0x0908, 0x0B0A, 0x0D0C, 0x0F0E
+#elif __BIG_ENDIAN__
+	      0x0607, 0x0405, 0x0203, 0x0001
+#endif
+    };
+  __v2du pmask =
+#ifdef __LITTLE_ENDIAN__
+      { 0x1716151413121110UL,  0x1f1e1d1c1b1a1918UL};
+#elif __BIG_ENDIAN__
+      { 0x1011121314151617UL,  0x18191a1b1c1d1e1fUL};
+#endif
+  __m64_union t;
+  __v2du a, r;
+
+#ifdef __LITTLE_ENDIAN__
+  t.as_short[0] = permute_selectors[element_selector_98];
+  t.as_short[1] = permute_selectors[element_selector_BA];
+  t.as_short[2] = permute_selectors[element_selector_DC];
+  t.as_short[3] = permute_selectors[element_selector_FE];
+#elif __BIG_ENDIAN__
+  t.as_short[3] = permute_selectors[element_selector_98];
+  t.as_short[2] = permute_selectors[element_selector_BA];
+  t.as_short[1] = permute_selectors[element_selector_DC];
+  t.as_short[0] = permute_selectors[element_selector_FE];
+#endif
+#ifdef __LITTLE_ENDIAN__
+  pmask[1] = t.as_m64;
+#elif __BIG_ENDIAN__
+  pmask[0] = t.as_m64;
+#endif
+  a = (__v2du)__A;
+  r = vec_perm (a, a, (__vector unsigned char)pmask);
+  return (__m128i) r;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_shufflelo_epi16 (__m128i __A, const int __mask)
+{
+  unsigned long element_selector_10 = __mask & 0x03;
+  unsigned long element_selector_32 = (__mask >> 2) & 0x03;
+  unsigned long element_selector_54 = (__mask >> 4) & 0x03;
+  unsigned long element_selector_76 = (__mask >> 6) & 0x03;
+  static const unsigned short permute_selectors[4] =
+    {
+#ifdef __LITTLE_ENDIAN__
+	      0x0100, 0x0302, 0x0504, 0x0706
+#elif __BIG_ENDIAN__
+	      0x0e0f, 0x0c0d, 0x0a0b, 0x0809
+#endif
+    };
+  __v2du pmask = { 0x1011121314151617UL,  0x1f1e1d1c1b1a1918UL};
+  __m64_union t;
+  __v2du a, r;
+
+#ifdef __LITTLE_ENDIAN__
+  t.as_short[0] = permute_selectors[element_selector_10];
+  t.as_short[1] = permute_selectors[element_selector_32];
+  t.as_short[2] = permute_selectors[element_selector_54];
+  t.as_short[3] = permute_selectors[element_selector_76];
+#elif __BIG_ENDIAN__
+  t.as_short[3] = permute_selectors[element_selector_10];
+  t.as_short[2] = permute_selectors[element_selector_32];
+  t.as_short[1] = permute_selectors[element_selector_54];
+  t.as_short[0] = permute_selectors[element_selector_76];
+#endif
+#ifdef __LITTLE_ENDIAN__
+  pmask[0] = t.as_m64;
+#elif __BIG_ENDIAN__
+  pmask[1] = t.as_m64;
+#endif
+  a = (__v2du)__A;
+  r = vec_perm (a, a, (__vector unsigned char)pmask);
+  return (__m128i) r;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_shuffle_epi32 (__m128i __A, const int __mask)
+{
+  unsigned long element_selector_10 = __mask & 0x03;
+  unsigned long element_selector_32 = (__mask >> 2) & 0x03;
+  unsigned long element_selector_54 = (__mask >> 4) & 0x03;
+  unsigned long element_selector_76 = (__mask >> 6) & 0x03;
+  static const unsigned int permute_selectors[4] =
+    {
+#ifdef __LITTLE_ENDIAN__
+	0x03020100, 0x07060504, 0x0B0A0908, 0x0F0E0D0C
+#elif __BIG_ENDIAN__
+      0x0C0D0E0F, 0x08090A0B, 0x04050607, 0x00010203
+#endif
+    };
+  __v4su t;
+
+#ifdef __LITTLE_ENDIAN__
+  t[0] = permute_selectors[element_selector_10];
+  t[1] = permute_selectors[element_selector_32];
+  t[2] = permute_selectors[element_selector_54] + 0x10101010;
+  t[3] = permute_selectors[element_selector_76] + 0x10101010;
+#elif __BIG_ENDIAN__
+  t[3] = permute_selectors[element_selector_10] + 0x10101010;
+  t[2] = permute_selectors[element_selector_32] + 0x10101010;
+  t[1] = permute_selectors[element_selector_54];
+  t[0] = permute_selectors[element_selector_76];
+#endif
+  return (__m128i)vec_perm ((__v4si) __A, (__v4si)__A, (__vector unsigned char)t);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_maskmoveu_si128 (__m128i __A, __m128i __B, char *__C)
+{
+  __v2du hibit = { 0x7f7f7f7f7f7f7f7fUL, 0x7f7f7f7f7f7f7f7fUL};
+  __v16qu mask, tmp;
+  __m128i *p = (__m128i*)__C;
+
+  tmp = (__v16qu)_mm_loadu_si128(p);
+  mask = (__v16qu)vec_cmpgt ((__v16qu)__B, (__v16qu)hibit);
+  tmp = vec_sel (tmp, (__v16qu)__A, mask);
+  _mm_storeu_si128 (p, (__m128i)tmp);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_avg_epu8 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_avg ((__v16qu)__A, (__v16qu)__B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_avg_epu16 (__m128i __A, __m128i __B)
+{
+  return (__m128i) vec_avg ((__v8hu)__A, (__v8hu)__B);
+}
+
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_sad_epu8 (__m128i __A, __m128i __B)
+{
+  __v16qu a, b;
+  __v16qu vmin, vmax, vabsdiff;
+  __v4si vsum;
+  const __v4su zero = { 0, 0, 0, 0 };
+  __v4si result;
+
+  a = (__v16qu) __A;
+  b = (__v16qu) __B;
+  vmin = vec_min (a, b);
+  vmax = vec_max (a, b);
+  vabsdiff = vec_sub (vmax, vmin);
+  /* Sum four groups of bytes into integers.  */
+  vsum = (__vector signed int) vec_sum4s (vabsdiff, zero);
+  /* Sum across four integers with two integer results.  */
+  result = vec_sum2s (vsum, (__vector signed int) zero);
+  /* Rotate the sums into the correct position.  */
+#ifdef __LITTLE_ENDIAN__
+  result = vec_sld (result, result, 4);
+#elif __BIG_ENDIAN__
+  result = vec_sld (result, result, 6);
+#endif
+  /* Rotate the sums into the correct position.  */
+  return (__m128i) result;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_stream_si32 (int *__A, int __B)
+{
+  /* Use the data cache block touch for store transient.  */
+  __asm__ (
+    "dcbtstt 0,%0"
+    :
+    : "b" (__A)
+    : "memory"
+  );
+  *__A = __B;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_stream_si64 (long long int *__A, long long int __B)
+{
+  /* Use the data cache block touch for store transient.  */
+  __asm__ (
+    "	dcbtstt	0,%0"
+    :
+    : "b" (__A)
+    : "memory"
+  );
+  *__A = __B;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_stream_si128 (__m128i *__A, __m128i __B)
+{
+  /* Use the data cache block touch for store transient.  */
+  __asm__ (
+    "dcbtstt 0,%0"
+    :
+    : "b" (__A)
+    : "memory"
+  );
+  *__A = __B;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_stream_pd (double *__A, __m128d __B)
+{
+  /* Use the data cache block touch for store transient.  */
+  __asm__ (
+    "dcbtstt 0,%0"
+    :
+    : "b" (__A)
+    : "memory"
+  );
+  *(__m128d*)__A = __B;
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_clflush (void const *__A)
+{
+  /* Use the data cache block flush.  */
+  __asm__ (
+    "dcbf 0,%0"
+    :
+    : "b" (__A)
+    : "memory"
+  );
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_lfence (void)
+{
+  /* Use light weight sync for load to load ordering.  */
+  __atomic_thread_fence (__ATOMIC_RELEASE);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mfence (void)
+{
+  /* Use heavy weight sync for any to any ordering.  */
+  __atomic_thread_fence (__ATOMIC_SEQ_CST);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi32_si128 (int __A)
+{
+  return _mm_set_epi32 (0, 0, 0, __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi64_si128 (long long __A)
+{
+  return __extension__ (__m128i)(__v2di){ __A, 0LL };
+}
+
+/* Microsoft intrinsic.  */
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtsi64x_si128 (long long __A)
+{
+  return __extension__ (__m128i)(__v2di){ __A, 0LL };
+}
+
+/* Casts between various SP, DP, INT vector types.  Note that these do no
+   conversion of values, they just change the type.  */
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castpd_ps(__m128d __A)
+{
+  return (__m128) __A;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castpd_si128(__m128d __A)
+{
+  return (__m128i) __A;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castps_pd(__m128 __A)
+{
+  return (__m128d) __A;
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castps_si128(__m128 __A)
+{
+  return (__m128i) __A;
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castsi128_ps(__m128i __A)
+{
+  return (__m128) __A;
+}
+
+extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_mm_castsi128_pd(__m128i __A)
+{
+  return (__m128d) __A;
+}
+
+#endif /* EMMINTRIN_H_ */
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index ac9ddae3ef0..c8a425cba7e 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -961,7 +961,7 @@ BU_SPECIAL_X (RS6000_BUILTIN_NONE, NULL, 0, RS6000_BTC_MISC)
 BU_ALTIVEC_3 (VMADDFP,        "vmaddfp",        FP,    	fmav4sf4)
 BU_ALTIVEC_3 (VMHADDSHS,      "vmhaddshs",      SAT,   	altivec_vmhaddshs)
 BU_ALTIVEC_3 (VMHRADDSHS,     "vmhraddshs",     SAT,   	altivec_vmhraddshs)
-BU_ALTIVEC_3 (VMLADDUHM,      "vmladduhm",      CONST, 	altivec_vmladduhm)
+BU_ALTIVEC_3 (VMLADDUHM,      "vmladduhm",      CONST, 	fmav8hi4)
 BU_ALTIVEC_3 (VMSUMUBM,       "vmsumubm",       CONST, 	altivec_vmsumubm)
 BU_ALTIVEC_3 (VMSUMMBM,       "vmsummbm",       CONST, 	altivec_vmsummbm)
 BU_ALTIVEC_3 (VMSUMUHM,       "vmsumuhm",       CONST, 	altivec_vmsumuhm)
@@ -2374,17 +2374,13 @@ BU_FLOAT128_1 (FABSQ,		"fabsq",       CONST, abskf2)
 BU_FLOAT128_2 (COPYSIGNQ,	"copysignq",   CONST, copysignkf3)
 
 /* 1, 2, and 3 argument IEEE 128-bit floating point functions that require ISA
-   3.0 hardware.  These functions use the new 'f128' suffix.  Eventually the
-   standard functions should be folded into the common built-in function
-   handling. */
-BU_FLOAT128_1_HW (SQRTF128,	 "sqrtf128",		   CONST, sqrtkf2)
+   3.0 hardware.  These functions use the new 'f128' suffix.  */
 BU_FLOAT128_1_HW (SQRTF128_ODD,	 "sqrtf128_round_to_odd",  CONST, sqrtkf2_odd)
 BU_FLOAT128_1_HW (TRUNCF128_ODD, "truncf128_round_to_odd", CONST, trunckfdf2_odd)
 BU_FLOAT128_2_HW (ADDF128_ODD,	 "addf128_round_to_odd",   CONST, addkf3_odd)
 BU_FLOAT128_2_HW (SUBF128_ODD,	 "subf128_round_to_odd",   CONST, subkf3_odd)
 BU_FLOAT128_2_HW (MULF128_ODD,	 "mulf128_round_to_odd",   CONST, mulkf3_odd)
 BU_FLOAT128_2_HW (DIVF128_ODD,	 "divf128_round_to_odd",   CONST, divkf3_odd)
-BU_FLOAT128_3_HW (FMAF128,	 "fmaf128",		   CONST, fmakf4_hw)
 BU_FLOAT128_3_HW (FMAF128_ODD,	 "fmaf128_round_to_odd",   CONST, fmakf4_odd)
 
 /* 1 argument crypto functions.  */
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index db0e692739c..721b906ee65 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -124,7 +124,6 @@ extern void print_operand_address (FILE *, rtx);
 extern enum rtx_code rs6000_reverse_condition (machine_mode,
 					       enum rtx_code);
 extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx);
-extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
 extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
@@ -132,6 +131,7 @@ extern const char * output_probe_stack_range (rtx, rtx, rtx);
 extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
+extern int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void rs6000_split_signbit (rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e02b0863dbf..6402c0386a6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -88,6 +88,20 @@
 #define TARGET_NO_PROTOTYPE 0
 #endif
 
+  /* Set -mabi=ieeelongdouble on some old targets.  In the future, power server
+     systems will also set long double to be IEEE 128-bit.  AIX and Darwin
+     explicitly redefine TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so
+     those systems will not pick up this default.  This needs to be after all
+     of the include files, so that POWERPC_LINUX and POWERPC_FREEBSD are
+     properly defined.  */
+#ifndef TARGET_IEEEQUAD_DEFAULT
+#if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD)
+#define TARGET_IEEEQUAD_DEFAULT 1
+#else
+#define TARGET_IEEEQUAD_DEFAULT 0
+#endif
+#endif
+
 #define min(A,B)	((A) < (B) ? (A) : (B))
 #define max(A,B)	((A) > (B) ? (A) : (B))
 
@@ -1345,7 +1359,6 @@ static void rs6000_common_init_builtins (void);
 static void paired_init_builtins (void);
 static rtx paired_expand_predicate_builtin (enum insn_code, tree, rtx);
 static void htm_init_builtins (void);
-static int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 static rs6000_stack_t *rs6000_stack_info (void);
 static void is_altivec_return_reg (rtx, void *);
 int easy_vector_constant (rtx, machine_mode);
@@ -2880,6 +2893,13 @@ rs6000_debug_reg_global (void)
   fprintf (stderr, DEBUG_FMT_D, "tls_size", rs6000_tls_size);
   fprintf (stderr, DEBUG_FMT_D, "long_double_size",
 	   rs6000_long_double_type_size);
+  if (rs6000_long_double_type_size == 128)
+    {
+      fprintf (stderr, DEBUG_FMT_S, "long double type",
+	       TARGET_IEEEQUAD ? "IEEE" : "IBM");
+      fprintf (stderr, DEBUG_FMT_S, "default long double type",
+	       TARGET_IEEEQUAD_DEFAULT ? "IEEE" : "IBM");
+    }
   fprintf (stderr, DEBUG_FMT_D, "sched_restricted_insns_priority",
 	   (int)rs6000_sched_restricted_insns_priority);
   fprintf (stderr, DEBUG_FMT_D, "Number of standard builtins",
@@ -4562,13 +4582,26 @@ rs6000_option_override_internal (bool global_init_p)
 	rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE;
     }
 
-  /* Set -mabi=ieeelongdouble on some old targets.  Note, AIX and Darwin
-     explicitly redefine TARGET_IEEEQUAD to 0, so those systems will not
-     pick up this default.  */
-#if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD)
+  /* Set -mabi=ieeelongdouble on some old targets.  In the future, power server
+     systems will also set long double to be IEEE 128-bit.  AIX and Darwin
+     explicitly redefine TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so
+     those systems will not pick up this default.  Warn if the user changes the
+     default unless -Wno-psabi.  */
   if (!global_options_set.x_rs6000_ieeequad)
-    rs6000_ieeequad = 1;
-#endif
+    rs6000_ieeequad = TARGET_IEEEQUAD_DEFAULT;
+
+  else if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && TARGET_LONG_DOUBLE_128)
+    {
+      static bool warned_change_long_double;
+      if (!warned_change_long_double)
+	{
+	  warned_change_long_double = true;
+	  if (TARGET_IEEEQUAD)
+	    warning (OPT_Wpsabi, "Using IEEE extended precision long double");
+	  else
+	    warning (OPT_Wpsabi, "Using IBM extended precision long double");
+	}
+    }
 
   /* Enable the default support for IEEE 128-bit floating point on Linux VSX
      sytems.  In GCC 7, we would enable the the IEEE 128-bit floating point
@@ -5424,9 +5457,7 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return 3;
 
       case unaligned_load:
-	if (TARGET_P9_VECTOR)
-	  return 3;
-
+      case vector_gather_load:
 	if (TARGET_EFFICIENT_UNALIGNED_VSX)
 	  return 1;
 
@@ -5465,6 +5496,7 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return 2;
 
       case unaligned_store:
+      case vector_scatter_store:
 	if (TARGET_EFFICIENT_UNALIGNED_VSX)
 	  return 1;
 
@@ -8991,6 +9023,8 @@ rs6000_delegitimize_address (rtx orig_x)
 static bool
 rs6000_const_not_ok_for_debug_p (rtx x)
 {
+  if (GET_CODE (x) == UNSPEC)
+    return true;
   if (GET_CODE (x) == SYMBOL_REF
       && CONSTANT_POOL_ADDRESS_P (x))
     {
@@ -16614,6 +16648,22 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 	 gsi_replace (gsi, g, true);
 	 return true;
       }
+
+    /* Vector Fused multiply-add (fma).  */
+    case ALTIVEC_BUILTIN_VMADDFP:
+    case VSX_BUILTIN_XVMADDDP:
+    case ALTIVEC_BUILTIN_VMLADDUHM:
+      {
+       arg0 = gimple_call_arg (stmt, 0);
+       arg1 = gimple_call_arg (stmt, 1);
+       tree arg2 = gimple_call_arg (stmt, 2);
+       lhs = gimple_call_lhs (stmt);
+       gimple *g = gimple_build_assign (lhs, FMA_EXPR , arg0, arg1, arg2);
+       gimple_set_location (g, gimple_location (stmt));
+       gsi_replace (gsi, g, true);
+       return true;
+      }
+
     default:
 	if (TARGET_DEBUG_BUILTIN)
 	   fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s\n",
@@ -22429,14 +22479,6 @@ rs6000_expand_float128_convert (rtx dest, rtx src, bool unsigned_p)
 }
 
 
-/* Emit the RTL for an sISEL pattern.  */
-
-void
-rs6000_emit_sISEL (machine_mode mode ATTRIBUTE_UNUSED, rtx operands[])
-{
-  rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
-}
-
 /* Emit RTL that sets a register to zero if OP1 and OP2 are equal.  SCRATCH
    can be used as that dest register.  Return the dest register.  */
 
@@ -23212,7 +23254,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 
 /* Same as above, but for ints (isel).  */
 
-static int
+int
 rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   rtx condition_rtx, cr;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index aad382ced33..ed5ff397e07 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3829,11 +3829,19 @@
 
 ; Special case for less-than-0.  We can do it with just one machine
 ; instruction, but the generic optimizers do not realise it is cheap.
-(define_insn "*lt0_disi"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-	(lt:DI (match_operand:SI 1 "gpc_reg_operand" "r")
-	       (const_int 0)))]
+(define_insn "*lt0_<mode>di"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(lt:GPR (match_operand:DI 1 "gpc_reg_operand" "r")
+		(const_int 0)))]
   "TARGET_POWERPC64"
+  "srdi %0,%1,63"
+  [(set_attr "type" "shift")])
+
+(define_insn "*lt0_<mode>si"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(lt:GPR (match_operand:SI 1 "gpc_reg_operand" "r")
+		(const_int 0)))]
+  ""
   "rlwinm %0,%1,1,31,31"
   [(set_attr "type" "shift")])
 
@@ -10333,6 +10341,9 @@
 	{
 	  rtx loop_lab, end_loop;
 	  bool rotated = CONST_INT_P (rounded_size);
+	  rtx update = GEN_INT (-probe_interval);
+	  if (probe_interval > 32768)
+	    update = force_reg (Pmode, update);
 
 	  emit_stack_clash_protection_probe_loop_start (&loop_lab, &end_loop,
 							last_addr, rotated);
@@ -10340,13 +10351,11 @@
 	  if (Pmode == SImode)
 	    emit_insn (gen_movsi_update_stack (stack_pointer_rtx,
 					       stack_pointer_rtx,
-					       GEN_INT (-probe_interval),
-					       chain));
+					       update, chain));
 	  else
 	    emit_insn (gen_movdi_di_update_stack (stack_pointer_rtx,
 					          stack_pointer_rtx,
-					          GEN_INT (-probe_interval),
-					          chain));
+					          update, chain));
 	  emit_stack_clash_protection_probe_loop_end (loop_lab, end_loop,
 						      last_addr, rotated);
 	}
@@ -11157,7 +11166,7 @@
   [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
 	 (match_operand 1 "" "g,g"))
    (use (match_operand:P 2 "memory_operand" "<ptrm>,<ptrm>"))
-   (set (reg:P TOC_REGNUM) (unspec [(match_operand:P 3 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX"
   "<ptrload> 2,%2\;b%T0l\;<ptrload> 2,%3(1)"
@@ -11169,7 +11178,7 @@
 	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
 	      (match_operand 2 "" "g,g")))
    (use (match_operand:P 3 "memory_operand" "<ptrm>,<ptrm>"))
-   (set (reg:P TOC_REGNUM) (unspec [(match_operand:P 4 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 4 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_AIX"
   "<ptrload> 2,%3\;b%T1l\;<ptrload> 2,%4(1)"
@@ -11183,7 +11192,7 @@
 (define_insn "*call_indirect_elfv2<mode>"
   [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
 	 (match_operand 1 "" "g,g"))
-   (set (reg:P TOC_REGNUM) (unspec [(match_operand:P 2 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_ELFv2"
   "b%T0l\;<ptrload> 2,%2(1)"
@@ -11194,7 +11203,7 @@
   [(set (match_operand 0 "" "")
 	(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
 	      (match_operand 2 "" "g,g")))
-   (set (reg:P TOC_REGNUM) (unspec [(match_operand:P 3 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" "n,n")] UNSPEC_TOCSLOT))
    (clobber (reg:P LR_REGNO))]
   "DEFAULT_ABI == ABI_ELFv2"
   "b%T1l\;<ptrload> 2,%3(1)"
@@ -11773,7 +11782,7 @@
 {
   /* Use ISEL if the user asked for it.  */
   if (TARGET_ISEL)
-    rs6000_emit_sISEL (<MODE>mode, operands);
+    rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
 
   /* Expanding EQ and NE directly to some machine instructions does not help
      but does hurt combine.  So don't.  */
@@ -12133,7 +12142,7 @@
   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
 	(unspec:SI [(match_operand:CC 1 "cc_reg_operand" "y")]
 		   UNSPEC_MV_CR_OV))]
-  "TARGET_ISEL"
+  "TARGET_PAIRED_FLOAT"
   "mfcr %0\;rlwinm %0,%0,%t1,1"
   [(set_attr "type" "mfcr")
    (set_attr "length" "8")])
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index c42818fbc04..e7d0829495e 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -381,10 +381,10 @@ mabi=d32
 Target RejectNegative Undocumented Warn(using old darwin ABI) Var(rs6000_darwin64_abi, 0)
 
 mabi=ieeelongdouble
-Target RejectNegative Undocumented Warn(using IEEE extended precision long double) Var(rs6000_ieeequad) Save
+Target RejectNegative Var(rs6000_ieeequad) Save
 
 mabi=ibmlongdouble
-Target RejectNegative Undocumented Warn(using IBM extended precision long double) Var(rs6000_ieeequad, 0)
+Target RejectNegative Var(rs6000_ieeequad, 0)
 
 mcpu=
 Target RejectNegative Joined Var(rs6000_cpu_index) Init(-1) Enum(rs6000_cpu_opt_value) Save
diff --git a/gcc/config/rs6000/x86intrin.h b/gcc/config/rs6000/x86intrin.h
index 624e498a292..33e3176108b 100644
--- a/gcc/config/rs6000/x86intrin.h
+++ b/gcc/config/rs6000/x86intrin.h
@@ -39,6 +39,8 @@
 #include <mmintrin.h>
 
 #include <xmmintrin.h>
+
+#include <emmintrin.h>
 #endif /* __ALTIVEC__ */
 
 #include <bmiintrin.h>
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2258148c573..29d017ff148 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "symbol-summary.h"
 #include "ipa-prop.h"
 #include "ipa-fnsummary.h"
+#include "sched-int.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -357,6 +358,18 @@ static rtx_insn *last_scheduled_insn;
 #define MAX_SCHED_UNITS 3
 static int last_scheduled_unit_distance[MAX_SCHED_UNITS];
 
+#define NUM_SIDES 2
+static int current_side = 1;
+#define LONGRUNNING_THRESHOLD 5
+
+/* Estimate of number of cycles a long-running insn occupies an
+   execution unit.  */
+static unsigned fxu_longrunning[NUM_SIDES];
+static unsigned vfu_longrunning[NUM_SIDES];
+
+/* Factor to scale latencies by, determined by measurements.  */
+#define LATENCY_FACTOR 4
+
 /* The maximum score added for an instruction whose unit hasn't been
    in use for MAX_SCHED_MIX_DISTANCE steps.  Increase this value to
    give instruction mix scheduling more priority over instruction
@@ -3719,6 +3732,8 @@ s390_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
       case vector_stmt:
       case vector_load:
       case vector_store:
+      case vector_gather_load:
+      case vector_scatter_store:
       case vec_to_scalar:
       case scalar_to_vec:
       case cond_branch_not_taken:
@@ -14606,6 +14621,28 @@ s390_z10_prevent_earlyload_conflicts (rtx_insn **ready, int *nready_p)
   ready[0] = tmp;
 }
 
+/* Returns TRUE if BB is entered via a fallthru edge and all other
+   incoming edges are less than unlikely.  */
+static bool
+s390_bb_fallthru_entry_likely (basic_block bb)
+{
+  edge e, fallthru_edge;
+  edge_iterator ei;
+
+  if (!bb)
+    return false;
+
+  fallthru_edge = find_fallthru_edge (bb->preds);
+  if (!fallthru_edge)
+    return false;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    if (e != fallthru_edge
+	&& e->probability >= profile_probability::unlikely ())
+      return false;
+
+  return true;
+}
 
 /* The s390_sched_state variable tracks the state of the current or
    the last instruction group.
@@ -14614,7 +14651,7 @@ s390_z10_prevent_earlyload_conflicts (rtx_insn **ready, int *nready_p)
    3     the last group is complete - normal insns
    4     the last group was a cracked/expanded insn */
 
-static int s390_sched_state;
+static int s390_sched_state = 0;
 
 #define S390_SCHED_STATE_NORMAL  3
 #define S390_SCHED_STATE_CRACKED 4
@@ -14755,7 +14792,24 @@ s390_sched_score (rtx_insn *insn)
 	if (m & unit_mask)
 	  score += (last_scheduled_unit_distance[i] * MAX_SCHED_MIX_SCORE /
 		    MAX_SCHED_MIX_DISTANCE);
+
+      unsigned latency = insn_default_latency (insn);
+
+      int other_side = 1 - current_side;
+
+      /* Try to delay long-running insns when side is busy.  */
+      if (latency > LONGRUNNING_THRESHOLD)
+	{
+	  if (get_attr_z13_unit_fxu (insn) && fxu_longrunning[current_side]
+	      && fxu_longrunning[other_side] <= fxu_longrunning[current_side])
+	    score = MAX (0, score - 10);
+
+	  if (get_attr_z13_unit_vfu (insn) && vfu_longrunning[current_side]
+	      && vfu_longrunning[other_side] <= vfu_longrunning[current_side])
+	    score = MAX (0, score - 10);
+	}
     }
+
   return score;
 }
 
@@ -14874,6 +14928,8 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 {
   last_scheduled_insn = insn;
 
+  bool starts_group = false;
+
   if (s390_tune >= PROCESSOR_2827_ZEC12
       && reload_completed
       && recog_memoized (insn) >= 0)
@@ -14881,6 +14937,11 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
       unsigned int mask = s390_get_sched_attrmask (insn);
 
       if ((mask & S390_SCHED_ATTR_MASK_CRACKED) != 0
+	  || (mask & S390_SCHED_ATTR_MASK_EXPANDED) != 0
+	  || (mask & S390_SCHED_ATTR_MASK_GROUPALONE) != 0)
+	starts_group = true;
+
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) != 0
 	  || (mask & S390_SCHED_ATTR_MASK_EXPANDED) != 0)
 	s390_sched_state = S390_SCHED_STATE_CRACKED;
       else if ((mask & S390_SCHED_ATTR_MASK_ENDGROUP) != 0
@@ -14892,14 +14953,15 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	  switch (s390_sched_state)
 	    {
 	    case 0:
+	      starts_group = true;
+	      /* fallthrough */
 	    case 1:
 	    case 2:
+	      s390_sched_state++;
+	      break;
 	    case S390_SCHED_STATE_NORMAL:
-	      if (s390_sched_state == S390_SCHED_STATE_NORMAL)
-		s390_sched_state = 1;
-	      else
-		s390_sched_state++;
-
+	      starts_group = true;
+	      s390_sched_state = 1;
 	      break;
 	    case S390_SCHED_STATE_CRACKED:
 	      s390_sched_state = S390_SCHED_STATE_NORMAL;
@@ -14922,6 +14984,27 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	      last_scheduled_unit_distance[i]++;
 	}
 
+      /* If this insn started a new group, the side flipped.  */
+      if (starts_group)
+	current_side = current_side ? 0 : 1;
+
+      for (int i = 0; i < 2; i++)
+	{
+	  if (fxu_longrunning[i] >= 1)
+	    fxu_longrunning[i] -= 1;
+	  if (vfu_longrunning[i] >= 1)
+	    vfu_longrunning[i] -= 1;
+	}
+
+      unsigned latency = insn_default_latency (insn);
+      if (latency > LONGRUNNING_THRESHOLD)
+	{
+	  if (get_attr_z13_unit_fxu (insn))
+	    fxu_longrunning[current_side] = latency * LATENCY_FACTOR;
+	  else
+	    vfu_longrunning[current_side] = latency * LATENCY_FACTOR;
+	}
+
       if (verbose > 5)
 	{
 	  unsigned int sched_mask;
@@ -14978,7 +15061,21 @@ s390_sched_init (FILE *file ATTRIBUTE_UNUSED,
 {
   last_scheduled_insn = NULL;
   memset (last_scheduled_unit_distance, 0, MAX_SCHED_UNITS * sizeof (int));
-  s390_sched_state = 0;
+
+  /* If the next basic block is most likely entered via a fallthru edge
+     we keep the last sched state.  Otherwise we start a new group.
+     The scheduler traverses basic blocks in "instruction stream" ordering
+     so if we see a fallthru edge here, s390_sched_state will be of its
+     source block.
+
+     current_sched_info->prev_head is the insn before the first insn of the
+     block of insns to be scheduled.
+     */
+  rtx_insn *insn = current_sched_info->prev_head
+    ? NEXT_INSN (current_sched_info->prev_head) : NULL;
+  basic_block bb = insn ? BLOCK_FOR_INSN (insn) : NULL;
+  if (s390_tune < PROCESSOR_2964_Z13 || !s390_bb_fallthru_entry_likely (bb))
+    s390_sched_state = 0;
 }
 
 /* This target hook implementation for TARGET_LOOP_UNROLL_ADJUST calculates
diff --git a/gcc/config/spu/spu.c b/gcc/config/spu/spu.c
index e792650184b..606934bcfe7 100644
--- a/gcc/config/spu/spu.c
+++ b/gcc/config/spu/spu.c
@@ -6633,6 +6633,8 @@ spu_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
         return 2;
 
       case unaligned_load:
+      case vector_gather_load:
+      case vector_scatter_store:
         return 2;
 
       case cond_branch_taken:
diff --git a/gcc/config/stormy16/stormy16.h b/gcc/config/stormy16/stormy16.h
index 094a2f08e43..dfc659c2e98 100644
--- a/gcc/config/stormy16/stormy16.h
+++ b/gcc/config/stormy16/stormy16.h
@@ -446,7 +446,7 @@ enum reg_class
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 
-/* Macros for SDB and Dwarf Output.  */
+/* Macros for Dwarf Output.  */
 
 /* Define this macro if addresses in Dwarf 2 debugging info should not
    be the same size as pointers on the target architecture.  The
diff --git a/gcc/config/visium/visium.c b/gcc/config/visium/visium.c
index e028dc479d3..6861f127867 100644
--- a/gcc/config/visium/visium.c
+++ b/gcc/config/visium/visium.c
@@ -2940,12 +2940,6 @@ visium_select_cc_mode (enum rtx_code code, rtx op0, rtx op1)
       /* This is a btst, the result is in C instead of Z.  */
       return CCCmode;
 
-    case CONST_INT:
-      /* This is a degenerate case, typically an uninitialized variable.  */
-      gcc_assert (op0 == constm1_rtx);
-
-      /* ... fall through ... */
-
     case REG:
     case AND:
     case IOR:
@@ -2962,6 +2956,17 @@ visium_select_cc_mode (enum rtx_code code, rtx op0, rtx op1)
 	 when applied to a comparison with zero.  */
       return CCmode;
 
+    /* ??? Cater to the junk RTXes sent by try_merge_compare.  */
+    case ASM_OPERANDS:
+    case CALL:
+    case CONST_INT:
+    case LO_SUM:
+    case HIGH:
+    case MEM:
+    case UNSPEC:
+    case ZERO_EXTEND:
+      return CCmode;
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/config/visium/visium.h b/gcc/config/visium/visium.h
index 3b229f1a1e6..85735953968 100644
--- a/gcc/config/visium/visium.h
+++ b/gcc/config/visium/visium.h
@@ -1527,9 +1527,8 @@ do									\
    automatic variable having address X (an RTL expression).  The
    default computation assumes that X is based on the frame-pointer
    and gives the offset from the frame-pointer.  This is required for
-   targets that produce debugging output for DBX or COFF-style
-   debugging output for SDB and allow the frame-pointer to be
-   eliminated when the `-g' options is used. */
+   targets that produce debugging output for DBX and allow the frame-pointer
+   to be eliminated when the `-g' options is used. */
 #define DEBUGGER_AUTO_OFFSET(X) \
   (GET_CODE (X) == PLUS ? INTVAL (XEXP (X, 1)) : 0)
 
diff --git a/gcc/config/vx-common.h b/gcc/config/vx-common.h
index 5cc965cab78..d8f04eced4d 100644
--- a/gcc/config/vx-common.h
+++ b/gcc/config/vx-common.h
@@ -72,7 +72,6 @@ along with GCC; see the file COPYING3.  If not see
 /* None of these other formats is supported.  */
 #undef DWARF_DEBUGGING_INFO
 #undef DBX_DEBUGGING_INFO
-#undef SDB_DEBUGGING_INFO
 #undef XCOFF_DEBUGGING_INFO
 #undef VMS_DEBUGGING_INFO
 
diff --git a/gcc/configure b/gcc/configure
index 13f97cd3663..fb40ead9204 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -942,6 +942,7 @@ enable_fix_cortex_a53_843419
 with_glibc_version
 enable_gnu_unique_object
 enable_linker_build_id
+enable_libssp
 enable_default_ssp
 with_long_double_128
 with_gc
@@ -1682,6 +1683,7 @@ Optional Features:
                           extension on glibc systems
   --enable-linker-build-id
                           compiler will always pass --build-id to linker
+  --enable-libssp         enable linking against libssp
   --enable-default-ssp    enable Stack Smashing Protection as default
   --enable-maintainer-mode
                           enable make rules and dependencies not useful (and
@@ -4987,7 +4989,7 @@ acx_cv_cc_gcc_supports_ada=no
 # Other compilers, like HP Tru64 UNIX cc, exit successfully when
 # given a .adb file, but produce no object file.  So we must check
 # if an object file was really produced to guard against this.
-errors=`(${CC} -I"$srcdir"/ada -c conftest.adb) 2>&1 || echo failure`
+errors=`(${CC} -I"$srcdir"/ada/libgnat -c conftest.adb) 2>&1 || echo failure`
 if test x"$errors" = x && test -f conftest.$ac_objext; then
   acx_cv_cc_gcc_supports_ada=yes
 fi
@@ -7321,10 +7323,10 @@ fi
 if test "${enable_coverage+set}" = set; then :
   enableval=$enable_coverage; case "${enableval}" in
   yes|noopt)
-    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O0"
+    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O0 -fkeep-inline-functions -fkeep-static-functions"
     ;;
   opt)
-    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O2"
+    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O2 -fkeep-inline-functions -fkeep-static-functions"
     ;;
   no)
     # a.k.a. --disable-coverage
@@ -18440,7 +18442,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18443 "configure"
+#line 18445 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18546,7 +18548,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18549 "configure"
+#line 18551 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -22783,15 +22785,25 @@ if test $in_tree_ld != yes ; then
   else
     case "${target}" in
       *-*-solaris2*)
-	# See acinclude.m4 (gcc_SUN_LD_VERSION) for the version number
-	# format.
+	# Solaris 2 ld -V output looks like this for a regular version:
 	#
-	# Don't reuse gcc_gv_sun_ld_vers_* in case a linker other than
-	# /usr/ccs/bin/ld has been configured.
+	# ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1699
+	#
+	# but test versions add stuff at the end:
+	#
+	# ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10
+	#
+	# In Solaris 11.4, this was changed to
+	#
+	# ld: Solaris ELF Utilities: 11.4-1.3123
+	#
+	# ld and ld.so.1 are guaranteed to be updated in lockstep, so ld version
+	# numbers can be used in ld.so.1 feature checks even if a different
+	# linker is configured.
 	ld_ver=`$gcc_cv_ld -V 2>&1`
-	if echo "$ld_ver" | grep 'Solaris Link Editors' > /dev/null; then
+	if echo "$ld_ver" | $EGREP 'Solaris Link Editors|Solaris ELF Utilities' > /dev/null; then
 	  ld_vers=`echo $ld_ver | sed -n \
-	    -e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'`
+	    -e 's,^.*: \(5\|1[0-9]\)\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\2,p'`
 	  ld_vers_major=`expr "$ld_vers" : '\([0-9]*\)'`
 	  ld_vers_minor=`expr "$ld_vers" : '[0-9]*\.\([0-9]*\)'`
 	fi
@@ -22915,29 +22927,6 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_ro_rw_mix" >&5
 $as_echo "$gcc_cv_ld_ro_rw_mix" >&6; }
 
-if test "x${build}" = "x${target}" && test "x${build}" = "x${host}"; then
-  case "${target}" in
-    *-*-solaris2*)
-      #
-      # Solaris 2 ld -V output looks like this for a regular version:
-      #
-      # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1699
-      #
-      # but test versions add stuff at the end:
-      #
-      # ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10
-      #
-      gcc_cv_sun_ld_ver=`/usr/ccs/bin/ld -V 2>&1`
-      if echo "$gcc_cv_sun_ld_ver" | grep 'Solaris Link Editors' > /dev/null; then
-	gcc_cv_sun_ld_vers=`echo $gcc_cv_sun_ld_ver | sed -n \
-	  -e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'`
-	gcc_cv_sun_ld_vers_major=`expr "$gcc_cv_sun_ld_vers" : '\([0-9]*\)'`
-	gcc_cv_sun_ld_vers_minor=`expr "$gcc_cv_sun_ld_vers" : '[0-9]*\.\([0-9]*\)'`
-      fi
-      ;;
-  esac
-fi
-
 # Check whether --enable-initfini-array was given.
 if test "${enable_initfini_array+set}" = set; then :
   enableval=$enable_initfini_array;
@@ -25552,6 +25541,38 @@ $as_echo "$as_me: WARNING: LTO for $target requires binutils >= 2.20.1, but vers
 	;;
     esac
 
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for -xbrace_comment" >&5
+$as_echo_n "checking assembler for -xbrace_comment... " >&6; }
+if test "${gcc_cv_as_ix86_xbrace_comment+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_ix86_xbrace_comment=no
+  if test x$gcc_cv_as != x; then
+    $as_echo '.text' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xbrace_comment=no -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	gcc_cv_as_ix86_xbrace_comment=yes
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_xbrace_comment" >&5
+$as_echo "$gcc_cv_as_ix86_xbrace_comment" >&6; }
+if test $gcc_cv_as_ix86_xbrace_comment = yes; then
+
+$as_echo "#define HAVE_AS_XBRACE_COMMENT_OPTION 1" >>confdefs.h
+
+fi
+
+
     # Test if the assembler supports the section flag 'e' for specifying
     # an excluded section.
     { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for .section with e" >&5
@@ -29008,6 +29029,18 @@ $as_echo "#define HAVE_SOLARIS_CRTS 1" >>confdefs.h
 
 fi
 
+# Check whether --enable-libssp was given.
+if test "${enable_libssp+set}" = set; then :
+  enableval=$enable_libssp; case "${enableval}" in
+  yes|no)
+    ;;
+  *)
+    as_fn_error "unknown libssp setting $enableval" "$LINENO" 5
+    ;;
+esac
+fi
+
+
 # Test for stack protector support in target C library.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking __stack_chk_fail in target C library" >&5
 $as_echo_n "checking __stack_chk_fail in target C library... " >&6; }
@@ -29015,6 +29048,11 @@ if test "${gcc_cv_libc_provides_ssp+set}" = set; then :
   $as_echo_n "(cached) " >&6
 else
   gcc_cv_libc_provides_ssp=no
+  if test "x$enable_libssp" = "xno"; then
+    gcc_cv_libc_provides_ssp=yes
+  elif test "x$enable_libssp" = "xyes"; then
+    gcc_cv_libc_provides_ssp=no
+  else
     case "$target" in
        *-*-musl*)
 	 # All versions of musl provide stack protector
@@ -29062,8 +29100,9 @@ else
 fi
 
         ;;
-  *) gcc_cv_libc_provides_ssp=no ;;
+       *) gcc_cv_libc_provides_ssp=no ;;
     esac
+  fi
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_libc_provides_ssp" >&5
 $as_echo "$gcc_cv_libc_provides_ssp" >&6; }
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 82711389281..0e5167695a2 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -362,7 +362,7 @@ rm -f a.out a.exe b.out
 # Find the native compiler
 AC_PROG_CC
 AC_PROG_CXX
-ACX_PROG_GNAT([-I"$srcdir"/ada])
+ACX_PROG_GNAT([-I"$srcdir"/ada/libgnat])
 
 # Do configure tests with the C++ compiler, since that's what we build with.
 AC_LANG(C++)
@@ -728,10 +728,10 @@ AC_ARG_ENABLE(coverage,
 		 default is noopt])],
 [case "${enableval}" in
   yes|noopt)
-    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O0"
+    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O0 -fkeep-inline-functions -fkeep-static-functions"
     ;;
   opt)
-    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O2"
+    coverage_flags="-fprofile-arcs -ftest-coverage -frandom-seed=\$@ -O2 -fkeep-inline-functions -fkeep-static-functions"
     ;;
   no)
     # a.k.a. --disable-coverage
@@ -2587,15 +2587,25 @@ if test $in_tree_ld != yes ; then
   else
     case "${target}" in
       *-*-solaris2*)
-	# See acinclude.m4 (gcc_SUN_LD_VERSION) for the version number
-	# format.
+	# Solaris 2 ld -V output looks like this for a regular version:
 	#
-	# Don't reuse gcc_gv_sun_ld_vers_* in case a linker other than
-	# /usr/ccs/bin/ld has been configured.
+	# ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1699
+	#
+	# but test versions add stuff at the end:
+	#
+	# ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10
+	#
+	# In Solaris 11.4, this was changed to
+	#
+	# ld: Solaris ELF Utilities: 11.4-1.3123
+	#
+	# ld and ld.so.1 are guaranteed to be updated in lockstep, so ld version
+	# numbers can be used in ld.so.1 feature checks even if a different
+	# linker is configured.
 	ld_ver=`$gcc_cv_ld -V 2>&1`
-	if echo "$ld_ver" | grep 'Solaris Link Editors' > /dev/null; then
+	if echo "$ld_ver" | $EGREP 'Solaris Link Editors|Solaris ELF Utilities' > /dev/null; then
 	  ld_vers=`echo $ld_ver | sed -n \
-	    -e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'`
+	    -e 's,^.*: \(5\|1[0-9]\)\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\2,p'`
 	  ld_vers_major=`expr "$ld_vers" : '\([0-9]*\)'`
 	  ld_vers_minor=`expr "$ld_vers" : '[0-9]*\.\([0-9]*\)'`
 	fi
@@ -4103,6 +4113,11 @@ foo:	nop
 	;;
     esac
 
+    gcc_GAS_CHECK_FEATURE([-xbrace_comment], gcc_cv_as_ix86_xbrace_comment,,
+      [-xbrace_comment=no], [.text],,
+      [AC_DEFINE(HAVE_AS_XBRACE_COMMENT_OPTION, 1,
+		[Define if your assembler supports -xbrace_comment option.])])
+
     # Test if the assembler supports the section flag 'e' for specifying
     # an excluded section.
     gcc_GAS_CHECK_FEATURE([.section with e], gcc_cv_as_section_has_e,
@@ -5751,10 +5766,25 @@ if test x$gcc_cv_solaris_crts = xyes; then
   	    [Define if the system-provided CRTs are present on Solaris.])
 fi
 
+AC_ARG_ENABLE(libssp,
+[AS_HELP_STRING([--enable-libssp], [enable linking against libssp])],
+[case "${enableval}" in
+  yes|no)
+    ;;
+  *)
+    AC_MSG_ERROR([unknown libssp setting $enableval])
+    ;;
+esac], [])
+
 # Test for stack protector support in target C library.
 AC_CACHE_CHECK(__stack_chk_fail in target C library,
-      gcc_cv_libc_provides_ssp,
-      [gcc_cv_libc_provides_ssp=no
+  gcc_cv_libc_provides_ssp,
+  [gcc_cv_libc_provides_ssp=no
+  if test "x$enable_libssp" = "xno"; then
+    gcc_cv_libc_provides_ssp=yes
+  elif test "x$enable_libssp" = "xyes"; then
+    gcc_cv_libc_provides_ssp=no
+  else
     case "$target" in
        *-*-musl*)
 	 # All versions of musl provide stack protector
@@ -5791,8 +5821,9 @@ AC_CACHE_CHECK(__stack_chk_fail in target C library,
 	 AC_CHECK_FUNC(__stack_chk_fail,[gcc_cv_libc_provides_ssp=yes],
            [echo "no __stack_chk_fail on this target"])
         ;;
-  *) gcc_cv_libc_provides_ssp=no ;;
-    esac])
+       *) gcc_cv_libc_provides_ssp=no ;;
+    esac
+  fi])
 
 if test x$gcc_cv_libc_provides_ssp = xyes; then
   AC_DEFINE(TARGET_LIBC_PROVIDES_SSP, 1,
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 3abf79440cc..590e3221c8c 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,363 @@
+2017-11-03  Nathan Sidwell  <nathan@acm.org>
+
+	PR c++/82710
+	* decl.c (grokdeclarator): Protect MAYBE_CLASS things from paren
+	warning too.
+
+2017-11-02  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/81957
+	* pt.c (make_pack_expansion): Add tsubst_flags_t parameter.
+	(expand_integer_pack, convert_template_argument, coerce_template_parms,
+	gen_elem_of_pack_expansion_instantiation, tsubst_pack_expansion,
+	unify): Adjust calls.
+	* tree.c (cp_build_qualified_type_real): Likewise.
+	* cp-tree.h (make_pack_expansion): Adjust declaration.
+
+2017-11-02  Nathan Sidwell  <nathan@acm.org>
+
+	* cp-tree.h (IDENTIFIER_NEWDEL_OP_P): Restore, adjust.
+	(IDENTIFIER_NEW_OP_P): New.
+	* decl.c (grokdeclarator): Restore IDENTIFIER_NEWDEL_OP_P use.
+	* pt.c (push_template_decl_real): Likewise.
+	* typeck.c (check_return_expr): Use IDENTIFIER_NEW_OP_P.
+
+	PR c++/82710
+	* decl.c (grokdeclarator): Don't warn when parens protect a return
+	type from a qualified name.
+
+2017-11-01  Nathan Sidwell  <nathan@acm.org>
+
+	* cp-tree.h (enum cp_identifier_kind): Delete cik_newdel_op.
+	Renumber and reserve udlit value.
+	(IDENTIFIER_NEWDEL_OP_P): Delete.
+	(IDENTIFIER_OVL_OP_P): New.
+	(IDENTIFIER_ASSIGN_OP_P): Adjust.
+	(IDENTIFIER_CONV_OP_P): Adjust.
+	(IDENTIFIER_OVL_OP_INFO): Adjust.
+	(IDENTIFIER_OVL_OP_FLAGS): New.
+	* decl.c (grokdeclarator): Use IDENTIFIER_OVL_OP_FLAGS.
+	* lex.c (get_identifier_kind_name): Adjust.
+	(init_operators): Don't special case new/delete ops.
+	* mangle.c (write_unqualified_id): Use IDENTIFIER_OVL_OP_P.
+	* pt.c (push_template_decl_real): Use IDENTIFIER_OVL_OP_FLAGS.
+	* typeck.c (check_return_expr): Likewise.
+
+	* cp-tree.h (assign_op_identifier, call_op_identifier): Use
+	compressed code.
+	(struct lang_decl_fn): Use compressed operator code.
+	(DECL_OVERLOADED_OPERATOR_CODE): Replace with ...
+	(DECL_OVERLOADED_OPERATOR_CODE_RAW): ... this.
+	(DECL_OVERLOADED_OPERATOR_CODE_IS): Use it.
+	* decl.c (duplicate_decls): Use DECL_OVERLOADED_OPERATOR_CODE_RAW.
+	(build_library_fn): Likewise.
+	(grok_op_properties): Likewise.
+	* mangle.c (write_unqualified_name): Likewise.
+	* method.c (implicitly_declare_fn): Likewise.
+	* typeck.c (check_return_expr): Use DECL_OVERLOADED_OPERATOR_IS.
+
+	* cp-tree.h (IDENTIFIER_CP_INDEX): Define.
+	(enum ovl_op_flags): Add OVL_OP_FLAG_AMBIARY.
+	(enum ovl_op_code): New.
+	(struct ovl_op_info): Add ovl_op_code field.
+	(ovl_op_info): Size by OVL_OP_MAX.
+	(ovl_op_mapping, ovl_op_alternate): Declare.
+	(OVL_OP_INFO): Adjust for mapping array.
+	(IDENTIFIER_OVL_OP_INFO): New.
+	* decl.c (ambi_op_p, unary_op_p): Delete.
+	(grok_op_properties): Use IDENTIFIER_OVL_OP_INFO and
+	ovl_op_alternate.
+	* lex.c (ovl_op_info): Adjust and static initialize.
+	(ovl_op_mappings, ovl_op_alternate): Define.
+	(init_operators): Iterate over ovl_op_info array and init mappings
+	& alternate arrays.
+	* mangle.c (write_unqualified_id): Use IDENTIFIER_OVL_OP_INFO.
+	* operators.def (DEF_OPERATOR): Remove KIND parm.
+	(DEF_SIMPLE_OPERATOR): Delete.
+	(OPERATOR_TRANSITION): Expand if defined.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* pt.c (listify): Use %< and %> for description of #include.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* class.c (explain_non_literal_class): Use UNKNOWN_LOCATION rather
+	than 0.
+	* name-lookup.c (suggest_alternatives_for): Update for renaming of
+	inform_at_rich_loc.
+	(maybe_suggest_missing_header): Likewise.
+	(suggest_alternative_in_explicit_scope): Likewise.
+	* parser.c (cp_parser_diagnose_invalid_type_name): Likewise for
+	renaming of error_at_rich_loc.
+	(cp_parser_string_literal): Likewise.
+	(cp_parser_nested_name_specifier_opt): Likewise.
+	(cp_parser_cast_expression): Likewise for renaming of
+	warning_at_rich_loc.
+	(cp_parser_decl_specifier_seq): Likewise for renaming of
+	error_at_rich_loc and warning_at_rich_loc.
+	(cp_parser_elaborated_type_specifier): Likewise for renaming of
+	pedwarn_at_rich_loc.
+	(cp_parser_cv_qualifier_seq_opt): Likewise for renaming of
+	error_at_rich_loc.
+	(cp_parser_virt_specifier_seq_opt): Likewise.
+	(cp_parser_class_specifier_1): Likewise.
+	(cp_parser_class_head): Likewise.
+	(cp_parser_member_declaration): Likewise for renaming of
+	pedwarn_at_rich_loc, warning_at_rich_loc, and error_at_rich_loc.
+	(cp_parser_enclosed_template_argument_list): Likewise for renaming
+	of error_at_rich_loc.
+	(set_and_check_decl_spec_loc): Likewise.
+	* pt.c (listify): Likewise.
+	* rtti.c (typeid_ok_p): Likewise.
+	* semantics.c (process_outer_var_ref): Use UNKNOWN_LOCATION rather
+	than 0.
+	* typeck.c (access_failure_info::maybe_suggest_accessor): Update
+	for renaming of inform_at_rich_loc.
+	(finish_class_member_access_expr): Likewise for renaming of
+	error_at_rich_loc.
+
+2017-10-31  Nathan Sidwell  <nathan@acm.org>
+
+	* cp-tree.h (struct operator_name_info_t): Rename to ...
+	(struct ovl_op_info_t): ... here.  Add tree_code field.
+	(operator_name_info, assignment_operator_name_info): Delete.
+	(ovl_op_info): Declare.
+	(OVL_OP_INFO): Adjust.
+	* decl.c (grok_op_properties): Use ovl_op_flags.
+	* lex.c (operator_name_info, assignment_operator_name_info):
+	Delete.
+	(ovl_op_info): Define.
+	(set_operator_ident): Adjust.
+	(init_operators): Set tree_code.
+	* mangle.c (write_unqualified_id): Adjust operator array scan.
+
+	* lex.c (init_operators): Allow NULL operator name.  Don't add
+	special cases.
+	* operators.def: Use NULL for mangling only operators.  Move to
+	after regular operators but move assignment operators last.
+
+	* cp-tree.h (enum ovl_op_flags): New.
+	(struct operator_name_info_t): Rename arity to flags.
+	* lex.c (set_operator_ident): New.
+	(init_operators): Use it.  Adjust for flags.
+	* mangle.c (write_unqualified_id): Adjust for flags.
+	* operators.def: Replace arity with flags.
+
+	* cp-tree.h (ovl_op_identifier): New.
+	(assign_op_identifier, call_op_identifier): Adjust.
+	(cp_operator_id, cp_assignment_operator_ide): Delete.
+	(SET_OVERLOADED_OPERATOR_CODE): Delete.
+	(OVL_OP_INFO): New.
+	* call.c (op_error): Use OVL_OP_INFO.
+	(build_conditional_expr_1): Use ovl_op_identifier.
+	(build_new_op_1): Use OVL_OP_INFO & ovl_op_identifier.
+	(build_op_delete_call): Likewise.
+	* class.c (type_requires_array_cookie): Use ovl_op_identifier.
+	* decl.c (duplicate_decls): Directly copy operator code.
+	(builtin_function_1): Do not set operator code.
+	(build_library_fn): Directly set operator code.
+	(push_cp_library_fn): Use ovl_op_identifier.
+	(grok_op_properties): Directly set operator code.
+	* decl2.c (maybe_warn_sized_delete): Use ovl_op_identifier.
+	* error.c (dump_expr): Use OVL_OP_INFO.
+	(op_to_string): Add assop arg. Use OVL_OP_INFO.
+	(assop_to_string): Delete.
+	(args_to_string): Adjust.
+	* init.c (build_new_1): Use ovl_op_identifier.
+	* mangle.c (write_unqualified_name): Use OVL_OP_INFO.
+	(write_expression): Likewise.
+	* method.c (synthesized_method_walk): Use ovl_op_identifier.
+	(implicitly_declare_fn): Use assign_op_identifier. Directly set
+	operator code.
+	* name-lookup.c (get_class_binding): Use assign_op_identifier.
+	* parser.c (cp_parser_operator): Use ovl_op_identifier.
+	(cp_parser_omp_clause_reduction): Likewise.
+	* semantics.c (omp_reduction_id): Likewise.
+	* typeck.c (cxx_sizeof_or_alignof_type): Use OVL_OP_INFO.
+
+	* cp-tree.h (assign_op_identifier, call_op_identifier): Define.
+	(LAMBDA_FUNCTION_P): Use DECL_OVERLOADED_OPERATOR_IS.
+	(DECL_OVERLOADED_OPERATOR_P): Just retuurn true/false.
+	(DECL_OVERLOADED_OPERATOR_CODE, DECL_OVERLOADED_OPERATOR_IS): Define.
+	* call.c (add_function_candidate): Use
+	DECL_OVERLOADED_OPERATOR_IS.
+	(build_op_call_1): Use call_op_identifier &
+	DECL_OVERLOADED_OPERATOR_IS.
+	(build_over_call): Likewise.
+	(has_trivial_copy_assign_p): Use assign_op_identifier.
+	(build_special_member_call): Likewise.
+	* class.c (dfs_declare_virt_assop_and_dtor): Likewise.
+	(vbase_has_user_provided_move_assign,
+	classtype_has_move_assign_or_move_ctor_p): Likewise.
+	* decl.c (duplicate_decls): Use DECL_OVERLOADED_OPERATOR_CODE.
+	(grok_special_member_properties): Use assign_op_identifier.
+	(start_preparsed_function): Use DECL_OVERLOADED_OPERATOR_IS.
+	* decl2.c (mark_used): Use DECL_CONV_FN_P.
+	* dump.c (dump_access): Delete prototype.
+	(dump_op): Delete.
+	(cp_dump_tree): Don't call it.
+	* lambda.c (lambda_function): Use call_op_identifier.
+	(maybe_add_lambda_conv_op): Not an overloaded operator.  Remove
+	unneeded braces.
+	* mangle.c (write_unqualified_name): Use DECL_OVERLOADED_OPERTOR_CODE.
+	* method.c (do_build_copy_assign): Use assign_op_identifier.
+	(synthesize_method): Use DECL_OVERLOADED_OPERATOR_IS.
+	(get_copy_assign): Use assign_op_identifier.
+	(synthesized_method_walk): Likewise.
+	(defaultable_fn_check): Use DECL_OVERLOADED_OPERATOR_IS.
+	* parser.c (cp_parser_lambda_declarator_opt): Use
+	call_op_identifier.
+	* semanitics.c (classtype_has_nothrow_assign_or_copy_p): Use
+	assign_op_identifier.
+	* tree.c (special_function_p):  Use DECL_OVERLOADED_OPERATOR_IS.
+	* typeck.c (check_return_expr): Use DECL_OVERLOADED_OPERATOR_CODE.
+	(check_return_expr): Use assign_op_identifier.
+
+2017-10-30  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82085
+	* pt.c (tsubst_copy_and_build, [INDIRECT_REF]): For a REFERENCE_REF_P,
+	unconditionally call convert_from_reference.
+
+2017-10-30  Nathan Sidwell  <nathan@acm.org>
+
+	* call.c (build_op_call_1): Test for FUNCTION_DECL in same manner
+	as a few lines earlier.
+	* cp-tree.h (PACK_EXPANSION_PATTERN): Fix white space.
+	* decl.c (grokfndecl): Fix indentation.
+	(compute_array_index_type): Use processing_template_decl_sentinel.
+	(grok_op_properties): Move warnings to end.  Reorder other checks
+	to group similar entities.  Tweak diagnostics.
+	* lex.c (unqualified_name_lookup_error): No need to check name is
+	not ERROR_MARK operator.
+	* parser.c (cp_parser_operator): Select operator code before
+	looking it up.
+	* typeck.c (check_return_expr): Fix indentation and line wrapping.
+
+2017-10-27  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	* pt.c (invalid_nontype_parm_type_p): Return a bool instead of an int.
+
+2017-10-26  Nathan Sidwell  <nathan@acm.org>
+
+	* decl.c (sort_labels): Restore function.
+	(pop_labels): Sort labels
+	(identify_goto): Add translation markup.
+
+2017-10-25  Nathan Sidwell  <nathan@acm.org>
+
+	Kill IDENTIFIER_LABEL_VALUE.
+	* cp-tree.h (lang_identifier): Delete label_value slot.
+	(IDENTIFIER_LABEL_VALUE, SET_IDENTIFIER_LABEL_VALUE): Delete.
+	(struct named_label_hasher): Rename to ...
+	(struct named_label_hash): ... here.  Reimplement.
+	(struct language_function): Adjust x_named_labels.
+	* name-lookup.h (struct cp_label_binding): Delete.
+	(struct cp_binding_level): Delete shadowed_labels slot.
+	* decl.c (struct named_label_entry): Add name and outer slots.
+	(pop_label): Rename to ...
+	(check_label_used): ... here.  Don't pop.
+	(note_label, sort_labels): Delete.
+	(pop_labels, pop_local_label): Reimplement.
+	(poplevel): Pop local labels as any other decl. Remove
+	shadowed_labels handling.
+	(named_label_hash::hash, named_label_hash::equal): New.
+	(make_label_decl): Absorb into ...
+	(lookup_label_1): ... here.  Add making_local_p arg, reimplement.
+	(lookup_label, declare_local_label): Adjust.
+	(check_goto, define_label): Adjust.
+	* lex.c (make_conv_op_name): Don't clear IDENTIFIER_LABEL_VALUE.
+	* ptree.c (cxx_print_identifier): Don't print identifier binding.
+
+	* decl.c (identifier_goto): Reduce duplication.
+	(check_previous_goto_1): Likewise.
+	(check_goto): Move var decls to initialization.
+	(check_omp_return, define_label_1, define_label): Likewise.
+
+2017-10-25  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libstdc++/81706
+	* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
+	newdecl to corresponding __builtin_ if any.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82466
+	* decl.c (duplicate_decls): Warn for built-in functions declared as
+	non-function, use OPT_Wbuiltin_declaration_mismatch.
+
+	* decl.c (duplicate_decls): Avoid redundant '+' in warning_at.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80991
+	* pt.c (value_dependent_expression_p, [TRAIT_EXPR]): Handle
+	a TREE_LIST as TRAIT_EXPR_TYPE2.
+
+2017-10-24  Mukesh Kapoor  <mukesh.kapoor@oracle.com>
+	    Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82307
+	* cvt.c (type_promotes_to): Implement C++17, 7.6/4, about unscoped
+	enumeration type whose underlying type is fixed.
+
+2017-10-23  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80449
+	* semantics.c (finish_compound_literal): Check do_auto_deduction
+	return value for error_mark_node.
+
+2017-10-23  Jason Merrill  <jason@redhat.com>
+
+	PR c++/77369 - wrong noexcept handling in C++14 and below
+	* tree.c (strip_typedefs): Canonicalize TYPE_RAISES_EXCEPTIONS.
+
+2017-10-20  Nathan Sidwell  <nathan@acm.org>
+
+	* class.c (layout_class_type): Cleanup as-base creation, determine
+	mode here.
+	(finish_struct_1): ... not here.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/82600
+	* typeck.c (check_return_expr): Don't call
+	maybe_warn_about_returning_address_of_local in templates.
+
+2017-10-17  Nathan Sidwell  <nathan@acm.org>
+
+	PR c++/82560
+	* call.c (build_over_call): Don't pass tf_no_cleanup to nested
+	calls.
+
+	PR middle-end/82546
+	* cp-objcp-common.c (cp_tree_size): Reformat.  Adjust returns size
+	of TYPE nodes.
+
+2017-10-13  Jason Merrill  <jason@redhat.com>
+
+	PR c++/82357 - bit-field in template
+	* tree.c (cp_stabilize_reference): Just return a NON_DEPENDENT_EXPR.
+
+2017-10-13  David Malcolm  <dmalcolm@redhat.com>
+
+	* cp-tree.h (maybe_show_extern_c_location): New decl.
+	* decl.c (grokfndecl): When complaining about literal operators
+	with C linkage, issue a note giving the location of the
+	extern "C".
+	* parser.c (cp_parser_new): Initialize new field
+	"innermost_linkage_specification_location".
+	(cp_parser_linkage_specification): Store the location
+	of the linkage specification within the cp_parser.
+	(cp_parser_explicit_specialization): When complaining about
+	template specializations with C linkage, issue a note giving the
+	location of the extern "C".
+	(cp_parser_explicit_template_declaration): Likewise for templates.
+	(maybe_show_extern_c_location): New function.
+	* parser.h (struct cp_parser): New field
+	"innermost_linkage_specification_location".
+
 2017-10-12  Nathan Sidwell  <nathan@acm.org>
 
 	* cp-tree.h (cp_expr): Add const operator * and operator->
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 8794210be0a..49cda986f44 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -2082,7 +2082,7 @@ add_function_candidate (struct z_candidate **candidates,
       if (DECL_CONSTRUCTOR_P (fn))
 	i = 1;
       else if (DECL_ASSIGNMENT_OPERATOR_P (fn)
-	       && DECL_OVERLOADED_OPERATOR_P (fn) == NOP_EXPR)
+	       && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR))
 	i = 2;
       else
 	i = 0;
@@ -4474,7 +4474,7 @@ build_op_call_1 (tree obj, vec<tree, va_gc> **args, tsubst_flags_t complain)
 
   if (TYPE_BINFO (type))
     {
-      fns = lookup_fnfields (TYPE_BINFO (type), cp_operator_id (CALL_EXPR), 1);
+      fns = lookup_fnfields (TYPE_BINFO (type), call_op_identifier, 1);
       if (fns == error_mark_node)
 	return error_mark_node;
     }
@@ -4557,19 +4557,20 @@ build_op_call_1 (tree obj, vec<tree, va_gc> **args, tsubst_flags_t complain)
             }
 	  result = error_mark_node;
 	}
-      /* Since cand->fn will be a type, not a function, for a conversion
-	 function, we must be careful not to unconditionally look at
-	 DECL_NAME here.  */
       else if (TREE_CODE (cand->fn) == FUNCTION_DECL
-	       && DECL_OVERLOADED_OPERATOR_P (cand->fn) == CALL_EXPR)
+	       && DECL_OVERLOADED_OPERATOR_P (cand->fn)
+	       && DECL_OVERLOADED_OPERATOR_IS (cand->fn, CALL_EXPR))
 	result = build_over_call (cand, LOOKUP_NORMAL, complain);
       else
 	{
-	  if (DECL_P (cand->fn))
+	  if (TREE_CODE (cand->fn) == FUNCTION_DECL)
 	    obj = convert_like_with_context (cand->convs[0], obj, cand->fn,
 					     -1, complain);
 	  else
-	    obj = convert_like (cand->convs[0], obj, complain);
+	    {
+	      gcc_checking_assert (TYPE_P (cand->fn));
+	      obj = convert_like (cand->convs[0], obj, complain);
+	    }
 	  obj = convert_from_reference (obj);
 	  result = cp_build_function_call_vec (obj, args, complain);
 	}
@@ -4619,12 +4620,8 @@ static void
 op_error (location_t loc, enum tree_code code, enum tree_code code2,
 	  tree arg1, tree arg2, tree arg3, bool match)
 {
-  const char *opname;
-
-  if (code == MODIFY_EXPR)
-    opname = assignment_operator_name_info[code2].name;
-  else
-    opname = operator_name_info[code].name;
+  bool assop = code == MODIFY_EXPR;
+  const char *opname = OVL_OP_INFO (assop, assop ? code2 : code)->name;
 
   switch (code)
     {
@@ -5183,7 +5180,7 @@ build_conditional_expr_1 (location_t loc, tree arg1, tree arg2, tree arg3,
       add_builtin_candidates (&candidates,
 			      COND_EXPR,
 			      NOP_EXPR,
-			      cp_operator_id (COND_EXPR),
+			      ovl_op_identifier (false, COND_EXPR),
 			      args,
 			      LOOKUP_NORMAL, complain);
 
@@ -5573,7 +5570,6 @@ build_new_op_1 (location_t loc, enum tree_code code, int flags, tree arg1,
 {
   struct z_candidate *candidates = 0, *cand;
   vec<tree, va_gc> *arglist;
-  tree fnname;
   tree args[3];
   tree result = NULL_TREE;
   bool result_valid_p = false;
@@ -5590,14 +5586,13 @@ build_new_op_1 (location_t loc, enum tree_code code, int flags, tree arg1,
       || error_operand_p (arg3))
     return error_mark_node;
 
-  if (code == MODIFY_EXPR)
+  bool ismodop = code == MODIFY_EXPR;
+  if (ismodop)
     {
       code2 = TREE_CODE (arg3);
       arg3 = NULL_TREE;
-      fnname = cp_assignment_operator_id (code2);
     }
-  else
-    fnname = cp_operator_id (code);
+  tree fnname = ovl_op_identifier (ismodop, ismodop ? code2 : code);
 
   arg1 = prep_operand (arg1);
 
@@ -5792,7 +5787,7 @@ build_new_op_1 (location_t loc, enum tree_code code, int flags, tree arg1,
 		? G_("no %<%D(int)%> declared for postfix %qs,"
 		     " trying prefix operator instead")
 		: G_("no %<%D(int)%> declared for postfix %qs");
-	      permerror (loc, msg, fnname, operator_name_info[code].name);
+	      permerror (loc, msg, fnname, OVL_OP_INFO (false, code)->name);
 	    }
 
 	  if (!flag_permissive)
@@ -6204,7 +6199,7 @@ build_op_delete_call (enum tree_code code, tree addr, tree size,
 
   type = strip_array_types (TREE_TYPE (TREE_TYPE (addr)));
 
-  fnname = cp_operator_id (code);
+  fnname = ovl_op_identifier (false, code);
 
   if (CLASS_TYPE_P (type)
       && COMPLETE_TYPE_P (complete_type (type))
@@ -6433,7 +6428,7 @@ build_op_delete_call (enum tree_code code, tree addr, tree size,
 
   if (complain & tf_error)
     error ("no suitable %<operator %s%> for %qT",
-	   operator_name_info[(int)code].name, type);
+	   OVL_OP_INFO (false, code)->name, type);
   return error_mark_node;
 }
 
@@ -7717,8 +7712,11 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
     }
 
   /* N3276 magic doesn't apply to nested calls.  */
-  int decltype_flag = (complain & tf_decltype);
+  tsubst_flags_t decltype_flag = (complain & tf_decltype);
   complain &= ~tf_decltype;
+  /* No-Cleanup doesn't apply to nested calls either.  */
+  tsubst_flags_t no_cleanup_complain = complain;
+  complain &= ~tf_no_cleanup;
 
   /* Find maximum size of vector to hold converted arguments.  */
   parmlen = list_length (parm);
@@ -7916,7 +7914,7 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
       if (flags & LOOKUP_NO_CONVERSION)
 	conv->user_conv_p = true;
 
-      tsubst_flags_t arg_complain = complain & (~tf_no_cleanup);
+      tsubst_flags_t arg_complain = complain;
       if (!conversion_warning)
 	arg_complain &= ~tf_warning;
 
@@ -8110,7 +8108,8 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	  return val;
 	}
     }
-  else if (DECL_OVERLOADED_OPERATOR_P (fn) == NOP_EXPR
+  else if (DECL_ASSIGNMENT_OPERATOR_P (fn)
+	   && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR)
 	   && trivial_fn_p (fn)
 	   && !DECL_DELETED_FN (fn))
     {
@@ -8164,7 +8163,8 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
       else if (default_ctor_p (fn))
 	{
 	  if (is_dummy_object (argarray[0]))
-	    return force_target_expr (DECL_CONTEXT (fn), void_node, complain);
+	    return force_target_expr (DECL_CONTEXT (fn), void_node,
+				      no_cleanup_complain);
 	  else
 	    return cp_build_indirect_ref (argarray[0], RO_NULL, complain);
 	}
@@ -8277,7 +8277,7 @@ first_non_public_field (tree type)
 static bool
 has_trivial_copy_assign_p (tree type, bool access, bool *hasassign)
 {
-  tree fns = get_class_binding (type, cp_assignment_operator_id (NOP_EXPR));
+  tree fns = get_class_binding (type, assign_op_identifier);
   bool all_trivial = true;
 
   /* Iterate over overloads of the assignment operator, checking
@@ -8777,8 +8777,7 @@ build_special_member_call (tree instance, tree name, vec<tree, va_gc> **args,
   vec<tree, va_gc> *allocated = NULL;
   tree ret;
 
-  gcc_assert (IDENTIFIER_CDTOR_P (name)
-	      || name == cp_assignment_operator_id (NOP_EXPR));
+  gcc_assert (IDENTIFIER_CDTOR_P (name) || name == assign_op_identifier);
   if (TYPE_P (binfo))
     {
       /* Resolve the name.  */
@@ -8804,7 +8803,7 @@ build_special_member_call (tree instance, tree name, vec<tree, va_gc> **args,
       if (!same_type_ignoring_top_level_qualifiers_p
 	  (TREE_TYPE (instance), BINFO_TYPE (binfo)))
 	{
-	  if (name != cp_assignment_operator_id (NOP_EXPR))
+	  if (IDENTIFIER_CDTOR_P (name))
 	    /* For constructors and destructors, either the base is
 	       non-virtual, or it is virtual but we are doing the
 	       conversion from a constructor or destructor for the
@@ -8812,10 +8811,13 @@ build_special_member_call (tree instance, tree name, vec<tree, va_gc> **args,
 	       statically.  */
 	    instance = convert_to_base_statically (instance, binfo);
 	  else
-	    /* However, for assignment operators, we must convert
-	       dynamically if the base is virtual.  */
-	    instance = build_base_path (PLUS_EXPR, instance,
-					binfo, /*nonnull=*/1, complain);
+	    {
+	      /* However, for assignment operators, we must convert
+		 dynamically if the base is virtual.  */
+	      gcc_checking_assert (name == assign_op_identifier);
+	      instance = build_base_path (PLUS_EXPR, instance,
+					  binfo, /*nonnull=*/1, complain);
+	    }
 	}
     }
 
@@ -9062,7 +9064,6 @@ build_new_method_call_1 (tree instance, tree fns, vec<tree, va_gc> **args,
      static member function.  */
   instance = mark_type_use (instance);
 
-
   /* Figure out whether to skip the first argument for the error
      message we will display to users if an error occurs.  We don't
      want to display any compiler-generated arguments.  The "this"
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index a90b85f2a5c..98e62c6ad45 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -3011,7 +3011,7 @@ static tree
 dfs_declare_virt_assop_and_dtor (tree binfo, void *data)
 {
   tree bv, fn, t = (tree)data;
-  tree opname = cp_assignment_operator_id (NOP_EXPR);
+  tree opname = assign_op_identifier;
 
   gcc_assert (t && CLASS_TYPE_P (t));
   gcc_assert (binfo && TREE_CODE (binfo) == TREE_BINFO);
@@ -5038,7 +5038,7 @@ vbase_has_user_provided_move_assign (tree type)
   /* Does the type itself have a user-provided move assignment operator?  */
   if (!CLASSTYPE_LAZY_MOVE_ASSIGN (type))
     for (ovl_iterator iter (get_class_binding_direct
-			    (type, cp_assignment_operator_id (NOP_EXPR)));
+			    (type, assign_op_identifier));
 	 iter; ++iter)
       if (!DECL_ARTIFICIAL (*iter) && move_fn_p (*iter))
 	return true;
@@ -5186,7 +5186,7 @@ classtype_has_move_assign_or_move_ctor_p (tree t, bool user_p)
 
   if (!CLASSTYPE_LAZY_MOVE_ASSIGN (t))
     for (ovl_iterator iter (get_class_binding_direct
-			    (t, cp_assignment_operator_id (NOP_EXPR)));
+			    (t, assign_op_identifier));
 	 iter; ++iter)
       if ((!user_p || !DECL_ARTIFICIAL (*iter)) && move_fn_p (*iter))
 	return true;
@@ -5304,7 +5304,7 @@ type_requires_array_cookie (tree type)
      the array to the deallocation function, so we will need to store
      a cookie.  */
   fns = lookup_fnfields (TYPE_BINFO (type),
-			 cp_operator_id (VEC_DELETE_EXPR),
+			 ovl_op_identifier (false, VEC_DELETE_EXPR),
 			 /*protect=*/0);
   /* If there are no `operator []' members, or the lookup is
      ambiguous, then we don't need a cookie.  */
@@ -5394,18 +5394,20 @@ explain_non_literal_class (tree t)
     /* Already explained.  */
     return;
 
-  inform (0, "%q+T is not literal because:", t);
+  inform (UNKNOWN_LOCATION, "%q+T is not literal because:", t);
   if (cxx_dialect < cxx17 && LAMBDA_TYPE_P (t))
-    inform (0, "  %qT is a closure type, which is only literal in "
+    inform (UNKNOWN_LOCATION,
+	    "  %qT is a closure type, which is only literal in "
 	    "C++17 and later", t);
   else if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t))
-    inform (0, "  %q+T has a non-trivial destructor", t);
+    inform (UNKNOWN_LOCATION, "  %q+T has a non-trivial destructor", t);
   else if (CLASSTYPE_NON_AGGREGATE (t)
 	   && !TYPE_HAS_TRIVIAL_DFLT (t)
 	   && !LAMBDA_TYPE_P (t)
 	   && !TYPE_HAS_CONSTEXPR_CTOR (t))
     {
-      inform (0, "  %q+T is not an aggregate, does not have a trivial "
+      inform (UNKNOWN_LOCATION,
+	      "  %q+T is not an aggregate, does not have a trivial "
 	      "default constructor, and has no constexpr constructor that "
 	      "is not a copy or move constructor", t);
       if (type_has_non_user_provided_default_constructor (t))
@@ -5437,7 +5439,8 @@ explain_non_literal_class (tree t)
 	  tree basetype = TREE_TYPE (base_binfo);
 	  if (!CLASSTYPE_LITERAL_P (basetype))
 	    {
-	      inform (0, "  base class %qT of %q+T is non-literal",
+	      inform (UNKNOWN_LOCATION,
+		      "  base class %qT of %q+T is non-literal",
 		      basetype, t);
 	      explain_non_literal_class (basetype);
 	      return;
@@ -5992,8 +5995,6 @@ layout_class_type (tree t, tree *virtuals_p)
   bool last_field_was_bitfield = false;
   /* The location at which the next field should be inserted.  */
   tree *next_field;
-  /* T, as a base class.  */
-  tree base_t;
 
   /* Keep track of the first non-static data member.  */
   non_static_data_members = TYPE_FIELDS (t);
@@ -6218,15 +6219,11 @@ layout_class_type (tree t, tree *virtuals_p)
      that the type is laid out they are no longer important.  */
   remove_zero_width_bit_fields (t);
 
-  /* Create the version of T used for virtual bases.  We do not use
-     make_class_type for this version; this is an artificial type.  For
-     a POD type, we just reuse T.  */
   if (CLASSTYPE_NON_LAYOUT_POD_P (t) || CLASSTYPE_EMPTY_P (t))
     {
-      base_t = make_node (TREE_CODE (t));
-
-      /* Set the size and alignment for the new type.  */
-      tree eoc;
+      /* T needs a different layout as a base (eliding virtual bases
+	 or whatever).  Create that version.  */
+      tree base_t = make_node (TREE_CODE (t));
 
       /* If the ABI version is not at least two, and the last
 	 field was a bit-field, RLI may not be on a byte
@@ -6235,7 +6232,7 @@ layout_class_type (tree t, tree *virtuals_p)
 	 indicates the total number of bits used.  Therefore,
 	 rli_size_so_far, rather than rli_size_unit_so_far, is
 	 used to compute TYPE_SIZE_UNIT.  */
-      eoc = end_of_class (t, /*include_virtuals_p=*/0);
+      tree eoc = end_of_class (t, /*include_virtuals_p=*/0);
       TYPE_SIZE_UNIT (base_t)
 	= size_binop (MAX_EXPR,
 		      fold_convert (sizetype,
@@ -6252,7 +6249,8 @@ layout_class_type (tree t, tree *virtuals_p)
       SET_TYPE_ALIGN (base_t, rli->record_align);
       TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
 
-      /* Copy the fields from T.  */
+      /* Copy the non-static data members of T. This will include its
+	 direct non-virtual bases & vtable.  */
       next_field = &TYPE_FIELDS (base_t);
       for (field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field))
 	if (TREE_CODE (field) == FIELD_DECL)
@@ -6263,9 +6261,14 @@ layout_class_type (tree t, tree *virtuals_p)
 	  }
       *next_field = NULL_TREE;
 
+      /* We use the base type for trivial assignments, and hence it
+	 needs a mode.  */
+      compute_record_mode (base_t);
+
+      TYPE_CONTEXT (base_t) = t;
+
       /* Record the base version of the type.  */
       CLASSTYPE_AS_BASE (t) = base_t;
-      TYPE_CONTEXT (base_t) = t;
     }
   else
     CLASSTYPE_AS_BASE (t) = t;
@@ -6822,11 +6825,6 @@ finish_struct_1 (tree t)
 
   set_class_bindings (t);
 
-  if (CLASSTYPE_AS_BASE (t) != t)
-    /* We use the base type for trivial assignments, and hence it
-       needs a mode.  */
-    compute_record_mode (CLASSTYPE_AS_BASE (t));
-
   /* With the layout complete, check for flexible array members and
      zero-length arrays that might overlap other members in the final
      layout.  */
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 1aa529eb8dc..fdc296908af 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1269,8 +1269,7 @@ cxx_bind_parameters_in_call (const constexpr_ctx *ctx, tree t,
 	{
 	  x = ctx->object;
 	  /* We don't use cp_build_addr_expr here because we don't want to
-	     capture the object argument until we've chosen a non-static member
-	     function.  */
+	     capture the object argument during constexpr evaluation.  */
 	  x = build_address (x);
 	}
       bool lval = false;
diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c
index f251b05775b..e051d66b67b 100644
--- a/gcc/cp/cp-objcp-common.c
+++ b/gcc/cp/cp-objcp-common.c
@@ -61,43 +61,34 @@ cxx_warn_unused_global_decl (const_tree decl)
 size_t
 cp_tree_size (enum tree_code code)
 {
+  gcc_checking_assert (code >= NUM_TREE_CODES);
   switch (code)
     {
-    case PTRMEM_CST:		return sizeof (struct ptrmem_cst);
-    case BASELINK:		return sizeof (struct tree_baselink);
+    case PTRMEM_CST:		return sizeof (ptrmem_cst);
+    case BASELINK:		return sizeof (tree_baselink);
     case TEMPLATE_PARM_INDEX:	return sizeof (template_parm_index);
-    case DEFAULT_ARG:		return sizeof (struct tree_default_arg);
-    case DEFERRED_NOEXCEPT:	return sizeof (struct tree_deferred_noexcept);
-    case OVERLOAD:		return sizeof (struct tree_overload);
-    case STATIC_ASSERT:         return sizeof (struct tree_static_assert);
+    case DEFAULT_ARG:		return sizeof (tree_default_arg);
+    case DEFERRED_NOEXCEPT:	return sizeof (tree_deferred_noexcept);
+    case OVERLOAD:		return sizeof (tree_overload);
+    case STATIC_ASSERT:         return sizeof (tree_static_assert);
     case TYPE_ARGUMENT_PACK:
-    case TYPE_PACK_EXPANSION:
-      return sizeof (struct tree_common);
-
+    case TYPE_PACK_EXPANSION:	return sizeof (tree_type_non_common);
     case NONTYPE_ARGUMENT_PACK:
-    case EXPR_PACK_EXPANSION:
-      return sizeof (struct tree_exp);
-
-    case ARGUMENT_PACK_SELECT:
-      return sizeof (struct tree_argument_pack_select);
-
-    case TRAIT_EXPR:
-      return sizeof (struct tree_trait_expr);
-
-    case LAMBDA_EXPR:           return sizeof (struct tree_lambda_expr);
-
-    case TEMPLATE_INFO:         return sizeof (struct tree_template_info);
-
-    case CONSTRAINT_INFO:       return sizeof (struct tree_constraint_info);
-
-    case USERDEF_LITERAL:	return sizeof (struct tree_userdef_literal);
-
-    case TEMPLATE_DECL:		return sizeof (struct tree_template_decl);
-
+    case EXPR_PACK_EXPANSION:	return sizeof (tree_exp);
+    case ARGUMENT_PACK_SELECT:	return sizeof (tree_argument_pack_select);
+    case TRAIT_EXPR:		return sizeof (tree_trait_expr);
+    case LAMBDA_EXPR:           return sizeof (tree_lambda_expr);
+    case TEMPLATE_INFO:         return sizeof (tree_template_info);
+    case CONSTRAINT_INFO:       return sizeof (tree_constraint_info);
+    case USERDEF_LITERAL:	return sizeof (tree_userdef_literal);
+    case TEMPLATE_DECL:		return sizeof (tree_template_decl);
     default:
-      if (TREE_CODE_CLASS (code) == tcc_declaration)
-	return sizeof (struct tree_decl_non_common);
-      gcc_unreachable ();
+      switch (TREE_CODE_CLASS (code))
+	{
+	case tcc_declaration:	return sizeof (tree_decl_non_common);
+	case tcc_type:		return sizeof (tree_type_non_common);
+	default: gcc_unreachable ();
+	}
     }
   /* NOTREACHED */
 }
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index dc98dd881c5..874cbcbd2bd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -217,7 +217,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
    things) to iterate over their overloads defined by/for a type.  For
    example:
 
-     tree ovlid = cp_assignment_operator_id (NOP_EXPR);
+     tree ovlid = assign_op_identifier;
      tree overloads = get_class_binding (type, ovlid);
      for (ovl_iterator it (overloads); it; ++it) { ... }
 
@@ -244,22 +244,15 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
 /* The name of a destructor that destroys virtual base classes, and
    then deletes the entire object.  */
 #define deleting_dtor_identifier	cp_global_trees[CPTI_DELETING_DTOR_IDENTIFIER]
+
+#define ovl_op_identifier(ISASS, CODE)  (OVL_OP_INFO(ISASS, CODE)->identifier)
+#define assign_op_identifier (ovl_op_info[true][OVL_OP_NOP_EXPR].identifier)
+#define call_op_identifier (ovl_op_info[false][OVL_OP_CALL_EXPR].identifier)
 /* The name used for conversion operators -- but note that actual
    conversion functions use special identifiers outside the identifier
    table.  */
 #define conv_op_identifier		cp_global_trees[CPTI_CONV_OP_IDENTIFIER]
 
-/* The name of the identifier used internally to represent operator CODE.  */
-#define cp_operator_id(CODE) \
-  (operator_name_info[(int) (CODE)].identifier)
-
-/* The name of the identifier used to represent assignment operator CODE,
-   both simple (i.e., operator= with CODE == NOP_EXPR) and compound (e.g.,
-   operator+= with CODE == PLUS_EXPR).  Includes copy and move assignment.
-   Use copy_fn_p() to test specifically for copy assignment.  */
-#define cp_assignment_operator_id(CODE)				\
-  (assignment_operator_name_info[(int) (CODE)].identifier)
-
 #define delta_identifier		cp_global_trees[CPTI_DELTA_IDENTIFIER]
 #define in_charge_identifier		cp_global_trees[CPTI_IN_CHARGE_IDENTIFIER]
 /* The name of the parameter that contains a pointer to the VTT to use
@@ -561,7 +554,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
 struct GTY(()) lang_identifier {
   struct c_common_identifier c_common;
   cxx_binding *bindings;
-  tree label_value;
 };
 
 /* Return a typed pointer version of T if it designates a
@@ -996,11 +988,6 @@ enum GTY(()) abstract_class_use {
 #define SET_IDENTIFIER_TYPE_VALUE(NODE,TYPE) (TREE_TYPE (NODE) = (TYPE))
 #define IDENTIFIER_HAS_TYPE_VALUE(NODE) (IDENTIFIER_TYPE_VALUE (NODE) ? 1 : 0)
 
-#define IDENTIFIER_LABEL_VALUE(NODE) \
-  (LANG_IDENTIFIER_CAST (NODE)->label_value)
-#define SET_IDENTIFIER_LABEL_VALUE(NODE, VALUE)   \
-  IDENTIFIER_LABEL_VALUE (NODE) = (VALUE)
-
 /* Kinds of identifiers.  Values are carefully chosen.  */
 enum cp_identifier_kind {
   cik_normal = 0,	/* Not a special identifier.  */
@@ -1009,9 +996,9 @@ enum cp_identifier_kind {
   cik_dtor = 3,		/* Destructor (in-chg, deleting, complete or
 			   base).  */
   cik_simple_op = 4,	/* Non-assignment operator name.  */
-  cik_newdel_op = 5,	/* New or delete operator name.  */
-  cik_assign_op = 6,	/* An assignment operator name.  */
-  cik_conv_op = 7,	/* Conversion operator name.  */
+  cik_assign_op = 5,	/* An assignment operator name.  */
+  cik_conv_op = 6,	/* Conversion operator name.  */
+  cik_reserved_for_udlit = 7,	/* Not yet in use  */
   cik_max
 };
 
@@ -1066,24 +1053,38 @@ enum cp_identifier_kind {
 #define IDENTIFIER_ANY_OP_P(NODE)		\
   (IDENTIFIER_KIND_BIT_2 (NODE))
 
-/* True if this identifier is for new or delete operator.  Value 5.  */
-#define IDENTIFIER_NEWDEL_OP_P(NODE)		\
-  (IDENTIFIER_KIND_BIT_2 (NODE)			\
-   & (!IDENTIFIER_KIND_BIT_1 (NODE))		\
-   & IDENTIFIER_KIND_BIT_0 (NODE))
+/* True if this identifier is for an overloaded operator. Values 4, 5.  */
+#define IDENTIFIER_OVL_OP_P(NODE)		\
+  (IDENTIFIER_ANY_OP_P (NODE)			\
+   & (!IDENTIFIER_KIND_BIT_1 (NODE)))
 
-/* True if this identifier is for any assignment. Values 6.  */
+/* True if this identifier is for any assignment. Values 5.  */
 #define IDENTIFIER_ASSIGN_OP_P(NODE)		\
-  (IDENTIFIER_KIND_BIT_2 (NODE)			\
-   & IDENTIFIER_KIND_BIT_1 (NODE)		\
-   & (!IDENTIFIER_KIND_BIT_0 (NODE)))
+  (IDENTIFIER_OVL_OP_P (NODE)			\
+   & IDENTIFIER_KIND_BIT_0 (NODE))
 
 /* True if this identifier is the name of a type-conversion
    operator.  Value 7.  */
 #define IDENTIFIER_CONV_OP_P(NODE)		\
-  (IDENTIFIER_KIND_BIT_2 (NODE)			\
+  (IDENTIFIER_ANY_OP_P (NODE)			\
    & IDENTIFIER_KIND_BIT_1 (NODE)		\
-   & IDENTIFIER_KIND_BIT_0 (NODE))
+   & (!IDENTIFIER_KIND_BIT_0 (NODE)))
+
+/* True if this identifier is a new or delete operator.  */
+#define IDENTIFIER_NEWDEL_OP_P(NODE)		\
+  (IDENTIFIER_OVL_OP_P (NODE)			\
+   && IDENTIFIER_OVL_OP_FLAGS (NODE) & OVL_OP_FLAG_ALLOC)
+
+/* True if this identifier is a new operator.  */
+#define IDENTIFIER_NEW_OP_P(NODE)					\
+  (IDENTIFIER_OVL_OP_P (NODE)						\
+   && (IDENTIFIER_OVL_OP_FLAGS (NODE)					\
+       & (OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_DELETE)) == OVL_OP_FLAG_ALLOC)
+
+/* Access a C++-specific index for identifier NODE.
+   Used to optimize operator mappings etc.  */
+#define IDENTIFIER_CP_INDEX(NODE)		\
+  (IDENTIFIER_NODE_CHECK(NODE)->base.u.bits.address_space)
 
 /* In a RECORD_TYPE or UNION_TYPE, nonzero if any component is read-only.  */
 #define C_TYPE_FIELDS_READONLY(TYPE) \
@@ -1209,9 +1210,10 @@ struct GTY (()) tree_trait_expr {
   (CLASS_TYPE_P (NODE) && CLASSTYPE_LAMBDA_EXPR (NODE))
 
 /* Test if FUNCTION_DECL is a lambda function.  */
-#define LAMBDA_FUNCTION_P(FNDECL)			\
-  (DECL_DECLARES_FUNCTION_P (FNDECL)			\
-   && DECL_OVERLOADED_OPERATOR_P (FNDECL) == CALL_EXPR	\
+#define LAMBDA_FUNCTION_P(FNDECL)				\
+  (DECL_DECLARES_FUNCTION_P (FNDECL)				\
+   && DECL_OVERLOADED_OPERATOR_P (FNDECL)			\
+   && DECL_OVERLOADED_OPERATOR_IS (FNDECL, CALL_EXPR)		\
    && LAMBDA_TYPE_P (CP_DECL_CONTEXT (FNDECL)))
 
 enum cp_lambda_default_capture_mode_type {
@@ -1662,12 +1664,22 @@ struct cxx_int_tree_map_hasher : ggc_ptr_hash<cxx_int_tree_map>
   static bool equal (cxx_int_tree_map *, cxx_int_tree_map *);
 };
 
-struct named_label_entry;
+struct named_label_entry; /* Defined in decl.c.  */
 
-struct named_label_hasher : ggc_ptr_hash<named_label_entry>
+struct named_label_hash : ggc_remove <named_label_entry *>
 {
-  static hashval_t hash (named_label_entry *);
-  static bool equal (named_label_entry *, named_label_entry *);
+  typedef named_label_entry *value_type;
+  typedef tree compare_type; /* An identifier.  */
+
+  inline static hashval_t hash (value_type);
+  inline static bool equal (const value_type, compare_type);
+
+  inline static void mark_empty (value_type &p) {p = NULL;}
+  inline static bool is_empty (value_type p) {return !p;}
+
+  /* Nothing is deletable.  Everything is insertable.  */
+  inline static bool is_deleted (value_type) { return false; }
+  inline static void mark_deleted (value_type) { gcc_unreachable (); }
 };
 
 /* Global state pertinent to the current function.  */
@@ -1696,7 +1708,8 @@ struct GTY(()) language_function {
 
   BOOL_BITFIELD invalid_constexpr : 1;
 
-  hash_table<named_label_hasher> *x_named_labels;
+  hash_table<named_label_hash> *x_named_labels;
+
   cp_binding_level *bindings;
   vec<tree, va_gc> *x_local_names;
   /* Tracking possibly infinite loops.  This is a vec<tree> only because
@@ -2475,26 +2488,24 @@ struct GTY(()) lang_decl_min {
 struct GTY(()) lang_decl_fn {
   struct lang_decl_min min;
 
-  /* In an overloaded operator, this is the value of
-     DECL_OVERLOADED_OPERATOR_P.
-     FIXME: We should really do better in compressing this.  */
-  ENUM_BITFIELD (tree_code) operator_code : 16;
-
+  /* In a overloaded operator, this is the compressed operator code.  */
+  unsigned ovl_op_code : 6;
   unsigned global_ctor_p : 1;
   unsigned global_dtor_p : 1;
+
   unsigned static_function : 1;
   unsigned pure_virtual : 1;
   unsigned defaulted_p : 1;
   unsigned has_in_charge_parm_p : 1;
   unsigned has_vtt_parm_p : 1;
   unsigned pending_inline_p : 1;
-
   unsigned nonconverting : 1;
   unsigned thunk_p : 1;
+
   unsigned this_thunk_p : 1;
   unsigned hidden_friend_p : 1;
   unsigned omp_declare_reduction_p : 1;
-  /* 3 spare bits.  */
+  unsigned spare : 13;
 
   /* 32-bits padding on 64-bit host.  */
 
@@ -2801,23 +2812,24 @@ struct GTY(()) lang_decl {
 #define SET_VAR_HAD_UNKNOWN_BOUND(NODE) \
   (DECL_LANG_SPECIFIC (VAR_DECL_CHECK (NODE))->u.base.unknown_bound_p = true)
 
-/* Set the overloaded operator code for NODE to CODE.  */
-#define SET_OVERLOADED_OPERATOR_CODE(NODE, CODE) \
-  (LANG_DECL_FN_CHECK (NODE)->operator_code = (CODE))
-
-/* If NODE is an overloaded operator, then this returns the TREE_CODE
-   associated with the overloaded operator.  If NODE is not an
-   overloaded operator, ERROR_MARK is returned.  Since the numerical
-   value of ERROR_MARK is zero, this macro can be used as a predicate
-   to test whether or not NODE is an overloaded operator.  */
+/* True iff decl NODE is for an overloaded operator.  */
 #define DECL_OVERLOADED_OPERATOR_P(NODE)		\
-  (IDENTIFIER_ANY_OP_P (DECL_NAME (NODE))		\
-   ? LANG_DECL_FN_CHECK (NODE)->operator_code : ERROR_MARK)
+  IDENTIFIER_ANY_OP_P (DECL_NAME (NODE))
 
 /* Nonzero if NODE is an assignment operator (including += and such).  */
-#define DECL_ASSIGNMENT_OPERATOR_P(NODE) \
+#define DECL_ASSIGNMENT_OPERATOR_P(NODE)		 \
   IDENTIFIER_ASSIGN_OP_P (DECL_NAME (NODE))
 
+/* NODE is a function_decl for an overloaded operator.  Return its
+   compressed (raw) operator code.  Note that this is not a TREE_CODE.  */
+#define DECL_OVERLOADED_OPERATOR_CODE_RAW(NODE)		\
+  (LANG_DECL_FN_CHECK (NODE)->ovl_op_code)
+
+/* DECL is an overloaded operator.  Test whether it is for TREE_CODE
+   (a literal constant).  */
+#define DECL_OVERLOADED_OPERATOR_IS(DECL, CODE)			\
+  (DECL_OVERLOADED_OPERATOR_CODE_RAW (DECL) == OVL_OP_##CODE)
+
 /* For FUNCTION_DECLs: nonzero means that this function is a
    constructor or a destructor with an extra in-charge parameter to
    control whether or not virtual bases are constructed.  */
@@ -3455,7 +3467,7 @@ extern void decl_shadowed_for_var_insert (tree, tree);
 /* Extracts the type or expression pattern from a TYPE_PACK_EXPANSION or
    EXPR_PACK_EXPANSION.  */
 #define PACK_EXPANSION_PATTERN(NODE)                            \
-  (TREE_CODE (NODE) == TYPE_PACK_EXPANSION? TREE_TYPE (NODE)    \
+  (TREE_CODE (NODE) == TYPE_PACK_EXPANSION ? TREE_TYPE (NODE)    \
    : TREE_OPERAND (NODE, 0))
 
 /* Sets the type or expression pattern for a TYPE_PACK_EXPANSION or
@@ -5474,23 +5486,63 @@ enum auto_deduction_context
 
 extern void init_reswords (void);
 
-typedef struct GTY(()) operator_name_info_t {
+/* Various flags for the overloaded operator information.  */
+enum ovl_op_flags
+  {
+    OVL_OP_FLAG_NONE = 0,	/* Don't care.  */
+    OVL_OP_FLAG_UNARY = 1,	/* Is unary.  */
+    OVL_OP_FLAG_BINARY = 2,	/* Is binary.  */
+    OVL_OP_FLAG_AMBIARY = 3,	/* May be unary or binary.  */
+    OVL_OP_FLAG_ALLOC = 4,  	/* operator new or delete.  */
+    OVL_OP_FLAG_DELETE = 1,	/* operator delete.  */
+    OVL_OP_FLAG_VEC = 2		/* vector new or delete.  */
+  };
+
+/* Compressed operator codes.  Order is determined by operators.def
+   and does not match that of tree_codes.  */
+enum ovl_op_code
+  {
+    OVL_OP_ERROR_MARK,
+    OVL_OP_NOP_EXPR,
+#define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) OVL_OP_##CODE,
+#define DEF_ASSN_OPERATOR(NAME, CODE, MANGLING) /* NOTHING */
+#include "operators.def"
+    OVL_OP_MAX
+  };
+
+struct GTY(()) ovl_op_info_t {
   /* The IDENTIFIER_NODE for the operator.  */
   tree identifier;
   /* The name of the operator.  */
   const char *name;
   /* The mangled name of the operator.  */
   const char *mangled_name;
-  /* The arity of the operator.  */
-  int arity;
-} operator_name_info_t;
+  /* The (regular) tree code.  */
+  enum tree_code tree_code : 16;
+  /* The (compressed) operator code.  */
+  enum ovl_op_code ovl_op_code : 8;
+  /* The ovl_op_flags of the operator */
+  unsigned flags : 8;
+};
 
-/* A mapping from tree codes to operator name information.  */
-extern GTY(()) operator_name_info_t operator_name_info
-  [(int) MAX_TREE_CODES];
-/* Similar, but for assignment operators.  */
-extern GTY(()) operator_name_info_t assignment_operator_name_info
-  [(int) MAX_TREE_CODES];
+/* Overloaded operator info indexed by ass_op_p & ovl_op_code.  */
+extern GTY(()) ovl_op_info_t ovl_op_info[2][OVL_OP_MAX];
+/* Mapping from tree_codes to ovl_op_codes.  */
+extern GTY(()) unsigned char ovl_op_mapping[MAX_TREE_CODES];
+/* Mapping for ambi-ary operators from the binary to the unary.  */
+extern GTY(()) unsigned char ovl_op_alternate[OVL_OP_MAX];
+
+/* Given an ass_op_p boolean and a tree code, return a pointer to its
+   overloaded operator info.  Tree codes for non-overloaded operators
+   map to the error-operator.  */
+#define OVL_OP_INFO(IS_ASS_P, TREE_CODE)			\
+  (&ovl_op_info[(IS_ASS_P) != 0][ovl_op_mapping[(TREE_CODE)]])
+/* Overloaded operator info for an identifier for which
+   IDENTIFIER_OVL_OP_P is true.  */
+#define IDENTIFIER_OVL_OP_INFO(NODE) \
+  (&ovl_op_info[IDENTIFIER_KIND_BIT_0 (NODE)][IDENTIFIER_CP_INDEX (NODE)])
+#define IDENTIFIER_OVL_OP_FLAGS(NODE) \
+  (IDENTIFIER_OVL_OP_INFO (NODE)->flags)
 
 /* A type-qualifier, or bitmask therefore, using the TYPE_QUAL
    constants.  */
@@ -6356,6 +6408,7 @@ extern bool parsing_nsdmi (void);
 extern bool parsing_default_capturing_generic_lambda_in_template (void);
 extern void inject_this_parameter (tree, cp_cv_quals);
 extern location_t defarg_location (tree);
+extern void maybe_show_extern_c_location (void);
 
 /* in pt.c */
 extern bool check_template_shadow		(tree);
@@ -6429,7 +6482,7 @@ extern bool uses_parameter_packs                (tree);
 extern bool template_parameter_pack_p           (const_tree);
 extern bool function_parameter_pack_p		(const_tree);
 extern bool function_parameter_expanded_from_pack_p (tree, tree);
-extern tree make_pack_expansion                 (tree);
+extern tree make_pack_expansion                 (tree, tsubst_flags_t = tf_warning_or_error);
 extern bool check_for_bare_parameter_packs      (tree);
 extern tree build_template_info			(tree, tree);
 extern tree get_template_info			(const_tree);
diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index c0d0a600562..9ce094eb2a5 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -1834,12 +1834,27 @@ type_promotes_to (tree type)
 	   || type == char32_type_node
 	   || type == wchar_type_node)
     {
+      tree prom = type;
+
+      if (TREE_CODE (type) == ENUMERAL_TYPE)
+	{
+	  prom = ENUM_UNDERLYING_TYPE (prom);
+	  if (!ENUM_IS_SCOPED (type)
+	      && ENUM_FIXED_UNDERLYING_TYPE_P (type))
+	    {
+	      /* ISO C++17, 7.6/4.  A prvalue of an unscoped enumeration type
+		 whose underlying type is fixed (10.2) can be converted to a
+		 prvalue of its underlying type. Moreover, if integral promotion
+		 can be applied to its underlying type, a prvalue of an unscoped
+		 enumeration type whose underlying type is fixed can also be 
+		 converted to a prvalue of the promoted underlying type.  */
+	      return type_promotes_to (prom);
+	    }
+	}
+
       int precision = MAX (TYPE_PRECISION (type),
 			   TYPE_PRECISION (integer_type_node));
       tree totype = c_common_type_for_size (precision, 0);
-      tree prom = type;
-      if (TREE_CODE (prom) == ENUMERAL_TYPE)
-	prom = ENUM_UNDERLYING_TYPE (prom);
       if (TYPE_UNSIGNED (prom)
 	  && ! int_fits_type_p (TYPE_MAX_VALUE (prom), totype))
 	prom = c_common_type_for_size (precision, 1);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5d3f39e1f59..49b871564d6 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -65,8 +65,6 @@ static const char *redeclaration_error_message (tree, tree);
 
 static int decl_jump_unsafe (tree);
 static void require_complete_types_for_parms (tree);
-static bool ambi_op_p (enum tree_code);
-static bool unary_op_p (enum tree_code);
 static void push_local_name (tree);
 static tree grok_reference_init (tree, tree, tree, int);
 static tree grokvardecl (tree, tree, tree, const cp_decl_specifier_seq *,
@@ -189,27 +187,33 @@ struct GTY((chain_next ("%h.next"))) named_label_use_entry {
    function, and so we can check the validity of jumps to these labels.  */
 
 struct GTY((for_user)) named_label_entry {
-  /* The decl itself.  */
-  tree label_decl;
+
+  tree name;  /* Name of decl. */
+
+  tree label_decl; /* LABEL_DECL, unless deleted local label. */
+
+  named_label_entry *outer; /* Outer shadowed chain.  */
 
   /* The binding level to which the label is *currently* attached.
      This is initially set to the binding level in which the label
      is defined, but is modified as scopes are closed.  */
   cp_binding_level *binding_level;
+
   /* The head of the names list that was current when the label was
      defined, or the inner scope popped.  These are the decls that will
      be skipped when jumping to the label.  */
   tree names_in_scope;
+
   /* A vector of all decls from all binding levels that would be
      crossed by a backward branch to the label.  */
   vec<tree, va_gc> *bad_decls;
 
   /* A list of uses of the label, before the label is defined.  */
-  struct named_label_use_entry *uses;
+  named_label_use_entry *uses;
 
   /* The following bits are set after the label is defined, and are
-     updated as scopes are popped.  They indicate that a backward jump
-     to the label will illegally enter a scope of the given flavor.  */
+     updated as scopes are popped.  They indicate that a jump to the
+     label will illegally enter a scope of the given flavor.  */
   bool in_try_scope;
   bool in_catch_scope;
   bool in_omp_scope;
@@ -347,7 +351,7 @@ finish_scope (void)
    in a valid manner, and issue any appropriate warnings or errors.  */
 
 static void
-pop_label (tree label, tree old_value)
+check_label_used (tree label)
 {
   if (!processing_template_decl)
     {
@@ -364,18 +368,6 @@ pop_label (tree label, tree old_value)
       else 
 	warn_for_unused_label (label);
     }
-
-  SET_IDENTIFIER_LABEL_VALUE (DECL_NAME (label), old_value);
-}
-
-/* Push all named labels into a vector, so that we can sort it on DECL_UID
-   to avoid code generation differences.  */
-
-int
-note_label (named_label_entry **slot, vec<named_label_entry **> &labels)
-{
-  labels.quick_push (slot);
-  return 1;
 }
 
 /* Helper function to sort named label entries in a vector by DECL_UID.  */
@@ -383,13 +375,11 @@ note_label (named_label_entry **slot, vec<named_label_entry **> &labels)
 static int
 sort_labels (const void *a, const void *b)
 {
-  named_label_entry **slot1 = *(named_label_entry **const *) a;
-  named_label_entry **slot2 = *(named_label_entry **const *) b;
-  if (DECL_UID ((*slot1)->label_decl) < DECL_UID ((*slot2)->label_decl))
-    return -1;
-  if (DECL_UID ((*slot1)->label_decl) > DECL_UID ((*slot2)->label_decl))
-    return 1;
-  return 0;
+  tree label1 = *(tree const *) a;
+  tree label2 = *(tree const *) b;
+
+  /* DECL_UIDs can never be equal.  */
+  return DECL_UID (label1) > DECL_UID (label2) ? -1 : +1;
 }
 
 /* At the end of a function, all labels declared within the function
@@ -399,46 +389,58 @@ sort_labels (const void *a, const void *b)
 static void
 pop_labels (tree block)
 {
-  if (named_labels)
+  if (!named_labels)
+    return;
+
+  /* We need to add the labels to the block chain, so debug
+     information is emitted.  But, we want the order to be stable so
+     need to sort them first.  Otherwise the debug output could be
+     randomly ordered.  I guess it's mostly stable, unless the hash
+     table implementation changes.  */
+  auto_vec<tree, 32> labels (named_labels->elements ());
+  hash_table<named_label_hash>::iterator end (named_labels->end ());
+  for (hash_table<named_label_hash>::iterator iter
+	 (named_labels->begin ()); iter != end; ++iter)
     {
-      auto_vec<named_label_entry **, 32> labels;
-      named_label_entry **slot;
-      unsigned int i;
+      named_label_entry *ent = *iter;
 
-      /* Push all the labels into a vector and sort them by DECL_UID,
-	 so that gaps between DECL_UIDs don't affect code generation.  */
-      labels.reserve_exact (named_labels->elements ());
-      named_labels->traverse<vec<named_label_entry **> &, note_label> (labels);
-      labels.qsort (sort_labels);
-      FOR_EACH_VEC_ELT (labels, i, slot)
-	{
-	  struct named_label_entry *ent = *slot;
+      gcc_checking_assert (!ent->outer);
+      if (ent->label_decl)
+	labels.quick_push (ent->label_decl);
+      ggc_free (ent);
+    }
+  named_labels = NULL;
+  labels.qsort (sort_labels);
 
-	  pop_label (ent->label_decl, NULL_TREE);
+  while (labels.length ())
+    {
+      tree label = labels.pop ();
 
-	  /* Put the labels into the "variables" of the top-level block,
-	     so debugger can see them.  */
-	  DECL_CHAIN (ent->label_decl) = BLOCK_VARS (block);
-	  BLOCK_VARS (block) = ent->label_decl;
+      DECL_CHAIN (label) = BLOCK_VARS (block);
+      BLOCK_VARS (block) = label;
 
-	  named_labels->clear_slot (slot);
-	}
-      named_labels = NULL;
+      check_label_used (label);
     }
 }
 
 /* At the end of a block with local labels, restore the outer definition.  */
 
 static void
-pop_local_label (tree label, tree old_value)
+pop_local_label (tree id, tree label)
 {
-  struct named_label_entry dummy;
-
-  pop_label (label, old_value);
+  check_label_used (label);
+  named_label_entry **slot = named_labels->find_slot_with_hash
+    (id, IDENTIFIER_HASH_VALUE (id), NO_INSERT);
+  named_label_entry *ent = *slot;
 
-  dummy.label_decl = label;
-  named_label_entry **slot = named_labels->find_slot (&dummy, NO_INSERT);
-  named_labels->clear_slot (slot);
+  if (ent->outer)
+    ent = ent->outer;
+  else
+    {
+      ent = ggc_cleared_alloc<named_label_entry> ();
+      ent->name = id;
+    }
+  *slot = ent;
 }
 
 /* The following two routines are used to interface to Objective-C++.
@@ -579,7 +581,6 @@ poplevel (int keep, int reverse, int functionbody)
   int leaving_for_scope;
   scope_kind kind;
   unsigned ix;
-  cp_label_binding *label_bind;
 
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
  restart:
@@ -613,11 +614,12 @@ poplevel (int keep, int reverse, int functionbody)
      Usually current_binding_level->names is in reverse order.
      But parameter decls were previously put in forward order.  */
 
+  decls = current_binding_level->names;
   if (reverse)
-    current_binding_level->names
-      = decls = nreverse (current_binding_level->names);
-  else
-    decls = current_binding_level->names;
+    {
+      decls = nreverse (decls);
+      current_binding_level->names = decls;
+    }
 
   /* If there were any declarations or structure tags in that level,
      or if this level is a function body,
@@ -770,7 +772,10 @@ poplevel (int keep, int reverse, int functionbody)
 	    }
 	}
       /* Remove the binding.  */
-      pop_local_binding (name, decl);
+      if (TREE_CODE (decl) == LABEL_DECL)
+	pop_local_label (name, decl);
+      else
+	pop_local_binding (name, decl);
     }
 
   /* Remove declarations for any `for' variables from inner scopes
@@ -784,11 +789,6 @@ poplevel (int keep, int reverse, int functionbody)
        link; link = TREE_CHAIN (link))
     SET_IDENTIFIER_TYPE_VALUE (TREE_PURPOSE (link), TREE_VALUE (link));
 
-  /* Restore the IDENTIFIER_LABEL_VALUEs for local labels.  */
-  FOR_EACH_VEC_SAFE_ELT_REVERSE (current_binding_level->shadowed_labels,
-			         ix, label_bind)
-    pop_local_label (label_bind->label, label_bind->prev_value);
-
   /* There may be OVERLOADs (wrapped in TREE_LISTs) on the BLOCK_VARs
      list if a `using' declaration put them there.  The debugging
      back ends won't understand OVERLOAD, so we remove them here.
@@ -1431,7 +1431,15 @@ duplicate_decls (tree newdecl, tree olddecl, bool newdecl_is_friend)
 	  /* Avoid warnings redeclaring built-ins which have not been
 	     explicitly declared.  */
 	  if (DECL_ANTICIPATED (olddecl))
-	    return NULL_TREE;
+	    {
+	      if (TREE_PUBLIC (newdecl)
+		  && CP_DECL_CONTEXT (newdecl) == global_namespace)
+		warning_at (DECL_SOURCE_LOCATION (newdecl),
+			    OPT_Wbuiltin_declaration_mismatch,
+			    "built-in function %qD declared as non-function",
+			    newdecl);
+	      return NULL_TREE;
+	    }
 
 	  /* If you declare a built-in or predefined function name as static,
 	     the old definition is overridden, but optionally warn this was a
@@ -1522,7 +1530,7 @@ next_arg:;
 
 	      warning_at (DECL_SOURCE_LOCATION (newdecl),
 			  OPT_Wbuiltin_declaration_mismatch,
-			  "declaration of %q+#D conflicts with built-in "
+			  "declaration of %q#D conflicts with built-in "
 			  "declaration %q#D", newdecl, olddecl);
 	    }
 	  else if ((DECL_EXTERN_C_P (newdecl)
@@ -1911,9 +1919,9 @@ next_arg:;
       DECL_FINAL_P (newdecl) |= DECL_FINAL_P (olddecl);
       DECL_OVERRIDE_P (newdecl) |= DECL_OVERRIDE_P (olddecl);
       DECL_THIS_STATIC (newdecl) |= DECL_THIS_STATIC (olddecl);
-      if (DECL_OVERLOADED_OPERATOR_P (olddecl) != ERROR_MARK)
-	SET_OVERLOADED_OPERATOR_CODE
-	  (newdecl, DECL_OVERLOADED_OPERATOR_P (olddecl));
+      if (DECL_OVERLOADED_OPERATOR_P (olddecl))
+	DECL_OVERLOADED_OPERATOR_CODE_RAW (newdecl)
+	  = DECL_OVERLOADED_OPERATOR_CODE_RAW (olddecl);
       new_defines_function = DECL_INITIAL (newdecl) != NULL_TREE;
 
       /* Optionally warn about more than one declaration for the same
@@ -2470,6 +2478,8 @@ next_arg:;
 		  break;
 		}
 	    }
+
+	  copy_attributes_to_builtin (newdecl);
 	}
       if (new_defines_function)
 	/* If defining a function declared with other language
@@ -2939,81 +2949,83 @@ redeclaration_error_message (tree newdecl, tree olddecl)
     }
 }
 
+
 /* Hash and equality functions for the named_label table.  */
 
 hashval_t
-named_label_hasher::hash (named_label_entry *ent)
+named_label_hash::hash (const value_type entry)
 {
-  return DECL_UID (ent->label_decl);
+  return IDENTIFIER_HASH_VALUE (entry->name);
 }
 
 bool
-named_label_hasher::equal (named_label_entry *a, named_label_entry *b)
+named_label_hash::equal (const value_type entry, compare_type name)
 {
-  return a->label_decl == b->label_decl;
+  return name == entry->name;
 }
 
-/* Create a new label, named ID.  */
+/* Look for a label named ID in the current function.  If one cannot
+   be found, create one.  Return the named_label_entry, or NULL on
+   failure.  */
 
-static tree
-make_label_decl (tree id, int local_p)
+static named_label_entry *
+lookup_label_1 (tree id, bool making_local_p)
 {
-  struct named_label_entry *ent;
-  tree decl;
-
-  decl = build_decl (input_location, LABEL_DECL, id, void_type_node);
-
-  DECL_CONTEXT (decl) = current_function_decl;
-  SET_DECL_MODE (decl, VOIDmode);
-  C_DECLARED_LABEL_FLAG (decl) = local_p;
-
-  /* Say where one reference is to the label, for the sake of the
-     error if it is not defined.  */
-  DECL_SOURCE_LOCATION (decl) = input_location;
-
-  /* Record the fact that this identifier is bound to this label.  */
-  SET_IDENTIFIER_LABEL_VALUE (id, decl);
+  /* You can't use labels at global scope.  */
+  if (current_function_decl == NULL_TREE)
+    {
+      error ("label %qE referenced outside of any function", id);
+      return NULL;
+    }
 
-  /* Create the label htab for the function on demand.  */
   if (!named_labels)
-    named_labels = hash_table<named_label_hasher>::create_ggc (13);
+    named_labels = hash_table<named_label_hash>::create_ggc (13);
 
-  /* Record this label on the list of labels used in this function.
-     We do this before calling make_label_decl so that we get the
-     IDENTIFIER_LABEL_VALUE before the new label is declared.  */
-  ent = ggc_cleared_alloc<named_label_entry> ();
-  ent->label_decl = decl;
-
-  named_label_entry **slot = named_labels->find_slot (ent, INSERT);
-  gcc_assert (*slot == NULL);
-  *slot = ent;
+  hashval_t hash = IDENTIFIER_HASH_VALUE (id);
+  named_label_entry **slot
+    = named_labels->find_slot_with_hash (id, hash, INSERT);
+  named_label_entry *old = *slot;
+  
+  if (old && old->label_decl)
+    {
+      if (!making_local_p)
+	return old;
 
-  return decl;
-}
+      if (old->binding_level == current_binding_level)
+	{
+	  error ("local label %qE conflicts with existing label", id);
+	  inform (DECL_SOURCE_LOCATION (old->label_decl), "previous label");
+	  return NULL;
+	}
+    }
 
-/* Look for a label named ID in the current function.  If one cannot
-   be found, create one.  (We keep track of used, but undefined,
-   labels, and complain about them at the end of a function.)  */
+  /* We are making a new decl, create or reuse the named_label_entry  */
+  named_label_entry *ent = NULL;
+  if (old && !old->label_decl)
+    ent = old;
+  else
+    {
+      ent = ggc_cleared_alloc<named_label_entry> ();
+      ent->name = id;
+      ent->outer = old;
+      *slot = ent;
+    }
 
-static tree
-lookup_label_1 (tree id)
-{
-  tree decl;
+  /* Now create the LABEL_DECL.  */
+  tree decl = build_decl (input_location, LABEL_DECL, id, void_type_node);
 
-  /* You can't use labels at global scope.  */
-  if (current_function_decl == NULL_TREE)
+  DECL_CONTEXT (decl) = current_function_decl;
+  SET_DECL_MODE (decl, VOIDmode);
+  if (making_local_p)
     {
-      error ("label %qE referenced outside of any function", id);
-      return NULL_TREE;
+      C_DECLARED_LABEL_FLAG (decl) = true;
+      DECL_CHAIN (decl) = current_binding_level->names;
+      current_binding_level->names = decl;
     }
 
-  /* See if we've already got this label.  */
-  decl = IDENTIFIER_LABEL_VALUE (id);
-  if (decl != NULL_TREE && DECL_CONTEXT (decl) == current_function_decl)
-    return decl;
+  ent->label_decl = decl;
 
-  decl = make_label_decl (id, /*local_p=*/0);
-  return decl;
+  return ent;
 }
 
 /* Wrapper for lookup_label_1.  */
@@ -3021,30 +3033,19 @@ lookup_label_1 (tree id)
 tree
 lookup_label (tree id)
 {
-  tree ret;
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
-  ret = lookup_label_1 (id);
+  named_label_entry *ent = lookup_label_1 (id, false);
   timevar_cond_stop (TV_NAME_LOOKUP, subtime);
-  return ret;
+  return ent ? ent->label_decl : NULL_TREE;
 }
 
-/* Declare a local label named ID.  */
-
 tree
 declare_local_label (tree id)
 {
-  tree decl;
-  cp_label_binding bind;
-
-  /* Add a new entry to the SHADOWED_LABELS list so that when we leave
-     this scope we can restore the old value of IDENTIFIER_TYPE_VALUE.  */
-  bind.prev_value = IDENTIFIER_LABEL_VALUE (id);
-
-  decl = make_label_decl (id, /*local_p=*/1);
-  bind.label = decl;
-  vec_safe_push (current_binding_level->shadowed_labels, bind);
-
-  return decl;
+  bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
+  named_label_entry *ent = lookup_label_1 (id, true);
+  timevar_cond_stop (TV_NAME_LOOKUP, subtime);
+  return ent ? ent->label_decl : NULL_TREE;
 }
 
 /* Returns nonzero if it is ill-formed to jump past the declaration of
@@ -3083,8 +3084,9 @@ identify_goto (tree decl, location_t loc, const location_t *locus,
 	       diagnostic_t diag_kind)
 {
   bool complained
-    = (decl ? emit_diagnostic (diag_kind, loc, 0, "jump to label %qD", decl)
-	    : emit_diagnostic (diag_kind, loc, 0, "jump to case label"));
+    = emit_diagnostic (diag_kind, loc, 0,
+		       decl ? N_("jump to label %qD")
+		       : N_("jump to case label"), decl);
   if (complained && locus)
     inform (*locus, "  from here");
   return complained;
@@ -3139,68 +3141,62 @@ check_previous_goto_1 (tree decl, cp_binding_level* level, tree names,
 			"  crosses initialization of %q#D", new_decls);
 	      else
 		inform (DECL_SOURCE_LOCATION (new_decls),
-			"  enters scope of %q#D which has "
+			"  enters scope of %q#D, which has "
 			"non-trivial destructor", new_decls);
 	    }
 	}
 
       if (b == level)
 	break;
-      if ((b->kind == sk_try || b->kind == sk_catch) && !saw_eh)
+
+      const char *inf = NULL;
+      location_t loc = input_location;
+      switch (b->kind)
 	{
-	  if (identified < 2)
-	    {
-	      complained = identify_goto (decl, input_location, locus,
-					  DK_ERROR);
-	      identified = 2;
-	    }
-	  if (complained)
-	    {
-	      if (b->kind == sk_try)
-		inform (input_location, "  enters try block");
-	      else
-		inform (input_location, "  enters catch block");
-	    }
+	case sk_try:
+	  if (!saw_eh)
+	    inf = N_("enters try block");
 	  saw_eh = true;
-	}
-      if (b->kind == sk_omp && !saw_omp)
-	{
-	  if (identified < 2)
-	    {
-	      complained = identify_goto (decl, input_location, locus,
-					  DK_ERROR);
-	      identified = 2;
-	    }
-	  if (complained)
-	    inform (input_location, "  enters OpenMP structured block");
+	  break;
+
+	case sk_catch:
+	  if (!saw_eh)
+	    inf = N_("enters catch block");
+	  saw_eh = true;
+	  break;
+
+	case sk_omp:
+	  if (!saw_omp)
+	    inf = N_("enters OpenMP structured block");
 	  saw_omp = true;
-	}
-      if (b->kind == sk_transaction && !saw_tm)
-	{
-	  if (identified < 2)
+	  break;
+
+	case sk_transaction:
+	  if (!saw_tm)
+	    inf = N_("enters synchronized or atomic statement");
+	  saw_tm = true;
+	  break;
+
+	case sk_block:
+	  if (!saw_cxif && level_for_constexpr_if (b->level_chain))
 	    {
-	      complained = identify_goto (decl, input_location, locus,
-					  DK_ERROR);
-	      identified = 2;
+	      inf = N_("enters constexpr if statement");
+	      loc = EXPR_LOCATION (b->level_chain->this_entity);
+	      saw_cxif = true;
 	    }
-	  if (complained)
-	    inform (input_location,
-		    "  enters synchronized or atomic statement");
-	  saw_tm = true;
+	  break;
+
+	default:
+	  break;
 	}
-      if (!saw_cxif && b->kind == sk_block
-	  && level_for_constexpr_if (b->level_chain))
+
+      if (inf)
 	{
 	  if (identified < 2)
-	    {
-	      complained = identify_goto (decl, input_location, locus,
-					  DK_ERROR);
-	      identified = 2;
-	    }
+	    complained = identify_goto (decl, input_location, locus, DK_ERROR);
+	  identified = 2;
 	  if (complained)
-	    inform (EXPR_LOCATION (b->level_chain->this_entity),
-		    "  enters constexpr if statement");
-	  saw_cxif = true;
+	    inform (loc, "  %s", inf);
 	}
     }
 
@@ -3227,12 +3223,6 @@ check_switch_goto (cp_binding_level* level)
 void
 check_goto (tree decl)
 {
-  struct named_label_entry *ent, dummy;
-  bool saw_catch = false, complained = false;
-  int identified = 0;
-  tree bad;
-  unsigned ix;
-
   /* We can't know where a computed goto is jumping.
      So we assume that it's OK.  */
   if (TREE_CODE (decl) != LABEL_DECL)
@@ -3243,22 +3233,22 @@ check_goto (tree decl)
   if (decl == cdtor_label)
     return;
 
-  dummy.label_decl = decl;
-  ent = named_labels->find (&dummy);
-  gcc_assert (ent != NULL);
+  hashval_t hash = IDENTIFIER_HASH_VALUE (DECL_NAME (decl));
+  named_label_entry **slot
+    = named_labels->find_slot_with_hash (DECL_NAME (decl), hash, NO_INSERT);
+  named_label_entry *ent = *slot;
 
   /* If the label hasn't been defined yet, defer checking.  */
   if (! DECL_INITIAL (decl))
     {
-      struct named_label_use_entry *new_use;
-
       /* Don't bother creating another use if the last goto had the
 	 same data, and will therefore create the same set of errors.  */
       if (ent->uses
 	  && ent->uses->names_in_scope == current_binding_level->names)
 	return;
 
-      new_use = ggc_alloc<named_label_use_entry> ();
+      named_label_use_entry *new_use
+	= ggc_alloc<named_label_use_entry> ();
       new_use->binding_level = current_binding_level;
       new_use->names_in_scope = current_binding_level->names;
       new_use->o_goto_locus = input_location;
@@ -3269,6 +3259,11 @@ check_goto (tree decl)
       return;
     }
 
+  bool saw_catch = false, complained = false;
+  int identified = 0;
+  tree bad;
+  unsigned ix;
+
   if (ent->in_try_scope || ent->in_catch_scope || ent->in_transaction_scope
       || ent->in_constexpr_if
       || ent->in_omp_scope || !vec_safe_is_empty (ent->bad_decls))
@@ -3329,27 +3324,24 @@ check_goto (tree decl)
 	inform (input_location, "  enters OpenMP structured block");
     }
   else if (flag_openmp)
-    {
-      cp_binding_level *b;
-      for (b = current_binding_level; b ; b = b->level_chain)
-	{
-	  if (b == ent->binding_level)
+    for (cp_binding_level *b = current_binding_level; b ; b = b->level_chain)
+      {
+	if (b == ent->binding_level)
+	  break;
+	if (b->kind == sk_omp)
+	  {
+	    if (identified < 2)
+	      {
+		complained = identify_goto (decl,
+					    DECL_SOURCE_LOCATION (decl),
+					    &input_location, DK_ERROR);
+		identified = 2;
+	      }
+	    if (complained)
+	      inform (input_location, "  exits OpenMP structured block");
 	    break;
-	  if (b->kind == sk_omp)
-	    {
-	      if (identified < 2)
-		{
-		  complained = identify_goto (decl,
-					      DECL_SOURCE_LOCATION (decl),
-					      &input_location, DK_ERROR);
-		  identified = 2;
-		}
-	      if (complained)
-		inform (input_location, "  exits OpenMP structured block");
-	      break;
-	    }
-	}
-    }
+	  }
+      }
 }
 
 /* Check that a return is ok wrt OpenMP structured blocks.
@@ -3358,8 +3350,7 @@ check_goto (tree decl)
 bool
 check_omp_return (void)
 {
-  cp_binding_level *b;
-  for (b = current_binding_level; b ; b = b->level_chain)
+  for (cp_binding_level *b = current_binding_level; b ; b = b->level_chain)
     if (b->kind == sk_omp)
       {
 	error ("invalid exit from OpenMP structured block");
@@ -3376,25 +3367,15 @@ check_omp_return (void)
 static tree
 define_label_1 (location_t location, tree name)
 {
-  struct named_label_entry *ent, dummy;
-  cp_binding_level *p;
-  tree decl;
-
-  decl = lookup_label (name);
-
-  dummy.label_decl = decl;
-  ent = named_labels->find (&dummy);
-  gcc_assert (ent != NULL);
-
   /* After labels, make any new cleanups in the function go into their
      own new (temporary) binding contour.  */
-  for (p = current_binding_level;
+  for (cp_binding_level *p = current_binding_level;
        p->kind != sk_function_parms;
        p = p->level_chain)
     p->more_cleanups_ok = 0;
 
-  if (name == get_identifier ("wchar_t"))
-    permerror (input_location, "label named wchar_t");
+  named_label_entry *ent = lookup_label_1 (name, false);
+  tree decl = ent->label_decl;
 
   if (DECL_INITIAL (decl) != NULL_TREE)
     {
@@ -3403,8 +3384,6 @@ define_label_1 (location_t location, tree name)
     }
   else
     {
-      struct named_label_use_entry *use;
-
       /* Mark label as having been defined.  */
       DECL_INITIAL (decl) = error_mark_node;
       /* Say where in the source.  */
@@ -3413,7 +3392,7 @@ define_label_1 (location_t location, tree name)
       ent->binding_level = current_binding_level;
       ent->names_in_scope = current_binding_level->names;
 
-      for (use = ent->uses; use ; use = use->next)
+      for (named_label_use_entry *use = ent->uses; use; use = use->next)
 	check_previous_goto (decl, use);
       ent->uses = NULL;
     }
@@ -3426,9 +3405,8 @@ define_label_1 (location_t location, tree name)
 tree
 define_label (location_t location, tree name)
 {
-  tree ret;
   bool running = timevar_cond_start (TV_NAME_LOOKUP);
-  ret = define_label_1 (location, name);
+  tree ret = define_label_1 (location, name);
   timevar_cond_stop (TV_NAME_LOOKUP, running);
   return ret;
 }
@@ -4379,7 +4357,6 @@ builtin_function_1 (tree decl, tree context, bool is_global)
   retrofit_lang_decl (decl);
 
   DECL_ARTIFICIAL (decl) = 1;
-  SET_OVERLOADED_OPERATOR_CODE (decl, ERROR_MARK);
   SET_DECL_LANGUAGE (decl, lang_c);
   /* Runtime library routines are, by definition, available in an
      external shared object.  */
@@ -4467,7 +4444,8 @@ build_library_fn (tree name, enum tree_code operator_code, tree type,
   DECL_EXTERNAL (fn) = 1;
   TREE_PUBLIC (fn) = 1;
   DECL_ARTIFICIAL (fn) = 1;
-  SET_OVERLOADED_OPERATOR_CODE (fn, operator_code);
+  DECL_OVERLOADED_OPERATOR_CODE_RAW (fn)
+    = OVL_OP_INFO (false, operator_code)->ovl_op_code;
   SET_DECL_LANGUAGE (fn, lang_c);
   /* Runtime library routines are, by definition, available in an
      external shared object.  */
@@ -4532,9 +4510,8 @@ static tree
 push_cp_library_fn (enum tree_code operator_code, tree type,
 		    int ecf_flags)
 {
-  tree fn = build_cp_library_fn (cp_operator_id (operator_code),
-				 operator_code,
-				 type, ecf_flags);
+  tree fn = build_cp_library_fn (ovl_op_identifier (false, operator_code),
+				 operator_code, type, ecf_flags);
   pushdecl (fn);
   if (flag_tm)
     apply_tm_attr (fn, get_identifier ("transaction_safe"));
@@ -8722,7 +8699,7 @@ grokfndecl (tree ctype,
 		  "deduction guide %qD must not have a function body", decl);
     }
   else if (IDENTIFIER_ANY_OP_P (DECL_NAME (decl))
-      && !grok_op_properties (decl, /*complain=*/true))
+	   && !grok_op_properties (decl, /*complain=*/true))
     return NULL_TREE;
   else if (UDLIT_OPER_P (DECL_NAME (decl)))
     {
@@ -8733,6 +8710,7 @@ grokfndecl (tree ctype,
       if (DECL_LANGUAGE (decl) == lang_c)
 	{
 	  error ("literal operator with C linkage");
+	  maybe_show_extern_c_location ();
 	  return NULL_TREE;
 	}
 
@@ -9495,22 +9473,20 @@ compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
     itype = build_min (MINUS_EXPR, sizetype, size, integer_one_node);
   else
     {
-      HOST_WIDE_INT saved_processing_template_decl;
-
       /* Compute the index of the largest element in the array.  It is
 	 one less than the number of elements in the array.  We save
 	 and restore PROCESSING_TEMPLATE_DECL so that computations in
 	 cp_build_binary_op will be appropriately folded.  */
-      saved_processing_template_decl = processing_template_decl;
-      processing_template_decl = 0;
-      itype = cp_build_binary_op (input_location,
-				  MINUS_EXPR,
-				  cp_convert (ssizetype, size, complain),
-				  cp_convert (ssizetype, integer_one_node,
-					      complain),
-				  complain);
-      itype = maybe_constant_value (itype);
-      processing_template_decl = saved_processing_template_decl;
+      {
+	processing_template_decl_sentinel s;
+	itype = cp_build_binary_op (input_location,
+				    MINUS_EXPR,
+				    cp_convert (ssizetype, size, complain),
+				    cp_convert (ssizetype, integer_one_node,
+						complain),
+				    complain);
+	itype = maybe_constant_value (itype);
+      }
 
       if (!TREE_CONSTANT (itype))
 	{
@@ -10816,18 +10792,27 @@ grokdeclarator (const cp_declarator *declarator,
 					    attr_flags);
 	}
 
+      inner_declarator = declarator->declarator;
+
       /* We don't want to warn in parmeter context because we don't
 	 yet know if the parse will succeed, and this might turn out
 	 to be a constructor call.  */
       if (decl_context != PARM
-	  && declarator->parenthesized != UNKNOWN_LOCATION)
+	  && declarator->parenthesized != UNKNOWN_LOCATION
+	  /* If the type is class-like and the inner name used a
+	     global namespace qualifier, we need the parens.
+	     Unfortunately all we can tell is whether a qualified name
+	     was used or not.  */
+	  && !(inner_declarator
+	       && inner_declarator->kind == cdk_id
+	       && inner_declarator->u.id.qualifying_scope
+	       && (MAYBE_CLASS_TYPE_P (type)
+		   || TREE_CODE (type) == ENUMERAL_TYPE)))
 	warning_at (declarator->parenthesized, OPT_Wparentheses,
 		    "unnecessary parentheses in declaration of %qs", name);
       if (declarator->kind == cdk_id || declarator->kind == cdk_decomp)
 	break;
 
-      inner_declarator = declarator->declarator;
-
       switch (declarator->kind)
 	{
 	case cdk_array:
@@ -12841,7 +12826,7 @@ grok_special_member_properties (tree decl)
 	  && !ctor && !move_fn_p (decl))
 	TYPE_HAS_CONSTEXPR_CTOR (class_type) = 1;
     }
-  else if (DECL_NAME (decl) == cp_assignment_operator_id (NOP_EXPR))
+  else if (DECL_NAME (decl) == assign_op_identifier)
     {
       /* [class.copy]
 
@@ -12901,114 +12886,72 @@ grok_ctor_properties (const_tree ctype, const_tree decl)
   return true;
 }
 
-/* An operator with this code is unary, but can also be binary.  */
-
-static bool
-ambi_op_p (enum tree_code code)
-{
-  return (code == INDIRECT_REF
-	  || code == ADDR_EXPR
-	  || code == UNARY_PLUS_EXPR
-	  || code == NEGATE_EXPR
-	  || code == PREINCREMENT_EXPR
-	  || code == PREDECREMENT_EXPR);
-}
-
-/* An operator with this name can only be unary.  */
-
-static bool
-unary_op_p (enum tree_code code)
-{
-  return (code == TRUTH_NOT_EXPR
-	  || code == BIT_NOT_EXPR
-	  || code == COMPONENT_REF
-	  || code == TYPE_EXPR);
-}
-
-/* DECL is a declaration for an overloaded operator.  If COMPLAIN is true,
-   errors are issued for invalid declarations.  */
+/* DECL is a declaration for an overloaded or conversion operator.  If
+   COMPLAIN is true, errors are issued for invalid declarations.  */
 
 bool
 grok_op_properties (tree decl, bool complain)
 {
   tree argtypes = TYPE_ARG_TYPES (TREE_TYPE (decl));
-  tree argtype;
-  int methodp = (TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE);
+  bool methodp = TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE;
   tree name = DECL_NAME (decl);
-  enum tree_code operator_code;
-  int arity;
-  bool ellipsis_p;
-  tree class_type;
 
-  /* Count the number of arguments and check for ellipsis.  */
-  for (argtype = argtypes, arity = 0;
-       argtype && argtype != void_list_node;
-       argtype = TREE_CHAIN (argtype))
-    ++arity;
-  ellipsis_p = !argtype;
-
-  class_type = DECL_CONTEXT (decl);
+  tree class_type = DECL_CONTEXT (decl);
   if (class_type && !CLASS_TYPE_P (class_type))
     class_type = NULL_TREE;
 
+  tree_code operator_code;
+  unsigned op_flags;
   if (IDENTIFIER_CONV_OP_P (name))
-    operator_code = TYPE_EXPR;
+    {
+      /* Conversion operators are TYPE_EXPR for the purposes of this
+	 function.  */
+      operator_code = TYPE_EXPR;
+      op_flags = OVL_OP_FLAG_UNARY;
+    }
   else
     {
-      /* It'd be nice to hang something else of the identifier to
-	 find CODE more directly.  */
-      bool assign_op = IDENTIFIER_ASSIGN_OP_P (name);
-      const operator_name_info_t *oni
-	= (assign_op ? assignment_operator_name_info : operator_name_info);
+      const ovl_op_info_t *ovl_op = IDENTIFIER_OVL_OP_INFO (name);
 
-      if (false)
-	;
-#define DEF_OPERATOR(NAME, CODE, MANGLING, ARITY, KIND)		\
-      else if (assign_op == (KIND == cik_assign_op)		\
-	       && oni[int (CODE)].identifier == name)		\
-	operator_code = (CODE);
-#include "operators.def"
-#undef DEF_OPERATOR
-      else
-	gcc_unreachable ();
-      }
-    while (0);
-  gcc_assert (operator_code != MAX_TREE_CODES);
-  SET_OVERLOADED_OPERATOR_CODE (decl, operator_code);
+      operator_code = ovl_op->tree_code;
+      op_flags = ovl_op->flags;
+      gcc_checking_assert (operator_code != ERROR_MARK);
+      DECL_OVERLOADED_OPERATOR_CODE_RAW (decl) = ovl_op->ovl_op_code;
+    }
 
-  if (class_type)
-    switch (operator_code)
-      {
-      case NEW_EXPR:
-	TYPE_HAS_NEW_OPERATOR (class_type) = 1;
-	break;
+  if (op_flags & OVL_OP_FLAG_ALLOC)
+    {
+      /* operator new and operator delete are quite special.  */
+      if (class_type)
+	switch (op_flags)
+	  {
+	  case OVL_OP_FLAG_ALLOC:
+	    TYPE_HAS_NEW_OPERATOR (class_type) = 1;
+	    break;
 
-      case DELETE_EXPR:
-	TYPE_GETS_DELETE (class_type) |= 1;
-	break;
+	  case OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_DELETE:
+	    TYPE_GETS_DELETE (class_type) |= 1;
+	    break;
 
-      case VEC_NEW_EXPR:
-	TYPE_HAS_ARRAY_NEW_OPERATOR (class_type) = 1;
-	break;
+	  case OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_VEC:
+	    TYPE_HAS_ARRAY_NEW_OPERATOR (class_type) = 1;
+	    break;
 
-      case VEC_DELETE_EXPR:
-	TYPE_GETS_DELETE (class_type) |= 2;
-	break;
+	  case OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_DELETE | OVL_OP_FLAG_VEC:
+	    TYPE_GETS_DELETE (class_type) |= 2;
+	    break;
 
-      default:
-	break;
-      }
+	  default:
+	    gcc_unreachable ();
+	  }
 
-    /* [basic.std.dynamic.allocation]/1:
+      /* [basic.std.dynamic.allocation]/1:
 
-       A program is ill-formed if an allocation function is declared
-       in a namespace scope other than global scope or declared static
-       in global scope.
+	 A program is ill-formed if an allocation function is declared
+	 in a namespace scope other than global scope or declared
+	 static in global scope.
 
-       The same also holds true for deallocation functions.  */
-  if (operator_code == NEW_EXPR || operator_code == VEC_NEW_EXPR
-      || operator_code == DELETE_EXPR || operator_code == VEC_DELETE_EXPR)
-    {
+	 The same also holds true for deallocation functions.  */
       if (DECL_NAMESPACE_SCOPE_P (decl))
 	{
 	  if (CP_DECL_CONTEXT (decl) != global_namespace)
@@ -13016,287 +12959,269 @@ grok_op_properties (tree decl, bool complain)
 	      error ("%qD may not be declared within a namespace", decl);
 	      return false;
 	    }
-	  else if (!TREE_PUBLIC (decl))
+
+	  if (!TREE_PUBLIC (decl))
 	    {
 	      error ("%qD may not be declared as static", decl);
 	      return false;
 	    }
 	}
-    }
 
-  if (operator_code == NEW_EXPR || operator_code == VEC_NEW_EXPR)
-    {
-      TREE_TYPE (decl) = coerce_new_type (TREE_TYPE (decl));
-      DECL_IS_OPERATOR_NEW (decl) = 1;
+      if (op_flags & OVL_OP_FLAG_DELETE)
+	TREE_TYPE (decl) = coerce_delete_type (TREE_TYPE (decl));
+      else
+	{
+	  DECL_IS_OPERATOR_NEW (decl) = 1;
+	  TREE_TYPE (decl) = coerce_new_type (TREE_TYPE (decl));
+	}
+
+      return true;
     }
-  else if (operator_code == DELETE_EXPR || operator_code == VEC_DELETE_EXPR)
-    TREE_TYPE (decl) = coerce_delete_type (TREE_TYPE (decl));
-  else
+
+  /* An operator function must either be a non-static member function
+     or have at least one parameter of a class, a reference to a class,
+     an enumeration, or a reference to an enumeration.  13.4.0.6 */
+  if (! methodp || DECL_STATIC_FUNCTION_P (decl))
     {
-      /* An operator function must either be a non-static member function
-	 or have at least one parameter of a class, a reference to a class,
-	 an enumeration, or a reference to an enumeration.  13.4.0.6 */
-      if (! methodp || DECL_STATIC_FUNCTION_P (decl))
+      if (operator_code == TYPE_EXPR
+	  || operator_code == CALL_EXPR
+	  || operator_code == COMPONENT_REF
+	  || operator_code == ARRAY_REF
+	  || operator_code == NOP_EXPR)
+	{
+	  error ("%qD must be a nonstatic member function", decl);
+	  return false;
+	}
+
+      if (DECL_STATIC_FUNCTION_P (decl))
 	{
-	  if (operator_code == TYPE_EXPR
-	      || operator_code == CALL_EXPR
-	      || operator_code == COMPONENT_REF
-	      || operator_code == ARRAY_REF
-	      || operator_code == NOP_EXPR)
+	  error ("%qD must be either a non-static member "
+		 "function or a non-member function", decl);
+	  return false;
+	}
+
+      for (tree arg = argtypes; ; arg = TREE_CHAIN (arg))
+	{
+	  if (!arg || arg == void_list_node)
 	    {
-	      error ("%qD must be a nonstatic member function", decl);
+	      if (complain)
+		error ("%qD must have an argument of class or "
+		       "enumerated type", decl);
 	      return false;
 	    }
-	  else
-	    {
-	      tree p;
+      
+	  tree type = non_reference (TREE_VALUE (arg));
+	  if (type == error_mark_node)
+	    return false;
+	  
+	  /* MAYBE_CLASS_TYPE_P, rather than CLASS_TYPE_P, is used
+	     because these checks are performed even on template
+	     functions.  */
+	  if (MAYBE_CLASS_TYPE_P (type)
+	      || TREE_CODE (type) == ENUMERAL_TYPE)
+	    break;
+	}
+    }
 
-	      if (DECL_STATIC_FUNCTION_P (decl))
-		{
-		  error ("%qD must be either a non-static member "
-			 "function or a non-member function", decl);
-		  return false;
-		}
+  if (operator_code == CALL_EXPR)
+    /* There are no further restrictions on the arguments to an overloaded
+       "operator ()".  */
+    return true;
 
-	      for (p = argtypes; p && p != void_list_node; p = TREE_CHAIN (p))
-		{
-		  tree arg = non_reference (TREE_VALUE (p));
-		  if (arg == error_mark_node)
-		    return false;
-
-		  /* MAYBE_CLASS_TYPE_P, rather than CLASS_TYPE_P, is used
-		     because these checks are performed even on
-		     template functions.  */
-		  if (MAYBE_CLASS_TYPE_P (arg)
-		      || TREE_CODE (arg) == ENUMERAL_TYPE)
-		    break;
-		}
+  if (operator_code == COND_EXPR)
+    {
+      /* 13.4.0.3 */
+      error ("ISO C++ prohibits overloading operator ?:");
+      return false;
+    }
 
-	      if (!p || p == void_list_node)
-		{
-		  if (complain)
-		    error ("%qD must have an argument of class or "
-			   "enumerated type", decl);
-		  return false;
-		}
-	    }
+  /* Count the number of arguments and check for ellipsis.  */
+  int arity = 0;
+  for (tree arg = argtypes; arg != void_list_node; arg = TREE_CHAIN (arg))
+    {
+      if (!arg)
+	{
+	  /* Variadic.  */
+	  error ("%qD must not have variable number of arguments", decl);
+	  return false;
 	}
+      ++arity;
+    }
 
-      /* There are no restrictions on the arguments to an overloaded
-	 "operator ()".  */
-      if (operator_code == CALL_EXPR)
-	return true;
-
-      /* Warn about conversion operators that will never be used.  */
-      if (IDENTIFIER_CONV_OP_P (name)
-	  && ! DECL_TEMPLATE_INFO (decl)
-	  && warn_conversion
-	  /* Warn only declaring the function; there is no need to
-	     warn again about out-of-class definitions.  */
-	  && class_type == current_class_type)
+  /* Verify correct number of arguments.  */
+  switch (op_flags)
+    {
+    case OVL_OP_FLAG_AMBIARY:
+      if (arity == 1)
 	{
-	  tree t = TREE_TYPE (name);
-	  int ref = (TREE_CODE (t) == REFERENCE_TYPE);
-
-	  if (ref)
-	    t = TYPE_MAIN_VARIANT (TREE_TYPE (t));
-
-	  if (VOID_TYPE_P (t))
-            warning (OPT_Wconversion,
-                     ref
-                     ? G_("conversion to a reference to void "
-                          "will never use a type conversion operator")
-                     : G_("conversion to void "
-                          "will never use a type conversion operator"));
-	  else if (class_type)
-	    {
-	      if (t == class_type)
-                warning (OPT_Wconversion,
-                     ref
-                     ? G_("conversion to a reference to the same type "
-                          "will never use a type conversion operator")
-                     : G_("conversion to the same type "
-                          "will never use a type conversion operator"));		
-	      /* Don't force t to be complete here.  */
-	      else if (MAYBE_CLASS_TYPE_P (t)
-		       && COMPLETE_TYPE_P (t)
-		       && DERIVED_FROM_P (t, class_type))
-                 warning (OPT_Wconversion,
-                          ref
-                          ? G_("conversion to a reference to a base class "
-                               "will never use a type conversion operator")
-                          : G_("conversion to a base class "
-                               "will never use a type conversion operator"));		
-	    }
-
+	  /* We have a unary instance of an ambi-ary op.  Remap to the
+	     unary one.  */
+	  unsigned alt = ovl_op_alternate[ovl_op_mapping [operator_code]];
+	  const ovl_op_info_t *ovl_op = &ovl_op_info[false][alt];
+	  gcc_checking_assert (ovl_op->flags == OVL_OP_FLAG_UNARY);
+	  operator_code = ovl_op->tree_code;
+	  DECL_OVERLOADED_OPERATOR_CODE_RAW (decl) = ovl_op->ovl_op_code;
 	}
-
-      if (operator_code == COND_EXPR)
+      else if (arity != 2)
 	{
-	  /* 13.4.0.3 */
-	  error ("ISO C++ prohibits overloading operator ?:");
+	  /* This was an ambiguous operator but is invalid. */
+	  error (methodp
+		 ? G_("%qD must have either zero or one argument")
+		 : G_("%qD must have either one or two arguments"), decl);
 	  return false;
 	}
-      else if (ellipsis_p)
+      else if ((operator_code == POSTINCREMENT_EXPR
+		|| operator_code == POSTDECREMENT_EXPR)
+	       && ! processing_template_decl
+	       /* x++ and x--'s second argument must be an int.  */
+	       && ! same_type_p (TREE_VALUE (TREE_CHAIN (argtypes)),
+				 integer_type_node))
+	{
+	  error (methodp
+		 ? G_("postfix %qD must have %<int%> as its argument")
+		 : G_("postfix %qD must have %<int%> as its second argument"),
+		 decl);
+	  return false;
+	}
+      break;
+
+    case OVL_OP_FLAG_UNARY:
+      if (arity != 1)
 	{
-	  error ("%qD must not have variable number of arguments", decl);
+	  error (methodp
+		 ? G_("%qD must have no arguments")
+		 : G_("%qD must have exactly one argument"), decl);
 	  return false;
 	}
-      else if (ambi_op_p (operator_code))
+      break;
+
+    case OVL_OP_FLAG_BINARY:
+      if (arity != 2)
 	{
-	  if (arity == 1)
-	    /* We pick the one-argument operator codes by default, so
-	       we don't have to change anything.  */
-	    ;
-	  else if (arity == 2)
-	    {
-	      /* If we thought this was a unary operator, we now know
-		 it to be a binary operator.  */
-	      switch (operator_code)
-		{
-		case INDIRECT_REF:
-		  operator_code = MULT_EXPR;
-		  break;
+	  error (methodp
+		 ? G_("%qD must have exactly one argument")
+		 : G_("%qD must have exactly two arguments"), decl);
+	  return false;
+	}
+      break;
 
-		case ADDR_EXPR:
-		  operator_code = BIT_AND_EXPR;
-		  break;
+    default:
+      gcc_unreachable ();
+    }
 
-		case UNARY_PLUS_EXPR:
-		  operator_code = PLUS_EXPR;
-		  break;
+  /* There can be no default arguments.  */
+  for (tree arg = argtypes; arg != void_list_node; arg = TREE_CHAIN (arg))
+    if (TREE_PURPOSE (arg))
+      {
+	TREE_PURPOSE (arg) = NULL_TREE;
+	if (operator_code == POSTINCREMENT_EXPR
+	    || operator_code == POSTDECREMENT_EXPR)
+	  pedwarn (input_location, OPT_Wpedantic,
+		   "%qD cannot have default arguments", decl);
+	else
+	  {
+	    error ("%qD cannot have default arguments", decl);
+	    return false;
+	  }
+      }
 
-		case NEGATE_EXPR:
-		  operator_code = MINUS_EXPR;
-		  break;
+  /* At this point the declaration is well-formed.  It may not be
+     sensible though.  */
 
-		case PREINCREMENT_EXPR:
-		  operator_code = POSTINCREMENT_EXPR;
-		  break;
+  /* Check member function warnings only on the in-class declaration.
+     There's no point warning on an out-of-class definition.  */
+  if (class_type && class_type != current_class_type)
+    return true;
 
-		case PREDECREMENT_EXPR:
-		  operator_code = POSTDECREMENT_EXPR;
-		  break;
+  /* Warn about conversion operators that will never be used.  */
+  if (IDENTIFIER_CONV_OP_P (name)
+      && ! DECL_TEMPLATE_INFO (decl)
+      && warn_conversion)
+    {
+      tree t = TREE_TYPE (name);
+      int ref = (TREE_CODE (t) == REFERENCE_TYPE);
 
-		default:
-		  gcc_unreachable ();
-		}
+      if (ref)
+	t = TYPE_MAIN_VARIANT (TREE_TYPE (t));
 
-	      SET_OVERLOADED_OPERATOR_CODE (decl, operator_code);
+      if (VOID_TYPE_P (t))
+	warning (OPT_Wconversion,
+		 ref
+		 ? G_("conversion to a reference to void "
+		      "will never use a type conversion operator")
+		 : G_("conversion to void "
+		      "will never use a type conversion operator"));
+      else if (class_type)
+	{
+	  if (t == class_type)
+	    warning (OPT_Wconversion,
+                     ref
+                     ? G_("conversion to a reference to the same type "
+                          "will never use a type conversion operator")
+                     : G_("conversion to the same type "
+                          "will never use a type conversion operator"));
+	  /* Don't force t to be complete here.  */
+	  else if (MAYBE_CLASS_TYPE_P (t)
+		   && COMPLETE_TYPE_P (t)
+		   && DERIVED_FROM_P (t, class_type))
+	    warning (OPT_Wconversion,
+		     ref
+		     ? G_("conversion to a reference to a base class "
+			  "will never use a type conversion operator")
+		     : G_("conversion to a base class "
+			  "will never use a type conversion operator"));
+	}
+    }
 
-	      if ((operator_code == POSTINCREMENT_EXPR
-		   || operator_code == POSTDECREMENT_EXPR)
-		  && ! processing_template_decl
-		  && ! same_type_p (TREE_VALUE (TREE_CHAIN (argtypes)), integer_type_node))
-		{
-		  if (methodp)
-		    error ("postfix %qD must take %<int%> as its argument",
-			   decl);
-		  else
-		    error ("postfix %qD must take %<int%> as its second "
-			   "argument", decl);
-		  return false;
-		}
-	    }
-	  else
-	    {
-	      if (methodp)
-		error ("%qD must take either zero or one argument", decl);
-	      else
-		error ("%qD must take either one or two arguments", decl);
-	      return false;
-	    }
+  if (!warn_ecpp)
+    return true;
 
-	  /* More Effective C++ rule 6.  */
-	  if (warn_ecpp
-	      && (operator_code == POSTINCREMENT_EXPR
-		  || operator_code == POSTDECREMENT_EXPR
-		  || operator_code == PREINCREMENT_EXPR
-		  || operator_code == PREDECREMENT_EXPR))
-	    {
-	      tree arg = TREE_VALUE (argtypes);
-	      tree ret = TREE_TYPE (TREE_TYPE (decl));
-	      if (methodp || TREE_CODE (arg) == REFERENCE_TYPE)
-		arg = TREE_TYPE (arg);
-	      arg = TYPE_MAIN_VARIANT (arg);
-	      if (operator_code == PREINCREMENT_EXPR
-		  || operator_code == PREDECREMENT_EXPR)
-		{
-		  if (TREE_CODE (ret) != REFERENCE_TYPE
-		      || !same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (ret)),
-				       arg))
-		    warning (OPT_Weffc__, "prefix %qD should return %qT", decl,
-			     build_reference_type (arg));
-		}
-	      else
-		{
-		  if (!same_type_p (TYPE_MAIN_VARIANT (ret), arg))
-		    warning (OPT_Weffc__, "postfix %qD should return %qT", decl, arg);
-		}
-	    }
+  /* Effective C++ rules below.  */
+
+  /* More Effective C++ rule 7.  */
+  if (operator_code == TRUTH_ANDIF_EXPR
+      || operator_code == TRUTH_ORIF_EXPR
+      || operator_code == COMPOUND_EXPR)
+    warning (OPT_Weffc__,
+	     "user-defined %qD always evaluates both arguments", decl);
+  
+  /* More Effective C++ rule 6.  */
+  if (operator_code == POSTINCREMENT_EXPR
+      || operator_code == POSTDECREMENT_EXPR
+      || operator_code == PREINCREMENT_EXPR
+      || operator_code == PREDECREMENT_EXPR)
+    {
+      tree arg = TREE_VALUE (argtypes);
+      tree ret = TREE_TYPE (TREE_TYPE (decl));
+      if (methodp || TREE_CODE (arg) == REFERENCE_TYPE)
+	arg = TREE_TYPE (arg);
+      arg = TYPE_MAIN_VARIANT (arg);
+
+      if (operator_code == PREINCREMENT_EXPR
+	  || operator_code == PREDECREMENT_EXPR)
+	{
+	  if (TREE_CODE (ret) != REFERENCE_TYPE
+	      || !same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (ret)), arg))
+	    warning (OPT_Weffc__, "prefix %qD should return %qT", decl,
+		     build_reference_type (arg));
 	}
-      else if (unary_op_p (operator_code))
+      else
 	{
-	  if (arity != 1)
-	    {
-	      if (methodp)
-		error ("%qD must take %<void%>", decl);
-	      else
-		error ("%qD must take exactly one argument", decl);
-	      return false;
-	    }
+	  if (!same_type_p (TYPE_MAIN_VARIANT (ret), arg))
+	    warning (OPT_Weffc__, "postfix %qD should return %qT", decl, arg);
 	}
-      else /* if (binary_op_p (operator_code)) */
-	{
-	  if (arity != 2)
-	    {
-	      if (methodp)
-		error ("%qD must take exactly one argument", decl);
-	      else
-		error ("%qD must take exactly two arguments", decl);
-	      return false;
-	    }
+    }
 
-	  /* More Effective C++ rule 7.  */
-	  if (warn_ecpp
-	      && (operator_code == TRUTH_ANDIF_EXPR
-		  || operator_code == TRUTH_ORIF_EXPR
-		  || operator_code == COMPOUND_EXPR))
-	    warning (OPT_Weffc__, "user-defined %qD always evaluates both arguments",
-		     decl);
-	}
+  /* Effective C++ rule 23.  */
+  if (!DECL_ASSIGNMENT_OPERATOR_P (decl)
+      && (operator_code == PLUS_EXPR
+	  || operator_code == MINUS_EXPR
+	  || operator_code == TRUNC_DIV_EXPR
+	  || operator_code == MULT_EXPR
+	  || operator_code == TRUNC_MOD_EXPR)
+      && TREE_CODE (TREE_TYPE (TREE_TYPE (decl))) == REFERENCE_TYPE)
+    warning (OPT_Weffc__, "%qD should return by value", decl);
 
-      /* Effective C++ rule 23.  */
-      if (warn_ecpp
-	  && arity == 2
-	  && !DECL_ASSIGNMENT_OPERATOR_P (decl)
-	  && (operator_code == PLUS_EXPR
-	      || operator_code == MINUS_EXPR
-	      || operator_code == TRUNC_DIV_EXPR
-	      || operator_code == MULT_EXPR
-	      || operator_code == TRUNC_MOD_EXPR)
-	  && TREE_CODE (TREE_TYPE (TREE_TYPE (decl))) == REFERENCE_TYPE)
-	warning (OPT_Weffc__, "%qD should return by value", decl);
-
-      /* [over.oper]/8 */
-      for (; argtypes && argtypes != void_list_node;
-	  argtypes = TREE_CHAIN (argtypes))
-	if (TREE_PURPOSE (argtypes))
-	  {
-	    TREE_PURPOSE (argtypes) = NULL_TREE;
-	    if (operator_code == POSTINCREMENT_EXPR
-		|| operator_code == POSTDECREMENT_EXPR)
-	      {
-		pedwarn (input_location, OPT_Wpedantic, "%qD cannot have default arguments", 
-			 decl);
-	      }
-	    else
-	      {
-		error ("%qD cannot have default arguments", decl);
-		return false;
-	      }
-	  }
-    }
   return true;
 }
 
@@ -14799,9 +14724,11 @@ start_preparsed_function (tree decl1, tree attrs, int flags)
 
   /* Effective C++ rule 15.  */
   if (warn_ecpp
-      && DECL_OVERLOADED_OPERATOR_P (decl1) == NOP_EXPR
+      && DECL_ASSIGNMENT_OPERATOR_P (decl1)
+      && DECL_OVERLOADED_OPERATOR_IS (decl1, NOP_EXPR)
       && VOID_TYPE_P (TREE_TYPE (fntype)))
-    warning (OPT_Weffc__, "%<operator=%> should return a reference to %<*this%>");
+    warning (OPT_Weffc__,
+	     "%<operator=%> should return a reference to %<*this%>");
 
   /* Make the init_value nonzero so pushdecl knows this is not tentative.
      error_mark_node is replaced below (in poplevel) with the BLOCK.  */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index bc509623b36..a23b96c53e7 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4455,7 +4455,7 @@ maybe_warn_sized_delete (enum tree_code code)
   tree sized = NULL_TREE;
   tree unsized = NULL_TREE;
 
-  for (ovl_iterator iter (get_global_binding (cp_operator_id (code)));
+  for (ovl_iterator iter (get_global_binding (ovl_op_identifier (false, code)));
        iter; ++iter)
     {
       tree fn = *iter;
@@ -5101,11 +5101,11 @@ mark_used (tree decl, tsubst_flags_t complain)
       && DECL_DELETED_FN (decl))
     {
       if (DECL_ARTIFICIAL (decl)
-	  && DECL_OVERLOADED_OPERATOR_P (decl) == TYPE_EXPR
+	  && DECL_CONV_FN_P (decl)
 	  && LAMBDA_TYPE_P (DECL_CONTEXT (decl)))
 	/* We mark a lambda conversion op as deleted if we can't
 	   generate it properly; see maybe_add_lambda_conv_op.  */
-	sorry ("converting lambda which uses %<...%> to function pointer");
+	sorry ("converting lambda that uses %<...%> to function pointer");
       else if (complain & tf_error)
 	{
 	  error ("use of deleted function %qD", decl);
diff --git a/gcc/cp/dump.c b/gcc/cp/dump.c
index 6fafa5b792e..2e4740f71eb 100644
--- a/gcc/cp/dump.c
+++ b/gcc/cp/dump.c
@@ -24,10 +24,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cp-tree.h"
 #include "tree-dump.h"
 
-static void dump_access (dump_info_p, tree);
-
-static void dump_op (dump_info_p, tree);
-
 /* Dump a representation of the accessibility information associated
    with T.  */
 
@@ -42,163 +38,6 @@ dump_access (dump_info_p di, tree t)
     dump_string_field (di, "accs", "pub");
 }
 
-/* Dump a representation of the specific operator for an overloaded
-   operator associated with node t.  */
-
-static void
-dump_op (dump_info_p di, tree t)
-{
-  switch (DECL_OVERLOADED_OPERATOR_P (t)) {
-    case NEW_EXPR:
-      dump_string (di, "new");
-      break;
-    case VEC_NEW_EXPR:
-      dump_string (di, "vecnew");
-      break;
-    case DELETE_EXPR:
-      dump_string (di, "delete");
-      break;
-    case VEC_DELETE_EXPR:
-      dump_string (di, "vecdelete");
-      break;
-    case UNARY_PLUS_EXPR:
-      dump_string (di, "pos");
-      break;
-    case NEGATE_EXPR:
-      dump_string (di, "neg");
-      break;
-    case ADDR_EXPR:
-      dump_string (di, "addr");
-      break;
-    case INDIRECT_REF:
-      dump_string(di, "deref");
-      break;
-    case BIT_NOT_EXPR:
-      dump_string(di, "not");
-      break;
-    case TRUTH_NOT_EXPR:
-      dump_string(di, "lnot");
-      break;
-    case PREINCREMENT_EXPR:
-      dump_string(di, "preinc");
-      break;
-    case PREDECREMENT_EXPR:
-      dump_string(di, "predec");
-      break;
-    case PLUS_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "plusassign");
-      else
-	dump_string(di, "plus");
-      break;
-    case MINUS_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "minusassign");
-      else
-	dump_string(di, "minus");
-      break;
-    case MULT_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "multassign");
-      else
-	dump_string (di, "mult");
-      break;
-    case TRUNC_DIV_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "divassign");
-      else
-	dump_string (di, "div");
-      break;
-    case TRUNC_MOD_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	 dump_string (di, "modassign");
-      else
-	dump_string (di, "mod");
-      break;
-    case BIT_AND_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "andassign");
-      else
-	dump_string (di, "and");
-      break;
-    case BIT_IOR_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "orassign");
-      else
-	dump_string (di, "or");
-      break;
-    case BIT_XOR_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "xorassign");
-      else
-	dump_string (di, "xor");
-      break;
-    case LSHIFT_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "lshiftassign");
-      else
-	dump_string (di, "lshift");
-      break;
-    case RSHIFT_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "rshiftassign");
-      else
-	dump_string (di, "rshift");
-      break;
-    case EQ_EXPR:
-      dump_string (di, "eq");
-      break;
-    case NE_EXPR:
-      dump_string (di, "ne");
-      break;
-    case LT_EXPR:
-      dump_string (di, "lt");
-      break;
-    case GT_EXPR:
-      dump_string (di, "gt");
-      break;
-    case LE_EXPR:
-      dump_string (di, "le");
-      break;
-    case GE_EXPR:
-      dump_string (di, "ge");
-      break;
-    case TRUTH_ANDIF_EXPR:
-      dump_string (di, "land");
-      break;
-    case TRUTH_ORIF_EXPR:
-      dump_string (di, "lor");
-      break;
-    case COMPOUND_EXPR:
-      dump_string (di, "compound");
-      break;
-    case MEMBER_REF:
-      dump_string (di, "memref");
-      break;
-    case COMPONENT_REF:
-      dump_string (di, "ref");
-      break;
-    case ARRAY_REF:
-      dump_string (di, "subs");
-      break;
-    case POSTINCREMENT_EXPR:
-      dump_string (di, "postinc");
-      break;
-    case POSTDECREMENT_EXPR:
-      dump_string (di, "postdec");
-      break;
-    case CALL_EXPR:
-      dump_string (di, "call");
-      break;
-    case NOP_EXPR:
-      if (DECL_ASSIGNMENT_OPERATOR_P (t))
-	dump_string (di, "assign");
-      break;
-    default:
-      break;
-  }
-}
-
 /* Dump information common to statements from STMT.  */
 
 static void
@@ -303,10 +142,8 @@ cp_dump_tree (void* dump_info, tree t)
     case FUNCTION_DECL:
       if (!DECL_THUNK_P (t))
 	{
-	  if (DECL_OVERLOADED_OPERATOR_P (t)) {
+	  if (DECL_OVERLOADED_OPERATOR_P (t))
 	    dump_string_field (di, "note", "operator");
-	    dump_op (di, t);
-	  }
 	  if (DECL_FUNCTION_MEMBER_P (t))
 	    {
 	      dump_string_field (di, "note", "member");
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 7a98d2e3594..2537713b5c9 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -50,13 +50,12 @@ static cxx_pretty_printer * const cxx_pp = &actual_pretty_printer;
 # define NEXT_CODE(T) (TREE_CODE (TREE_TYPE (T)))
 
 static const char *args_to_string (tree, int);
-static const char *assop_to_string (enum tree_code);
 static const char *code_to_string (enum tree_code);
 static const char *cv_to_string (tree, int);
 static const char *decl_to_string (tree, int);
 static const char *expr_to_string (tree);
 static const char *fndecl_to_string (tree, int);
-static const char *op_to_string	(enum tree_code);
+static const char *op_to_string	(bool, enum tree_code);
 static const char *parm_to_string (int);
 static const char *type_to_string (tree, int);
 
@@ -2230,8 +2229,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 
     case INIT_EXPR:
     case MODIFY_EXPR:
-      dump_binary_op (pp, assignment_operator_name_info[NOP_EXPR].name,
-		      t, flags);
+      dump_binary_op (pp, OVL_OP_INFO (true, NOP_EXPR)->name, t, flags);
       break;
 
     case PLUS_EXPR:
@@ -2255,7 +2253,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
     case EQ_EXPR:
     case NE_EXPR:
     case EXACT_DIV_EXPR:
-      dump_binary_op (pp, operator_name_info[TREE_CODE (t)].name, t, flags);
+      dump_binary_op (pp, OVL_OP_INFO (false, TREE_CODE (t))->name, t, flags);
       break;
 
     case CEIL_DIV_EXPR:
@@ -2386,14 +2384,14 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
     case TRUTH_NOT_EXPR:
     case PREDECREMENT_EXPR:
     case PREINCREMENT_EXPR:
-      dump_unary_op (pp, operator_name_info [TREE_CODE (t)].name, t, flags);
+      dump_unary_op (pp, OVL_OP_INFO (false, TREE_CODE (t))->name, t, flags);
       break;
 
     case POSTDECREMENT_EXPR:
     case POSTINCREMENT_EXPR:
       pp_cxx_left_paren (pp);
       dump_expr (pp, TREE_OPERAND (t, 0), flags | TFF_EXPR_IN_PARENS);
-      pp_cxx_ws_string (pp, operator_name_info[TREE_CODE (t)].name);
+      pp_cxx_ws_string (pp, OVL_OP_INFO (false, TREE_CODE (t))->name);
       pp_cxx_right_paren (pp);
       break;
 
@@ -2656,7 +2654,7 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 
     case REALPART_EXPR:
     case IMAGPART_EXPR:
-      pp_cxx_ws_string (pp, operator_name_info[TREE_CODE (t)].name);
+      pp_cxx_ws_string (pp, OVL_OP_INFO (false, TREE_CODE (t))->name);
       pp_cxx_whitespace (pp);
       dump_expr (pp, TREE_OPERAND (t, 0), flags);
       break;
@@ -3136,9 +3134,9 @@ parm_to_string (int p)
 }
 
 static const char *
-op_to_string (enum tree_code p)
+op_to_string (bool assop, enum tree_code p)
 {
-  tree id = operator_name_info[p].identifier;
+  tree id = ovl_op_identifier (assop, p);
   return id ? IDENTIFIER_POINTER (id) : M_("<unknown>");
 }
 
@@ -3180,13 +3178,6 @@ type_to_string (tree typ, int verbose)
 }
 
 static const char *
-assop_to_string (enum tree_code p)
-{
-  tree id = assignment_operator_name_info[(int) p].identifier;
-  return id ? IDENTIFIER_POINTER (id) : M_("{unknown}");
-}
-
-static const char *
 args_to_string (tree p, int verbose)
 {
   int flags = 0;
@@ -4044,9 +4035,9 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
     case 'E': result = expr_to_string (next_tree);		break;
     case 'F': result = fndecl_to_string (next_tree, verbose);	break;
     case 'L': result = language_to_string (next_lang);		break;
-    case 'O': result = op_to_string (next_tcode);		break;
+    case 'O': result = op_to_string (false, next_tcode);	break;
     case 'P': result = parm_to_string (next_int);		break;
-    case 'Q': result = assop_to_string (next_tcode);		break;
+    case 'Q': result = op_to_string (true, next_tcode);		break;
     case 'S': result = subst_to_string (next_tree);		break;
     case 'T': result = type_to_string (next_tree, verbose);	break;
     case 'V': result = cv_to_string (next_tree, verbose);	break;
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index ad30f247cf6..9e6e3aff779 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -3055,7 +3055,7 @@ build_new_1 (vec<tree, va_gc> **placement, tree type, tree nelts,
   tree fnname;
   tree fns;
 
-  fnname = cp_operator_id (array_p ? VEC_NEW_EXPR : NEW_EXPR);
+  fnname = ovl_op_identifier (false, array_p ? VEC_NEW_EXPR : NEW_EXPR);
 
   member_new_p = !globally_qualified_p
 		 && CLASS_TYPE_P (elt_type)
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 76f2f29578f..bb6c68a100a 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -201,7 +201,7 @@ lambda_function (tree lambda)
   if (CLASSTYPE_TEMPLATE_INSTANTIATION (type)
       && !COMPLETE_OR_OPEN_TYPE_P (type))
     return NULL_TREE;
-  lambda = lookup_member (type, cp_operator_id (CALL_EXPR),
+  lambda = lookup_member (type, call_op_identifier,
 			  /*protect=*/0, /*want_type=*/false,
 			  tf_warning_or_error);
   if (lambda)
@@ -1258,7 +1258,6 @@ maybe_add_lambda_conv_op (tree type)
   tree fn = convfn;
   DECL_SOURCE_LOCATION (fn) = DECL_SOURCE_LOCATION (callop);
   SET_DECL_ALIGN (fn, MINIMUM_METHOD_BOUNDARY);
-  SET_OVERLOADED_OPERATOR_CODE (fn, TYPE_EXPR);
   grokclassfn (type, fn, NO_SPECIAL);
   set_linkage_according_to_type (type, fn);
   rest_of_decl_compilation (fn, namespace_bindings_p (), at_eof);
@@ -1312,11 +1311,9 @@ maybe_add_lambda_conv_op (tree type)
     fn = add_inherited_template_parms (fn, DECL_TI_TEMPLATE (callop));
 
   if (flag_sanitize & SANITIZE_NULL)
-    {
-      /* Don't UBsan this function; we're deliberately calling op() with a null
-	 object argument.  */
-      add_no_sanitize_value (fn, SANITIZE_UNDEFINED);
-    }
+    /* Don't UBsan this function; we're deliberately calling op() with a null
+       object argument.  */
+    add_no_sanitize_value (fn, SANITIZE_UNDEFINED);
 
   add_method (type, fn, false);
 
diff --git a/gcc/cp/lex.c b/gcc/cp/lex.c
index fd93401f9a6..c097f4b54cf 100644
--- a/gcc/cp/lex.c
+++ b/gcc/cp/lex.c
@@ -77,17 +77,20 @@ cxx_finish (void)
   c_common_finish ();
 }
 
-/* A mapping from tree codes to operator name information.  */
-operator_name_info_t operator_name_info[(int) MAX_TREE_CODES];
-/* Similar, but for assignment operators.  */
-operator_name_info_t assignment_operator_name_info[(int) MAX_TREE_CODES];
-
-/* Initialize data structures that keep track of operator names.  */
-
-#define DEF_OPERATOR(NAME, C, M, AR, AP) \
- CONSTRAINT (C, sizeof "operator " + sizeof NAME <= 256);
+ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] = 
+  {
+    {
+      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
+      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
+#define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
+      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
+#define OPERATOR_TRANSITION }, {			\
+      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
 #include "operators.def"
-#undef DEF_OPERATOR
+    }
+  };
+unsigned char ovl_op_mapping[MAX_TREE_CODES];
+unsigned char ovl_op_alternate[OVL_OP_MAX];
 
 /* Get the name of the kind of identifier T.  */
 
@@ -97,7 +100,8 @@ get_identifier_kind_name (tree id)
   /* Keep in sync with cp_id_kind enumeration.  */
   static const char *const names[cik_max] = {
     "normal", "keyword", "constructor", "destructor",
-    "assign-op", "op-assign-op", "simple-op", "conv-op", };
+    "simple-op", "assign-op", "conv-op", "<reserved>udlit-op"
+  };
 
   unsigned kind = 0;
   kind |= IDENTIFIER_KIND_BIT_2 (id) << 2;
@@ -120,66 +124,94 @@ set_identifier_kind (tree id, cp_identifier_kind kind)
   IDENTIFIER_KIND_BIT_0 (id) |= (kind >> 0) & 1;
 }
 
+/* Create and tag the internal operator name for the overloaded
+   operator PTR describes.  */
+
+static tree
+set_operator_ident (ovl_op_info_t *ptr)
+{
+  char buffer[32];
+  size_t len = snprintf (buffer, sizeof (buffer), "operator%s%s",
+			 &" "[ptr->name[0] && ptr->name[0] != '_'
+			      && !ISALPHA (ptr->name[0])],
+			 ptr->name);
+  gcc_checking_assert (len < sizeof (buffer));
+
+  tree ident = get_identifier_with_length (buffer, len);
+  ptr->identifier = ident;
+
+  return ident;
+}
+
+/* Initialize data structures that keep track of operator names.  */
+
 static void
 init_operators (void)
 {
-  tree identifier;
-  char buffer[256];
-  struct operator_name_info_t *oni;
-
-#define DEF_OPERATOR(NAME, CODE, MANGLING, ARITY, KIND)			\
-  sprintf (buffer, "operator%s%s", !NAME[0]				\
-	   || NAME[0] == '_' || ISALPHA (NAME[0]) ? " " : "", NAME);	\
-  identifier = get_identifier (buffer);					\
-									\
-  if (KIND != cik_simple_op || !IDENTIFIER_ANY_OP_P (identifier))	\
-    set_identifier_kind (identifier, KIND);				\
-  									\
-  oni = (KIND == cik_assign_op						\
-	 ? &assignment_operator_name_info[(int) CODE]			\
-	 : &operator_name_info[(int) CODE]);				\
-  oni->identifier = identifier;						\
-  oni->name = NAME;							\
-  oni->mangled_name = MANGLING;						\
-  oni->arity = ARITY;
+  /* We rely on both these being zero.  */
+  gcc_checking_assert (!OVL_OP_ERROR_MARK && !ERROR_MARK);
 
-#include "operators.def"
-#undef DEF_OPERATOR
-
-  operator_name_info[(int) TYPE_EXPR] = operator_name_info[(int) CAST_EXPR];
-  operator_name_info[(int) ERROR_MARK].identifier
-    = get_identifier ("<invalid operator>");
-
-  /* Handle some special cases.  These operators are not defined in
-     the language, but can be produced internally.  We may need them
-     for error-reporting.  (Eventually, we should ensure that this
-     does not happen.  Error messages involving these operators will
-     be confusing to users.)  */
-
-  operator_name_info [(int) INIT_EXPR].name
-    = operator_name_info [(int) MODIFY_EXPR].name;
-
-  operator_name_info [(int) EXACT_DIV_EXPR].name = "(ceiling /)";
-  operator_name_info [(int) CEIL_DIV_EXPR].name = "(ceiling /)";
-  operator_name_info [(int) FLOOR_DIV_EXPR].name = "(floor /)";
-  operator_name_info [(int) ROUND_DIV_EXPR].name = "(round /)";
-  operator_name_info [(int) CEIL_MOD_EXPR].name = "(ceiling %)";
-  operator_name_info [(int) FLOOR_MOD_EXPR].name = "(floor %)";
-  operator_name_info [(int) ROUND_MOD_EXPR].name = "(round %)";
-
-  operator_name_info [(int) ABS_EXPR].name = "abs";
-  operator_name_info [(int) TRUTH_AND_EXPR].name = "strict &&";
-  operator_name_info [(int) TRUTH_OR_EXPR].name = "strict ||";
-  operator_name_info [(int) RANGE_EXPR].name = "...";
-  operator_name_info [(int) UNARY_PLUS_EXPR].name = "+";
-
-  assignment_operator_name_info [(int) EXACT_DIV_EXPR].name = "(exact /=)";
-  assignment_operator_name_info [(int) CEIL_DIV_EXPR].name = "(ceiling /=)";
-  assignment_operator_name_info [(int) FLOOR_DIV_EXPR].name = "(floor /=)";
-  assignment_operator_name_info [(int) ROUND_DIV_EXPR].name = "(round /=)";
-  assignment_operator_name_info [(int) CEIL_MOD_EXPR].name = "(ceiling %=)";
-  assignment_operator_name_info [(int) FLOOR_MOD_EXPR].name = "(floor %=)";
-  assignment_operator_name_info [(int) ROUND_MOD_EXPR].name = "(round %=)";
+  /* This loop iterates backwards because we need to move the
+     assignment operators down to their correct slots.  I.e. morally
+     equivalent to an overlapping memmove where dest > src.  Slot
+     zero is for error_mark, so hae no operator. */
+  for (unsigned ix = OVL_OP_MAX; --ix;)
+    {
+      ovl_op_info_t *op_ptr = &ovl_op_info[false][ix];
+
+      if (op_ptr->name)
+	{
+	  /* Make sure it fits in lang_decl_fn::operator_code. */
+	  gcc_checking_assert (op_ptr->ovl_op_code < (1 << 6));
+	  tree ident = set_operator_ident (op_ptr);
+	  if (unsigned index = IDENTIFIER_CP_INDEX (ident))
+	    {
+	      ovl_op_info_t *bin_ptr = &ovl_op_info[false][index];
+
+	      /* They should only differ in unary/binary ness.  */
+	      gcc_checking_assert ((op_ptr->flags ^ bin_ptr->flags)
+				   == OVL_OP_FLAG_AMBIARY);
+	      bin_ptr->flags |= op_ptr->flags;
+	      ovl_op_alternate[index] = ix;
+	    }
+	  else
+	    {
+	      IDENTIFIER_CP_INDEX (ident) = ix;
+	      set_identifier_kind (ident, cik_simple_op);
+	    }
+	}
+      if (op_ptr->tree_code)
+	{
+	  gcc_checking_assert (op_ptr->ovl_op_code == ix
+			       && !ovl_op_mapping[op_ptr->tree_code]);
+	  ovl_op_mapping[op_ptr->tree_code] = op_ptr->ovl_op_code;
+	}
+
+      ovl_op_info_t *as_ptr = &ovl_op_info[true][ix];
+      if (as_ptr->name)
+	{
+	  /* These will be placed at the start of the array, move to
+	     the correct slot and initialize.  */
+	  if (as_ptr->ovl_op_code != ix)
+	    {
+	      ovl_op_info_t *dst_ptr = &ovl_op_info[true][as_ptr->ovl_op_code];
+	      gcc_assert (as_ptr->ovl_op_code > ix && !dst_ptr->tree_code);
+	      memcpy (dst_ptr, as_ptr, sizeof (*dst_ptr));
+	      memset (as_ptr, 0, sizeof (*as_ptr));
+	      as_ptr = dst_ptr;
+	    }
+
+	  tree ident = set_operator_ident (as_ptr);
+	  gcc_checking_assert (!IDENTIFIER_CP_INDEX (ident));
+	  IDENTIFIER_CP_INDEX (ident) = as_ptr->ovl_op_code;
+	  set_identifier_kind (ident, cik_assign_op);
+
+	  gcc_checking_assert (!ovl_op_mapping[as_ptr->tree_code]
+			       || (ovl_op_mapping[as_ptr->tree_code]
+				   == as_ptr->ovl_op_code));
+	  ovl_op_mapping[as_ptr->tree_code] = as_ptr->ovl_op_code;
+	}
+    }
 }
 
 /* Initialize the reserved words.  */
@@ -462,10 +494,7 @@ unqualified_name_lookup_error (tree name, location_t loc)
     loc = EXPR_LOC_OR_LOC (name, input_location);
 
   if (IDENTIFIER_ANY_OP_P (name))
-    {
-      if (name != cp_operator_id (ERROR_MARK))
-	error_at (loc, "%qD not defined", name);
-    }
+    error_at (loc, "%qD not defined", name);
   else
     {
       if (!objc_diagnose_private_ivar (name))
@@ -585,7 +614,6 @@ make_conv_op_name (tree type)
 
       /* Just in case something managed to bind.  */
       IDENTIFIER_BINDING (identifier) = NULL;
-      IDENTIFIER_LABEL_VALUE (identifier) = NULL_TREE;
 
       /* Hang TYPE off the identifier so it can be found easily later
 	 when performing conversions.  */
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 19e7c62e483..ea7f55ca293 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1263,35 +1263,10 @@ write_unqualified_id (tree identifier)
 {
   if (IDENTIFIER_CONV_OP_P (identifier))
     write_conversion_operator_name (TREE_TYPE (identifier));
-  else if (IDENTIFIER_ANY_OP_P (identifier))
+  else if (IDENTIFIER_OVL_OP_P (identifier))
     {
-      int i;
-      const char *mangled_name = NULL;
-
-      /* Unfortunately, there is no easy way to go from the
-	 name of the operator back to the corresponding tree
-	 code.  */
-      for (i = 0; i < MAX_TREE_CODES; ++i)
-	if (operator_name_info[i].identifier == identifier)
-	  {
-	    /* The ABI says that we prefer binary operator
-	       names to unary operator names.  */
-	    if (operator_name_info[i].arity == 2)
-	      {
-		mangled_name = operator_name_info[i].mangled_name;
-		break;
-	      }
-	    else if (!mangled_name)
-	      mangled_name = operator_name_info[i].mangled_name;
-	  }
-	else if (assignment_operator_name_info[i].identifier
-		 == identifier)
-	  {
-	    mangled_name
-	      = assignment_operator_name_info[i].mangled_name;
-	    break;
-	  }
-      write_string (mangled_name);
+      const ovl_op_info_t *ovl_op = IDENTIFIER_OVL_OP_INFO (identifier);
+      write_string (ovl_op->mangled_name);
     }
   else if (UDLIT_OPER_P (identifier))
     write_literal_operator_name (identifier);
@@ -1345,13 +1320,10 @@ write_unqualified_name (tree decl)
 	}
       else if (DECL_OVERLOADED_OPERATOR_P (decl))
 	{
-	  operator_name_info_t *oni;
-	  if (DECL_ASSIGNMENT_OPERATOR_P (decl))
-	    oni = assignment_operator_name_info;
-	  else
-	    oni = operator_name_info;
-
-	  write_string (oni[DECL_OVERLOADED_OPERATOR_P (decl)].mangled_name);
+	  const char *mangled_name
+	    = (ovl_op_info[DECL_ASSIGNMENT_OPERATOR_P (decl)]
+	       [DECL_OVERLOADED_OPERATOR_CODE_RAW (decl)].mangled_name);
+	  write_string (mangled_name);
 	}
       else if (UDLIT_OPER_P (DECL_NAME (decl)))
 	write_literal_operator_name (DECL_NAME (decl));
@@ -3065,8 +3037,8 @@ write_expression (tree expr)
   else if (TREE_CODE (expr) == MODOP_EXPR)
     {
       enum tree_code subop = TREE_CODE (TREE_OPERAND (expr, 1));
-      const char *name = (assignment_operator_name_info[(int) subop]
-			  .mangled_name);
+      const char *name = OVL_OP_INFO (true, subop)->mangled_name;
+
       write_string (name);
       write_expression (TREE_OPERAND (expr, 0));
       write_expression (TREE_OPERAND (expr, 2));
@@ -3091,7 +3063,7 @@ write_expression (tree expr)
       if (NEW_EXPR_USE_GLOBAL (expr))
 	write_string ("gs");
 
-      write_string (operator_name_info[(int) code].mangled_name);
+      write_string (OVL_OP_INFO (false, code)->mangled_name);
 
       for (t = placement; t; t = TREE_CHAIN (t))
 	write_expression (TREE_VALUE (t));
@@ -3131,7 +3103,7 @@ write_expression (tree expr)
       if (DELETE_EXPR_USE_GLOBAL (expr))
 	write_string ("gs");
 
-      write_string (operator_name_info[(int) code].mangled_name);
+      write_string (OVL_OP_INFO (false, code)->mangled_name);
 
       write_expression (TREE_OPERAND (expr, 0));
     }
@@ -3196,7 +3168,7 @@ write_expression (tree expr)
 
 	  if (TREE_CODE (ob) == ARROW_EXPR)
 	    {
-	      write_string (operator_name_info[(int)code].mangled_name);
+	      write_string (OVL_OP_INFO (false, code)->mangled_name);
 	      ob = TREE_OPERAND (ob, 0);
 	      write_expression (ob);
 	    }
@@ -3213,7 +3185,7 @@ write_expression (tree expr)
 	}
 
       /* If it wasn't any of those, recursively expand the expression.  */
-      name = operator_name_info[(int) code].mangled_name;
+      name = OVL_OP_INFO (false, code)->mangled_name;
 
       /* We used to mangle const_cast and static_cast like a C cast.  */
       if (code == CONST_CAST_EXPR
@@ -3222,7 +3194,7 @@ write_expression (tree expr)
 	  if (abi_warn_or_compat_version_crosses (6))
 	    G.need_abi_warning = 1;
 	  if (!abi_version_at_least (6))
-	    name = operator_name_info[CAST_EXPR].mangled_name;
+	    name = OVL_OP_INFO (false, CAST_EXPR)->mangled_name;
 	}
 
       if (name == NULL)
@@ -3323,7 +3295,7 @@ write_expression (tree expr)
 		  if (i == 0)
 		    {
 		      int fcode = TREE_INT_CST_LOW (operand);
-		      write_string (operator_name_info[fcode].mangled_name);
+		      write_string (OVL_OP_INFO (false, fcode)->mangled_name);
 		      continue;
 		    }
 		  else if (code == BINARY_LEFT_FOLD_EXPR)
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 4e56874ae26..714b5087991 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -815,7 +815,7 @@ do_build_copy_assign (tree fndecl)
 	  parmvec = make_tree_vector_single (converted_parm);
 	  finish_expr_stmt
 	    (build_special_member_call (current_class_ref,
-					cp_assignment_operator_id (NOP_EXPR),
+					assign_op_identifier,
 					&parmvec,
 					base_binfo,
 					flags,
@@ -929,7 +929,8 @@ synthesize_method (tree fndecl)
   start_preparsed_function (fndecl, NULL_TREE, SF_DEFAULT | SF_PRE_PARSED);
   stmt = begin_function_body ();
 
-  if (DECL_OVERLOADED_OPERATOR_P (fndecl) == NOP_EXPR)
+  if (DECL_ASSIGNMENT_OPERATOR_P (fndecl)
+      && DECL_OVERLOADED_OPERATOR_IS (fndecl, NOP_EXPR))
     {
       do_build_copy_assign (fndecl);
       need_body = false;
@@ -1108,7 +1109,7 @@ get_copy_assign (tree type)
   int quals = (TYPE_HAS_CONST_COPY_ASSIGN (type)
 	       ? TYPE_QUAL_CONST : TYPE_UNQUALIFIED);
   tree argtype = build_stub_type (type, quals, false);
-  tree fn = locate_fn_flags (type, cp_assignment_operator_id (NOP_EXPR), argtype,
+  tree fn = locate_fn_flags (type, assign_op_identifier, argtype,
 			     LOOKUP_NORMAL, tf_warning_or_error);
   if (fn == error_mark_node)
     return NULL_TREE;
@@ -1565,7 +1566,7 @@ synthesized_method_walk (tree ctype, special_function_kind sfk, bool const_p,
     case sfk_move_assignment:
     case sfk_copy_assignment:
       assign_p = true;
-      fnname = cp_assignment_operator_id (NOP_EXPR);
+      fnname = assign_op_identifier;
       break;
 
     case sfk_destructor:
@@ -1702,12 +1703,12 @@ synthesized_method_walk (tree ctype, special_function_kind sfk, bool const_p,
 	{
 	  /* Unlike for base ctor/op=/dtor, for operator delete it's fine
 	     to have a null fn (no class-specific op delete).  */
-	  fn = locate_fn_flags (ctype, cp_operator_id (DELETE_EXPR),
+	  fn = locate_fn_flags (ctype, ovl_op_identifier (false, DELETE_EXPR),
 				ptr_type_node, flags, tf_none);
 	  if (fn && fn == error_mark_node)
 	    {
 	      if (complain & tf_error)
-		locate_fn_flags (ctype, cp_operator_id (DELETE_EXPR),
+		locate_fn_flags (ctype, ovl_op_identifier (false, DELETE_EXPR),
 				 ptr_type_node, flags, complain);
 	      if (deleted_p)
 		*deleted_p = true;
@@ -2007,7 +2008,7 @@ implicitly_declare_fn (special_function_kind kind, tree type,
 	  || kind == sfk_move_assignment)
 	{
 	  return_type = build_reference_type (type);
-	  name = cp_assignment_operator_id (NOP_EXPR);
+	  name = assign_op_identifier;
 	}
       else
 	name = ctor_identifier;
@@ -2077,7 +2078,7 @@ implicitly_declare_fn (special_function_kind kind, tree type,
 
   if (!IDENTIFIER_CDTOR_P (name))
     /* Assignment operator.  */
-    SET_OVERLOADED_OPERATOR_CODE (fn, NOP_EXPR);
+    DECL_OVERLOADED_OPERATOR_CODE_RAW (fn) = OVL_OP_NOP_EXPR;
   else if (IDENTIFIER_CTOR_P (name))
     DECL_CXX_CONSTRUCTOR_P (fn) = true;
   else
@@ -2318,7 +2319,7 @@ defaultable_fn_check (tree fn)
   else if (DECL_DESTRUCTOR_P (fn))
     kind = sfk_destructor;
   else if (DECL_ASSIGNMENT_OPERATOR_P (fn)
-	   && DECL_OVERLOADED_OPERATOR_P (fn) == NOP_EXPR)
+	   && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR))
     {
       if (copy_fn_p (fn))
 	kind = sfk_copy_assignment;
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index b1b4ebbb7de..b4976d8b7cc 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -1299,7 +1299,7 @@ get_class_binding (tree klass, tree name, int type_or_fns)
 	  if (CLASSTYPE_LAZY_DESTRUCTOR (klass))
 	    lazily_declare_fn (sfk_destructor, klass);
 	}
-      else if (name == cp_assignment_operator_id (NOP_EXPR))
+      else if (name == assign_op_identifier)
 	{
 	  if (CLASSTYPE_LAZY_COPY_ASSIGN (klass))
 	    lazily_declare_fn (sfk_copy_assignment, klass);
@@ -5377,7 +5377,7 @@ suggest_alternatives_for (location_t location, tree name,
       gcc_rich_location richloc (location);
 
       richloc.add_fixit_replace (fuzzy);
-      inform_at_rich_loc (&richloc, "suggested alternative: %qs", fuzzy);
+      inform (&richloc, "suggested alternative: %qs", fuzzy);
     }
 }
 
@@ -5485,10 +5485,10 @@ maybe_suggest_missing_header (location_t location, tree name, tree scope)
 
   gcc_rich_location richloc (location);
   maybe_add_include_fixit (&richloc, header_hint);
-  inform_at_rich_loc (&richloc,
-		      "%<std::%s%> is defined in header %qs;"
-		      " did you forget to %<#include %s%>?",
-		      name_str, header_hint, header_hint);
+  inform (&richloc,
+	  "%<std::%s%> is defined in header %qs;"
+	  " did you forget to %<#include %s%>?",
+	  name_str, header_hint, header_hint);
   return true;
 }
 
@@ -5518,8 +5518,8 @@ suggest_alternative_in_explicit_scope (location_t location, tree name,
     {
       gcc_rich_location richloc (location);
       richloc.add_fixit_replace (fuzzy_name);
-      inform_at_rich_loc (&richloc, "suggested alternative: %qs",
-			  fuzzy_name);
+      inform (&richloc, "suggested alternative: %qs",
+	      fuzzy_name);
       return true;
     }
 
diff --git a/gcc/cp/name-lookup.h b/gcc/cp/name-lookup.h
index bf0bf85cd53..1fc128070d3 100644
--- a/gcc/cp/name-lookup.h
+++ b/gcc/cp/name-lookup.h
@@ -148,15 +148,6 @@ struct GTY(()) cp_class_binding {
   tree identifier;
 };
 
-
-struct GTY(()) cp_label_binding {
-  /* The bound LABEL_DECL.  */
-  tree label;
-  /* The previous IDENTIFIER_LABEL_VALUE.  */
-  tree prev_value;
-};
-
-
 /* For each binding contour we allocate a binding_level structure
    which records the names defined in that contour.
    Contours include:
@@ -202,10 +193,6 @@ struct GTY(()) cp_binding_level {
       the class.  */
   tree type_shadowed;
 
-  /* Similar to class_shadowed, but for IDENTIFIER_LABEL_VALUE, and
-      used for all binding levels.  */
-  vec<cp_label_binding, va_gc> *shadowed_labels;
-
   /* For each level (except not the global one),
       a chain of BLOCK nodes for all the levels
       that were entered and exited one level down.  */
diff --git a/gcc/cp/operators.def b/gcc/cp/operators.def
index 7dfdd227241..119529ccddd 100644
--- a/gcc/cp/operators.def
+++ b/gcc/cp/operators.def
@@ -45,116 +45,114 @@ along with GCC; see the file COPYING3.  If not see
      mangled under the new ABI.  For `operator +', for example, this
      would be "pl".
 
-   ARITY
+   FLAGS
 
-     The arity of the operator, or -1 if any arity is allowed.  (As
-     for `operator ()'.)  Postincrement and postdecrement operators
-     are marked as binary.
-
-   ASSN_P
-
-     A boolean value.  If nonzero, this is an assignment operator.
+     ovl_op_flags bits.  Postincrement and postdecrement operators are
+     marked as binary.
 
    Before including this file, you should define DEF_OPERATOR
    to take these arguments.
 
    There is code (such as in grok_op_properties) that depends on the
-   order the operators are presented in this file.  In particular,
-   unary operators must precede binary operators.  */
-
-/* Use DEF_SIMPLE_OPERATOR to define a non-assignment operator.  Its
-   arguments are as for DEF_OPERATOR, but there is no need to provide
-   an ASSIGNMENT_P argument; it is always zero.  */
-
-#define DEF_SIMPLE_OPERATOR(NAME, CODE, MANGLING, ARITY) \
-  DEF_OPERATOR(NAME, CODE, MANGLING, ARITY, cik_simple_op)
+   order the operators are presented in this file.  Unary_ops must
+   preceed a matching binary op (i.e. '+').  Assignment operators must
+   be last, after OPERATOR_TRANSITION.  */
 
 /* Use DEF_ASSN_OPERATOR to define an assignment operator.  Its
    arguments are as for DEF_OPERATOR, but there is no need to provide
-   an ASSIGNMENT_P argument; it is always one.  */
-
-#define DEF_ASSN_OPERATOR(NAME, CODE, MANGLING, ARITY) \
-  DEF_OPERATOR(NAME, CODE, MANGLING, ARITY, cik_assign_op)
-
-/* Memory allocation operators.  */
-DEF_OPERATOR ("new", NEW_EXPR, "nw", -1, cik_newdel_op)
-DEF_OPERATOR ("new []", VEC_NEW_EXPR, "na", -1, cik_newdel_op)
-DEF_OPERATOR ("delete", DELETE_EXPR, "dl", -1, cik_newdel_op)
-DEF_OPERATOR ("delete []", VEC_DELETE_EXPR, "da", -1, cik_newdel_op)
+   FLAGS (OVL_OP_FLAG_BINARY).  */
+
+#ifndef DEF_ASSN_OPERATOR
+#define DEF_ASSN_OPERATOR(NAME, CODE, MANGLING) \
+  DEF_OPERATOR(NAME, CODE, MANGLING, OVL_OP_FLAG_BINARY)
+#endif
+
+/* Memory allocation operators.  ARITY has special meaning. */
+DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
+DEF_OPERATOR ("new []", VEC_NEW_EXPR, "na",
+	      OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_VEC)
+DEF_OPERATOR ("delete", DELETE_EXPR, "dl",
+	      OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_DELETE)
+DEF_OPERATOR ("delete []", VEC_DELETE_EXPR, "da",
+	      OVL_OP_FLAG_ALLOC | OVL_OP_FLAG_DELETE | OVL_OP_FLAG_VEC)
 
 /* Unary operators.  */
-DEF_SIMPLE_OPERATOR ("+", UNARY_PLUS_EXPR, "ps", 1)
-DEF_SIMPLE_OPERATOR ("-", NEGATE_EXPR, "ng", 1)
-DEF_SIMPLE_OPERATOR ("&", ADDR_EXPR, "ad", 1)
-DEF_SIMPLE_OPERATOR ("*", INDIRECT_REF, "de", 1)
-DEF_SIMPLE_OPERATOR ("~", BIT_NOT_EXPR, "co", 1)
-DEF_SIMPLE_OPERATOR ("!", TRUTH_NOT_EXPR, "nt", 1)
-DEF_SIMPLE_OPERATOR ("++", PREINCREMENT_EXPR, "pp", 1)
-DEF_SIMPLE_OPERATOR ("--", PREDECREMENT_EXPR, "mm", 1)
-DEF_SIMPLE_OPERATOR ("sizeof", SIZEOF_EXPR, "sz", 1)
-/* These are extensions.  */
-DEF_SIMPLE_OPERATOR ("alignof", ALIGNOF_EXPR, "az", 1)
-DEF_SIMPLE_OPERATOR ("__imag__", IMAGPART_EXPR, "v18__imag__", 1)
-DEF_SIMPLE_OPERATOR ("__real__", REALPART_EXPR, "v18__real__", 1)
+DEF_OPERATOR ("+", UNARY_PLUS_EXPR, "ps", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("-", NEGATE_EXPR, "ng", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("&", ADDR_EXPR, "ad", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("*", INDIRECT_REF, "de", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("~", BIT_NOT_EXPR, "co", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("!", TRUTH_NOT_EXPR, "nt", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("++", PREINCREMENT_EXPR, "pp", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("--", PREDECREMENT_EXPR, "mm", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("->", COMPONENT_REF, "pt", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("sizeof", SIZEOF_EXPR, "sz", OVL_OP_FLAG_UNARY)
 
-/* The cast operators.  */
-DEF_SIMPLE_OPERATOR ("", CAST_EXPR, "cv", 1)
-DEF_SIMPLE_OPERATOR ("dynamic_cast", DYNAMIC_CAST_EXPR, "dc", 1)
-DEF_SIMPLE_OPERATOR ("reinterpret_cast", REINTERPRET_CAST_EXPR, "rc", 1)
-DEF_SIMPLE_OPERATOR ("const_cast", CONST_CAST_EXPR, "cc", 1)
-DEF_SIMPLE_OPERATOR ("static_cast", STATIC_CAST_EXPR, "sc", 1)
+/* These are extensions.  */
+DEF_OPERATOR ("alignof", ALIGNOF_EXPR, "az", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("__imag__", IMAGPART_EXPR, "v18__imag__", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("__real__", REALPART_EXPR, "v18__real__", OVL_OP_FLAG_UNARY)
 
 /* Binary operators.  */
-DEF_SIMPLE_OPERATOR ("+", PLUS_EXPR, "pl", 2)
-DEF_SIMPLE_OPERATOR ("-", MINUS_EXPR, "mi", 2)
-DEF_SIMPLE_OPERATOR ("*", MULT_EXPR, "ml", 2)
-DEF_SIMPLE_OPERATOR ("/", TRUNC_DIV_EXPR, "dv", 2)
-DEF_SIMPLE_OPERATOR ("%", TRUNC_MOD_EXPR, "rm", 2)
-DEF_SIMPLE_OPERATOR ("&", BIT_AND_EXPR, "an", 2)
-DEF_SIMPLE_OPERATOR ("|", BIT_IOR_EXPR, "or", 2)
-DEF_SIMPLE_OPERATOR ("^", BIT_XOR_EXPR, "eo", 2)
-DEF_SIMPLE_OPERATOR ("<<", LSHIFT_EXPR, "ls", 2)
-DEF_SIMPLE_OPERATOR (">>", RSHIFT_EXPR, "rs", 2)
-DEF_SIMPLE_OPERATOR ("==", EQ_EXPR, "eq", 2)
-DEF_SIMPLE_OPERATOR ("!=", NE_EXPR, "ne", 2)
-DEF_SIMPLE_OPERATOR ("<", LT_EXPR, "lt", 2)
-DEF_SIMPLE_OPERATOR (">", GT_EXPR, "gt", 2)
-DEF_SIMPLE_OPERATOR ("<=", LE_EXPR, "le", 2)
-DEF_SIMPLE_OPERATOR (">=", GE_EXPR, "ge", 2)
-DEF_SIMPLE_OPERATOR ("&&", TRUTH_ANDIF_EXPR, "aa", 2)
-DEF_SIMPLE_OPERATOR ("||", TRUTH_ORIF_EXPR, "oo", 2)
-DEF_SIMPLE_OPERATOR (",", COMPOUND_EXPR, "cm", 2)
-DEF_SIMPLE_OPERATOR ("->*", MEMBER_REF, "pm", 2)
-DEF_SIMPLE_OPERATOR (".*", DOTSTAR_EXPR, "ds", 2)
-DEF_SIMPLE_OPERATOR ("->", COMPONENT_REF, "pt", 2)
-DEF_SIMPLE_OPERATOR ("[]", ARRAY_REF, "ix", 2)
-DEF_SIMPLE_OPERATOR ("++", POSTINCREMENT_EXPR, "pp", 2)
-DEF_SIMPLE_OPERATOR ("--", POSTDECREMENT_EXPR, "mm", 2)
-/* This one is needed for mangling.  */
-DEF_SIMPLE_OPERATOR ("::", SCOPE_REF, "sr", 2)
-
-/* Assignment operators.  */
-DEF_ASSN_OPERATOR ("=", NOP_EXPR, "aS", 2)
-DEF_ASSN_OPERATOR ("+=", PLUS_EXPR, "pL", 2)
-DEF_ASSN_OPERATOR ("-=", MINUS_EXPR, "mI", 2)
-DEF_ASSN_OPERATOR ("*=", MULT_EXPR, "mL", 2)
-DEF_ASSN_OPERATOR ("/=", TRUNC_DIV_EXPR, "dV", 2)
-DEF_ASSN_OPERATOR ("%=", TRUNC_MOD_EXPR, "rM", 2)
-DEF_ASSN_OPERATOR ("&=", BIT_AND_EXPR, "aN", 2)
-DEF_ASSN_OPERATOR ("|=", BIT_IOR_EXPR, "oR", 2)
-DEF_ASSN_OPERATOR ("^=", BIT_XOR_EXPR, "eO", 2)
-DEF_ASSN_OPERATOR ("<<=", LSHIFT_EXPR, "lS", 2)
-DEF_ASSN_OPERATOR (">>=", RSHIFT_EXPR, "rS", 2)
-
-/* Ternary operators.  */
-DEF_SIMPLE_OPERATOR ("?:", COND_EXPR, "qu", 3)
+DEF_OPERATOR ("+", PLUS_EXPR, "pl", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("-", MINUS_EXPR, "mi", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("*", MULT_EXPR, "ml", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("/", TRUNC_DIV_EXPR, "dv", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("%", TRUNC_MOD_EXPR, "rm", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("&", BIT_AND_EXPR, "an", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("|", BIT_IOR_EXPR, "or", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("^", BIT_XOR_EXPR, "eo", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("<<", LSHIFT_EXPR, "ls", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR (">>", RSHIFT_EXPR, "rs", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("==", EQ_EXPR, "eq", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("!=", NE_EXPR, "ne", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("<", LT_EXPR, "lt", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR (">", GT_EXPR, "gt", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("<=", LE_EXPR, "le", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR (">=", GE_EXPR, "ge", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("&&", TRUTH_ANDIF_EXPR, "aa", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("||", TRUTH_ORIF_EXPR, "oo", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR (",", COMPOUND_EXPR, "cm", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("->*", MEMBER_REF, "pm", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR (".*", DOTSTAR_EXPR, "ds", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("[]", ARRAY_REF, "ix", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("++", POSTINCREMENT_EXPR, "pp", OVL_OP_FLAG_BINARY)
+DEF_OPERATOR ("--", POSTDECREMENT_EXPR, "mm", OVL_OP_FLAG_BINARY)
 
 /* Miscellaneous.  */
-DEF_SIMPLE_OPERATOR ("()", CALL_EXPR, "cl", -1)
-
-/* Variadic templates extension. */
-DEF_SIMPLE_OPERATOR ("...", EXPR_PACK_EXPANSION, "sp", 1)
-DEF_SIMPLE_OPERATOR ("... +", UNARY_LEFT_FOLD_EXPR, "fl", 2)
-DEF_SIMPLE_OPERATOR ("+ ...", UNARY_RIGHT_FOLD_EXPR, "fr", 2)
-DEF_SIMPLE_OPERATOR ("+ ... +", BINARY_LEFT_FOLD_EXPR, "fL", 3)
-DEF_SIMPLE_OPERATOR ("+ ... +", BINARY_RIGHT_FOLD_EXPR, "fR", 3)
+DEF_OPERATOR ("?:", COND_EXPR, "qu", OVL_OP_FLAG_NONE)
+DEF_OPERATOR ("()", CALL_EXPR, "cl", OVL_OP_FLAG_NONE)
+
+/* Operators needed for mangling.  */
+DEF_OPERATOR (NULL, CAST_EXPR, "cv", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR (NULL, DYNAMIC_CAST_EXPR, "dc", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR (NULL, REINTERPRET_CAST_EXPR, "rc", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR (NULL, CONST_CAST_EXPR, "cc", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR (NULL, STATIC_CAST_EXPR, "sc", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR (NULL, SCOPE_REF, "sr", OVL_OP_FLAG_NONE)
+DEF_OPERATOR (NULL, EXPR_PACK_EXPANSION, "sp", OVL_OP_FLAG_NONE)
+DEF_OPERATOR (NULL, UNARY_LEFT_FOLD_EXPR, "fl", OVL_OP_FLAG_NONE)
+DEF_OPERATOR (NULL, UNARY_RIGHT_FOLD_EXPR, "fr", OVL_OP_FLAG_NONE)
+DEF_OPERATOR (NULL, BINARY_LEFT_FOLD_EXPR, "fL", OVL_OP_FLAG_NONE)
+DEF_OPERATOR (NULL, BINARY_RIGHT_FOLD_EXPR, "fR", OVL_OP_FLAG_NONE)
+
+#ifdef OPERATOR_TRANSITION
+OPERATOR_TRANSITION
+#undef OPERATOR_TRANSITION
+#endif
+
+/* Assignment operators.  */
+DEF_ASSN_OPERATOR ("=", NOP_EXPR, "aS")
+DEF_ASSN_OPERATOR ("+=", PLUS_EXPR, "pL")
+DEF_ASSN_OPERATOR ("-=", MINUS_EXPR, "mI")
+DEF_ASSN_OPERATOR ("*=", MULT_EXPR, "mL")
+DEF_ASSN_OPERATOR ("/=", TRUNC_DIV_EXPR, "dV")
+DEF_ASSN_OPERATOR ("%=", TRUNC_MOD_EXPR, "rM")
+DEF_ASSN_OPERATOR ("&=", BIT_AND_EXPR, "aN")
+DEF_ASSN_OPERATOR ("|=", BIT_IOR_EXPR, "oR")
+DEF_ASSN_OPERATOR ("^=", BIT_XOR_EXPR, "eO")
+DEF_ASSN_OPERATOR ("<<=", LSHIFT_EXPR, "lS")
+DEF_ASSN_OPERATOR (">>=", RSHIFT_EXPR, "rS")
+
+#undef DEF_ASSN_OPERATOR
+#undef DEF_OPERATOR
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 810e2b7f72e..77b96376e13 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -3294,9 +3294,9 @@ cp_parser_diagnose_invalid_type_name (cp_parser *parser, tree id,
 	{
 	  gcc_rich_location richloc (location);
 	  richloc.add_fixit_replace (suggestion);
-	  error_at_rich_loc (&richloc,
-			     "%qE does not name a type; did you mean %qs?",
-			     id, suggestion);
+	  error_at (&richloc,
+		    "%qE does not name a type; did you mean %qs?",
+		    id, suggestion);
 	}
       else
 	error_at (location, "%qE does not name a type", id);
@@ -3937,6 +3937,9 @@ cp_parser_new (void)
   /* Allow constrained-type-specifiers. */
   parser->prevent_constrained_type_specifiers = 0;
 
+  /* We haven't yet seen an 'extern "C"'.  */
+  parser->innermost_linkage_specification_location = UNKNOWN_LOCATION;
+
   return parser;
 }
 
@@ -4104,9 +4107,9 @@ cp_parser_string_literal (cp_parser *parser, bool translate, bool wide_ok,
 		{
 		  rich_location rich_loc (line_table, tok->location);
 		  rich_loc.add_range (last_tok_loc, false);
-		  error_at_rich_loc (&rich_loc,
-				     "unsupported non-standard concatenation "
-				     "of string literals");
+		  error_at (&rich_loc,
+			    "unsupported non-standard concatenation "
+			    "of string literals");
 		}
 	    }
 
@@ -6160,9 +6163,9 @@ cp_parser_nested_name_specifier_opt (cp_parser *parser,
 	    {
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_replace ("::");
-	      error_at_rich_loc (&richloc,
-				 "found %<:%> in nested-name-specifier, "
-				 "expected %<::%>");
+	      error_at (&richloc,
+			"found %<:%> in nested-name-specifier, "
+			"expected %<::%>");
 	      token->type = CPP_SCOPE;
 	    }
 
@@ -9094,8 +9097,8 @@ cp_parser_cast_expression (cp_parser *parser, bool address_p, bool cast_p,
 		  gcc_rich_location rich_loc (input_location);
 		  maybe_add_cast_fixit (&rich_loc, open_paren_loc, close_paren_loc,
 					expr, type);
-		  warning_at_rich_loc (&rich_loc, OPT_Wold_style_cast,
-				       "use of old-style cast to %q#T", type);
+		  warning_at (&rich_loc, OPT_Wold_style_cast,
+			      "use of old-style cast to %q#T", type);
 		}
 
 	      /* Only type conversions to integral or enumeration types
@@ -10611,8 +10614,7 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, tree lambda_expr)
 
     p = obstack_alloc (&declarator_obstack, 0);
 
-    declarator = make_id_declarator (NULL_TREE, cp_operator_id (CALL_EXPR),
-				     sfk_none);
+    declarator = make_id_declarator (NULL_TREE, call_op_identifier, sfk_none);
 
     quals = (LAMBDA_EXPR_MUTABLE_P (lambda_expr)
 	     ? TYPE_UNQUALIFIED : TYPE_QUAL_CONST);
@@ -13547,7 +13549,7 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 	    {
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_remove ();
-	      error_at_rich_loc (&richloc, "%<friend%> used outside of class");
+	      error_at (&richloc, "%<friend%> used outside of class");
 	      cp_lexer_purge_token (parser->lexer);
 	    }
 	  else
@@ -13613,9 +13615,9 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 		 we're complaining about C++0x compatibility.  */
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_remove ();
-	      warning_at_rich_loc (&richloc, OPT_Wc__11_compat,
-				   "%<auto%> changes meaning in C++11; "
-				   "please remove it");
+	      warning_at (&richloc, OPT_Wc__11_compat,
+			  "%<auto%> changes meaning in C++11; "
+			  "please remove it");
 
               /* Set the storage class anyway.  */
               cp_parser_set_storage_class (parser, decl_specs, RID_AUTO,
@@ -13848,9 +13850,11 @@ cp_parser_linkage_specification (cp_parser* parser)
   tree linkage;
 
   /* Look for the `extern' keyword.  */
-  cp_parser_require_keyword (parser, RID_EXTERN, RT_EXTERN);
+  cp_token *extern_token
+    = cp_parser_require_keyword (parser, RID_EXTERN, RT_EXTERN);
 
   /* Look for the string-literal.  */
+  cp_token *string_token = cp_lexer_peek_token (parser->lexer);
   linkage = cp_parser_string_literal (parser, false, false);
 
   /* Transform the literal into an identifier.  If the literal is a
@@ -13869,6 +13873,20 @@ cp_parser_linkage_specification (cp_parser* parser)
   /* We're now using the new linkage.  */
   push_lang_context (linkage);
 
+  /* Preserve the location of the the innermost linkage specification,
+     tracking the locations of nested specifications via a local.  */
+  location_t saved_location
+    = parser->innermost_linkage_specification_location;
+  /* Construct a location ranging from the start of the "extern" to
+     the end of the string-literal, with the caret at the start, e.g.:
+       extern "C" {
+       ^~~~~~~~~~
+  */
+  parser->innermost_linkage_specification_location
+    = make_location (extern_token->location,
+		     extern_token->location,
+		     get_finish (string_token->location));
+
   /* If the next token is a `{', then we're using the first
      production.  */
   if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
@@ -13899,6 +13917,9 @@ cp_parser_linkage_specification (cp_parser* parser)
 
   /* We're done with the linkage-specification.  */
   pop_lang_context ();
+
+  /* Restore location of parent linkage specification, if any.  */
+  parser->innermost_linkage_specification_location = saved_location;
 }
 
 /* Parse a static_assert-declaration.
@@ -14693,12 +14714,13 @@ cp_parser_operator (cp_parser* parser)
   location_t start_loc = token->location;
 
   /* Figure out which operator we have.  */
+  enum tree_code op = ERROR_MARK;
+  bool assop = false;
+  bool consumed = false;
   switch (token->type)
     {
     case CPP_KEYWORD:
       {
-	enum tree_code op;
-
 	/* The keyword should be either `new' or `delete'.  */
 	if (token->keyword == RID_NEW)
 	  op = NEW_EXPR;
@@ -14722,160 +14744,166 @@ cp_parser_operator (cp_parser* parser)
 	    if (cp_token *close_token
 		= cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE))
 	      end_loc = close_token->location;
-	    id = cp_operator_id (op == NEW_EXPR
-			      ? VEC_NEW_EXPR : VEC_DELETE_EXPR);
+	    op = op == NEW_EXPR ? VEC_NEW_EXPR : VEC_DELETE_EXPR;
 	  }
-	/* Otherwise, we have the non-array variant.  */
-	else
-	  id = cp_operator_id (op);
-
-	location_t loc = make_location (start_loc, start_loc, end_loc);
-
-	return cp_expr (id, loc);
+	start_loc = make_location (start_loc, start_loc, end_loc);
+	consumed = true;
+	break;
       }
 
     case CPP_PLUS:
-      id = cp_operator_id (PLUS_EXPR);
+      op = PLUS_EXPR;
       break;
 
     case CPP_MINUS:
-      id = cp_operator_id (MINUS_EXPR);
+      op = MINUS_EXPR;
       break;
 
     case CPP_MULT:
-      id = cp_operator_id (MULT_EXPR);
+      op = MULT_EXPR;
       break;
 
     case CPP_DIV:
-      id = cp_operator_id (TRUNC_DIV_EXPR);
+      op = TRUNC_DIV_EXPR;
       break;
 
     case CPP_MOD:
-      id = cp_operator_id (TRUNC_MOD_EXPR);
+      op = TRUNC_MOD_EXPR;
       break;
 
     case CPP_XOR:
-      id = cp_operator_id (BIT_XOR_EXPR);
+      op = BIT_XOR_EXPR;
       break;
 
     case CPP_AND:
-      id = cp_operator_id (BIT_AND_EXPR);
+      op = BIT_AND_EXPR;
       break;
 
     case CPP_OR:
-      id = cp_operator_id (BIT_IOR_EXPR);
+      op = BIT_IOR_EXPR;
       break;
 
     case CPP_COMPL:
-      id = cp_operator_id (BIT_NOT_EXPR);
+      op = BIT_NOT_EXPR;
       break;
 
     case CPP_NOT:
-      id = cp_operator_id (TRUTH_NOT_EXPR);
+      op = TRUTH_NOT_EXPR;
       break;
 
     case CPP_EQ:
-      id = cp_assignment_operator_id (NOP_EXPR);
+      assop = true;
+      op = NOP_EXPR;
       break;
 
     case CPP_LESS:
-      id = cp_operator_id (LT_EXPR);
+      op = LT_EXPR;
       break;
 
     case CPP_GREATER:
-      id = cp_operator_id (GT_EXPR);
+      op = GT_EXPR;
       break;
 
     case CPP_PLUS_EQ:
-      id = cp_assignment_operator_id (PLUS_EXPR);
+      assop = true;
+      op = PLUS_EXPR;
       break;
 
     case CPP_MINUS_EQ:
-      id = cp_assignment_operator_id (MINUS_EXPR);
+      assop = true;
+      op = MINUS_EXPR;
       break;
 
     case CPP_MULT_EQ:
-      id = cp_assignment_operator_id (MULT_EXPR);
+      assop = true;
+      op = MULT_EXPR;
       break;
 
     case CPP_DIV_EQ:
-      id = cp_assignment_operator_id (TRUNC_DIV_EXPR);
+      assop = true;
+      op = TRUNC_DIV_EXPR;
       break;
 
     case CPP_MOD_EQ:
-      id = cp_assignment_operator_id (TRUNC_MOD_EXPR);
+      assop = true;
+      op = TRUNC_MOD_EXPR;
       break;
 
     case CPP_XOR_EQ:
-      id = cp_assignment_operator_id (BIT_XOR_EXPR);
+      assop = true;
+      op = BIT_XOR_EXPR;
       break;
 
     case CPP_AND_EQ:
-      id = cp_assignment_operator_id (BIT_AND_EXPR);
+      assop = true;
+      op = BIT_AND_EXPR;
       break;
 
     case CPP_OR_EQ:
-      id = cp_assignment_operator_id (BIT_IOR_EXPR);
+      assop = true;
+      op = BIT_IOR_EXPR;
       break;
 
     case CPP_LSHIFT:
-      id = cp_operator_id (LSHIFT_EXPR);
+      op = LSHIFT_EXPR;
       break;
 
     case CPP_RSHIFT:
-      id = cp_operator_id (RSHIFT_EXPR);
+      op = RSHIFT_EXPR;
       break;
 
     case CPP_LSHIFT_EQ:
-      id = cp_assignment_operator_id (LSHIFT_EXPR);
+      assop = true;
+      op = LSHIFT_EXPR;
       break;
 
     case CPP_RSHIFT_EQ:
-      id = cp_assignment_operator_id (RSHIFT_EXPR);
+      assop = true;
+      op = RSHIFT_EXPR;
       break;
 
     case CPP_EQ_EQ:
-      id = cp_operator_id (EQ_EXPR);
+      op = EQ_EXPR;
       break;
 
     case CPP_NOT_EQ:
-      id = cp_operator_id (NE_EXPR);
+      op = NE_EXPR;
       break;
 
     case CPP_LESS_EQ:
-      id = cp_operator_id (LE_EXPR);
+      op = LE_EXPR;
       break;
 
     case CPP_GREATER_EQ:
-      id = cp_operator_id (GE_EXPR);
+      op = GE_EXPR;
       break;
 
     case CPP_AND_AND:
-      id = cp_operator_id (TRUTH_ANDIF_EXPR);
+      op = TRUTH_ANDIF_EXPR;
       break;
 
     case CPP_OR_OR:
-      id = cp_operator_id (TRUTH_ORIF_EXPR);
+      op = TRUTH_ORIF_EXPR;
       break;
 
     case CPP_PLUS_PLUS:
-      id = cp_operator_id (POSTINCREMENT_EXPR);
+      op = POSTINCREMENT_EXPR;
       break;
 
     case CPP_MINUS_MINUS:
-      id = cp_operator_id (PREDECREMENT_EXPR);
+      op = PREDECREMENT_EXPR;
       break;
 
     case CPP_COMMA:
-      id = cp_operator_id (COMPOUND_EXPR);
+      op = COMPOUND_EXPR;
       break;
 
     case CPP_DEREF_STAR:
-      id = cp_operator_id (MEMBER_REF);
+      op = MEMBER_REF;
       break;
 
     case CPP_DEREF:
-      id = cp_operator_id (COMPONENT_REF);
+      op = COMPONENT_REF;
       break;
 
     case CPP_OPEN_PAREN:
@@ -14885,7 +14913,9 @@ cp_parser_operator (cp_parser* parser)
         parens.consume_open (parser);
         /* Look for the matching `)'.  */
         parens.require_close (parser);
-        return cp_operator_id (CALL_EXPR);
+	op = CALL_EXPR;
+	consumed = true;
+	break;
       }
 
     case CPP_OPEN_SQUARE:
@@ -14893,7 +14923,9 @@ cp_parser_operator (cp_parser* parser)
       cp_lexer_consume_token (parser->lexer);
       /* Look for the matching `]'.  */
       cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
-      return cp_operator_id (ARRAY_REF);
+      op = ARRAY_REF;
+      consumed = true;
+      break;
 
     case CPP_UTF8STRING:
     case CPP_UTF8STRING_USERDEF:
@@ -14972,8 +15004,12 @@ cp_parser_operator (cp_parser* parser)
 
   /* If we have selected an identifier, we need to consume the
      operator token.  */
-  if (id)
-    cp_lexer_consume_token (parser->lexer);
+  if (op != ERROR_MARK)
+    {
+      id = ovl_op_identifier (assop, op);
+      if (!consumed)
+	cp_lexer_consume_token (parser->lexer);
+    }
   /* Otherwise, no valid operator name was present.  */
   else
     {
@@ -16643,6 +16679,7 @@ cp_parser_explicit_specialization (cp_parser* parser)
   if (current_lang_name == lang_name_c)
     {
       error_at (token->location, "template specialization with C linkage");
+      maybe_show_extern_c_location ();
       /* Give it C++ linkage to avoid confusing other parts of the
 	 front end.  */
       push_lang_context (lang_name_cplusplus);
@@ -17624,9 +17661,9 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
 	  gcc_rich_location richloc (token->location);
 	  richloc.add_range (input_location, false);
 	  richloc.add_fixit_remove ();
-	  pedwarn_at_rich_loc (&richloc, 0, "elaborated-type-specifier for "
-			       "a scoped enum must not use the %qD keyword",
-			       token->u.value);
+	  pedwarn (&richloc, 0, "elaborated-type-specifier for "
+		   "a scoped enum must not use the %qD keyword",
+		   token->u.value);
 	  /* Consume the `struct' or `class' and parse it anyway.  */
 	  cp_lexer_consume_token (parser->lexer);
 	}
@@ -20622,7 +20659,7 @@ cp_parser_cv_qualifier_seq_opt (cp_parser* parser)
 	{
 	  gcc_rich_location richloc (token->location);
 	  richloc.add_fixit_remove ();
-	  error_at_rich_loc (&richloc, "duplicate cv-qualifier");
+	  error_at (&richloc, "duplicate cv-qualifier");
 	  cp_lexer_purge_token (parser->lexer);
 	}
       else
@@ -20771,7 +20808,7 @@ cp_parser_virt_specifier_seq_opt (cp_parser* parser)
 	{
 	  gcc_rich_location richloc (token->location);
 	  richloc.add_fixit_remove ();
-	  error_at_rich_loc (&richloc, "duplicate virt-specifier");
+	  error_at (&richloc, "duplicate virt-specifier");
 	  cp_lexer_purge_token (parser->lexer);
 	}
       else
@@ -22569,14 +22606,14 @@ cp_parser_class_specifier_1 (cp_parser* parser)
 	  richloc.add_fixit_insert_before (next_loc, ";");
 
 	if (CLASSTYPE_DECLARED_CLASS (type))
-	  error_at_rich_loc (&richloc,
-			     "expected %<;%> after class definition");
+	  error_at (&richloc,
+		    "expected %<;%> after class definition");
 	else if (TREE_CODE (type) == RECORD_TYPE)
-	  error_at_rich_loc (&richloc,
-			     "expected %<;%> after struct definition");
+	  error_at (&richloc,
+		    "expected %<;%> after struct definition");
 	else if (TREE_CODE (type) == UNION_TYPE)
-	  error_at_rich_loc (&richloc,
-			     "expected %<;%> after union definition");
+	  error_at (&richloc,
+		    "expected %<;%> after union definition");
 	else
 	  gcc_unreachable ();
 
@@ -23023,9 +23060,9 @@ cp_parser_class_head (cp_parser* parser,
       rich_location richloc (line_table, reported_loc);
       richloc.add_fixit_insert_before (class_head_start_location,
                                        "template <> ");
-      error_at_rich_loc
-        (&richloc,
-         "an explicit specialization must be preceded by %<template <>%>");
+      error_at (&richloc,
+		"an explicit specialization must be preceded by"
+		" %<template <>%>");
       invalid_explicit_specialization_p = true;
       /* Take the same action that would have been taken by
 	 cp_parser_explicit_specialization.  */
@@ -23493,7 +23530,7 @@ cp_parser_member_declaration (cp_parser* parser)
 	    {
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_remove ();
-	      pedwarn_at_rich_loc (&richloc, OPT_Wpedantic, "extra %<;%>");
+	      pedwarn (&richloc, OPT_Wpedantic, "extra %<;%>");
 	    }
 	}
       else
@@ -23836,9 +23873,9 @@ cp_parser_member_declaration (cp_parser* parser)
 			= cp_lexer_consume_token (parser->lexer)->location;
 		      gcc_rich_location richloc (semicolon_loc);
 		      richloc.add_fixit_remove ();
-		      warning_at_rich_loc (&richloc, OPT_Wextra_semi,
-					   "extra %<;%> after in-class "
-					   "function definition");
+		      warning_at (&richloc, OPT_Wextra_semi,
+				  "extra %<;%> after in-class "
+				  "function definition");
 		    }
 		  goto out;
 		}
@@ -23881,8 +23918,8 @@ cp_parser_member_declaration (cp_parser* parser)
 		  cp_token *token = cp_lexer_previous_token (parser->lexer);
 		  gcc_rich_location richloc (token->location);
 		  richloc.add_fixit_remove ();
-		  error_at_rich_loc (&richloc, "stray %<,%> at end of "
-				     "member declaration");
+		  error_at (&richloc, "stray %<,%> at end of "
+			    "member declaration");
 		}
 	    }
 	  /* If the next token isn't a `;', then we have a parse error.  */
@@ -23895,8 +23932,8 @@ cp_parser_member_declaration (cp_parser* parser)
 	      cp_token *token = cp_lexer_previous_token (parser->lexer);
 	      gcc_rich_location richloc (token->location);
 	      richloc.add_fixit_insert_after (";");
-	      error_at_rich_loc (&richloc, "expected %<;%> at end of "
-				 "member declaration");
+	      error_at (&richloc, "expected %<;%> at end of "
+			"member declaration");
 
 	      /* Assume that the user meant to provide a semicolon.  If
 		 we were to cp_parser_skip_to_end_of_statement, we might
@@ -26979,6 +27016,7 @@ cp_parser_explicit_template_declaration (cp_parser* parser, bool member_p)
   if (current_lang_name == lang_name_c)
     {
       error_at (location, "template with C linkage");
+      maybe_show_extern_c_location ();
       /* Give it C++ linkage to avoid confusing other parts of the
          front end.  */
       push_lang_context (lang_name_cplusplus);
@@ -27506,8 +27544,8 @@ cp_parser_enclosed_template_argument_list (cp_parser* parser)
 	  cp_token *token = cp_lexer_peek_token (parser->lexer);
 	  gcc_rich_location richloc (token->location);
 	  richloc.add_fixit_replace ("> >");
-	  error_at_rich_loc (&richloc, "%<>>%> should be %<> >%> "
-			     "within a nested template argument list");
+	  error_at (&richloc, "%<>>%> should be %<> >%> "
+		    "within a nested template argument list");
 
 	  token->type = CPP_GREATER;
 	}
@@ -28136,7 +28174,7 @@ set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
 	    {
 	      gcc_rich_location richloc (location);
 	      richloc.add_fixit_remove ();
-	      error_at_rich_loc (&richloc, "duplicate %qD", token->u.value);
+	      error_at (&richloc, "duplicate %qD", token->u.value);
 	    }
 	}
       else
@@ -28160,7 +28198,7 @@ set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
 	  };
 	  gcc_rich_location richloc (location);
 	  richloc.add_fixit_remove ();
-	  error_at_rich_loc (&richloc, "duplicate %qs", decl_spec_names[ds]);
+	  error_at (&richloc, "duplicate %qs", decl_spec_names[ds]);
 	}
     }
 }
@@ -32550,21 +32588,21 @@ cp_parser_omp_clause_reduction (cp_parser *parser, tree list)
 	    code = MIN_EXPR;
 	  else if (strcmp (p, "max") == 0)
 	    code = MAX_EXPR;
-	  else if (id == cp_operator_id (PLUS_EXPR))
+	  else if (id == ovl_op_identifier (false, PLUS_EXPR))
 	    code = PLUS_EXPR;
-	  else if (id == cp_operator_id (MULT_EXPR))
+	  else if (id == ovl_op_identifier (false, MULT_EXPR))
 	    code = MULT_EXPR;
-	  else if (id == cp_operator_id (MINUS_EXPR))
+	  else if (id == ovl_op_identifier (false, MINUS_EXPR))
 	    code = MINUS_EXPR;
-	  else if (id == cp_operator_id (BIT_AND_EXPR))
+	  else if (id == ovl_op_identifier (false, BIT_AND_EXPR))
 	    code = BIT_AND_EXPR;
-	  else if (id == cp_operator_id (BIT_IOR_EXPR))
+	  else if (id == ovl_op_identifier (false, BIT_IOR_EXPR))
 	    code = BIT_IOR_EXPR;
-	  else if (id == cp_operator_id (BIT_XOR_EXPR))
+	  else if (id == ovl_op_identifier (false, BIT_XOR_EXPR))
 	    code = BIT_XOR_EXPR;
-	  else if (id == cp_operator_id (TRUTH_ANDIF_EXPR))
+	  else if (id == ovl_op_identifier (false, TRUTH_ANDIF_EXPR))
 	    code = TRUTH_ANDIF_EXPR;
-	  else if (id == cp_operator_id (TRUTH_ORIF_EXPR))
+	  else if (id == ovl_op_identifier (false, TRUTH_ORIF_EXPR))
 	    code = TRUTH_ORIF_EXPR;
 	  id = omp_reduction_id (code, id, NULL_TREE);
 	  tree scope = parser->scope;
@@ -39552,4 +39590,17 @@ finish_fully_implicit_template (cp_parser *parser, tree member_decl_opt)
   return member_decl_opt;
 }
 
+/* Helper function for diagnostics that have complained about things
+   being used with 'extern "C"' linkage.
+
+   Attempt to issue a note showing where the 'extern "C"' linkage began.  */
+
+void
+maybe_show_extern_c_location (void)
+{
+  if (the_parser->innermost_linkage_specification_location != UNKNOWN_LOCATION)
+    inform (the_parser->innermost_linkage_specification_location,
+	    "%<extern \"C\"%> linkage started here");
+}
+
 #include "gt-cp-parser.h"
diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index 0994e1e7f4f..f4f4a010964 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -412,6 +412,10 @@ struct GTY(()) cp_parser {
      context e.g., because they could never be deduced.  */
   int prevent_constrained_type_specifiers;
 
+  /* Location of the string-literal token within the current linkage
+     specification, if any, or UNKNOWN_LOCATION otherwise.  */
+  location_t innermost_linkage_specification_location;
+
 };
 
 /* In parser.c  */
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ba52f3b57a6..710333ddaba 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -203,7 +203,7 @@ static void tsubst_default_arguments (tree, tsubst_flags_t);
 static tree for_each_template_parm_r (tree *, int *, void *);
 static tree copy_default_args_to_explicit_spec_1 (tree, tree);
 static void copy_default_args_to_explicit_spec (tree);
-static int invalid_nontype_parm_type_p (tree, tsubst_flags_t);
+static bool invalid_nontype_parm_type_p (tree, tsubst_flags_t);
 static bool dependent_template_arg_p (tree);
 static bool any_template_arguments_need_structural_equality_p (tree);
 static bool dependent_type_p_r (tree);
@@ -3435,7 +3435,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t complain,
 	  call = copy_node (call);
 	  CALL_EXPR_ARG (call, 0) = hi;
 	}
-      tree ex = make_pack_expansion (call);
+      tree ex = make_pack_expansion (call, complain);
       tree vec = make_tree_vec (1);
       TREE_VEC_ELT (vec, 0) = ex;
       return vec;
@@ -3724,7 +3724,7 @@ uses_parameter_packs (tree t)
    EXPR_PACK_EXPANSION, TYPE_PACK_EXPANSION, or TREE_LIST,
    respectively.  */
 tree 
-make_pack_expansion (tree arg)
+make_pack_expansion (tree arg, tsubst_flags_t complain)
 {
   tree result;
   tree parameter_packs = NULL_TREE;
@@ -3770,7 +3770,9 @@ make_pack_expansion (tree arg)
 
       if (parameter_packs == NULL_TREE)
         {
-          error ("base initializer expansion %qT contains no parameter packs", arg);
+	  if (complain & tf_error)
+	    error ("base initializer expansion %qT contains no parameter packs",
+		   arg);
           delete ppd.visited;
           return error_mark_node;
         }
@@ -3834,10 +3836,13 @@ make_pack_expansion (tree arg)
   /* Make sure we found some parameter packs.  */
   if (parameter_packs == NULL_TREE)
     {
-      if (TYPE_P (arg))
-        error ("expansion pattern %qT contains no argument packs", arg);
-      else
-        error ("expansion pattern %qE contains no argument packs", arg);
+      if (complain & tf_error)
+	{
+	  if (TYPE_P (arg))
+	    error ("expansion pattern %qT contains no argument packs", arg);
+	  else
+	    error ("expansion pattern %qE contains no argument packs", arg);
+	}
       return error_mark_node;
     }
   PACK_EXPANSION_PARAMETER_PACKS (result) = parameter_packs;
@@ -5564,7 +5569,7 @@ push_template_decl_real (tree decl, bool is_friend)
 	  (TI_ARGS (tinfo),
 	   TI_ARGS (get_template_info (DECL_TEMPLATE_RESULT (tmpl)))))
 	{
-	  error ("template arguments to %qD do not match original"
+	  error ("template arguments to %qD do not match original "
 		 "template %qD", decl, DECL_TEMPLATE_RESULT (tmpl));
 	  if (!uses_template_parms (TI_ARGS (tinfo)))
 	    inform (input_location, "use %<template<>%> for"
@@ -7694,7 +7699,7 @@ convert_template_argument (tree parm,
                       if (DECL_TEMPLATE_TEMPLATE_PARM_P (val))
                         val = TREE_TYPE (val);
 		      if (TREE_CODE (orig_arg) == TYPE_PACK_EXPANSION)
-			val = make_pack_expansion (val);
+			val = make_pack_expansion (val, complain);
                     }
 		}
 	      else
@@ -8188,7 +8193,7 @@ coerce_template_parms (tree parms,
 	      else if (TYPE_P (conv) && !TYPE_P (pattern))
 		/* Recover from missing typename.  */
 		TREE_VEC_ELT (inner_args, arg_idx)
-		  = make_pack_expansion (conv);
+		  = make_pack_expansion (conv, complain);
 
               /* We don't know how many args we have yet, just
                  use the unconverted ones for now.  */
@@ -11161,7 +11166,7 @@ gen_elem_of_pack_expansion_instantiation (tree pattern,
       the Ith element resulting from the substituting is going to
       be a pack expansion as well.  */
   if (ith_elem_is_expansion)
-    t = make_pack_expansion (t);
+    t = make_pack_expansion (t, complain);
 
   return t;
 }
@@ -11573,7 +11578,7 @@ tsubst_pack_expansion (tree t, tree args, tsubst_flags_t complain,
       /* We got some full packs, but we can't substitute them in until we
 	 have values for all the packs.  So remember these until then.  */
 
-      t = make_pack_expansion (pattern);
+      t = make_pack_expansion (pattern, complain);
       PACK_EXPANSION_EXTRA_ARGS (t) = args;
       return t;
     }
@@ -11588,7 +11593,7 @@ tsubst_pack_expansion (tree t, tree args, tsubst_flags_t complain,
 			 /*integral_constant_expression_p=*/false);
       else
 	t = tsubst (pattern, args, complain, in_decl);
-      t = make_pack_expansion (t);
+      t = make_pack_expansion (t, complain);
       return t;
     }
 
@@ -12195,7 +12200,7 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t complain,
 	 We also deal with the peculiar case:
 
 	 template <class T> struct S {
-	 template <class U> friend void f();
+	   template <class U> friend void f();
 	 };
 	 template <class U> void f() {}
 	 template S<int>;
@@ -17079,8 +17084,7 @@ tsubst_copy_and_build (tree t,
 	    /* A type conversion to reference type will be enclosed in
 	       such an indirect ref, but the substitution of the cast
 	       will have also added such an indirect ref.  */
-	    if (TREE_CODE (TREE_TYPE (r)) == REFERENCE_TYPE)
-	      r = convert_from_reference (r);
+	    r = convert_from_reference (r);
 	  }
 	else
 	  r = build_x_indirect_ref (input_location, r, RO_UNARY_STAR,
@@ -21324,7 +21328,7 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
 	  if (REFERENCE_REF_P (arg))
 	    arg = TREE_OPERAND (arg, 0);
 	  if (pexp)
-	    arg = make_pack_expansion (arg);
+	    arg = make_pack_expansion (arg, complain);
 	  return unify (tparms, targs, TREE_OPERAND (parm, 0), arg,
 			strict, explain_p);
 	}
@@ -23618,31 +23622,31 @@ instantiating_current_function_p (void)
 }
 
 /* [temp.param] Check that template non-type parm TYPE is of an allowable
-   type. Return zero for ok, nonzero for disallowed. Issue error and
-   warning messages under control of COMPLAIN.  */
+   type.  Return false for ok, true for disallowed.  Issue error and
+   inform messages under control of COMPLAIN.  */
 
-static int
+static bool
 invalid_nontype_parm_type_p (tree type, tsubst_flags_t complain)
 {
   if (INTEGRAL_OR_ENUMERATION_TYPE_P (type))
-    return 0;
+    return false;
   else if (POINTER_TYPE_P (type))
-    return 0;
+    return false;
   else if (TYPE_PTRMEM_P (type))
-    return 0;
+    return false;
   else if (TREE_CODE (type) == TEMPLATE_TYPE_PARM)
-    return 0;
+    return false;
   else if (TREE_CODE (type) == TYPENAME_TYPE)
-    return 0;
+    return false;
   else if (TREE_CODE (type) == DECLTYPE_TYPE)
-    return 0;
+    return false;
   else if (TREE_CODE (type) == NULLPTR_TYPE)
-    return 0;
+    return false;
   /* A bound template template parm could later be instantiated to have a valid
      nontype parm type via an alias template.  */
   else if (cxx_dialect >= cxx11
 	   && TREE_CODE (type) == BOUND_TEMPLATE_TEMPLATE_PARM)
-    return 0;
+    return false;
 
   if (complain & tf_error)
     {
@@ -23652,7 +23656,7 @@ invalid_nontype_parm_type_p (tree type, tsubst_flags_t complain)
 	error ("%q#T is not a valid type for a template non-type parameter",
 	       type);
     }
-  return 1;
+  return true;
 }
 
 /* Returns TRUE if TYPE is dependent, in the sense of [temp.dep.type].
@@ -24019,8 +24023,21 @@ value_dependent_expression_p (tree expression)
     case TRAIT_EXPR:
       {
 	tree type2 = TRAIT_EXPR_TYPE2 (expression);
-	return (dependent_type_p (TRAIT_EXPR_TYPE1 (expression))
-		|| (type2 ? dependent_type_p (type2) : false));
+
+	if (dependent_type_p (TRAIT_EXPR_TYPE1 (expression)))
+	  return true;
+
+	if (!type2)
+	  return false;
+
+	if (TREE_CODE (type2) != TREE_LIST)
+	  return dependent_type_p (type2);
+
+	for (; type2; type2 = TREE_CHAIN (type2))
+	  if (dependent_type_p (TREE_VALUE (type2)))
+	    return true;
+
+	return false;
       }
 
     case MODOP_EXPR:
@@ -25118,9 +25135,9 @@ listify (tree arg)
     {    
       gcc_rich_location richloc (input_location);
       maybe_add_include_fixit (&richloc, "<initializer_list>");
-      error_at_rich_loc (&richloc,
-                         "deducing from brace-enclosed initializer list"
-                         " requires #include <initializer_list>");
+      error_at (&richloc,
+		"deducing from brace-enclosed initializer list"
+		" requires %<#include <initializer_list>%>");
 
       return error_mark_node;
     }
diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 50c717e286e..90bae2a7039 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -177,7 +177,6 @@ cxx_print_identifier (FILE *file, tree node, int indent)
     indent_to (file, indent + 4);
   fprintf (file, "%s local bindings <%p>", get_identifier_kind_name (node),
 	   (void *) IDENTIFIER_BINDING (node));
-  print_node (file, "label", IDENTIFIER_LABEL_VALUE (node), indent + 4);
 }
 
 void
diff --git a/gcc/cp/rtti.c b/gcc/cp/rtti.c
index 5b2326cbbb6..10ecbfd9589 100644
--- a/gcc/cp/rtti.c
+++ b/gcc/cp/rtti.c
@@ -319,9 +319,9 @@ typeid_ok_p (void)
     {
       gcc_rich_location richloc (input_location);
       maybe_add_include_fixit (&richloc, "<typeinfo>");
-      error_at_rich_loc (&richloc,
-			 "must %<#include <typeinfo>%> before using"
-			 " %<typeid%>");
+      error_at (&richloc,
+		"must %<#include <typeinfo>%> before using"
+		" %<typeid%>");
 
       return false;
     }
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index a512664e396..664952e749c 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2711,8 +2711,12 @@ finish_compound_literal (tree type, tree compound_literal,
 
   if (tree anode = type_uses_auto (type))
     if (CLASS_PLACEHOLDER_TEMPLATE (anode))
-      type = do_auto_deduction (type, compound_literal, anode, complain,
-				adc_variable_type);
+      {
+	type = do_auto_deduction (type, compound_literal, anode, complain,
+				  adc_variable_type);
+	if (type == error_mark_node)
+	  return error_mark_node;
+      }
 
   if (processing_template_decl)
     {
@@ -3395,7 +3399,7 @@ process_outer_var_ref (tree decl, tsubst_flags_t complain, bool force_use)
 	    inform (location_of (closure),
 		    "the lambda has no capture-default");
 	  else if (TYPE_CLASS_SCOPE_P (closure))
-	    inform (0, "lambda in local class %q+T cannot "
+	    inform (UNKNOWN_LOCATION, "lambda in local class %q+T cannot "
 		    "capture variables from the enclosing context",
 		    TYPE_CONTEXT (closure));
 	  inform (DECL_SOURCE_LOCATION (decl), "%q#D declared here", decl);
@@ -5095,7 +5099,7 @@ omp_reduction_id (enum tree_code reduction_code, tree reduction_id, tree type)
     case BIT_IOR_EXPR:
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
-      reduction_id = cp_operator_id (reduction_code);
+      reduction_id = ovl_op_identifier (false, reduction_code);
       break;
     case MIN_EXPR:
       p = "min";
@@ -9016,8 +9020,7 @@ classtype_has_nothrow_assign_or_copy_p (tree type, bool assign_p)
   tree fns = NULL_TREE;
 
   if (assign_p || TYPE_HAS_COPY_CTOR (type))
-    fns = get_class_binding (type,
-			     assign_p ? cp_assignment_operator_id (NOP_EXPR)
+    fns = get_class_binding (type, assign_p ? assign_op_identifier
 			     : ctor_identifier);
 
   bool saw_copy = false;
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index e21ff6a1572..b63f2ae4c5d 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -333,6 +333,10 @@ cp_stabilize_reference (tree ref)
 {
   switch (TREE_CODE (ref))
     {
+    case NON_DEPENDENT_EXPR:
+      /* We aren't actually evaluating this.  */
+      return ref;
+
     /* We need to treat specially anything stabilize_reference doesn't
        handle specifically.  */
     case VAR_DECL:
@@ -1204,7 +1208,7 @@ cp_build_qualified_type_real (tree type,
       tree t = PACK_EXPANSION_PATTERN (type);
 
       t = cp_build_qualified_type_real (t, type_quals, complain);
-      return make_pack_expansion (t);
+      return make_pack_expansion (t, complain);
     }
 
   /* A reference or method type shall not be cv-qualified.
@@ -1435,7 +1439,11 @@ strip_typedefs (tree t, bool *remove_attributes)
 	  is_variant = true;
 
 	type = strip_typedefs (TREE_TYPE (t), remove_attributes);
-	changed = type != TREE_TYPE (t) || is_variant;
+	tree canon_spec = (flag_noexcept_type
+			   ? canonical_eh_spec (TYPE_RAISES_EXCEPTIONS (t))
+			   : NULL_TREE);
+	changed = (type != TREE_TYPE (t) || is_variant
+		   || TYPE_RAISES_EXCEPTIONS (t) != canon_spec);
 
 	for (arg_node = TYPE_ARG_TYPES (t);
 	     arg_node;
@@ -1494,9 +1502,8 @@ strip_typedefs (tree t, bool *remove_attributes)
 					type_memfn_rqual (t));
 	  }
 
-	if (TYPE_RAISES_EXCEPTIONS (t))
-	  result = build_exception_variant (result,
-					    TYPE_RAISES_EXCEPTIONS (t));
+	if (canon_spec)
+	  result = build_exception_variant (result, canon_spec);
 	if (TYPE_HAS_LATE_RETURN_TYPE (t))
 	  TYPE_HAS_LATE_RETURN_TYPE (result) = 1;
       }
@@ -4840,7 +4847,8 @@ special_function_p (const_tree decl)
     return sfk_move_constructor;
   if (DECL_CONSTRUCTOR_P (decl))
     return sfk_constructor;
-  if (DECL_OVERLOADED_OPERATOR_P (decl) == NOP_EXPR)
+  if (DECL_ASSIGNMENT_OPERATOR_P (decl)
+      && DECL_OVERLOADED_OPERATOR_IS (decl, NOP_EXPR))
     {
       if (copy_fn_p (decl))
 	return sfk_copy_assignment;
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index d87ee62ad1a..7db8719d50d 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -1562,7 +1562,7 @@ cxx_sizeof_or_alignof_type (tree type, enum tree_code op, bool complain)
       if (complain)
 	pedwarn (input_location, OPT_Wpointer_arith, 
 		 "invalid application of %qs to a member function", 
-		 operator_name_info[(int) op].name);
+		 OVL_OP_INFO (false, op)->name);
       else
 	return error_mark_node;
       value = size_one_node;
@@ -2677,8 +2677,8 @@ access_failure_info::maybe_suggest_accessor (bool const_p) const
   pretty_printer pp;
   pp_printf (&pp, "%s()", IDENTIFIER_POINTER (DECL_NAME (accessor)));
   richloc.add_fixit_replace (pp_formatted_text (&pp));
-  inform_at_rich_loc (&richloc, "field %q#D can be accessed via %q#D",
-		      m_field_decl, accessor);
+  inform (&richloc, "field %q#D can be accessed via %q#D",
+	  m_field_decl, accessor);
 }
 
 /* This function is called by the parser to process a class member
@@ -2883,12 +2883,12 @@ finish_class_member_access_expr (cp_expr object, tree name, bool template_p,
 		      gcc_rich_location rich_loc (bogus_component_loc);
 		      rich_loc.add_fixit_misspelled_id (bogus_component_loc,
 							guessed_id);
-		      error_at_rich_loc
-			(&rich_loc,
-			 "%q#T has no member named %qE; did you mean %qE?",
-			 TREE_CODE (access_path) == TREE_BINFO
-			 ? TREE_TYPE (access_path) : object_type, name,
-			 guessed_id);
+		      error_at (&rich_loc,
+				"%q#T has no member named %qE;"
+				" did you mean %qE?",
+				TREE_CODE (access_path) == TREE_BINFO
+				? TREE_TYPE (access_path) : object_type,
+				name, guessed_id);
 		    }
 		  else
 		    error ("%q#T has no member named %qE",
@@ -9048,10 +9048,11 @@ check_return_expr (tree retval, bool *no_warning)
 	/* You can return a `void' value from a function of `void'
 	   type.  In that case, we have to evaluate the expression for
 	   its side-effects.  */
-	  finish_expr_stmt (retval);
+	finish_expr_stmt (retval);
       else
-	permerror (input_location, "return-statement with a value, in function "
-		   "returning 'void'");
+	permerror (input_location,
+		   "return-statement with a value, in function "
+		   "returning %qT", valtype);
       current_function_returns_null = 1;
 
       /* There's really no value to return, after all.  */
@@ -9075,8 +9076,7 @@ check_return_expr (tree retval, bool *no_warning)
     }
 
   /* Only operator new(...) throw(), can return NULL [expr.new/13].  */
-  if ((DECL_OVERLOADED_OPERATOR_P (current_function_decl) == NEW_EXPR
-       || DECL_OVERLOADED_OPERATOR_P (current_function_decl) == VEC_NEW_EXPR)
+  if (IDENTIFIER_NEW_OP_P (DECL_NAME (current_function_decl))
       && !TYPE_NOTHROW_P (TREE_TYPE (current_function_decl))
       && ! flag_check_new
       && retval && null_ptr_cst_p (retval))
@@ -9085,7 +9085,7 @@ check_return_expr (tree retval, bool *no_warning)
 
   /* Effective C++ rule 15.  See also start_function.  */
   if (warn_ecpp
-      && DECL_NAME (current_function_decl) == cp_assignment_operator_id (NOP_EXPR))
+      && DECL_NAME (current_function_decl) == assign_op_identifier)
     {
       bool warn = true;
 
@@ -9231,7 +9231,8 @@ check_return_expr (tree retval, bool *no_warning)
 	       && TREE_CODE (TREE_OPERAND (retval, 1)) == AGGR_INIT_EXPR)
 	retval = build2 (COMPOUND_EXPR, TREE_TYPE (retval), retval,
 			 TREE_OPERAND (retval, 0));
-      else if (maybe_warn_about_returning_address_of_local (retval))
+      else if (!processing_template_decl
+	       && maybe_warn_about_returning_address_of_local (retval))
 	retval = build2 (COMPOUND_EXPR, TREE_TYPE (retval), retval,
 			 build_zero_cst (TREE_TYPE (retval)));
     }
diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 18e16658227..5a2bbfaedbc 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -2486,7 +2486,7 @@ dbxout_expand_expr (tree expr)
 	      return NULL;
 	    x = adjust_address_nv (x, mode, tree_to_shwi (offset));
 	  }
-	if (maybe_nonzero (bitpos))
+	if (may_ne (bitpos, 0))
 	  x = adjust_address_nv (x, mode, bits_to_bytes_round_down (bitpos));
 
 	return x;
diff --git a/gcc/debug.h b/gcc/debug.h
index 915420baded..19b27848ca8 100644
--- a/gcc/debug.h
+++ b/gcc/debug.h
@@ -228,7 +228,6 @@ extern void debug_nothing_tree_charstar_uhwi (tree, const char *,
 /* Hooks for various debug formats.  */
 extern const struct gcc_debug_hooks do_nothing_debug_hooks;
 extern const struct gcc_debug_hooks dbx_debug_hooks;
-extern const struct gcc_debug_hooks sdb_debug_hooks;
 extern const struct gcc_debug_hooks xcoff_debug_hooks;
 extern const struct gcc_debug_hooks dwarf2_debug_hooks;
 extern const struct gcc_debug_hooks dwarf2_lineno_debug_hooks;
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 99cd9db5191..768c9879df9 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -894,14 +894,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define DEFAULT_GDB_EXTENSIONS 1
 #endif
 
-#ifndef SDB_DEBUGGING_INFO
-#define SDB_DEBUGGING_INFO 0
-#endif
-
 /* If more than one debugging type is supported, you must define
    PREFERRED_DEBUGGING_TYPE to choose the default.  */
 
-#if 1 < (defined (DBX_DEBUGGING_INFO) + (SDB_DEBUGGING_INFO) \
+#if 1 < (defined (DBX_DEBUGGING_INFO) \
          + defined (DWARF2_DEBUGGING_INFO) + defined (XCOFF_DEBUGGING_INFO) \
          + defined (VMS_DEBUGGING_INFO))
 #ifndef PREFERRED_DEBUGGING_TYPE
@@ -913,9 +909,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #elif defined DBX_DEBUGGING_INFO
 #define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
 
-#elif SDB_DEBUGGING_INFO
-#define PREFERRED_DEBUGGING_TYPE SDB_DEBUG
-
 #elif defined DWARF2_DEBUGGING_INFO || defined DWARF2_LINENO_DEBUGGING_INFO
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index b8cf6f2c045..ccbae4ba223 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -24,90 +24,7 @@
 #  include <windows.h>
 #endif
 
-/* Select Graphic Rendition (SGR, "\33[...m") strings.  */
-/* Also Erase in Line (EL) to Right ("\33[K") by default.  */
-/*    Why have EL to Right after SGR?
-	 -- The behavior of line-wrapping when at the bottom of the
-	    terminal screen and at the end of the current line is often
-	    such that a new line is introduced, entirely cleared with
-	    the current background color which may be different from the
-	    default one (see the boolean back_color_erase terminfo(5)
-	    capability), thus scrolling the display by one line.
-	    The end of this new line will stay in this background color
-	    even after reverting to the default background color with
-	    "\33[m', unless it is explicitly cleared again with "\33[K"
-	    (which is the behavior the user would instinctively expect
-	    from the whole thing).  There may be some unavoidable
-	    background-color flicker at the end of this new line because
-	    of this (when timing with the monitor's redraw is just right).
-	 -- The behavior of HT (tab, "\t") is usually the same as that of
-	    Cursor Forward Tabulation (CHT) with a default parameter
-	    of 1 ("\33[I"), i.e., it performs pure movement to the next
-	    tab stop, without any clearing of either content or screen
-	    attributes (including background color); try
-	       printf 'asdfqwerzxcv\rASDF\tZXCV\n'
-	    in a bash(1) shell to demonstrate this.  This is not what the
-	    user would instinctively expect of HT (but is ok for CHT).
-	    The instinctive behavior would include clearing the terminal
-	    cells that are skipped over by HT with blank cells in the
-	    current screen attributes, including background color;
-	    the boolean dest_tabs_magic_smso terminfo(5) capability
-	    indicates this saner behavior for HT, but only some rare
-	    terminals have it (although it also indicates a special
-	    glitch with standout mode in the Teleray terminal for which
-	    it was initially introduced).  The remedy is to add "\33K"
-	    after each SGR sequence, be it START (to fix the behavior
-	    of any HT after that before another SGR) or END (to fix the
-	    behavior of an HT in default background color that would
-	    follow a line-wrapping at the bottom of the screen in another
-	    background color, and to complement doing it after START).
-	    Piping GCC's output through a pager such as less(1) avoids
-	    any HT problems since the pager performs tab expansion.
-
-      Generic disadvantages of this remedy are:
-	 -- Some very rare terminals might support SGR but not EL (nobody
-	    will use "gcc -fdiagnostics-color" on a terminal that does not
-	    support SGR in the first place).
-	 -- Having these extra control sequences might somewhat complicate
-	    the task of any program trying to parse "gcc -fdiagnostics-color"
-	    output in order to extract structuring information from it.
-      A specific disadvantage to doing it after SGR START is:
-	 -- Even more possible background color flicker (when timing
-	    with the monitor's redraw is just right), even when not at the
-	    bottom of the screen.
-      There are no additional disadvantages specific to doing it after
-      SGR END.
-
-      It would be impractical for GCC to become a full-fledged
-      terminal program linked against ncurses or the like, so it will
-      not detect terminfo(5) capabilities.  */
-#define COLOR_SEPARATOR		";"
-#define COLOR_NONE		"00"
-#define COLOR_BOLD		"01"
-#define COLOR_UNDERSCORE	"04"
-#define COLOR_BLINK		"05"
-#define COLOR_REVERSE		"07"
-#define COLOR_FG_BLACK		"30"
-#define COLOR_FG_RED		"31"
-#define COLOR_FG_GREEN		"32"
-#define COLOR_FG_YELLOW		"33"
-#define COLOR_FG_BLUE		"34"
-#define COLOR_FG_MAGENTA	"35"
-#define COLOR_FG_CYAN		"36"
-#define COLOR_FG_WHITE		"37"
-#define COLOR_BG_BLACK		"40"
-#define COLOR_BG_RED		"41"
-#define COLOR_BG_GREEN		"42"
-#define COLOR_BG_YELLOW		"43"
-#define COLOR_BG_BLUE		"44"
-#define COLOR_BG_MAGENTA	"45"
-#define COLOR_BG_CYAN		"46"
-#define COLOR_BG_WHITE		"47"
-#define SGR_START		"\33["
-#define SGR_END			"m\33[K"
-#define SGR_SEQ(str)		SGR_START str SGR_END
-#define SGR_RESET		SGR_SEQ("")
-
+#include "color-macros.h"
 
 /* The context and logic for choosing default --color screen attributes
    (foreground and background colors, etc.) are the following.
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 1fa28027b5b..24025f1ef80 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -61,33 +61,32 @@ extern void internal_error_no_backtrace (const char *, ...)
 extern bool warning (int, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
+extern bool warning_n (rich_location *, int, int, const char *,
+		       const char *, ...)
+    ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
-extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+extern bool warning_at (rich_location *, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
-extern bool warning_at_rich_loc_n (rich_location *, int, int, const char *,
-				  const char *, ...)
-    ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void error_at_rich_loc (rich_location *, const char *, ...)
+extern void error_at (rich_location *, const char *, ...)
   ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
-extern bool pedwarn_at_rich_loc (rich_location *, int, const char *, ...)
+extern bool pedwarn (rich_location *, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern bool permerror_at_rich_loc (rich_location *, const char *,
+extern bool permerror (rich_location *, const char *,
 				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void inform_at_rich_loc (rich_location *, const char *,
-				...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform (rich_location *, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 35121117f49..a1ce682403b 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-color.h"
 #include "gcc-rich-location.h"
 #include "selftest.h"
+#include "selftest-diagnostic.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -1987,34 +1988,6 @@ namespace selftest {
 
 /* Selftests for diagnostic_show_locus.  */
 
-/* Convenience subclass of diagnostic_context for testing
-   diagnostic_show_locus.  */
-
-class test_diagnostic_context : public diagnostic_context
-{
- public:
-  test_diagnostic_context ()
-  {
-    diagnostic_initialize (this, 0);
-    show_caret = true;
-    show_column = true;
-    start_span = start_span_cb;
-  }
-  ~test_diagnostic_context ()
-  {
-    diagnostic_finish (this);
-  }
-
-  /* Implementation of diagnostic_start_span_fn, hiding the
-     real filename (to avoid printing the names of tempfiles).  */
-  static void
-  start_span_cb (diagnostic_context *context, expanded_location exploc)
-  {
-    exploc.file = "FILENAME";
-    default_diagnostic_start_span_fn (context, exploc);
-  }
-};
-
 /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */
 
 static void
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index a98bf4a3333..813bca6f65d 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-color.h"
 #include "edit-context.h"
 #include "selftest.h"
+#include "selftest-diagnostic.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -50,12 +51,9 @@ along with GCC; see the file COPYING3.  If not see
 /* Prototypes.  */
 static bool diagnostic_impl (rich_location *, int, const char *,
 			     va_list *, diagnostic_t) ATTRIBUTE_GCC_DIAG(3,0);
-static bool diagnostic_n_impl (location_t, int, int, const char *,
+static bool diagnostic_n_impl (rich_location *, int, int, const char *,
 			       const char *, va_list *,
 			       diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
-static bool diagnostic_n_impl_richloc (rich_location *, int, int, const char *,
-				       const char *, va_list *,
-				       diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
 
 static void error_recursion (diagnostic_context *) ATTRIBUTE_NORETURN;
 static void real_abort (void) ATTRIBUTE_NORETURN;
@@ -1074,10 +1072,9 @@ diagnostic_append_note (diagnostic_context *context,
   va_end (ap);
 }
 
-/* Implement emit_diagnostic, inform, inform_at_rich_loc, warning, warning_at,
-   warning_at_rich_loc, pedwarn, permerror, permerror_at_rich_loc, error,
-   error_at, error_at_rich_loc, sorry, fatal_error, internal_error, and
-   internal_error_no_backtrace, as documented and defined below.  */
+/* Implement emit_diagnostic, inform, warning, warning_at, pedwarn,
+   permerror, error, error_at, error_at, sorry, fatal_error, internal_error,
+   and internal_error_no_backtrace, as documented and defined below.  */
 static bool
 diagnostic_impl (rich_location *richloc, int opt,
 		 const char *gmsgid,
@@ -1099,12 +1096,13 @@ diagnostic_impl (rich_location *richloc, int opt,
   return diagnostic_report_diagnostic (global_dc, &diagnostic);
 }
 
-/* Same as diagonostic_n_impl taking rich_location instead of location_t.  */
+/* Implement inform_n, warning_n, and error_n, as documented and
+   defined below.  */
 static bool
-diagnostic_n_impl_richloc (rich_location *richloc, int opt, int n,
-			   const char *singular_gmsgid,
-			   const char *plural_gmsgid,
-			   va_list *ap, diagnostic_t kind)
+diagnostic_n_impl (rich_location *richloc, int opt, int n,
+		   const char *singular_gmsgid,
+		   const char *plural_gmsgid,
+		   va_list *ap, diagnostic_t kind)
 {
   diagnostic_info diagnostic;
   diagnostic_set_info_translated (&diagnostic,
@@ -1113,19 +1111,6 @@ diagnostic_n_impl_richloc (rich_location *richloc, int opt, int n,
   if (kind == DK_WARNING)
     diagnostic.option_index = opt;
   return diagnostic_report_diagnostic (global_dc, &diagnostic);
-} 
-
-/* Implement inform_n, warning_n, and error_n, as documented and
-   defined below.  */
-static bool
-diagnostic_n_impl (location_t location, int opt, int n,
-		   const char *singular_gmsgid,
-		   const char *plural_gmsgid,
-		   va_list *ap, diagnostic_t kind)
-{
-  rich_location richloc (line_table, location);
-  return diagnostic_n_impl_richloc (&richloc, opt, n,
-				    singular_gmsgid, plural_gmsgid, ap, kind);
 }
 
 /* Wrapper around diagnostic_impl taking a variable argument list.  */
@@ -1164,10 +1149,12 @@ inform (location_t location, const char *gmsgid, ...)
   va_end (ap);
 }
 
-/* Same as "inform", but at RICHLOC.  */
+/* Same as "inform" above, but at RICHLOC.  */
 void
-inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+inform (rich_location *richloc, const char *gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, gmsgid);
   diagnostic_impl (richloc, -1, gmsgid, &ap, DK_NOTE);
@@ -1182,7 +1169,8 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
-  diagnostic_n_impl (location, -1, n, singular_gmsgid, plural_gmsgid,
+  rich_location richloc (line_table, location);
+  diagnostic_n_impl (&richloc, -1, n, singular_gmsgid, plural_gmsgid,
 		     &ap, DK_NOTE);
   va_end (ap);
 }
@@ -1216,11 +1204,13 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   return ret;
 }
 
-/* Same as warning at, but using RICHLOC.  */
+/* Same as "warning at" above, but using RICHLOC.  */
 
 bool
-warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+warning_at (rich_location *richloc, int opt, const char *gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, gmsgid);
   bool ret = diagnostic_impl (richloc, opt, gmsgid, &ap, DK_WARNING);
@@ -1228,17 +1218,19 @@ warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
   return ret;
 }
 
-/* Same as warning_at_rich_loc but for plural variant.  */
+/* Same as warning_n plural variant below, but using RICHLOC.  */
 
 bool
-warning_at_rich_loc_n (rich_location *richloc, int opt, int n,
-		       const char *singular_gmsgid, const char *plural_gmsgid, ...)
+warning_n (rich_location *richloc, int opt, int n,
+	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, plural_gmsgid);
-  bool ret = diagnostic_n_impl_richloc (richloc, opt, n,
-					singular_gmsgid, plural_gmsgid,
-					&ap, DK_WARNING);
+  bool ret = diagnostic_n_impl (richloc, opt, n,
+				singular_gmsgid, plural_gmsgid,
+				&ap, DK_WARNING);
   va_end (ap);
   return ret;
 }
@@ -1253,7 +1245,8 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
-  bool ret = diagnostic_n_impl (location, opt, n,
+  rich_location richloc (line_table, location);
+  bool ret = diagnostic_n_impl (&richloc, opt, n,
 				singular_gmsgid, plural_gmsgid,
 				&ap, DK_WARNING);
   va_end (ap);
@@ -1284,11 +1277,13 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   return ret;
 }
 
-/* Same as pedwarn, but using RICHLOC.  */
+/* Same as pedwarn above, but using RICHLOC.  */
 
 bool
-pedwarn_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+pedwarn (rich_location *richloc, int opt, const char *gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, gmsgid);
   bool ret = diagnostic_impl (richloc, opt, gmsgid, &ap, DK_PEDWARN);
@@ -1314,11 +1309,13 @@ permerror (location_t location, const char *gmsgid, ...)
   return ret;
 }
 
-/* Same as "permerror", but at RICHLOC.  */
+/* Same as "permerror" above, but at RICHLOC.  */
 
 bool
-permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+permerror (rich_location *richloc, const char *gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, gmsgid);
   bool ret = diagnostic_impl (richloc, -1, gmsgid, &ap, DK_PERMERROR);
@@ -1346,7 +1343,8 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
-  diagnostic_n_impl (location, -1, n, singular_gmsgid, plural_gmsgid,
+  rich_location richloc (line_table, location);
+  diagnostic_n_impl (&richloc, -1, n, singular_gmsgid, plural_gmsgid,
 		     &ap, DK_ERROR);
   va_end (ap);
 }
@@ -1365,8 +1363,10 @@ error_at (location_t loc, const char *gmsgid, ...)
 /* Same as above, but use RICH_LOC.  */
 
 void
-error_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+error_at (rich_location *richloc, const char *gmsgid, ...)
 {
+  gcc_assert (richloc);
+
   va_list ap;
   va_start (ap, gmsgid);
   diagnostic_impl (richloc, -1, gmsgid, &ap, DK_ERROR);
@@ -1628,6 +1628,45 @@ test_print_parseable_fixits_replace ()
 		pp_formatted_text (&pp));
 }
 
+/* Verify that
+     diagnostic_get_location_text (..., SHOW_COLUMN)
+   generates EXPECTED_LOC_TEXT, given FILENAME, LINE, COLUMN, with
+   colorization disabled.  */
+
+static void
+assert_location_text (const char *expected_loc_text,
+		      const char *filename, int line, int column,
+		      bool show_column)
+{
+  test_diagnostic_context dc;
+  dc.show_column = show_column;
+
+  expanded_location xloc;
+  xloc.file = filename;
+  xloc.line = line;
+  xloc.column = column;
+  xloc.data = NULL;
+  xloc.sysp = false;
+
+  char *actual_loc_text = diagnostic_get_location_text (&dc, xloc);
+  ASSERT_STREQ (expected_loc_text, actual_loc_text);
+  free (actual_loc_text);
+}
+
+/* Verify that diagnostic_get_location_text works as expected.  */
+
+static void
+test_diagnostic_get_location_text ()
+{
+  const char *old_progname = progname;
+  progname = "PROGNAME";
+  assert_location_text ("PROGNAME:", NULL, 0, 0, true);
+  assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
+  assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
+  assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
+  progname = old_progname;
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -1638,6 +1677,7 @@ diagnostic_c_tests ()
   test_print_parseable_fixits_insert ();
   test_print_parseable_fixits_remove ();
   test_print_parseable_fixits_replace ();
+  test_diagnostic_get_location_text ();
 }
 
 } // namespace selftest
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 52f2606eadc..8cafb6554f8 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -211,8 +211,8 @@ Standard C@.  In its default mode, the GNU C preprocessor does not do a
 few things required by the standard.  These are features which are
 rarely, if ever, used, and may cause surprising changes to the meaning
 of a program which does not expect them.  To get strict ISO Standard C,
-you should use the @option{-std=c90}, @option{-std=c99} or
-@option{-std=c11} options, depending
+you should use the @option{-std=c90}, @option{-std=c99},
+@option{-std=c11} or @option{-std=c17} options, depending
 on which version of the standard you want.  To get all the mandatory
 diagnostics, you must also use @option{-pedantic}.  @xref{Invocation}.
 
@@ -1857,8 +1857,11 @@ implementation, unless GNU CPP is being used with GCC@.
 
 The value @code{199409L} signifies the 1989 C standard as amended in
 1994, which is the current default; the value @code{199901L} signifies
-the 1999 revision of the C standard.  Support for the 1999 revision is
-not yet complete.
+the 1999 revision of the C standard; the value @code{201112L}
+signifies the 2011 revision of the C standard; the value
+@code{201710L} signifies the 2017 revision of the C standard (which is
+otherwise identical to the 2011 version apart from correction of
+defects).
 
 This macro is not defined if the @option{-traditional-cpp} option is
 used, nor when compiling C++ or Objective-C@.
@@ -2366,6 +2369,21 @@ the include file @file{math.h} can define the macros
 @code{FP_FAST_FMA}, @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL}
 for compatibility with the 1999 C standard.
 
+@item __FP_FAST_FMAF16
+@itemx __FP_FAST_FMAF32
+@itemx __FP_FAST_FMAF64
+@itemx __FP_FAST_FMAF128
+@itemx __FP_FAST_FMAF32X
+@itemx __FP_FAST_FMAF64X
+@itemx __FP_FAST_FMAF128X
+These macros are defined with the value 1 if the backend supports the
+@code{fma} functions using the additional @code{_Float@var{n}} and
+@code{_Float@var{n}x} types that are defined in ISO/IEC TS
+18661-3:2015.  The include file @file{math.h} can define the
+@code{FP_FAST_FMAF@var{n}} and @code{FP_FAST_FMAF@var{n}x} macros if
+the user defined @code{__STDC_WANT_IEC_60559_TYPES_EXT__} before
+including @file{math.h}.
+
 @item __GCC_IEC_559
 This macro is defined to indicate the intended level of support for
 IEEE 754 (IEC 60559) floating-point arithmetic.  It expands to a
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d9b7a540cbd..8aa443f87fb 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5690,6 +5690,58 @@ Specify which floating-point unit to use.  You must specify the
 @code{target("fpmath=sse,387")} option as
 @code{target("fpmath=sse+387")} because the comma would separate
 different options.
+
+@item nocf_check
+@cindex @code{nocf_check} function attribute
+The @code{nocf_check} attribute on a function is used to inform the
+compiler that the function's prologue should not be instrumented when
+compiled with the @option{-fcf-protection=branch} option.  The
+compiler assumes that the function's address is a valid target for a
+control-flow transfer.
+
+The @code{nocf_check} attribute on a type of pointer to function is
+used to inform the compiler that a call through the pointer should
+not be instrumented when compiled with the
+@option{-fcf-protection=branch} option.  The compiler assumes
+that the function's address from the pointer is a valid target for
+a control-flow transfer.  A direct function call through a function
+name is assumed to be a safe call thus direct calls are not
+instrumented by the compiler.
+
+The @code{nocf_check} attribute is applied to an object's type.
+In case of assignment of a function address or a function pointer to
+another pointer, the attribute is not carried over from the right-hand
+object's type; the type of left-hand object stays unchanged.  The
+compiler checks for @code{nocf_check} attribute mismatch and reports
+a warning in case of mismatch.
+
+@smallexample
+@{
+int foo (void) __attribute__(nocf_check);
+void (*foo1)(void) __attribute__(nocf_check);
+void (*foo2)(void);
+
+int
+foo (void) /* The function's address is assumed to be valid.  */
+
+  /* This call site is not checked for control-flow validity.  */
+  (*foo1)();
+
+  foo1 = foo2; /* A warning is printed about attribute mismatch.  */
+  /* This call site is still not checked for control-flow validity.  */
+  (*foo1)();
+
+  /* This call site is checked for control-flow validity.  */
+  (*foo2)();
+
+  foo2 = foo1; /* A warning is printed about attribute mismatch.  */
+  /* This call site is still checked for control-flow validity.  */
+  (*foo2)();
+
+  return 0;
+@}
+@end smallexample
+
 @end table
 
 On the x86, the inliner does not inline a
@@ -7723,8 +7775,8 @@ GCC implements three different semantics of declaring a function
 inline.  One is available with @option{-std=gnu89} or
 @option{-fgnu89-inline} or when @code{gnu_inline} attribute is present
 on all inline declarations, another when
-@option{-std=c99}, @option{-std=c11},
-@option{-std=gnu99} or @option{-std=gnu11}
+@option{-std=c99},
+@option{-std=gnu99} or an option for a later C version is used
 (without @option{-fgnu89-inline}), and the third
 is used when compiling C++.
 
@@ -10869,6 +10921,7 @@ in the Cilk Plus language manual which can be found at
 @cindex built-in functions
 @findex __builtin_alloca
 @findex __builtin_alloca_with_align
+@findex __builtin_alloca_with_align_and_max
 @findex __builtin_call_with_static_chain
 @findex __builtin_fpclassify
 @findex __builtin_isfinite
@@ -11516,6 +11569,16 @@ an extension.  @xref{Variable Length}, for details.
 
 @end deftypefn
 
+@deftypefn {Built-in Function} void *__builtin_alloca_with_align_and_max (size_t size, size_t alignment, size_t max_size)
+Similar to @code{__builtin_alloca_with_align} but takes an extra argument
+specifying an upper bound for @var{size} in case its value cannot be computed
+at compile time, for use by @option{-fstack-usage}, @option{-Wstack-usage}
+and @option{-Walloca-larger-than}.  @var{max_size} must be a constant integer
+expression, it has no effect on code generation and no attempt is made to
+check its compatibility with @var{size}.
+
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_types_compatible_p (@var{type1}, @var{type2})
 
 You can use the built-in function @code{__builtin_types_compatible_p} to
@@ -21456,6 +21519,25 @@ void __builtin_ia32_wrpkru (unsigned int)
 unsigned int __builtin_ia32_rdpkru ()
 @end smallexample
 
+The following built-in functions are available when @option{-mcet} is used.
+They are used to support Intel Control-flow Enforcment Technology (CET).
+Each built-in function generates the  machine instruction that is part of the
+function's name.
+@smallexample
+unsigned int __builtin_ia32_rdsspd (unsigned int)
+unsigned long long __builtin_ia32_rdsspq (unsigned long long)
+void __builtin_ia32_incsspd (unsigned int)
+void __builtin_ia32_incsspq (unsigned long long)
+void __builtin_ia32_saveprevssp(void);
+void __builtin_ia32_rstorssp(void *);
+void __builtin_ia32_wrssd(unsigned int, void *);
+void __builtin_ia32_wrssq(unsigned long long, void *);
+void __builtin_ia32_wrussd(unsigned int, void *);
+void __builtin_ia32_wrussq(unsigned long long, void *);
+void __builtin_ia32_setssbsy(void);
+void __builtin_ia32_clrssbsy(void *);
+@end smallexample
+
 @node x86 transactional memory intrinsics
 @subsection x86 Transactional Memory Intrinsics
 
diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 706aa6cf0b0..5c4ba8a51a7 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -125,6 +125,8 @@ gcov [@option{-v}|@option{--version}] [@option{-h}|@option{--help}]
      [@option{-d}|@option{--display-progress}]
      [@option{-f}|@option{--function-summaries}]
      [@option{-i}|@option{--intermediate-format}]
+     [@option{-j}|@option{--human-readable}]
+     [@option{-k}|@option{--use-colors}]
      [@option{-l}|@option{--long-file-names}]
      [@option{-m}|@option{--demangled-names}]
      [@option{-n}|@option{--no-output}]
@@ -185,10 +187,14 @@ be used by @command{lcov} or other tools. The output is a single
 The format of the intermediate @file{.gcov} file is plain text with
 one entry per line
 
+@item -j
+@itemx --human-readable
+Write counts in human readable format (like 24k).
+
 @smallexample
 file:@var{source_file_name}
 function:@var{line_number},@var{execution_count},@var{function_name}
-lcount:@var{line number},@var{execution_count}
+lcount:@var{line number},@var{execution_count},@var{has_unexecuted_block}
 branch:@var{line_number},@var{branch_coverage_type}
 
 Where the @var{branch_coverage_type} is
@@ -207,14 +213,22 @@ Here is a sample when @option{-i} is used in conjunction with @option{-b} option
 file:array.cc
 function:11,1,_Z3sumRKSt6vectorIPiSaIS0_EE
 function:22,1,main
-lcount:11,1
-lcount:12,1
-lcount:14,1
+lcount:11,1,0
+lcount:12,1,0
+lcount:14,1,0
 branch:14,taken
-lcount:26,1
+lcount:26,1,0
 branch:28,nottaken
 @end smallexample
 
+@item -k
+@itemx --use-colors
+
+Use colors for lines of code that have zero coverage.  We use red color for
+non-exceptional lines and cyan for exceptional.  Same colors are used for
+basic blocks with @option{-a} option.
+
+
 @item -l
 @itemx --long-file-names
 Create long file names for included source files.  For example, if the
@@ -327,6 +341,16 @@ non-exceptional paths or only exceptional paths such as C++ exception
 handlers, respectively. Given @samp{-a} option, unexecuted blocks are
 marked @samp{$$$$$} or @samp{%%%%%}, depending on whether a basic block
 is reachable via non-exceptional or exceptional paths.
+Executed basic blocks having a statement with zero @var{execution_count}
+end with @samp{*} character and are colored with magenta color with @option{-k}
+option.
+
+Note that GCC can completely remove the bodies of functions that are
+not needed -- for instance if they are inlined everywhere.  Such functions
+are marked with @samp{-}, which can be confusing.
+Use the @option{-fkeep-inline-functions} and @option{-fkeep-static-functions}
+options to retain these functions and
+allow gcov to properly show their @var{execution_count}.
 
 Some lines of information at the start have @var{line_number} of zero.
 These preamble lines are of the form
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 5c39f453e7f..bb42269ef53 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1107,8 +1107,10 @@ constant @var{base} and @var{step}.  The value of @var{base} is
 given by @code{VEC_SERIES_CST_BASE} and the value of @var{step} is
 given by @code{VEC_SERIES_CST_STEP}.
 
-These nodes are restricted to integral types, in order to avoid
-specifying the rounding behavior for floating-point types.
+At present only variable-length vectors use @code{VEC_SERIES_CST};
+constant-length vectors use @code{VECTOR_CST} instead.  The nodes
+are also restricted to integral types, in order to avoid specifying
+the rounding behavior for floating-point types.
 
 @item STRING_CST
 These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}
diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
index 635abd39b6e..fa98800a675 100644
--- a/gcc/doc/gimple.texi
+++ b/gcc/doc/gimple.texi
@@ -1310,11 +1310,13 @@ operand is validated with @code{is_gimple_operand}).
 @end deftypefn
 
 
-@deftypefn {GIMPLE function} gcall *gimple_build_call_from_tree (tree call_expr)
-Build a @code{GIMPLE_CALL} from a @code{CALL_EXPR} node.  The arguments and the
-function are taken from the expression directly.  This routine
-assumes that @code{call_expr} is already in GIMPLE form.  That is, its
-operands are GIMPLE values and the function call needs no further
+@deftypefn {GIMPLE function} gcall *gimple_build_call_from_tree (tree call_expr, @
+tree fnptrtype)
+Build a @code{GIMPLE_CALL} from a @code{CALL_EXPR} node.  The arguments
+and the function are taken from the expression directly.  The type of the
+@code{GIMPLE_CALL} is set from the second parameter passed by a caller.
+This routine assumes that @code{call_expr} is already in GIMPLE form.
+That is, its operands are GIMPLE values and the function call needs no further
 simplification.  All the call flags in @code{call_expr} are copied over
 to the new @code{GIMPLE_CALL}.
 @end deftypefn
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index da360da1c50..b10c94af5ca 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1007,31 +1007,6 @@ Specify that stabs debugging
 information should be used instead of whatever format the host normally
 uses.  Normally GCC uses the same debug format as the host system.
 
-On MIPS based systems and on Alphas, you must specify whether you want
-GCC to create the normal ECOFF debugging format, or to use BSD-style
-stabs passed through the ECOFF symbol table.  The normal ECOFF debug
-format cannot fully handle languages other than C@.  BSD stabs format can
-handle other languages, but it only works with the GNU debugger GDB@.
-
-Normally, GCC uses the ECOFF debugging format by default; if you
-prefer BSD stabs, specify @option{--with-stabs} when you configure GCC@.
-
-No matter which default you choose when you configure GCC, the user
-can use the @option{-gcoff} and @option{-gstabs+} options to specify explicitly
-the debug format for a particular compilation.
-
-@option{--with-stabs} is meaningful on the ISC system on the 386, also, if
-@option{--with-gas} is used.  It selects use of stabs debugging
-information embedded in COFF output.  This kind of debugging information
-supports C++ well; ordinary COFF debugging information does not.
-
-@option{--with-stabs} is also meaningful on 386 systems running SVR4.  It
-selects use of stabs debugging information embedded in ELF output.  The
-C++ compiler currently (2.6.0) does not support the DWARF debugging
-information normally used on 386 SVR4 platforms; stabs provide a
-workable alternative.  This requires gas and gdb, as the normal SVR4
-tools can not generate or interpret stabs.
-
 @item --with-tls=@var{dialect}
 Specify the default TLS dialect, for systems were there is a choice.
 For ARM targets, possible values for @var{dialect} are @code{gnu} or
@@ -1665,7 +1640,8 @@ not be built.
 
 @item --disable-libssp
 Specify that the run-time libraries for stack smashing protection
-should not be built.
+should not be built or linked against.  On many targets library support
+is provided by the C library instead.
 
 @item --disable-libquadmath
 Specify that the GCC quad-precision math library should not be built.
@@ -2492,6 +2468,13 @@ useful to verify the full @option{-fcompare-debug} testing coverage.  It
 must be used along with @code{bootstrap-debug-lean} and
 @code{bootstrap-debug-lib}.
 
+@item @samp{bootstrap-cet}
+This option enables Intel CET for host tools during bootstrapping.
+@samp{BUILD_CONFIG=bootstrap-cet} is equivalent to adding
+@option{-fcf-protection -mcet} to @samp{BOOT_CFLAGS}.  This option
+assumes that the host supports Intel CET (e.g. GNU assembler version
+2.30 or later).
+
 @item @samp{bootstrap-time}
 Arranges for the run time of each program started by the GCC driver,
 built in any stage, to be logged to @file{time.log}, in the top level of
@@ -3161,8 +3144,6 @@ information have to.
 @item
 @uref{#alpha-x-x,,alpha*-*-*}
 @item
-@uref{#alpha-dec-osf51,,alpha*-dec-osf5.1}
-@item
 @uref{#amd64-x-solaris210,,amd64-*-solaris2.10}
 @item
 @uref{#arm-x-eabi,,arm-*-eabi}
@@ -3213,10 +3194,6 @@ information have to.
 @item
 @uref{#mips-x-x,,mips-*-*}
 @item
-@uref{#mips-sgi-irix5,,mips-sgi-irix5}
-@item
-@uref{#mips-sgi-irix6,,mips-sgi-irix6}
-@item
 @uref{#nds32le-x-elf,,nds32le-*-elf}
 @item
 @uref{#nds32be-x-elf,,nds32be-*-elf}
@@ -3346,8 +3323,7 @@ The workaround is disabled by default if neither of
 @anchor{alpha-x-x}
 @heading alpha*-*-*
 This section contains general configuration information for all
-alpha-based platforms using ELF (in particular, ignore this section for
-DEC OSF/1, Digital UNIX and Tru64 UNIX)@.  In addition to reading this
+Alpha-based platforms using ELF@.  In addition to reading this
 section, please read all other sections that match your target.
 
 We require binutils 2.11.2 or newer.
@@ -3358,20 +3334,6 @@ shared libraries.
 @html
 <hr />
 @end html
-@anchor{alpha-dec-osf51}
-@heading alpha*-dec-osf5.1
-Systems using processors that implement the DEC Alpha architecture and
-are running the DEC/Compaq/HP Unix (DEC OSF/1, Digital UNIX, or Compaq/HP
-Tru64 UNIX) operating system, for example the DEC Alpha AXP systems.
-
-Support for Tru64 UNIX V5.1 has been removed in GCC 4.8.  As of GCC 4.6,
-support for Tru64 UNIX V4.0 and V5.0 has been removed.  As of GCC 3.2,
-versions before @code{alpha*-dec-osf4} are no longer supported.  (These
-are the versions which identify themselves as DEC OSF/1.)
-
-@html
-<hr />
-@end html
 @anchor{amd64-x-solaris210}
 @heading amd64-*-solaris2.1[0-9]*
 This is a synonym for @samp{x86_64-*-solaris2.1[0-9]*}.
@@ -3799,12 +3761,10 @@ It is recommended that you configure GCC to use the GNU assembler.  The
 versions included in Solaris 10, from GNU binutils 2.15 (in
 @file{/usr/sfw/bin/gas}), and Solaris 11, from GNU binutils 2.19 or
 newer (also available as @file{/usr/bin/gas} and
-@file{/usr/gnu/bin/as}), work fine.  Please note that the current
-version, from GNU binutils 2.26, only works on Solaris 12 when using the
-Solaris linker.  On Solaris 10 and 11, you either have to wait for GNU
-binutils 2.26.1 or newer, or stay with GNU binutils 2.25.1.  Recent
-versions of the Solaris assembler in @file{/usr/ccs/bin/as} work almost
-as well, though.
+@file{/usr/gnu/bin/as}), work fine.  The current version, from GNU
+binutils 2.29, is known to work, but the version from GNU binutils 2.26
+must be avoided.  Recent versions of the Solaris assembler in
+@file{/usr/ccs/bin/as} work almost as well, though.
 @c FIXME: as patch requirements?
 
 For linking, the Solaris linker, is preferred.  If you want to use the GNU
@@ -3812,7 +3772,7 @@ linker instead, note that due to a packaging bug the version in Solaris
 10, from GNU binutils 2.15 (in @file{/usr/sfw/bin/gld}), cannot be used,
 while the version in Solaris 11, from GNU binutils 2.19 or newer (also
 in @file{/usr/gnu/bin/ld} and @file{/usr/bin/gld}), works, as does the
-latest version, from GNU binutils 2.26.
+latest version, from GNU binutils 2.29.
 
 To use GNU @command{as}, configure with the options
 @option{--with-gnu-as --with-as=@//usr/@/sfw/@/bin/@/gas}.  It may be necessary
@@ -4160,22 +4120,6 @@ use traps on systems that support them.
 @html
 <hr />
 @end html
-@anchor{mips-sgi-irix5}
-@heading mips-sgi-irix5
-Support for IRIX 5 has been removed in GCC 4.6.
-
-@html
-<hr />
-@end html
-@anchor{mips-sgi-irix6}
-@heading mips-sgi-irix6
-Support for IRIX 6.5 has been removed in GCC 4.8.  Support for IRIX 6
-releases before 6.5 has been removed in GCC 4.6, as well as support for
-the O32 ABI.
-
-@html
-<hr />
-@end html
 @anchor{moxie-x-elf}
 @heading moxie-*-elf
 The moxie processor.
@@ -4449,9 +4393,8 @@ versions included in Solaris 10, from GNU binutils 2.15 (in
 @file{/usr/sfw/bin/gas}), and Solaris 11,
 from GNU binutils 2.19 or newer (also in @file{/usr/bin/gas} and
 @file{/usr/gnu/bin/as}), are known to work.
-Current versions of GNU binutils (2.26)
-are known to work as well, with the caveat mentioned in
-@uref{#ix86-x-solaris210,,i?86-*-solaris2.10} .  Note that your mileage may vary
+The current version, from GNU binutils 2.29,
+is known to work as well.  Note that your mileage may vary
 if you use a combination of the GNU tools and the Solaris tools: while the
 combination GNU @command{as} + Sun @command{ld} should reasonably work,
 the reverse combination Sun @command{as} + GNU @command{ld} may fail to
@@ -4459,7 +4402,7 @@ build or cause memory corruption at runtime in some cases for C++ programs.
 @c FIXME: still?
 GNU @command{ld} usually works as well, although the version included in
 Solaris 10 cannot be used due to several bugs.  Again, the current
-version (2.26) is known to work, but generally lacks platform specific
+version (2.29) is known to work, but generally lacks platform specific
 features, so better stay with Solaris @command{ld}.  To use the LTO linker
 plugin (@option{-fuse-linker-plugin}) with GNU @command{ld}, GNU
 binutils @emph{must} be configured with @option{--enable-largefile}.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9bf1a17ebfb..4e96a3942c2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -45,7 +45,7 @@ remainder.  @command{g++} accepts mostly the same options as @command{gcc}.
 @c man end
 @c man begin SEEALSO
 gpl(7), gfdl(7), fsf-funding(7),
-cpp(1), gcov(1), as(1), ld(1), gdb(1), adb(1), dbx(1), sdb(1)
+cpp(1), gcov(1), as(1), ld(1), gdb(1), dbx(1)
 and the Info entries for @file{gcc}, @file{cpp}, @file{as},
 @file{ld}, @file{binutils} and @file{gdb}.
 @c man end
@@ -315,7 +315,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wstack-protector  -Wstack-usage=@var{len}  -Wstrict-aliasing @gol
 -Wstrict-aliasing=n  -Wstrict-overflow  -Wstrict-overflow=@var{n} @gol
 -Wstringop-overflow=@var{n} @gol
--Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{]} @gol
+-Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]} @gol
 -Wsuggest-final-types @gol  -Wsuggest-final-methods  -Wsuggest-override @gol
 -Wmissing-format-attribute  -Wsubobject-linkage @gol
 -Wswitch  -Wswitch-bool  -Wswitch-default  -Wswitch-enum @gol
@@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
 
 @item Debugging Options
 @xref{Debugging Options,,Options for Debugging Your Program}.
-@gccoptlist{-g  -g@var{level}  -gcoff  -gdwarf  -gdwarf-@var{version} @gol
+@gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version} @gol
 -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches @gol
 -gstabs  -gstabs+  -gstrict-dwarf  -gno-strict-dwarf @gol
 -gcolumn-info  -gno-column-info @gol
@@ -461,6 +461,7 @@ Objective-C and Objective-C++ Dialects}.
 -fchkp-check-read  -fchkp-check-write  -fchkp-store-bounds @gol
 -fchkp-instrument-calls  -fchkp-instrument-marked-only @gol
 -fchkp-use-wrappers  -fchkp-flexible-struct-trailing-arrays@gol
+-fcf-protection==@r{[}full@r{|}branch@r{|}return@r{|}none@r{]} @gol
 -fstack-protector  -fstack-protector-all  -fstack-protector-strong @gol
 -fstack-protector-explicit  -fstack-check @gol
 -fstack-limit-register=@var{reg}  -fstack-limit-symbol=@var{sym} @gol
@@ -742,7 +743,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-msmall-model  -mno-lsim}
 
 @emph{FT32 Options}
-@gccoptlist{-msim  -mlra  -mnodiv}
+@gccoptlist{-msim  -mlra  -mnodiv  -mft32b  -mcompress  -mnopm}
 
 @emph{FRV Options}
 @gccoptlist{-mgpr-32  -mgpr-64  -mfpr-32  -mfpr-64 @gol
@@ -947,6 +948,7 @@ Objective-C and Objective-C++ Dialects}.
 
 @emph{Nios II Options}
 @gccoptlist{-G @var{num}  -mgpopt=@var{option}  -mgpopt  -mno-gpopt @gol
+-mgprel-sec=@var{regexp} -mr0rel-sec=@var{regexp} @gol
 -mel  -meb @gol
 -mno-bypass-cache  -mbypass-cache @gol
 -mno-cache-volatile  -mcache-volatile @gol
@@ -987,7 +989,7 @@ See RS/6000 and PowerPC Options.
 -msmall-data-limit=@var{N-bytes} @gol
 -msave-restore  -mno-save-restore @gol
 -mstrict-align -mno-strict-align @gol
--mcmodel=@var{code-model} @gol
+-mcmodel=medlow -mcmodel=medany @gol
 -mexplicit-relocs  -mno-explicit-relocs @gol}
 
 @emph{RL78 Options}
@@ -1203,6 +1205,7 @@ See RS/6000 and PowerPC Options.
 -msse4a  -m3dnow  -m3dnowa  -mpopcnt  -mabm  -mbmi  -mtbm  -mfma4  -mxop @gol
 -mlzcnt  -mbmi2  -mfxsr  -mxsave  -mxsaveopt  -mrtm  -mlwp  -mmpx  @gol
 -mmwaitx  -mclzero  -mpku  -mthreads @gol
+-mcet -mibt -mshstk @gol
 -mms-bitfields  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically  -mstringop-strategy=@var{alg} @gol
 -mmemcpy-strategy=@var{strategy}  -mmemset-strategy=@var{strategy} @gol
@@ -1828,6 +1831,13 @@ substantially completely supported, modulo bugs, floating-point issues
 Annexes F and G) and the optional Annexes K (Bounds-checking
 interfaces) and L (Analyzability).  The name @samp{c1x} is deprecated.
 
+@item c17
+@itemx iso9899:2017
+ISO C17, the 2017 revision of the ISO C standard.  This standard is
+same as C11 except for corrections of defects (all of which are also
+applied with @option{-std=c11}) and a new value of
+@code{__STDC_VERSION__}, and so is supported to the same extent as C11.
+
 @item gnu90
 @itemx gnu89
 GNU dialect of ISO C90 (including some C99 features).
@@ -1838,9 +1848,12 @@ GNU dialect of ISO C99.  The name @samp{gnu9x} is deprecated.
 
 @item gnu11
 @itemx gnu1x
-GNU dialect of ISO C11.  This is the default for C code.
+GNU dialect of ISO C11.
 The name @samp{gnu1x} is deprecated.
 
+@item gnu17
+GNU dialect of ISO C17.  This is the default for C code.
+
 @item c++98
 @itemx c++03
 The 1998 ISO C++ standard plus the 2003 technical corrigendum and some
@@ -5201,7 +5214,7 @@ whether to issue a warning.  Similarly to @option{-Wstringop-overflow=3} this
 setting of the option may result in warnings for benign code.
 @end table
 
-@item -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}cold@r{]}
+@item -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}cold@r{|}malloc@r{]}
 @opindex Wsuggest-attribute=
 @opindex Wno-suggest-attribute=
 Warn for cases where adding an attribute may be beneficial. The
@@ -5211,21 +5224,25 @@ attributes currently supported are listed below.
 @item -Wsuggest-attribute=pure
 @itemx -Wsuggest-attribute=const
 @itemx -Wsuggest-attribute=noreturn
+@itemx -Wsuggest-attribute=malloc
 @opindex Wsuggest-attribute=pure
 @opindex Wno-suggest-attribute=pure
 @opindex Wsuggest-attribute=const
 @opindex Wno-suggest-attribute=const
 @opindex Wsuggest-attribute=noreturn
 @opindex Wno-suggest-attribute=noreturn
+@opindex Wsuggest-attribute=malloc
+@opindex Wno-suggest-attribute=malloc
 
 Warn about functions that might be candidates for attributes
-@code{pure}, @code{const} or @code{noreturn}.  The compiler only warns for
-functions visible in other compilation units or (in the case of @code{pure} and
-@code{const}) if it cannot prove that the function returns normally. A function
-returns normally if it doesn't contain an infinite loop or return abnormally
-by throwing, calling @code{abort} or trapping.  This analysis requires option
-@option{-fipa-pure-const}, which is enabled by default at @option{-O} and
-higher.  Higher optimization levels improve the accuracy of the analysis.
+@code{pure}, @code{const} or @code{noreturn} or @code{malloc}. The compiler
+only warns for functions visible in other compilation units or (in the case of
+@code{pure} and @code{const}) if it cannot prove that the function returns
+normally. A function returns normally if it doesn't contain an infinite loop or
+return abnormally by throwing, calling @code{abort} or trapping.  This analysis
+requires option @option{-fipa-pure-const}, which is enabled by default at
+@option{-O} and higher.  Higher optimization levels improve the accuracy
+of the analysis.
 
 @item -Wsuggest-attribute=format
 @itemx -Wmissing-format-attribute
@@ -6318,7 +6335,8 @@ attributes.
 @item -Wno-builtin-declaration-mismatch
 @opindex Wno-builtin-declaration-mismatch
 @opindex Wbuiltin-declaration-mismatch
-Warn if a built-in function is declared with the wrong signature.
+Warn if a built-in function is declared with the wrong signature or 
+as non-function.
 This warning is enabled by default.
 
 @item -Wno-builtin-macro-redefined
@@ -6893,7 +6911,7 @@ in their names, but apply to all currently-supported versions of DWARF.
 Produce debugging information in stabs format (if that is supported),
 without GDB extensions.  This is the format used by DBX on most BSD
 systems.  On MIPS, Alpha and System V Release 4 systems this option
-produces stabs debugging output that is not understood by DBX or SDB@.
+produces stabs debugging output that is not understood by DBX@.
 On System V Release 4 systems this option requires the GNU assembler.
 
 @item -gstabs+
@@ -6903,12 +6921,6 @@ using GNU extensions understood only by the GNU debugger (GDB)@.  The
 use of these extensions is likely to make other debuggers crash or
 refuse to read the program.
 
-@item -gcoff
-@opindex gcoff
-Produce debugging information in COFF format (if that is supported).
-This is the format used by SDB on most System V systems prior to
-System V Release 4.
-
 @item -gxcoff
 @opindex gxcoff
 Produce debugging information in XCOFF format (if that is supported).
@@ -6930,7 +6942,6 @@ supported).  This is the format used by DEBUG on Alpha/VMS systems.
 @item -g@var{level}
 @itemx -ggdb@var{level}
 @itemx -gstabs@var{level}
-@itemx -gcoff@var{level}
 @itemx -gxcoff@var{level}
 @itemx -gvms@var{level}
 Request debugging information and also use @var{level} to specify how
@@ -6979,7 +6990,12 @@ link processing time.  Merging is enabled by default.
 @item -fdebug-prefix-map=@var{old}=@var{new}
 @opindex fdebug-prefix-map
 When compiling files in directory @file{@var{old}}, record debugging
-information describing them as in @file{@var{new}} instead.
+information describing them as in @file{@var{new}} instead.  This can be
+used to replace a build-time path with an install-time path in the debug info.
+It can also be used to change an absolute path to a relative path by using
+@file{.} for @var{new}.  This can give more reproducible builds, which are
+location independent, but may require an extra command to tell GDB where to
+find the source files.
 
 @item -fvar-tracking
 @opindex fvar-tracking
@@ -7063,7 +7079,7 @@ Allow using extensions of later DWARF standard version than selected with
 @opindex gno-column-info
 Emit location column information into DWARF debugging information, rather
 than just file and line.
-This option is disabled by default.
+This option is enabled by default.
 
 @item -gz@r{[}=@var{type}@r{]}
 @opindex gz
@@ -7837,7 +7853,7 @@ Use @option{-fno-delete-null-pointer-checks} to disable this optimization
 for programs that depend on that behavior.
 
 This option is enabled by default on most targets.  On Nios II ELF, it
-defaults to off.  On AVR and CR16, this option is completely disabled.  
+defaults to off.  On AVR, CR16, and MSP430, this option is completely disabled.
 
 Passes that use the dataflow information
 are enabled independently at different optimization levels.
@@ -9712,18 +9728,26 @@ file if the target supports arbitrary sections.  The name of the
 function or the name of the data item determines the section's name
 in the output file.
 
-Use these options on systems where the linker can perform optimizations
-to improve locality of reference in the instruction space.  Most systems
-using the ELF object format and SPARC processors running Solaris 2 have
-linkers with such optimizations.  AIX may have these optimizations in
-the future.
-
-Only use these options when there are significant benefits from doing
-so.  When you specify these options, the assembler and linker
-create larger object and executable files and are also slower.
-You cannot use @command{gprof} on all systems if you
-specify this option, and you may have problems with debugging if
-you specify both this option and @option{-g}.
+Use these options on systems where the linker can perform optimizations to
+improve locality of reference in the instruction space.  Most systems using the
+ELF object format have linkers with such optimizations.  On AIX, the linker
+rearranges sections (CSECTs) based on the call graph.  The performance impact
+varies.
+
+Together with a linker garbage collection (linker @option{--gc-sections}
+option) these options may lead to smaller statically-linked executables (after
+stripping).
+
+On ELF/DWARF systems these options do not degenerate the quality of the debug
+information.  There could be issues with other object files/debug info formats.
+
+Only use these options when there are significant benefits from doing so.  When
+you specify these options, the assembler and linker create larger object and
+executable files and are also slower.  These options affect code generation.
+They prevent optimizations by the compiler and assembler using relative
+locations inside a translation unit since the locations are unknown until
+link time.  An example of such an optimization is relaxing calls to short call
+instructions.
 
 @item -fbranch-target-load-optimize
 @opindex fbranch-target-load-optimize
@@ -10851,9 +10875,9 @@ Link your object files with @option{-lgcov} or @option{-fprofile-arcs}
 Run the program on a representative workload to generate the arc profile
 information.  This may be repeated any number of times.  You can run
 concurrent instances of your program, and provided that the file system
-supports locking, the data files will be correctly updated.  Also
-@code{fork} calls are detected and correctly handled (double counting
-will not happen).
+supports locking, the data files will be correctly updated.  Unless
+a strict ISO C dialect option is in effect, @code{fork} calls are
+detected and correctly handled without double counting.
 
 @item
 For profile-directed optimizations, compile the source files again with
@@ -11141,6 +11165,15 @@ to verify the referenced object has the correct dynamic type.
 This option enables instrumentation of pointer arithmetics.  If the pointer
 arithmetics overflows, a run-time error is issued.
 
+@item -fsanitize=builtin
+@opindex fsanitize=builtin
+
+This option enables instrumentation of arguments to selected builtin
+functions.  If an invalid value is passed to such arguments, a run-time
+error is issued.  E.g.@ passing 0 as the argument to @code{__builtin_ctz}
+or @code{__builtin_clz} invokes undefined behavior and is diagnosed
+by this option.
+
 @end table
 
 While @option{-ftrapv} causes traps for signed overflows to be emitted,
@@ -11401,6 +11434,33 @@ is used to link a program, the GCC driver automatically links
 against @file{libmpxwrappers}.  See also @option{-static-libmpxwrappers}.
 Enabled by default.
 
+@item -fcf-protection==@r{[}full@r{|}branch@r{|}return@r{|}none@r{]}
+@opindex fcf-protection
+Enable code instrumentation of control-flow transfers to increase
+program security by checking that target addresses of control-flow
+transfer instructions (such as indirect function call, function return,
+indirect jump) are valid.  This prevents diverting the flow of control
+to an unexpected target.  This is intended to protect against such
+threats as Return-oriented Programming (ROP), and similarly
+call/jmp-oriented programming (COP/JOP).
+
+The value @code{branch} tells the compiler to implement checking of
+validity of control-flow transfer at the point of indirect branch
+instructions, i.e. call/jmp instructions.  The value @code{return}
+implements checking of validity at the point of returning from a
+function.  The value @code{full} is an alias for specifying both
+@code{branch} and @code{return}. The value @code{none} turns off
+instrumentation.
+
+You can also use the @code{nocf_check} attribute to identify
+which functions and calls should be skipped from instrumentation
+(@pxref{Function Attributes}).
+
+Currently the x86 GNU/Linux target provides an implementation based
+on Intel Control-flow Enforcement Technology (CET).  Instrumentation
+for x86 is controlled by target-specific options @option{-mcet},
+@option{-mibt} and @option{-mshstk} (@pxref{x86 Options}).
+
 @item -fstack-protector
 @opindex fstack-protector
 Emit extra code to check for buffer overflows, such as stack smashing
@@ -14279,7 +14339,7 @@ Specify the name of the target processor for which GCC should tune the
 performance of the code.  Permissible values for this option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx},
+@samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx}, @samp{saphira},
 @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
 @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
 @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
@@ -14361,6 +14421,9 @@ replacing it with a number selects vector-length specific output.
 The possible lengths in the latter case are: 128, 256, 512, 1024
 and 2048.  @samp{scalable} is the default.
 
+At present, @samp{-msve-vector-bits=128} produces the same output
+as @samp{-msve-vector-bits=scalable}.
+
 @end table
 
 @subsubsection @option{-march} and @option{-mcpu} Feature Modifiers
@@ -14399,6 +14462,8 @@ Enable FP16 extension.  This also enables floating-point instructions.
 Enable the RcPc extension.  This does not change code generation from GCC,
 but is passed on to the assembler, enabling inline asm statements to use
 instructions from the RcPc extension.
+@item dotprod
+Enable the Dot Product extension.  This also enables Advanced SIMD instructions.
 
 @end table
 
@@ -15639,6 +15704,9 @@ The ARMv8.1 Advanced SIMD and floating-point instructions.
 The cryptographic instructions.  This also enables the Advanced SIMD and
 floating-point instructions.
 
+@item +dotprod
+Enable the Dot Product extension.  This also enables Advanced SIMD instructions.
+
 @item +nocrypto
 Disable the cryptographic extension.
 
@@ -15825,6 +15893,9 @@ Permissible names for this option are the same as those for
 The following extension options are common to the listed CPUs:
 
 @table @samp
+@item +nodsp
+Disable the DSP instructions on @samp{cortex-m33}.
+
 @item  +nofp
 Disables the floating-point instructions on @samp{arm9e},
 @samp{arm946e-s}, @samp{arm966e-s}, @samp{arm968e-s}, @samp{arm10e},
@@ -17717,6 +17788,18 @@ so by default the compiler uses standard reload.
 @opindex mnodiv
 Do not use div and mod instructions.
 
+@item -mft32b
+@opindex mft32b
+Enable use of the extended instructions of the FT32B processor.
+
+@item -mcompress
+@opindex mcompress
+Compress all code using the Ft32B code compression scheme.
+
+@item -mnopm
+@opindex  mnopm
+Do not generate code that reads program memory.
+
 @end table
 
 @node FRV Options
@@ -21128,6 +21211,32 @@ GOT data sections.  In this case, the 16-bit offset for GP-relative
 addressing may not be large enough to allow access to the entire 
 small data section.
 
+@item -mgprel-sec=@var{regexp}
+@opindex mgprel-sec
+This option specifies additional section names that can be accessed via
+GP-relative addressing.  It is most useful in conjunction with 
+@code{section} attributes on variable declarations 
+(@pxref{Common Variable Attributes}) and a custom linker script.  
+The @var{regexp} is a POSIX Extended Regular Expression.
+
+This option does not affect the behavior of the @option{-G} option, and 
+and the specified sections are in addition to the standard @code{.sdata} 
+and @code{.sbss} small-data sections that are recognized by @option{-mgpopt}.
+
+@item -mr0rel-sec=@var{regexp}
+@opindex mr0rel-sec
+This option specifies names of sections that can be accessed via a 
+16-bit offset from @code{r0}; that is, in the low 32K or high 32K 
+of the 32-bit address space.  It is most useful in conjunction with 
+@code{section} attributes on variable declarations 
+(@pxref{Common Variable Attributes}) and a custom linker script.  
+The @var{regexp} is a POSIX Extended Regular Expression.
+
+In contrast to the use of GP-relative addressing for small data, 
+zero-based addressing is never generated by default and there are no 
+conventional section names used in standard linker scripts for sections
+in the low or high areas of memory.
+
 @item -mel
 @itemx -meb
 @opindex mel
@@ -21631,9 +21740,26 @@ When generating PIC code, allow the use of PLTs. Ignored for non-PIC.
 
 @item -mabi=@var{ABI-string}
 @opindex mabi
-Specify integer and floating-point calling convention.  This defaults to the
-natural calling convention: e.g.@ LP64 for RV64I, ILP32 for RV32I, LP64D for
-RV64G.
+@item -mabi=@var{ABI-string}
+@opindex mabi
+Specify integer and floating-point calling convention.  @var{ABI-string}
+contains two parts: the size of integer types and the registers used for
+floating-point types.  For example @samp{-march=rv64ifd -mabi=lp64d} means that
+@samp{long} and pointers are 64-bit (implicitly defining @samp{int} to be
+32-bit), and that floating-point values up to 64 bits wide are passed in F
+registers.  Contrast this with @samp{-march=rv64ifd -mabi=lp64f}, which still
+allows the compiler to generate code that uses the F and D extensions but only
+allows floating-point values up to 32 bits long to be passed in registers; or
+@samp{-march=rv64ifd -mabi=lp64}, in which no floating-point arguments will be
+passed in registers.
+
+The default for this argument is system dependent, users who want a specific
+calling convention should specify one explicitly.  The valid calling
+conventions are: @samp{ilp32}, @samp{ilp32f}, @samp{ilp32d}, @samp{lp64},
+@samp{lp64f}, and @samp{lp64d}.  Some calling conventions are impossible to
+implement on some ISAs: for example, @samp{-march=rv32if -mabi=ilp32d} is
+invalid because the ABI requires 64-bit values be passed in F registers, but F
+registers are only 32 bits wide.
 
 @item -mfdiv
 @itemx -mno-fdiv
@@ -21671,9 +21797,18 @@ Use smaller but slower prologue and epilogue code.
 @opindex mstrict-align
 Do not generate unaligned memory accesses.
 
-@item -mcmodel=@var{code-model}
-@opindex mcmodel
-Specify the code model.
+@item -mcmodel=medlow
+@opindex mcmodel=medlow
+Generate code for the medium-low code model. The program and its statically
+defined symbols must lie within a single 2 GiB address range and must lie
+between absolute addresses @minus{}2 GiB and +2 GiB. Programs can be
+statically or dynamically linked. This is the default code model.
+
+@item -mcmodel=medany
+@opindex mcmodel=medany
+Generate code for the medium-any code model. The program and its statically
+defined symbols must be within any single 2 GiB address range. Programs can be
+statically or dynamically linked.
 
 @end table
 
@@ -22576,12 +22711,18 @@ Disable Book-E SPE ABI extensions for the current ABI@.
 @item -mabi=ibmlongdouble
 @opindex mabi=ibmlongdouble
 Change the current ABI to use IBM extended-precision long double.
-This is a PowerPC 32-bit SYSV ABI option.
+This is not likely to work if your system defaults to using IEEE
+extended-precision long double.  If you change the long double type
+from IEEE extended-precision, the compiler will issue a warning unless
+you use the @option{-Wno-psabi} option.
 
 @item -mabi=ieeelongdouble
 @opindex mabi=ieeelongdouble
 Change the current ABI to use IEEE extended-precision long double.
-This is a PowerPC 32-bit Linux ABI option.
+This is not likely to work if your system defaults to using IBM
+extended-precision long double.  If you change the long double type
+from IBM extended-precision, the compiler will issue a warning unless
+you use the @option{-Wno-psabi} option.
 
 @item -mabi=elfv1
 @opindex mabi=elfv1
@@ -25821,15 +25962,19 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}.
 @need 200
 @itemx -mclzero
 @opindex mclzero
+@need 200
 @itemx -mpku
 @opindex mpku
+@need 200
+@itemx -mcet
+@opindex mcet
 These switches enable the use of instructions in the MMX, SSE,
 SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, AVX512PF, AVX512ER, AVX512CD,
 SHA, AES, PCLMUL, FSGSBASE, RDRND, F16C, FMA, SSE4A, FMA4, XOP, LWP, ABM,
 AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA AVX512VBMI, BMI, BMI2, FXSR,
-XSAVE, XSAVEOPT, LZCNT, RTM, MPX, MWAITX, PKU, 3DNow!@: or enhanced 3DNow!@:
-extended instruction sets.  Each has a corresponding @option{-mno-} option
-to disable use of these instructions.
+XSAVE, XSAVEOPT, LZCNT, RTM, MPX, MWAITX, PKU, IBT, SHSTK,
+3DNow!@: or enhanced 3DNow!@: extended instruction sets.  Each has a
+corresponding @option{-mno-} option to disable use of these instructions.
 
 These extensions are also available as built-in functions: see
 @ref{x86 Built-in Functions}, for details of the functions enabled and
@@ -25849,6 +25994,13 @@ supported architecture, using the appropriate flags.  In particular,
 the file containing the CPU detection code should be compiled without
 these options.
 
+The @option{-mcet} option turns on the @option{-mibt} and @option{-mshstk}
+options.  The @option{-mibt} option enables indirect branch tracking support
+and the @option{-mshstk} option enables shadow stack support from
+Intel Control-flow Enforcement Technology (CET).  The compiler also provides
+a number of built-in functions for fine-grained control in a CET-based
+application.  See @xref{x86 Built-in Functions}, for more information.
+
 @item -mdump-tune-features
 @opindex mdump-tune-features
 This option instructs GCC to dump the names of the x86 performance 
@@ -25927,6 +26079,24 @@ see @ref{Other Builtins} for details.
 This option enables use of the @code{movbe} instruction to implement
 @code{__builtin_bswap32} and @code{__builtin_bswap64}.
 
+@item -mibt
+@opindex mibt
+This option tells the compiler to use indirect branch tracking support
+(for indirect calls and jumps) from x86 Control-flow Enforcement
+Technology (CET).  The option has effect only if the
+@option{-fcf-protection=full} or @option{-fcf-protection=branch} option
+is specified. The option @option{-mibt} is on by default when the
+@code{-mcet} option is specified.
+
+@item -mshstk
+@opindex mshstk
+This option tells the compiler to use shadow stack support (return
+address tracking) from x86 Control-flow Enforcement Technology (CET).
+The option has effect only if the @option{-fcf-protection=full} or
+@option{-fcf-protection=return} option is specified.  The option
+@option{-mshstk} is on by default when the @option{-mcet} option is
+specified.
+
 @item -mcrc32
 @opindex mcrc32
 This option enables built-in functions @code{__builtin_ia32_crc32qi},
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0aa8ec8812c..e4fed29a95b 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6896,6 +6896,15 @@ scheduler and other passes from moving instructions and using register
 equivalences across the boundary defined by the blockage insn.
 This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.
 
+@cindex @code{memory_blockage} instruction pattern
+@item @samp{memory_blockage}
+This pattern, if defined, represents a compiler memory barrier, and will be
+placed at points across which RTL passes may not propagate memory accesses.
+This instruction needs to read and write volatile BLKmode memory.  It does
+not need to generate any machine instruction.  If this pattern is not defined,
+the compiler falls back to emitting an instruction corresponding
+to @code{asm volatile ("" ::: "memory")}.
+
 @cindex @code{memory_barrier} instruction pattern
 @item @samp{memory_barrier}
 If the target memory model is not fully synchronous, then this pattern
diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index 93e70ee6406..2bf786c2568 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -981,11 +981,10 @@ these files.
 
 This is run after final because it must output the stack slot offsets
 for pseudo registers that did not get hard registers.  Source files
-are @file{dbxout.c} for DBX symbol table format, @file{sdbout.c} for
-SDB symbol table format, @file{dwarfout.c} for DWARF symbol table
-format, files @file{dwarf2out.c} and @file{dwarf2asm.c} for DWARF2
-symbol table format, and @file{vmsdbgout.c} for VMS debug symbol table
-format.
+are @file{dbxout.c} for DBX symbol table format, @file{dwarfout.c} for
+DWARF symbol table format, files @file{dwarf2out.c} and @file{dwarf2asm.c}
+for DWARF2 symbol table format, and @file{vmsdbgout.c} for VMS debug
+symbol table format.
 
 @end itemize
 
diff --git a/gcc/doc/poly-int.texi b/gcc/doc/poly-int.texi
index 851a86a431c..9519b620274 100644
--- a/gcc/doc/poly-int.texi
+++ b/gcc/doc/poly-int.texi
@@ -256,20 +256,6 @@ must_gt (@var{a}, @var{b}) == !may_le (@var{a}, @var{b})
 must_ne (@var{a}, @var{b}) == !may_eq (@var{a}, @var{b})
 @end example
 
-There are also helper functions for certain common operations:
-
-@example
-known_zero (@var{a}) == must_eq (@var{a}, 0)
-maybe_zero (@var{a}) == may_eq (@var{a}, 0)
-known_nonzero (@var{a}) == must_ne (@var{a}, 0)
-maybe_nonzero (@var{a}) == may_ne (@var{a}, 0)
-known_one (@var{a}) == must_eq (@var{a}, 1)
-known_all_ones (@var{a}) == must_eq (@var{a}, -1)
-@end example
-
-Using these helper functions removes the need to distinguish between
-signed and unsigned constants, such as @samp{0} and @samp{0U}.
-
 @node Properties of the @code{poly_int} comparisons
 @subsection Properties of the @code{poly_int} comparisons
 
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 2156daf64a8..f583940b944 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1599,7 +1599,12 @@ compute integer values.
 
 @findex MAX_BITSIZE_MODE_ANY_MODE
 @item MAX_BITSIZE_MODE_ANY_MODE
-The bitsize of the largest mode on the target.   
+The bitsize of the largest mode on the target.  The default value is
+the largest mode size given in the mode definition file, which is
+always correct for targets whose modes have a fixed size.  Targets
+that might increase the size of a mode beyond this default should define
+@code{MAX_BITSIZE_MODE_ANY_MODE} to the actual upper limit in
+@file{@var{machine}-modes.def}.
 @end table
 
 @findex byte_mode
@@ -4189,6 +4194,22 @@ is used in place of the actual insn pattern.  This is done in cases where
 the pattern is either complex or misleading.
 @end table
 
+The note @code{REG_CALL_NOCF_CHECK} is used in conjunction with the
+@option{-fcf-protection=branch} option.  The note is set if a
+@code{nocf_check} attribute is specified for a function type or a
+pointer to function type.  The note is stored in the @code{REG_NOTES}
+field of an insn.
+
+@table @code
+@findex REG_CALL_NOCF_CHECK
+@item REG_CALL_NOCF_CHECK
+Users have control through the @code{nocf_check} attribute to identify
+which calls to a function should be skipped from control-flow instrumentation
+when the option @option{-fcf-protection=branch} is specified.  The compiler
+puts a @code{REG_CALL_NOCF_CHECK} note on each @code{CALL_INSN} instruction
+that has a function type marked with a @code{nocf_check} attribute.
+@end table
+
 For convenience, the machine mode in an @code{insn_list} or
 @code{expr_list} is printed using these symbolic codes in debugging dumps.
 
@@ -4308,6 +4329,20 @@ There is only one @code{pc} expression.
 @item
 There is only one @code{cc0} expression.
 
+@cindex @code{const}, RTL sharing
+@item
+There is only one instance of the following structures for a given
+@var{m}, @var{x} and @var{y}:
+@example
+(const:@var{m} (vec_duplicate:@var{m} @var{x}))
+(const:@var{m} (vec_series:@var{m} @var{x} @var{y}))
+@end example
+This means, for example, that for a given @var{n} there is only ever a
+single instance of an expression like:
+@example
+(const:V@var{n}DI (vec_duplicate:V@var{n}DI (const_int 0)))
+@end example
+
 @cindex @code{const_double}, RTL sharing
 @item
 There is only one @code{const_double} expression with value 0 for
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index ff4ba5751d7..390bfcacccd 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1373,6 +1373,10 @@ Target supports Fortran @code{real} kinds larger than @code{real(8)}.
 @subsubsection Vector-specific attributes
 
 @table @code
+@item vect_align_stack_vars
+The target's ABI allows stack variables to be aligned to the preferred
+vector alignment.
+
 @item vect_condition
 Target supports vector conditional operations.
 
@@ -1383,6 +1387,10 @@ have different type from the value operands.
 @item vect_double
 Target supports hardware vectors of @code{double}.
 
+@item vect_element_align_preferred
+The target's preferred vector alignment is the same as the element
+alignment.
+
 @item vect_float
 Target supports hardware vectors of @code{float}.
 
@@ -1395,6 +1403,9 @@ Target supports hardware vectors of @code{long}.
 @item vect_long_long
 Target supports hardware vectors of @code{long long}.
 
+@item vect_masked_store
+Target supports vector masked stores.
+
 @item vect_aligned_arrays
 Target aligns arrays to vector alignment boundary.
 
@@ -1448,9 +1459,43 @@ element types.
 @item vect_perm
 Target supports vector permutation.
 
+@item vect_perm_byte
+Target supports permutation of vectors with 8-bit elements.
+
+@item vect_perm_short
+Target supports permutation of vectors with 16-bit elements.
+
+@item vect_perm3_byte
+Target supports permutation of vectors with 8-bit elements, and for the
+default vector length it is possible to permute:
+@example
+@{ a0, a1, a2, b0, b1, b2, @dots{} @}
+@end example
+to:
+@example
+@{ a0, a0, a0, b0, b0, b0, @dots{} @}
+@{ a1, a1, a1, b1, b1, b1, @dots{} @}
+@{ a2, a2, a2, b2, b2, b2, @dots{} @}
+@end example
+using only two-vector permutes, regardless of how long the sequence is.
+
+@item vect_perm3_int
+Like @code{vect_perm3_byte}, but for 32-bit elements.
+
+@item vect_perm3_short
+Like @code{vect_perm3_byte}, but for 16-bit elements.
+
 @item vect_shift
 Target supports a hardware vector shift operation.
 
+@item vect_unaligned_possible
+Target prefers vectors to have an alignment greater than element
+alignment, but also allows unaligned vector accesses in some
+circumstances.
+
+@item vect_variable_length
+Target has variable-length vectors.
+
 @item vect_widen_sum_hi_to_si
 Target supports a vector widening summation of @code{short} operands
 into @code{int} results, or can promote (unpack) from @code{short}
@@ -1705,6 +1750,17 @@ ARM target supports executing instructions from ARMv8.2 with the FP16
 extension.  Some multilibs may be incompatible with these options.
 Implies arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_scalar_hw.
 
+@item arm_v8_2a_dotprod_neon_ok
+@anchor{arm_v8_2a_dotprod_neon_ok}
+ARM target supports options to generate instructions from ARMv8.2 with
+the Dot Product extension. Some multilibs may be incompatible with these
+options.
+
+@item arm_v8_2a_dotprod_neon_hw
+ARM target supports executing instructions from ARMv8.2 with the Dot
+Product extension. Some multilibs may be incompatible with these options.
+Implies arm_v8_2a_dotprod_neon_ok.
+
 @item arm_prefer_ldrd_strd
 ARM target prefers @code{LDRD} and @code{STRD} instructions over
 @code{LDM} and @code{STM} instructions.
@@ -2311,6 +2367,11 @@ supported by the target; see the
 @ref{arm_v8_2a_fp16_neon_ok,,arm_v8_2a_fp16_neon_ok} effective target
 keyword.
 
+@item arm_v8_2a_dotprod_neon
+Add options for ARMv8.2 with Adv.SIMD Dot Product support, if this is
+supported by the target; see the
+@ref{arm_v8_2a_dotprod_neon_ok} effective target keyword.
+
 @item bind_pic_locally
 Add the target-specific flags needed to enable functions to bind
 locally when using pic/PIC passes in the testsuite.
@@ -2361,6 +2422,9 @@ Skip the test if the target does not support the @code{-fstack-check}
 option.  If @var{check} is @code{""}, support for @code{-fstack-check}
 is checked, for @code{-fstack-check=("@var{check}")} otherwise.
 
+@item dg-require-stack-size @var{size}
+Skip the test if the target does not support a stack size of @var{size}.
+
 @item dg-require-visibility @var{vis}
 Skip the test if the target does not support the @code{visibility} attribute.
 If @var{vis} is @code{""}, support for @code{visibility("hidden")} is
diff --git a/gcc/doc/standards.texi b/gcc/doc/standards.texi
index d4112b37863..a40899dba85 100644
--- a/gcc/doc/standards.texi
+++ b/gcc/doc/standards.texi
@@ -36,6 +36,8 @@ with some exceptions, and possibly with some extensions.
 @cindex C11
 @cindex ISO C1X
 @cindex C1X
+@cindex ISO C17
+@cindex C17
 @cindex Technical Corrigenda
 @cindex TC1
 @cindex Technical Corrigendum 1
@@ -100,7 +102,11 @@ in 2011 as ISO/IEC 9899:2011.  (While in development, drafts of this
 standard version were referred to as @dfn{C1X}.)
 GCC has substantially complete support
 for this standard, enabled with @option{-std=c11} or
-@option{-std=iso9899:2011}.  
+@option{-std=iso9899:2011}.  A version with corrections integrated is
+known as @dfn{C17} and is supported with @option{-std=c17} or
+@option{-std=iso9899:2017}; the corrections are also applied with
+@option{-std=c11}, and the only difference between the options is the
+value of @code{__STDC_VERSION__}.
 
 By default, GCC provides some extensions to the C language that, on
 rare occasions conflict with the C standard.  @xref{C
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a6bdb8ff277..acadc73667e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4313,6 +4313,17 @@ ISO/IEC TS 18661-3:2015; that is, @var{n} is one of 32, 64, 128, or,
 if @var{extended} is false, 16 or greater than 128 and a multiple of 32.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_FLOATN_BUILTIN_P (int @var{func})
+Define this to return true if the @code{_Float@var{n}} and
+@code{_Float@var{n}x} built-in functions should implicitly enable the
+built-in function without the @code{__builtin_} prefix in addition to the
+normal built-in function with the @code{__builtin_} prefix.  The default is
+to only enable built-in functions without the @code{__builtin_} prefix for
+the GNU C langauge.  In strict ANSI/ISO mode, the built-in function without
+the @code{__builtin_} prefix is not enabled.  The argument @code{FUNC} is the
+@code{enum built_in_function} id of the function to be enabled.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P (machine_mode @var{mode})
 Define this to return nonzero for machine modes for which the port has
 small register classes.  If this target hook returns nonzero for a given
@@ -7744,7 +7755,7 @@ for the file format in use is appropriate.
 @end defmac
 
 @deftypefn {Target Hook} void TARGET_ASM_OUTPUT_SOURCE_FILENAME (FILE *@var{file}, const char *@var{name})
-Output COFF information or DWARF debugging information which indicates that filename @var{name} is the current source file to the stdio stream @var{file}.
+Output DWARF debugging information which indicates that filename @var{name} is the current source file to the stdio stream @var{file}.
  
  This target hook need not be defined if the standard form of output for the file format in use is appropriate.
 @end deftypefn
@@ -9392,7 +9403,7 @@ This macro need only be defined if the target might save registers in the
 function prologue at an offset to the stack pointer that is not aligned to
 @code{UNITS_PER_WORD}.  The definition should be the negative minimum
 alignment if @code{STACK_GROWS_DOWNWARD} is true, and the positive
-minimum alignment otherwise.  @xref{SDB and DWARF}.  Only applicable if
+minimum alignment otherwise.  @xref{DWARF}.  Only applicable if
 the target supports DWARF 2 frame unwind information.
 @end defmac
 
@@ -9566,7 +9577,7 @@ This describes how to specify debugging information.
 * DBX Options::        Macros enabling specific options in DBX format.
 * DBX Hooks::          Hook macros for varying DBX format.
 * File Names and DBX:: Macros controlling output of file names in DBX format.
-* SDB and DWARF::      Macros for SDB (COFF) and DWARF formats.
+* DWARF::              Macros for DWARF format.
 * VMS Debug::          Macros for VMS debug format.
 @end menu
 
@@ -9600,9 +9611,8 @@ A C expression that returns the integer offset value for an automatic
 variable having address @var{x} (an RTL expression).  The default
 computation assumes that @var{x} is based on the frame-pointer and
 gives the offset from the frame-pointer.  This is required for targets
-that produce debugging output for DBX or COFF-style debugging output
-for SDB and allow the frame-pointer to be eliminated when the
-@option{-g} options is used.
+that produce debugging output for DBX and allow the frame-pointer to be
+eliminated when the @option{-g} option is used.
 @end defmac
 
 @defmac DEBUGGER_ARG_OFFSET (@var{offset}, @var{x})
@@ -9616,7 +9626,7 @@ A C expression that returns the type of debugging output GCC should
 produce when the user specifies just @option{-g}.  Define
 this if you have arranged for GCC to support more than one format of
 debugging output.  Currently, the allowable values are @code{DBX_DEBUG},
-@code{SDB_DEBUG}, @code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
+@code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
 @code{XCOFF_DEBUG}, @code{VMS_DEBUG}, and @code{VMS_AND_DWARF2_DEBUG}.
 
 When the user specifies @option{-ggdb}, GCC normally also uses the
@@ -9627,7 +9637,7 @@ defined, GCC uses @code{DBX_DEBUG}.
 
 The value of this macro only affects the default debugging output; the
 user can always get a specific type of output by using @option{-gstabs},
-@option{-gcoff}, @option{-gdwarf-2}, @option{-gxcoff}, or @option{-gvms}.
+@option{-gdwarf-2}, @option{-gxcoff}, or @option{-gvms}.
 @end defmac
 
 @node DBX Options
@@ -9845,16 +9855,11 @@ whose value is the highest absolute text address in the file.
 @end defmac
 
 @need 2000
-@node SDB and DWARF
-@subsection Macros for SDB and DWARF Output
+@node DWARF
+@subsection Macros for DWARF Output
 
 @c prevent bad page break with this line
-Here are macros for SDB and DWARF output.
-
-@defmac SDB_DEBUGGING_INFO
-Define this macro to 1 if GCC should produce COFF-style debugging output
-for SDB in response to the @option{-g} option.
-@end defmac
+Here are macros for DWARF output.
 
 @defmac DWARF2_DEBUGGING_INFO
 Define this macro if GCC should produce dwarf version 2 format
@@ -9960,40 +9965,6 @@ If defined, this target hook is a function which outputs a DTP-relative
 reference to the given TLS symbol of the specified size.
 @end deftypefn
 
-@defmac PUT_SDB_@dots{}
-Define these macros to override the assembler syntax for the special
-SDB assembler directives.  See @file{sdbout.c} for a list of these
-macros and their arguments.  If the standard syntax is used, you need
-not define them yourself.
-@end defmac
-
-@defmac SDB_DELIM
-Some assemblers do not support a semicolon as a delimiter, even between
-SDB assembler directives.  In that case, define this macro to be the
-delimiter to use (usually @samp{\n}).  It is not necessary to define
-a new set of @code{PUT_SDB_@var{op}} macros if this is the only change
-required.
-@end defmac
-
-@defmac SDB_ALLOW_UNKNOWN_REFERENCES
-Define this macro to allow references to unknown structure,
-union, or enumeration tags to be emitted.  Standard COFF does not
-allow handling of unknown references, MIPS ECOFF has support for
-it.
-@end defmac
-
-@defmac SDB_ALLOW_FORWARD_REFERENCES
-Define this macro to allow references to structure, union, or
-enumeration tags that have not yet been seen to be handled.  Some
-assemblers choke if forward tags are used, while some require it.
-@end defmac
-
-@defmac SDB_OUTPUT_SOURCE_LINE (@var{stream}, @var{line})
-A C statement to output SDB debugging information before code for line
-number @var{line} of the current source file to the stdio stream
-@var{stream}.  The default is to emit an @code{.ln} directive.
-@end defmac
-
 @need 2000
 @node VMS Debug
 @subsection Macros for VMS Debug Format
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index b5e2771a831..7cbce20b877 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3330,6 +3330,8 @@ stack.
 
 @hook TARGET_FLOATN_MODE
 
+@hook TARGET_FLOATN_BUILTIN_P
+
 @hook TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P
 
 @node Scalar Return
@@ -6458,7 +6460,7 @@ This macro need only be defined if the target might save registers in the
 function prologue at an offset to the stack pointer that is not aligned to
 @code{UNITS_PER_WORD}.  The definition should be the negative minimum
 alignment if @code{STACK_GROWS_DOWNWARD} is true, and the positive
-minimum alignment otherwise.  @xref{SDB and DWARF}.  Only applicable if
+minimum alignment otherwise.  @xref{DWARF}.  Only applicable if
 the target supports DWARF 2 frame unwind information.
 @end defmac
 
@@ -6582,7 +6584,7 @@ This describes how to specify debugging information.
 * DBX Options::        Macros enabling specific options in DBX format.
 * DBX Hooks::          Hook macros for varying DBX format.
 * File Names and DBX:: Macros controlling output of file names in DBX format.
-* SDB and DWARF::      Macros for SDB (COFF) and DWARF formats.
+* DWARF::              Macros for DWARF format.
 * VMS Debug::          Macros for VMS debug format.
 @end menu
 
@@ -6616,9 +6618,8 @@ A C expression that returns the integer offset value for an automatic
 variable having address @var{x} (an RTL expression).  The default
 computation assumes that @var{x} is based on the frame-pointer and
 gives the offset from the frame-pointer.  This is required for targets
-that produce debugging output for DBX or COFF-style debugging output
-for SDB and allow the frame-pointer to be eliminated when the
-@option{-g} options is used.
+that produce debugging output for DBX and allow the frame-pointer to be
+eliminated when the @option{-g} option is used.
 @end defmac
 
 @defmac DEBUGGER_ARG_OFFSET (@var{offset}, @var{x})
@@ -6632,7 +6633,7 @@ A C expression that returns the type of debugging output GCC should
 produce when the user specifies just @option{-g}.  Define
 this if you have arranged for GCC to support more than one format of
 debugging output.  Currently, the allowable values are @code{DBX_DEBUG},
-@code{SDB_DEBUG}, @code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
+@code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
 @code{XCOFF_DEBUG}, @code{VMS_DEBUG}, and @code{VMS_AND_DWARF2_DEBUG}.
 
 When the user specifies @option{-ggdb}, GCC normally also uses the
@@ -6643,7 +6644,7 @@ defined, GCC uses @code{DBX_DEBUG}.
 
 The value of this macro only affects the default debugging output; the
 user can always get a specific type of output by using @option{-gstabs},
-@option{-gcoff}, @option{-gdwarf-2}, @option{-gxcoff}, or @option{-gvms}.
+@option{-gdwarf-2}, @option{-gxcoff}, or @option{-gvms}.
 @end defmac
 
 @node DBX Options
@@ -6861,16 +6862,11 @@ whose value is the highest absolute text address in the file.
 @end defmac
 
 @need 2000
-@node SDB and DWARF
-@subsection Macros for SDB and DWARF Output
+@node DWARF
+@subsection Macros for DWARF Output
 
 @c prevent bad page break with this line
-Here are macros for SDB and DWARF output.
-
-@defmac SDB_DEBUGGING_INFO
-Define this macro to 1 if GCC should produce COFF-style debugging output
-for SDB in response to the @option{-g} option.
-@end defmac
+Here are macros for DWARF output.
 
 @defmac DWARF2_DEBUGGING_INFO
 Define this macro if GCC should produce dwarf version 2 format
@@ -6946,40 +6942,6 @@ is referenced by a function.
 
 @hook TARGET_ASM_OUTPUT_DWARF_DTPREL
 
-@defmac PUT_SDB_@dots{}
-Define these macros to override the assembler syntax for the special
-SDB assembler directives.  See @file{sdbout.c} for a list of these
-macros and their arguments.  If the standard syntax is used, you need
-not define them yourself.
-@end defmac
-
-@defmac SDB_DELIM
-Some assemblers do not support a semicolon as a delimiter, even between
-SDB assembler directives.  In that case, define this macro to be the
-delimiter to use (usually @samp{\n}).  It is not necessary to define
-a new set of @code{PUT_SDB_@var{op}} macros if this is the only change
-required.
-@end defmac
-
-@defmac SDB_ALLOW_UNKNOWN_REFERENCES
-Define this macro to allow references to unknown structure,
-union, or enumeration tags to be emitted.  Standard COFF does not
-allow handling of unknown references, MIPS ECOFF has support for
-it.
-@end defmac
-
-@defmac SDB_ALLOW_FORWARD_REFERENCES
-Define this macro to allow references to structure, union, or
-enumeration tags that have not yet been seen to be handled.  Some
-assemblers choke if forward tags are used, while some require it.
-@end defmac
-
-@defmac SDB_OUTPUT_SOURCE_LINE (@var{stream}, @var{line})
-A C statement to output SDB debugging information before code for line
-number @var{line} of the current source file to the stdio stream
-@var{stream}.  The default is to emit an @code{.ln} directive.
-@end defmac
-
 @need 2000
 @node VMS Debug
 @subsection Macros for VMS Debug Format
diff --git a/gcc/dojump.c b/gcc/dojump.c
index 97c4100b663..9efc4a465ba 100644
--- a/gcc/dojump.c
+++ b/gcc/dojump.c
@@ -89,7 +89,7 @@ do_pending_stack_adjust (void)
 {
   if (inhibit_defer_pop == 0)
     {
-      if (maybe_nonzero (pending_stack_adjust))
+      if (may_ne (pending_stack_adjust, 0))
 	adjust_stack (gen_int_mode (pending_stack_adjust, Pmode));
       pending_stack_adjust = 0;
     }
diff --git a/gcc/dse.c b/gcc/dse.c
index 90ec76a36f7..8ff2137876f 100644
--- a/gcc/dse.c
+++ b/gcc/dse.c
@@ -1492,7 +1492,7 @@ record_store (rtx body, bb_info_t bb_info)
       group_info *group = rtx_group_vec[group_id];
       mem_addr = group->canon_base_addr;
     }
-  if (maybe_nonzero (offset))
+  if (may_ne (offset, 0))
     mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
 
   while (ptr)
@@ -1827,7 +1827,7 @@ get_stored_val (store_info *store_info, machine_mode read_mode,
   else
     gap = read_offset - store_info->offset;
 
-  if (maybe_nonzero (gap))
+  if (may_ne (gap, 0))
     {
       poly_int64 shift = gap * BITS_PER_UNIT;
       poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;
@@ -2099,7 +2099,7 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info)
       group_info *group = rtx_group_vec[group_id];
       mem_addr = group->canon_base_addr;
     }
-  if (maybe_nonzero (offset))
+  if (may_ne (offset, 0))
     mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
 
   if (group_id >= 0)
diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 82e6571964b..ce1f1a21124 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -793,14 +793,17 @@ def_cfa_0 (dw_cfa_location *old_cfa, dw_cfa_location *new_cfa)
 	cfi->dw_cfi_opc = DW_CFA_def_cfa_offset;
       cfi->dw_cfi_oprnd1.dw_cfi_offset = const_offset;
     }
-  else if (must_eq (new_cfa->offset, old_cfa->offset)
+  else if (new_cfa->offset.is_constant ()
+	   && must_eq (new_cfa->offset, old_cfa->offset)
 	   && old_cfa->reg != INVALID_REGNUM
 	   && !new_cfa->indirect
 	   && !old_cfa->indirect)
     {
       /* Construct a "DW_CFA_def_cfa_register <register>" instruction,
 	 indicating the CFA register has changed to <register> but the
-	 offset has not changed.  */
+	 offset has not changed.  This requires the old CFA to have
+	 been set as a register plus offset rather than a general
+	 DW_CFA_def_cfa_expression.  */
       cfi->dw_cfi_opc = DW_CFA_def_cfa_register;
       cfi->dw_cfi_oprnd1.dw_cfi_reg_num = new_cfa->reg;
     }
@@ -937,7 +940,7 @@ notice_args_size (rtx_insn *insn)
 
   args_size = get_args_size (note);
   delta = args_size - cur_trace->end_true_args_size;
-  if (known_zero (delta))
+  if (must_eq (delta, 0))
     return;
 
   cur_trace->end_true_args_size = args_size;
@@ -1946,7 +1949,7 @@ dwarf2out_frame_debug_expr (rtx expr)
 	{
 	  /* We're storing the current CFA reg into the stack.  */
 
-	  if (known_zero (cur_cfa->offset))
+	  if (must_eq (cur_cfa->offset, 0))
 	    {
               /* Rule 19 */
               /* If stack is aligned, putting CFA reg into stack means
@@ -2358,7 +2361,7 @@ maybe_record_trace_start_abnormal (rtx_insn *start, rtx_insn *origin)
   dw_cfa_location save_cfa;
 
   save_args_size = cur_trace->end_true_args_size;
-  if (known_zero (save_args_size))
+  if (must_eq (save_args_size, 0))
     {
       maybe_record_trace_start (start, origin);
       return;
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 82b46169a18..c0f93d763f5 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -1489,7 +1489,7 @@ loc_descr_plus_const (dw_loc_descr_ref *list_head, poly_int64 poly_offset)
 
   gcc_assert (*list_head != NULL);
 
-  if (known_zero (poly_offset))
+  if (must_eq (poly_offset, 0))
     return;
 
   /* Find the end of the chain.  */
@@ -5381,6 +5381,16 @@ splice_child_die (dw_die_ref parent, dw_die_ref child)
   reparent_child (child, parent);
 }
 
+/* Create and return a new die with TAG_VALUE as tag.  */
+ 
+static inline dw_die_ref
+new_die_raw (enum dwarf_tag tag_value)
+{
+  dw_die_ref die = ggc_cleared_alloc<die_node> ();
+  die->die_tag = tag_value;
+  return die;
+}
+
 /* Create and return a new die with a parent of PARENT_DIE.  If
    PARENT_DIE is NULL, the new DIE is placed in limbo and an
    associated tree T must be supplied to determine parenthood
@@ -5389,9 +5399,7 @@ splice_child_die (dw_die_ref parent, dw_die_ref child)
 static inline dw_die_ref
 new_die (enum dwarf_tag tag_value, dw_die_ref parent_die, tree t)
 {
-  dw_die_ref die = ggc_cleared_alloc<die_node> ();
-
-  die->die_tag = tag_value;
+  dw_die_ref die = new_die_raw (tag_value);
 
   if (parent_die != NULL)
     add_child_die (parent_die, die);
@@ -5585,8 +5593,7 @@ add_AT_external_die_ref (dw_die_ref die, enum dwarf_attribute attr_kind,
 {
   /* Create a fake DIE that contains the reference.  Don't use
      new_die because we don't want to end up in the limbo list.  */
-  dw_die_ref ref = ggc_cleared_alloc<die_node> ();
-  ref->die_tag = die->die_tag;
+  dw_die_ref ref = new_die_raw (die->die_tag);
   ref->die_id.die_symbol = IDENTIFIER_POINTER (get_identifier (symbol));
   ref->die_offset = offset;
   ref->with_offset = 1;
@@ -7727,13 +7734,10 @@ should_move_die_to_comdat (dw_die_ref die)
 static dw_die_ref
 clone_die (dw_die_ref die)
 {
-  dw_die_ref clone;
+  dw_die_ref clone = new_die_raw (die->die_tag);
   dw_attr_node *a;
   unsigned ix;
 
-  clone = ggc_cleared_alloc<die_node> ();
-  clone->die_tag = die->die_tag;
-
   FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a)
     add_dwarf_attr (clone, a);
 
@@ -7777,8 +7781,7 @@ clone_as_declaration (dw_die_ref die)
       return clone;
     }
 
-  clone = ggc_cleared_alloc<die_node> ();
-  clone->die_tag = die->die_tag;
+  clone = new_die_raw (die->die_tag);
 
   FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a)
     {
@@ -12105,9 +12108,6 @@ base_type_die (tree type, bool reverse)
   struct fixed_point_type_info fpt_info;
   tree type_bias = NULL_TREE;
 
-  if (TREE_CODE (type) == ERROR_MARK || TREE_CODE (type) == VOID_TYPE)
-    return 0;
-
   /* If this is a subtype that should not be emitted as a subrange type,
      use the base type.  See subrange_type_for_debug_p.  */
   if (TREE_CODE (type) == INTEGER_TYPE && TREE_TYPE (type) != NULL_TREE)
@@ -12200,7 +12200,7 @@ base_type_die (tree type, bool reverse)
       gcc_unreachable ();
     }
 
-  base_type_result = new_die (DW_TAG_base_type, comp_unit_die (), type);
+  base_type_result = new_die_raw (DW_TAG_base_type);
 
   add_AT_unsigned (base_type_result, DW_AT_byte_size,
 		   int_size_in_bytes (type));
@@ -12256,8 +12256,6 @@ base_type_die (tree type, bool reverse)
 		     | dw_scalar_form_reference,
 		     NULL);
 
-  add_pubtype (type, base_type_result);
-
   return base_type_result;
 }
 
@@ -12285,8 +12283,6 @@ is_base_type (tree type)
 {
   switch (TREE_CODE (type))
     {
-    case ERROR_MARK:
-    case VOID_TYPE:
     case INTEGER_TYPE:
     case REAL_TYPE:
     case FIXED_POINT_TYPE:
@@ -12295,6 +12291,7 @@ is_base_type (tree type)
     case POINTER_BOUNDS_TYPE:
       return 1;
 
+    case VOID_TYPE:
     case ARRAY_TYPE:
     case RECORD_TYPE:
     case UNION_TYPE:
@@ -12500,6 +12497,8 @@ modified_type_die (tree type, int cv_quals, bool reverse,
   /* Only these cv-qualifiers are currently handled.  */
   const int cv_qual_mask = (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE
 			    | TYPE_QUAL_RESTRICT | TYPE_QUAL_ATOMIC);
+  const bool reverse_base_type
+    = need_endianity_attribute_p (reverse) && is_base_type (type);
 
   if (code == ERROR_MARK)
     return NULL;
@@ -12550,29 +12549,33 @@ modified_type_die (tree type, int cv_quals, bool reverse,
 	qualified_type = size_type_node;
     }
 
-
   /* If we do, then we can just use its DIE, if it exists.  */
   if (qualified_type)
     {
       mod_type_die = lookup_type_die (qualified_type);
 
-      /* DW_AT_endianity doesn't come from a qualifier on the type.  */
+      /* DW_AT_endianity doesn't come from a qualifier on the type, so it is
+	 dealt with specially: the DIE with the attribute, if it exists, is
+	 placed immediately after the regular DIE for the same base type.  */
       if (mod_type_die
-	  && (!need_endianity_attribute_p (reverse)
-	      || !is_base_type (type)
-	      || get_AT_unsigned (mod_type_die, DW_AT_endianity)))
+	  && (!reverse_base_type
+	      || ((mod_type_die = mod_type_die->die_sib) != NULL
+		  && get_AT_unsigned (mod_type_die, DW_AT_endianity))))
 	return mod_type_die;
     }
 
   name = qualified_type ? TYPE_NAME (qualified_type) : NULL;
 
   /* Handle C typedef types.  */
-  if (name && TREE_CODE (name) == TYPE_DECL && DECL_ORIGINAL_TYPE (name)
+  if (name
+      && TREE_CODE (name) == TYPE_DECL
+      && DECL_ORIGINAL_TYPE (name)
       && !DECL_ARTIFICIAL (name))
     {
       tree dtype = TREE_TYPE (name);
 
-      if (qualified_type == dtype)
+      /* Skip the typedef for base types with DW_AT_endianity, no big deal.  */
+      if (qualified_type == dtype && !reverse_base_type)
 	{
 	  tree origin = decl_ultimate_origin (name);
 
@@ -12685,8 +12688,7 @@ modified_type_die (tree type, int cv_quals, bool reverse,
 	      }
 	    if (first)
 	      {
-		d = ggc_cleared_alloc<die_node> ();
-		d->die_tag = dwarf_qual_info[i].t;
+		d = new_die_raw (dwarf_qual_info[i].t);
 		add_child_die_after (mod_scope, d, last);
 		last = d;
 	      }
@@ -12744,7 +12746,21 @@ modified_type_die (tree type, int cv_quals, bool reverse,
       item_type = TREE_TYPE (type);
     }
   else if (is_base_type (type))
-    mod_type_die = base_type_die (type, reverse);
+    {
+      mod_type_die = base_type_die (type, reverse);
+
+      /* The DIE with DW_AT_endianity is placed right after the naked DIE.  */
+      if (reverse_base_type)
+	{
+	  dw_die_ref after_die
+	    = modified_type_die (type, cv_quals, false, context_die);
+	  add_child_die_after (comp_unit_die (), mod_type_die, after_die);
+	}
+      else
+	add_child_die (comp_unit_die (), mod_type_die);
+
+      add_pubtype (type, mod_type_die);
+    }
   else
     {
       gen_type_die (type, context_die);
@@ -12806,7 +12822,7 @@ modified_type_die (tree type, int cv_quals, bool reverse,
 			  name ? IDENTIFIER_POINTER (name) : "__unknown__");
     }
 
-  if (qualified_type)
+  if (qualified_type && !reverse_base_type)
     equate_type_number_to_die (qualified_type, mod_type_die);
 
   if (item_type)
@@ -13760,7 +13776,7 @@ tls_mem_loc_descriptor (rtx mem)
   if (loc_result == NULL)
     return NULL;
 
-  if (maybe_nonzero (MEM_OFFSET (mem)))
+  if (may_ne (MEM_OFFSET (mem), 0))
     loc_descr_plus_const (&loc_result, MEM_OFFSET (mem));
 
   return loc_result;
@@ -13791,9 +13807,17 @@ expansion_failed (tree expr, rtx rtl, char const *reason)
 static bool
 const_ok_for_output_1 (rtx rtl)
 {
-  if (GET_CODE (rtl) == UNSPEC)
+  if (targetm.const_not_ok_for_debug_p (rtl))
     {
-      /* If delegitimize_address couldn't do anything with the UNSPEC, assume
+      if (GET_CODE (rtl) != UNSPEC)
+	{
+	  expansion_failed (NULL_TREE, rtl,
+			    "Expression rejected for debug by the backend.\n");
+	  return false;
+	}
+
+      /* If delegitimize_address couldn't do anything with the UNSPEC, and
+	 the target hook doesn't explicitly allow it in debug info, assume
 	 we can't express it in the debug info.  */
       /* Don't complain about TLS UNSPECs, those are just too hard to
 	 delegitimize.  Note this could be a non-decl SYMBOL_REF such as
@@ -16668,7 +16692,7 @@ loc_list_for_address_of_addr_expr_of_indirect_ref (tree loc, bool toplev,
 			NULL_RTX, "no indirect ref in inner refrence");
       return 0;
     }
-  if (!offset && known_zero (bitpos))
+  if (!offset && must_eq (bitpos, 0))
     list_ret = loc_list_from_tree (TREE_OPERAND (obj, 0), toplev ? 2 : 1,
 				   context);
   else if (toplev
@@ -16694,7 +16718,7 @@ loc_list_for_address_of_addr_expr_of_indirect_ref (tree loc, bool toplev,
       if (bytepos.is_constant (&value) && value > 0)
 	add_loc_descr_to_each (list_ret,
 			       new_loc_descr (DW_OP_plus_uconst, value, 0));
-      else if (maybe_nonzero (bytepos))
+      else if (may_ne (bytepos, 0))
 	loc_list_plus_const (list_ret, bytepos);
       add_loc_descr_to_each (list_ret,
 			     new_loc_descr (DW_OP_stack_value, 0, 0));
@@ -17675,7 +17699,7 @@ loc_list_from_tree_1 (tree loc, int want_address,
 
 	list_ret = loc_list_from_tree_1 (obj,
 					 want_address == 2
-					 && known_zero (bitpos)
+					 && must_eq (bitpos, 0)
 					 && !offset ? 2 : 1,
 					 context);
 	/* TODO: We can extract value of the small expression via shifting even
@@ -17706,7 +17730,7 @@ loc_list_from_tree_1 (tree loc, int want_address,
 	if (bytepos.is_constant (&value) && value > 0)
 	  add_loc_descr_to_each (list_ret, new_loc_descr (DW_OP_plus_uconst,
 							  value, 0));
-	else if (maybe_nonzero (bytepos))
+	else if (may_ne (bytepos, 0))
 	  loc_list_plus_const (list_ret, bytepos);
 
 	have_address = 1;
@@ -19188,7 +19212,7 @@ rtl_for_decl_location (tree decl)
 	 fact we are not.  We need to adjust the offset of the
 	 storage location to reflect the actual value's bytes,
 	 else gdb will not be able to display it.  */
-      if (maybe_nonzero (offset))
+      if (may_ne (offset, 0))
 	rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (decl)),
 			   plus_constant (addr_mode, XEXP (rtl, 0), offset));
     }
@@ -20599,8 +20623,7 @@ dwarf2out_vms_debug_main_pointer (void)
   dw_die_ref die;
 
   /* Allocate the VMS debug main subprogram die.  */
-  die = ggc_cleared_alloc<die_node> ();
-  die->die_tag = DW_TAG_subprogram;
+  die = new_die_raw (DW_TAG_subprogram);
   add_name_attribute (die, VMS_DEBUG_MAIN_POINTER);
   ASM_GENERATE_INTERNAL_LABEL (label, PROLOGUE_END_LABEL,
 			       current_function_funcdef_no);
@@ -21022,10 +21045,10 @@ gen_array_type_die (tree type, dw_die_ref context_die)
     add_AT_unsigned (array_die, DW_AT_ordering, DW_ORD_col_major);
 
 #if 0
-  /* We default the array ordering.  SDB will probably do
-     the right things even if DW_AT_ordering is not present.  It's not even
-     an issue until we start to get into multidimensional arrays anyway.  If
-     SDB is ever caught doing the Wrong Thing for multi-dimensional arrays,
+  /* We default the array ordering.  Debuggers will probably do the right
+     things even if DW_AT_ordering is not present.  It's not even an issue
+     until we start to get into multidimensional arrays anyway.  If a debugger
+     is ever caught doing the Wrong Thing for multi-dimensional arrays,
      then we'll have to put the DW_AT_ordering attribute back in.  (But if
      and when we find out that we need to put these in, we will only do so
      for multidimensional arrays.  */
@@ -23517,6 +23540,8 @@ highest_c_language (const char *lang1, const char *lang2)
   if (strcmp ("GNU C++98", lang1) == 0 || strcmp ("GNU C++98", lang2) == 0)
     return "GNU C++98";
 
+  if (strcmp ("GNU C17", lang1) == 0 || strcmp ("GNU C17", lang2) == 0)
+    return "GNU C17";
   if (strcmp ("GNU C11", lang1) == 0 || strcmp ("GNU C11", lang2) == 0)
     return "GNU C11";
   if (strcmp ("GNU C99", lang1) == 0 || strcmp ("GNU C99", lang2) == 0)
@@ -23593,7 +23618,8 @@ gen_compile_unit_die (const char *filename)
 	    language = DW_LANG_C99;
 
 	  if (dwarf_version >= 5 /* || !dwarf_strict */)
-	    if (strcmp (language_string, "GNU C11") == 0)
+	    if (strcmp (language_string, "GNU C11") == 0
+		|| strcmp (language_string, "GNU C17") == 0)
 	      language = DW_LANG_C11;
 	}
     }
diff --git a/gcc/early-remat.c b/gcc/early-remat.c
index a6e5b797dee..94e87d96ffe 100644
--- a/gcc/early-remat.c
+++ b/gcc/early-remat.c
@@ -672,10 +672,11 @@ early_remat::dump_block_info (basic_block bb)
   fprintf (dump_file, "\n;;%*s:", width, "successors");
   dump_edge_list (bb, true);
 
-  fprintf (dump_file, "\n;;%*s: %d", width, "Frequency", bb->frequency);
+  fprintf (dump_file, "\n;;%*s: %d", width, "frequency",
+	   bb->count.to_frequency (m_fn));
 
   if (info->last_call)
-    fprintf (dump_file, "\n;;%*s: %d", width, "Last call",
+    fprintf (dump_file, "\n;;%*s: %d", width, "last call",
 	     INSN_UID (info->last_call));
 
   if (!empty_p (info->rd_in))
@@ -2218,7 +2219,7 @@ early_remat::local_remat_cheaper_p (unsigned int query_bb_index)
 	  }
 	else if (m_block_info[e->src->index].last_call)
 	  /* We'll rematerialize after the call.  */
-	  frequency += e->src->frequency;
+	  frequency += e->src->count.to_frequency (m_fn);
 	else if (m_block_info[e->src->index].remat_frequency_valid_p)
 	  /* Add the cost of rematerializing at the head of E->src
 	     or in its predecessors (whichever is cheaper).  */
@@ -2234,10 +2235,11 @@ early_remat::local_remat_cheaper_p (unsigned int query_bb_index)
 
       /* If rematerializing in and before the block have equal cost, prefer
 	 rematerializing in the block.  This should shorten the live range.  */
-      if (!can_move_p || frequency >= bb->frequency)
+      int bb_frequency = bb->count.to_frequency (m_fn);
+      if (!can_move_p || frequency >= bb_frequency)
 	{
 	  info->local_remat_cheaper_p = true;
-	  info->remat_frequency = bb->frequency;
+	  info->remat_frequency = bb_frequency;
 	}
       else
 	info->remat_frequency = frequency;
@@ -2252,7 +2254,7 @@ early_remat::local_remat_cheaper_p (unsigned int query_bb_index)
 	    {
 	      fprintf (dump_file, ";; Block %d has frequency %d,"
 		       " rematerializing in predecessors has frequency %d",
-		       bb->index, bb->frequency, frequency);
+		       bb->index, bb_frequency, frequency);
 	      if (info->local_remat_cheaper_p)
 		fprintf (dump_file, "; prefer to rematerialize"
 			 " in the block\n");
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 28e6dd85e0d..af4a038d75a 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -450,7 +450,7 @@ get_reg_attrs (tree decl, poly_int64 offset)
   reg_attrs attrs;
 
   /* If everything is the default, we can just return zero.  */
-  if (decl == 0 && known_zero (offset))
+  if (decl == 0 && must_eq (offset, 0))
     return 0;
 
   attrs.decl = decl;
@@ -982,7 +982,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
 
   /* Paradoxical subregs must have offset zero.  */
   if (may_gt (osize, isize))
-    return known_zero (offset);
+    return must_eq (offset, 0U);
 
   /* This is a normal subreg.  Verify that the offset is representable.  */
 
@@ -1028,7 +1028,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
       if (!can_div_trunc_p (offset, block_size, &start_reg, &offset_within_reg)
 	  || (BYTES_BIG_ENDIAN
 	      ? may_ne (offset_within_reg, block_size - osize)
-	      : maybe_nonzero (offset_within_reg)))
+	      : may_ne (offset_within_reg, 0U)))
 	return false;
     }
   return true;
@@ -1156,7 +1156,7 @@ subreg_memory_offset (machine_mode outer_mode, machine_mode inner_mode,
 {
   if (paradoxical_subreg_p (outer_mode, inner_mode))
     {
-      gcc_assert (known_zero (offset));
+      gcc_assert (must_eq (offset, 0U));
       return -subreg_lowpart_offset (inner_mode, outer_mode);
     }
   return offset;
@@ -2178,7 +2178,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int objectp,
   /* If we modified OFFSET based on T, then subtract the outstanding
      bit position offset.  Similarly, increase the size of the accessed
      object to contain the negative offset.  */
-  if (maybe_nonzero (apply_bitpos))
+  if (may_ne (apply_bitpos, 0))
     {
       gcc_assert (attrs.offset_known_p);
       poly_int64 bytepos = bits_to_bytes_round_down (apply_bitpos);
@@ -2402,8 +2402,8 @@ adjust_address_1 (rtx memref, machine_mode mode, poly_int64 offset,
 
   /* If there are no changes, just return the original memory reference.  */
   if (mode == GET_MODE (memref)
-      && known_zero (offset)
-      && (known_zero (size)
+      && must_eq (offset, 0)
+      && (must_eq (size, 0)
 	  || (attrs.size_known_p && must_eq (attrs.size, size)))
       && (!validate || memory_address_addr_space_p (mode, addr,
 						    attrs.addrspace)))
@@ -2451,7 +2451,7 @@ adjust_address_1 (rtx memref, machine_mode mode, poly_int64 offset,
 
   /* If the address is a REG, change_address_1 rightfully returns memref,
      but this would destroy memref's MEM_ATTRS.  */
-  if (new_rtx == memref && maybe_nonzero (offset))
+  if (new_rtx == memref && may_ne (offset, 0))
     new_rtx = copy_rtx (new_rtx);
 
   /* Conservatively drop the object if we don't know where we start from.  */
@@ -2478,13 +2478,13 @@ adjust_address_1 (rtx memref, machine_mode mode, poly_int64 offset,
   /* Compute the new alignment by taking the MIN of the alignment and the
      lowest-order set bit in OFFSET, but don't change the alignment if OFFSET
      if zero.  */
-  if (maybe_nonzero (offset))
+  if (may_ne (offset, 0))
     {
       max_align = known_alignment (offset) * BITS_PER_UNIT;
       attrs.align = MIN (attrs.align, max_align);
     }
 
-  if (maybe_nonzero (size))
+  if (may_ne (size, 0))
     {
       /* Drop the object if the new right end is not within its bounds.  */
       if (adjust_object && may_gt (offset + size, attrs.size))
@@ -3935,6 +3935,7 @@ try_split (rtx pat, rtx_insn *trial, int last)
 	case REG_NORETURN:
 	case REG_SETJMP:
 	case REG_TM:
+	case REG_CALL_NOCF_CHECK:
 	  for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn))
 	    {
 	      if (CALL_P (insn))
diff --git a/gcc/except.c b/gcc/except.c
index 041f89a55e5..30f303a7bf8 100644
--- a/gcc/except.c
+++ b/gcc/except.c
@@ -1003,7 +1003,6 @@ dw2_build_landing_pads (void)
 
       bb = emit_to_new_bb_before (seq, label_rtx (lp->post_landing_pad));
       bb->count = bb->next_bb->count;
-      bb->frequency = bb->next_bb->frequency;
       make_single_succ_edge (bb, bb->next_bb, e_flags);
       if (current_loops)
 	{
diff --git a/gcc/explow.c b/gcc/explow.c
index 5d738c1c422..4c99d4e2871 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -102,7 +102,7 @@ plus_constant (machine_mode mode, rtx x, poly_int64 c, bool inplace)
 
   gcc_assert (GET_MODE (x) == VOIDmode || GET_MODE (x) == mode);
 
-  if (known_zero (c))
+  if (must_eq (c, 0))
     return x;
 
  restart:
@@ -195,7 +195,7 @@ plus_constant (machine_mode mode, rtx x, poly_int64 c, bool inplace)
       break;
     }
 
-  if (maybe_nonzero (c))
+  if (may_ne (c, 0))
     x = gen_rtx_PLUS (mode, x, gen_int_mode (c, mode));
 
   if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
@@ -1334,6 +1334,9 @@ get_stack_check_protect (void)
    REQUIRED_ALIGN is the alignment (in bits) required for the region
    of memory.
 
+   MAX_SIZE is an upper bound for SIZE, if SIZE is not constant, or -1 if
+   no such upper bound is known.
+
    If CANNOT_ACCUMULATE is set to TRUE, the caller guarantees that the
    stack space allocated by the generated code cannot be added with itself
    in the course of the execution of the function.  It is always safe to
@@ -1343,7 +1346,9 @@ get_stack_check_protect (void)
 
 rtx
 allocate_dynamic_stack_space (rtx size, unsigned size_align,
-			      unsigned required_align, bool cannot_accumulate)
+			      unsigned required_align,
+			      HOST_WIDE_INT max_size,
+			      bool cannot_accumulate)
 {
   HOST_WIDE_INT stack_usage_size = -1;
   rtx_code_label *final_label;
@@ -1382,8 +1387,12 @@ allocate_dynamic_stack_space (rtx size, unsigned size_align,
 	    }
 	}
 
-      /* If the size is not constant, we can't say anything.  */
-      if (stack_usage_size == -1)
+      /* If the size is not constant, try the maximum size.  */
+      if (stack_usage_size < 0)
+	stack_usage_size = max_size;
+
+      /* If the size is still not constant, we can't say anything.  */
+      if (stack_usage_size < 0)
 	{
 	  current_function_has_unbounded_dynamic_stack_size = 1;
 	  stack_usage_size = 0;
diff --git a/gcc/explow.h b/gcc/explow.h
index c79065113b5..e981a656254 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -94,7 +94,8 @@ extern void update_nonlocal_goto_save_area (void);
 extern void record_new_stack_level (void);
 
 /* Allocate some space on the stack dynamically and return its address.  */
-extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned, bool);
+extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned,
+					 HOST_WIDE_INT, bool);
 
 /* Calculate the necessary size of a constant dynamic stack allocation from the
    size of the variable area.  */
diff --git a/gcc/expmed.c b/gcc/expmed.c
index e99ff2e3dde..4565bcce99e 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -561,7 +561,7 @@ strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
     return false;
 
   /* Check for cases where the C++ memory model applies.  */
-  if (maybe_nonzero (bitregion_end)
+  if (may_ne (bitregion_end, 0U)
       && (may_lt (bitnum - bitnum % modesize, bitregion_start)
 	  || may_gt (bitnum - bitnum % modesize + modesize - 1,
 		     bitregion_end)))
@@ -779,7 +779,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
       rtx sub;
       HOST_WIDE_INT regnum;
       poly_uint64 regsize = REGMODE_NATURAL_SIZE (GET_MODE (op0));
-      if (known_zero (bitnum)
+      if (must_eq (bitnum, 0U)
 	  && must_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0))))
 	{
 	  sub = simplify_gen_subreg (GET_MODE (op0), value, fieldmode, 0);
@@ -1129,7 +1129,7 @@ store_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
   /* Under the C++0x memory model, we must not touch bits outside the
      bit region.  Adjust the address to start at the beginning of the
      bit region.  */
-  if (MEM_P (str_rtx) && maybe_nonzero (bitregion_start))
+  if (MEM_P (str_rtx) && may_ne (bitregion_start, 0U))
     {
       scalar_int_mode best_mode;
       machine_mode addr_mode = VOIDmode;
@@ -1370,7 +1370,7 @@ store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	 UNIT close to the end of the region as needed.  If op0 is a REG
 	 or SUBREG of REG, don't do this, as there can't be data races
 	 on a register and we can expand shorter code in some cases.  */
-      if (maybe_nonzero (bitregion_end)
+      if (may_ne (bitregion_end, 0U)
 	  && unit > BITS_PER_UNIT
 	  && may_gt (bitpos + bitsdone - thispos + unit, bitregion_end + 1)
 	  && !REG_P (op0)
@@ -1617,7 +1617,7 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
 
   if (REG_P (op0)
       && mode == GET_MODE (op0)
-      && known_zero (bitnum)
+      && must_eq (bitnum, 0U)
       && must_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0))))
     {
       if (reverse)
@@ -1631,7 +1631,7 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
       && !MEM_P (op0)
       && VECTOR_MODE_P (tmode)
       && must_eq (bitsize, GET_MODE_SIZE (tmode))
-      && may_gt (GET_MODE (op0), GET_MODE_SIZE (tmode)))
+      && may_gt (GET_MODE_SIZE (GET_MODE (op0)), GET_MODE_SIZE (tmode)))
     {
       machine_mode new_mode = GET_MODE (op0);
       if (GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode))
@@ -1708,13 +1708,14 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
     }
 
   /* Use vec_extract patterns for extracting parts of vectors whenever
-     available.  */
+     available.  If that fails, see whether the current modes and bitregion
+     give a natural subreg.  */
   machine_mode outermode = GET_MODE (op0);
   if (VECTOR_MODE_P (outermode) && !MEM_P (op0))
     {
       scalar_mode innermode = GET_MODE_INNER (outermode);
-      enum insn_code icode = convert_optab_handler (vec_extract_optab,
-						    outermode, innermode);
+      enum insn_code icode
+	= convert_optab_handler (vec_extract_optab, outermode, innermode);
       poly_uint64 pos;
       if (icode != CODE_FOR_nothing
 	  && must_eq (bitsize, GET_MODE_BITSIZE (innermode))
@@ -1736,6 +1737,15 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
 	      return target;
 	    }
 	}
+      /* Using subregs is useful if we're extracting the least-significant
+	 vector element, or if we're extracting one register vector from
+	 a multi-register vector.  extract_bit_field_as_subreg checks
+	 for valid bitsize and bitnum, so we don't need to do that here.
+
+	 The mode check makes sure that we're extracting either
+	 a single element or a subvector with the same element type.
+	 If the modes aren't such a natural fit, fall through and
+	 bitcast to integers first.  */
       if (GET_MODE_INNER (mode) == innermode)
 	{
 	  rtx sub = extract_bit_field_as_subreg (mode, op0, bitsize, bitnum);
@@ -2051,9 +2061,9 @@ extract_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
   machine_mode mode1;
 
   /* Handle -fstrict-volatile-bitfields in the cases where it applies.  */
-  if (maybe_nonzero (GET_MODE_BITSIZE (GET_MODE (str_rtx))))
+  if (may_ne (GET_MODE_BITSIZE (GET_MODE (str_rtx)), 0))
     mode1 = GET_MODE (str_rtx);
-  else if (target && maybe_nonzero (GET_MODE_BITSIZE (GET_MODE (target))))
+  else if (target && may_ne (GET_MODE_BITSIZE (GET_MODE (target)), 0))
     mode1 = GET_MODE (target);
   else
     mode1 = tmode;
diff --git a/gcc/expr.c b/gcc/expr.c
index 8e290dedda7..31ff5e188e1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2210,7 +2210,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 		 can be used to determine the object and the bit field
 		 to be extracted.  */
 	      tmps[i] = XEXP (src, elt);
-	      if (maybe_nonzero (subpos)
+	      if (may_ne (subpos, 0)
 		  || may_ne (subpos + bytelen, slen0)
 		  || (!CONSTANT_P (tmps[i])
 		      && (!REG_P (tmps[i]) || GET_MODE (tmps[i]) != mode)))
@@ -2223,7 +2223,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 	    {
 	      rtx mem;
 
-	      gcc_assert (known_zero (bytepos));
+	      gcc_assert (must_eq (bytepos, 0));
 	      mem = assign_stack_temp (GET_MODE (src), slen);
 	      emit_move_insn (mem, src);
 	      tmps[i] = extract_bit_field (mem, bytelen * BITS_PER_UNIT,
@@ -2271,7 +2271,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode, false, NULL);
 
-      if (maybe_nonzero (shift))
+      if (may_ne (shift, 0))
 	tmps[i] = expand_shift (LSHIFT_EXPR, mode, tmps[i],
 				shift, tmps[i], 0);
     }
@@ -2540,7 +2540,7 @@ emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
 	      machine_mode dest_mode = GET_MODE (dest);
 	      machine_mode tmp_mode = GET_MODE (tmps[i]);
 
-	      gcc_assert (known_zero (bytepos) && XVECLEN (src, 0));
+	      gcc_assert (must_eq (bytepos, 0) && XVECLEN (src, 0));
 
 	      if (GET_MODE_ALIGNMENT (dest_mode)
 		  >= GET_MODE_ALIGNMENT (tmp_mode))
@@ -3884,12 +3884,12 @@ push_block (rtx size, poly_int64 extra, int below)
   size = convert_modes (Pmode, ptr_mode, size, 1);
   if (CONSTANT_P (size))
     anti_adjust_stack (plus_constant (Pmode, size, extra));
-  else if (REG_P (size) && known_zero (extra))
+  else if (REG_P (size) && must_eq (extra, 0))
     anti_adjust_stack (size);
   else
     {
       temp = copy_to_mode_reg (Pmode, size);
-      if (maybe_nonzero (extra))
+      if (may_ne (extra, 0))
 	temp = expand_binop (Pmode, add_optab, temp,
 			     gen_int_mode (extra, Pmode),
 			     temp, 0, OPTAB_LIB_WIDEN);
@@ -3899,7 +3899,7 @@ push_block (rtx size, poly_int64 extra, int below)
   if (STACK_GROWS_DOWNWARD)
     {
       temp = virtual_outgoing_args_rtx;
-      if (maybe_nonzero (extra) && below)
+      if (may_ne (extra, 0) && below)
 	temp = plus_constant (Pmode, temp, extra);
     }
   else
@@ -3908,7 +3908,7 @@ push_block (rtx size, poly_int64 extra, int below)
       if (poly_int_rtx_p (size, &csize))
 	temp = plus_constant (Pmode, virtual_outgoing_args_rtx,
 			      -csize - (below ? 0 : extra));
-      else if (maybe_nonzero (extra) && !below)
+      else if (may_ne (extra, 0) && !below)
 	temp = gen_rtx_PLUS (Pmode, virtual_outgoing_args_rtx,
 			     negate_rtx (Pmode, plus_constant (Pmode, size,
 							       extra)));
@@ -4091,7 +4091,7 @@ fixup_args_size_notes (rtx_insn *prev, rtx_insn *last,
 	continue;
 
       poly_int64 this_delta = find_args_size_adjust (insn);
-      if (known_zero (this_delta))
+      if (must_eq (this_delta, 0))
 	{
 	  if (!CALL_P (insn)
 	      || ACCUMULATE_OUTGOING_ARGS
@@ -4363,7 +4363,7 @@ emit_push_insn (rtx x, machine_mode mode, tree type, rtx size,
 	  /* Push padding now if padding above and stack grows down,
 	     or if padding below and stack grows up.
 	     But if space already allocated, this has already been done.  */
-	  if (maybe_nonzero (extra)
+	  if (may_ne (extra, 0)
 	      && args_addr == 0
 	      && where_pad != PAD_NONE
 	      && where_pad != stack_direction)
@@ -4490,7 +4490,7 @@ emit_push_insn (rtx x, machine_mode mode, tree type, rtx size,
       /* Push padding now if padding above and stack grows down,
 	 or if padding below and stack grows up.
 	 But if space already allocated, this has already been done.  */
-      if (maybe_nonzero (extra)
+      if (may_ne (extra, 0)
 	  && args_addr == 0
 	  && where_pad != PAD_NONE
 	  && where_pad != stack_direction)
@@ -4543,7 +4543,7 @@ emit_push_insn (rtx x, machine_mode mode, tree type, rtx size,
       /* Push padding now if padding above and stack grows down,
 	 or if padding below and stack grows up.
 	 But if space already allocated, this has already been done.  */
-      if (maybe_nonzero (extra)
+      if (may_ne (extra, 0)
 	  && args_addr == 0
 	  && where_pad != PAD_NONE
 	  && where_pad != stack_direction)
@@ -4592,7 +4592,7 @@ emit_push_insn (rtx x, machine_mode mode, tree type, rtx size,
 	}
     }
 
-  if (maybe_nonzero (extra) && args_addr == 0 && where_pad == stack_direction)
+  if (may_ne (extra, 0) && args_addr == 0 && where_pad == stack_direction)
     anti_adjust_stack (gen_int_mode (extra, Pmode));
 
   if (alignment_pad && args_addr == 0)
@@ -5087,7 +5087,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
 	     be expected to result in single move instructions.  */
 	  poly_int64 bytepos;
 	  if (mode1 != VOIDmode
-	      && maybe_nonzero (bitpos)
+	      && may_ne (bitpos, 0)
 	      && may_gt (bitsize, 0)
 	      && multiple_p (bitpos, BITS_PER_UNIT, &bytepos)
 	      && multiple_p (bitpos, bitsize)
@@ -5124,13 +5124,13 @@ expand_assignment (tree to, tree from, bool nontemporal)
 	  poly_int64 mode_bitsize = GET_MODE_BITSIZE (to_mode);
 	  unsigned short inner_bitsize = GET_MODE_UNIT_BITSIZE (to_mode);
 	  if (COMPLEX_MODE_P (TYPE_MODE (TREE_TYPE (from)))
-	      && known_zero (bitpos)
+	      && must_eq (bitpos, 0)
 	      && must_eq (bitsize, mode_bitsize))
 	    result = store_expr (from, to_rtx, false, nontemporal, reversep);
 	  else if (must_eq (bitsize, inner_bitsize)
-		   && (known_zero (bitpos)
+		   && (must_eq (bitpos, 0)
 		       || must_eq (bitpos, inner_bitsize)))
-	    result = store_expr (from, XEXP (to_rtx, maybe_nonzero (bitpos)),
+	    result = store_expr (from, XEXP (to_rtx, may_ne (bitpos, 0)),
 				 false, nontemporal, reversep);
 	  else if (must_le (bitpos + bitsize, inner_bitsize))
 	    result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
@@ -5143,8 +5143,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
 				  bitregion_start, bitregion_end,
 				  mode1, from, get_alias_set (to),
 				  nontemporal, reversep);
-	  else if (known_zero (bitpos)
-		   && must_eq (bitsize, mode_bitsize))
+	  else if (must_eq (bitpos, 0) && must_eq (bitsize, mode_bitsize))
 	    {
 	      rtx from_rtx;
 	      result = expand_normal (from);
@@ -6118,12 +6117,12 @@ store_constructor_field (rtx target, poly_uint64 bitsize, poly_int64 bitpos,
       /* We can only call store_constructor recursively if the size and
 	 bit position are on a byte boundary.  */
       && multiple_p (bitpos, BITS_PER_UNIT, &bytepos)
-      && maybe_nonzero (bitsize)
+      && may_ne (bitsize, 0U)
       && multiple_p (bitsize, BITS_PER_UNIT, &bytesize)
       /* If we have a nonzero bitpos for a register target, then we just
 	 let store_field do the bitfield handling.  This is unlikely to
 	 generate unnecessary clear instructions anyways.  */
-      && (known_zero (bitpos) || MEM_P (target)))
+      && (must_eq (bitpos, 0) || MEM_P (target)))
     {
       if (MEM_P (target))
 	{
@@ -6197,7 +6196,7 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size,
 	reverse = TYPE_REVERSE_STORAGE_ORDER (type);
 
 	/* If size is zero or the target is already cleared, do nothing.  */
-	if (known_zero (size) || cleared)
+	if (must_eq (size, 0) || cleared)
 	  cleared = 1;
 	/* We either clear the aggregate or indicate the value is dead.  */
 	else if ((TREE_CODE (type) == UNION_TYPE
@@ -6805,7 +6804,7 @@ store_field (rtx target, poly_int64 bitsize, poly_int64 bitpos,
   /* If we have nothing to store, do nothing unless the expression has
      side-effects.  Don't do that for zero sized addressable lhs of
      calls.  */
-  if (known_zero (bitsize)
+  if (must_eq (bitsize, 0)
       && (!TREE_ADDRESSABLE (TREE_TYPE (exp))
 	  || TREE_CODE (exp) != CALL_EXPR))
     return expand_expr (exp, const0_rtx, VOIDmode, EXPAND_NORMAL);
@@ -6814,7 +6813,7 @@ store_field (rtx target, poly_int64 bitsize, poly_int64 bitpos,
     {
       /* We're storing into a struct containing a single __complex.  */
 
-      gcc_assert (known_zero (bitpos));
+      gcc_assert (must_eq (bitpos, 0));
       return store_expr (exp, target, 0, nontemporal, reverse);
     }
 
@@ -7901,7 +7900,7 @@ expand_expr_addr_expr_1 (tree exp, rtx target, scalar_int_mode tmode,
   /* We must have made progress.  */
   gcc_assert (inner != exp);
 
-  subtarget = offset || maybe_nonzero (bitpos) ? NULL_RTX : target;
+  subtarget = offset || may_ne (bitpos, 0) ? NULL_RTX : target;
   /* For VIEW_CONVERT_EXPR, where the outer alignment is bigger than
      inner alignment, force the inner to be sufficiently aligned.  */
   if (CONSTANT_CLASS_P (inner)
@@ -7936,13 +7935,13 @@ expand_expr_addr_expr_1 (tree exp, rtx target, scalar_int_mode tmode,
 	result = simplify_gen_binary (PLUS, tmode, result, tmp);
       else
 	{
-	  subtarget = maybe_nonzero (bitpos) ? NULL_RTX : target;
+	  subtarget = may_ne (bitpos, 0) ? NULL_RTX : target;
 	  result = expand_simple_binop (tmode, PLUS, result, tmp, subtarget,
 					1, OPTAB_LIB_WIDEN);
 	}
     }
 
-  if (maybe_nonzero (bitpos))
+  if (may_ne (bitpos, 0))
     {
       /* Someone beforehand should have rejected taking the address
 	 of an object that isn't byte-aligned.  */
@@ -9969,43 +9968,24 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  && GET_MODE (decl_rtl) != dmode)
 	{
 	  machine_mode pmode;
-	  bool always_initialized_rtx;
 
 	  /* Get the signedness to be used for this variable.  Ensure we get
 	     the same mode we got when the variable was declared.  */
 	  if (code != SSA_NAME)
-	    {
-	      pmode = promote_decl_mode (exp, &unsignedp);
-	      always_initialized_rtx = true;
-	    }
+	    pmode = promote_decl_mode (exp, &unsignedp);
 	  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
 		   && gimple_code (g) == GIMPLE_CALL
 		   && !gimple_call_internal_p (g))
-	    {
-	      pmode = promote_function_mode (type, mode, &unsignedp,
-					    gimple_call_fntype (g), 2);
-	      always_initialized_rtx
-		= always_initialized_rtx_for_ssa_name_p (ssa_name);
-	    }
+	    pmode = promote_function_mode (type, mode, &unsignedp,
+					   gimple_call_fntype (g),
+					   2);
 	  else
-	    {
-	      pmode = promote_ssa_mode (ssa_name, &unsignedp);
-	      always_initialized_rtx
-		= always_initialized_rtx_for_ssa_name_p (ssa_name);
-	    }
-
+	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
-
-	  /* We cannot assume anything about an existing extension if the
-	     register may contain uninitialized bits.  */
-	  if (always_initialized_rtx)
-	    {
-	      SUBREG_PROMOTED_VAR_P (temp) = 1;
-	      SUBREG_PROMOTED_SET (temp, unsignedp);
-	    }
-
+	  SUBREG_PROMOTED_VAR_P (temp) = 1;
+	  SUBREG_PROMOTED_SET (temp, unsignedp);
 	  return temp;
 	}
 
@@ -10242,7 +10222,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	    poly_int64 offset = mem_ref_offset (exp).force_shwi ();
 	    base = TREE_OPERAND (base, 0);
 	    poly_uint64 type_size;
-	    if (known_zero (offset)
+	    if (must_eq (offset, 0)
 	        && !reverse
 		&& poly_int_tree_p (TYPE_SIZE (type), &type_size)
 		&& must_eq (GET_MODE_BITSIZE (DECL_MODE (base)), type_size))
@@ -10582,7 +10562,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	/* Handle CONCAT first.  */
 	if (GET_CODE (op0) == CONCAT && !must_force_mem)
 	  {
-	    if (known_zero (bitpos)
+	    if (must_eq (bitpos, 0)
 		&& must_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0)))
 		&& COMPLEX_MODE_P (mode1)
 		&& COMPLEX_MODE_P (GET_MODE (op0))
@@ -10615,10 +10595,10 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 		  }
 		return op0;
 	      }
-	    if (known_zero (bitpos)
+	    if (must_eq (bitpos, 0)
 		&& must_eq (bitsize,
 			    GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 0))))
-		&& maybe_nonzero (bitsize))
+		&& may_ne (bitsize, 0))
 	      {
 		op0 = XEXP (op0, 0);
 		mode2 = GET_MODE (op0);
@@ -10627,8 +10607,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 			      GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 0))))
 		     && must_eq (bitsize,
 				 GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 1))))
-		     && maybe_nonzero (bitpos)
-		     && maybe_nonzero (bitsize))
+		     && may_ne (bitpos, 0)
+		     && may_ne (bitsize, 0))
 	      {
 		op0 = XEXP (op0, 1);
 		bitpos = 0;
@@ -10683,7 +10663,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 
 	    /* See the comment in expand_assignment for the rationale.  */
 	    if (mode1 != VOIDmode
-		&& maybe_nonzero (bitpos)
+		&& may_ne (bitpos, 0)
 		&& may_gt (bitsize, 0)
 		&& multiple_p (bitpos, BITS_PER_UNIT, &bytepos)
 		&& multiple_p (bitpos, bitsize)
@@ -10701,7 +10681,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	/* If OFFSET is making OP0 more aligned than BIGGEST_ALIGNMENT,
 	   record its alignment as BIGGEST_ALIGNMENT.  */
 	if (MEM_P (op0)
-	    && known_zero (bitpos)
+	    && must_eq (bitpos, 0)
 	    && offset != 0
 	    && is_aligning_offset (offset, tem))
 	  set_mem_align (op0, BIGGEST_ALIGNMENT);
@@ -10775,7 +10755,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 
 		/* ??? Unlike the similar test a few lines below, this one is
 		   very likely obsolete.  */
-		if (known_zero (bitsize))
+		if (must_eq (bitsize, 0))
 		  return target;
 
 		/* In this case, BITPOS must start at a byte boundary and
@@ -10798,7 +10778,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	       with SHIFT_COUNT_TRUNCATED == 0 and garbage otherwise.  Always
 	       return 0 for the sake of consistency, as reading a zero-sized
 	       bitfield is valid in Ada and the value is fully specified.  */
-	    if (known_zero (bitsize))
+	    if (must_eq (bitsize, 0))
 	      return const0_rtx;
 
 	    op0 = validize_mem (op0);
diff --git a/gcc/file-find.c b/gcc/file-find.c
index b072a4993d7..b5a1fe8494e 100644
--- a/gcc/file-find.c
+++ b/gcc/file-find.c
@@ -208,38 +208,3 @@ prefix_from_string (const char *p, struct path_prefix *pprefix)
     }
   free (nstore);
 }
-
-void
-remove_prefix (const char *prefix, struct path_prefix *pprefix)
-{
-  struct prefix_list *remove, **prev, **remove_prev = NULL;
-  int max_len = 0;
-
-  if (pprefix->plist)
-    {
-      prev = &pprefix->plist;
-      for (struct prefix_list *pl = pprefix->plist; pl->next; pl = pl->next)
-	{
-	  if (strcmp (prefix, pl->prefix) == 0)
-	    {
-	      remove = pl;
-	      remove_prev = prev;
-	      continue;
-	    }
-
-	  int l = strlen (pl->prefix);
-	  if (l > max_len)
-	    max_len = l;
-
-	  prev = &pl;
-	}
-
-      if (remove_prev)
-	{
-	  *remove_prev = remove->next;
-	  free (remove);
-	}
-
-      pprefix->max_len = max_len;
-    }
-}
diff --git a/gcc/file-find.h b/gcc/file-find.h
index 8f49a3af273..407feba26e7 100644
--- a/gcc/file-find.h
+++ b/gcc/file-find.h
@@ -41,7 +41,6 @@ extern void find_file_set_debug (bool);
 extern char *find_a_file (struct path_prefix *, const char *, int);
 extern void add_prefix (struct path_prefix *, const char *);
 extern void add_prefix_begin (struct path_prefix *, const char *);
-extern void remove_prefix (const char *prefix, struct path_prefix *);
 extern void prefix_from_env (const char *, struct path_prefix *);
 extern void prefix_from_string (const char *, struct path_prefix *);
 
diff --git a/gcc/final.c b/gcc/final.c
index 8dfaa2ead1b..3a127d9e7e7 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -92,8 +92,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbxout.h"
 #endif
 
-#include "sdbout.h"
-
 /* Most ports that aren't using cc0 don't need to define CC_STATUS_INIT.
    So define a null default for it to save conditionalization later.  */
 #ifndef CC_STATUS_INIT
@@ -696,8 +694,8 @@ compute_alignments (void)
     }
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
   FOR_EACH_BB_FN (bb, cfun)
-    if (bb->frequency > freq_max)
-      freq_max = bb->frequency;
+    if (bb->count.to_frequency (cfun) > freq_max)
+      freq_max = bb->count.to_frequency (cfun);
   freq_threshold = freq_max / PARAM_VALUE (PARAM_ALIGN_THRESHOLD);
 
   if (dump_file)
@@ -715,7 +713,8 @@ compute_alignments (void)
 	  if (dump_file)
 	    fprintf (dump_file,
 		     "BB %4i freq %4i loop %2i loop_depth %2i skipped.\n",
-		     bb->index, bb->frequency, bb->loop_father->num,
+		     bb->index, bb->count.to_frequency (cfun),
+		     bb->loop_father->num,
 		     bb_loop_depth (bb));
 	  continue;
 	}
@@ -733,7 +732,7 @@ compute_alignments (void)
 	{
 	  fprintf (dump_file, "BB %4i freq %4i loop %2i loop_depth"
 		   " %2i fall %4i branch %4i",
-		   bb->index, bb->frequency, bb->loop_father->num,
+		   bb->index, bb->count.to_frequency (cfun), bb->loop_father->num,
 		   bb_loop_depth (bb),
 		   fallthru_frequency, branch_frequency);
 	  if (!bb->loop_father->inner && bb->loop_father->num)
@@ -755,9 +754,10 @@ compute_alignments (void)
 
       if (!has_fallthru
 	  && (branch_frequency > freq_threshold
-	      || (bb->frequency > bb->prev_bb->frequency * 10
-		  && (bb->prev_bb->frequency
-		      <= ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency / 2))))
+	      || (bb->count.to_frequency (cfun) 
+			> bb->prev_bb->count.to_frequency (cfun) * 10
+		  && (bb->prev_bb->count.to_frequency (cfun)
+		      <= ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun) / 2))))
 	{
 	  log = JUMP_ALIGN (label);
 	  if (dump_file)
@@ -1945,8 +1945,6 @@ dump_basic_block_info (FILE *file, rtx_insn *insn, basic_block *start_to_bb,
       edge_iterator ei;
 
       fprintf (file, "%s BLOCK %d", ASM_COMMENT_START, bb->index);
-      if (bb->frequency)
-        fprintf (file, " freq:%d", bb->frequency);
       if (bb->count.initialized_p ())
 	{
           fprintf (file, ", count:");
@@ -2329,8 +2327,7 @@ final_scan_insn (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
 	      TREE_ASM_WRITTEN (NOTE_BLOCK (insn)) = 1;
 	      BLOCK_IN_COLD_SECTION_P (NOTE_BLOCK (insn)) = in_cold_section_p;
 	    }
-	  if (write_symbols == DBX_DEBUG
-	      || write_symbols == SDB_DEBUG)
+	  if (write_symbols == DBX_DEBUG)
 	    {
 	      location_t *locus_ptr
 		= block_nonartificial_location (NOTE_BLOCK (insn));
@@ -2364,8 +2361,7 @@ final_scan_insn (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
 	      gcc_assert (BLOCK_IN_COLD_SECTION_P (NOTE_BLOCK (insn))
 			  == in_cold_section_p);
 	    }
-	  if (write_symbols == DBX_DEBUG
-	      || write_symbols == SDB_DEBUG)
+	  if (write_symbols == DBX_DEBUG)
 	    {
 	      tree outer_block = BLOCK_SUPERCONTEXT (NOTE_BLOCK (insn));
 	      location_t *locus_ptr
@@ -4686,12 +4682,6 @@ rest_of_clean_state (void)
 	}
     }
 
-  /* In case the function was not output,
-     don't leave any temporary anonymous types
-     queued up for sdb output.  */
-  if (SDB_DEBUGGING_INFO && write_symbols == SDB_DEBUG)
-    sdbout_types (NULL_TREE);
-
   flag_rerun_cse_after_global_opts = 0;
   reload_completed = 0;
   epilogue_completed = 0;
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 1f439d35b07..591b74457cd 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -24,7 +24,6 @@ enum debug_info_type
 {
   NO_DEBUG,	    /* Write no debug info.  */
   DBX_DEBUG,	    /* Write BSD .stabs for DBX (using dbxout.c).  */
-  SDB_DEBUG,	    /* Write COFF for (old) SDB (using sdbout.c).  */
   DWARF2_DEBUG,	    /* Write Dwarf v2 debug info (using dwarf2out.c).  */
   XCOFF_DEBUG,	    /* Write IBM/Xcoff debug info (using dbxout.c).  */
   VMS_DEBUG,        /* Write VMS debug info (using vmsdbgout.c).  */
@@ -246,6 +245,7 @@ enum sanitize_code {
   SANITIZE_VPTR = 1UL << 22,
   SANITIZE_BOUNDS_STRICT = 1UL << 23,
   SANITIZE_POINTER_OVERFLOW = 1UL << 24,
+  SANITIZE_BUILTIN = 1UL << 25,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
@@ -254,7 +254,7 @@ enum sanitize_code {
 		       | SANITIZE_NONNULL_ATTRIBUTE
 		       | SANITIZE_RETURNS_NONNULL_ATTRIBUTE
 		       | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR
-		       | SANITIZE_POINTER_OVERFLOW,
+		       | SANITIZE_POINTER_OVERFLOW | SANITIZE_BUILTIN,
   SANITIZE_UNDEFINED_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST
 				  | SANITIZE_BOUNDS_STRICT
 };
@@ -325,4 +325,13 @@ enum gfc_convert
 };
 
 
+/* Control-Flow Protection values.  */
+enum cf_protection_level
+{
+  CF_NONE = 0,
+  CF_BRANCH = 1 << 0,
+  CF_RETURN = 1 << 1,
+  CF_FULL = CF_BRANCH | CF_RETURN,
+  CF_SET = 1 << 2
+};
 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index 7b5e6819381..5d88a356d3b 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -596,6 +596,7 @@ fold_const_call_ss (real_value *result, combined_fn fn,
   switch (fn)
     {
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       return (real_compare (GE_EXPR, arg, &dconst0)
 	      && do_mpfr_arg1 (result, mpfr_sqrt, arg, format));
 
@@ -1179,14 +1180,17 @@ fold_const_call_sss (real_value *result, combined_fn fn,
       return do_mpfr_arg2 (result, mpfr_hypot, arg0, arg1, format);
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       *result = *arg0;
       real_copysign (result, arg1);
       return true;
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
       return do_mpfr_arg2 (result, mpfr_min, arg0, arg1, format);
 
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       return do_mpfr_arg2 (result, mpfr_max, arg0, arg1, format);
 
     CASE_CFN_POW:
@@ -1473,6 +1477,7 @@ fold_const_call_ssss (real_value *result, combined_fn fn,
   switch (fn)
     {
     CASE_CFN_FMA:
+    CASE_CFN_FMA_FN:
       return do_mpfr_arg3 (result, mpfr_fma, arg0, arg1, arg2, format);
 
     case CFN_FMS:
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 72e4f2d4c96..e92b5efed4a 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3593,7 +3593,8 @@ operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
 #undef OP_SAME_WITH_NULL
 }
 
-/* Similar to operand_equal_p, but strip nops first.  */
+/* Similar to operand_equal_p, but see if ARG0 might be a variant of ARG1
+   with a different signedness or a narrower precision.  */
 
 static bool
 operand_equal_for_comparison_p (tree arg0, tree arg1)
@@ -3608,9 +3609,20 @@ operand_equal_for_comparison_p (tree arg0, tree arg1)
   /* Discard any conversions that don't change the modes of ARG0 and ARG1
      and see if the inner values are the same.  This removes any
      signedness comparison, which doesn't matter here.  */
-  STRIP_NOPS (arg0);
-  STRIP_NOPS (arg1);
-  if (operand_equal_p (arg0, arg1, 0))
+  tree op0 = arg0;
+  tree op1 = arg1;
+  STRIP_NOPS (op0);
+  STRIP_NOPS (op1);
+  if (operand_equal_p (op0, op1, 0))
+    return true;
+
+  /* Discard a single widening conversion from ARG1 and see if the inner
+     value is the same as ARG0.  */
+  if (CONVERT_EXPR_P (arg1)
+      && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)))
+      && TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg1, 0)))
+         < TYPE_PRECISION (TREE_TYPE (arg1))
+      && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
     return true;
 
   return false;
@@ -4023,47 +4035,6 @@ invert_truthvalue_loc (location_t loc, tree arg)
 			       : TRUTH_NOT_EXPR,
 			  type, arg);
 }
-
-/* Knowing that ARG0 and ARG1 are both RDIV_EXPRs, simplify a binary operation
-   with code CODE.  This optimization is unsafe.  */
-static tree
-distribute_real_division (location_t loc, enum tree_code code, tree type,
-			  tree arg0, tree arg1)
-{
-  bool mul0 = TREE_CODE (arg0) == MULT_EXPR;
-  bool mul1 = TREE_CODE (arg1) == MULT_EXPR;
-
-  /* (A / C) +- (B / C) -> (A +- B) / C.  */
-  if (mul0 == mul1
-      && operand_equal_p (TREE_OPERAND (arg0, 1),
-		       TREE_OPERAND (arg1, 1), 0))
-    return fold_build2_loc (loc, mul0 ? MULT_EXPR : RDIV_EXPR, type,
-			fold_build2_loc (loc, code, type,
-				     TREE_OPERAND (arg0, 0),
-				     TREE_OPERAND (arg1, 0)),
-			TREE_OPERAND (arg0, 1));
-
-  /* (A / C1) +- (A / C2) -> A * (1 / C1 +- 1 / C2).  */
-  if (operand_equal_p (TREE_OPERAND (arg0, 0),
-		       TREE_OPERAND (arg1, 0), 0)
-      && TREE_CODE (TREE_OPERAND (arg0, 1)) == REAL_CST
-      && TREE_CODE (TREE_OPERAND (arg1, 1)) == REAL_CST)
-    {
-      REAL_VALUE_TYPE r0, r1;
-      r0 = TREE_REAL_CST (TREE_OPERAND (arg0, 1));
-      r1 = TREE_REAL_CST (TREE_OPERAND (arg1, 1));
-      if (!mul0)
-	real_arithmetic (&r0, RDIV_EXPR, &dconst1, &r0);
-      if (!mul1)
-        real_arithmetic (&r1, RDIV_EXPR, &dconst1, &r1);
-      real_arithmetic (&r0, code, &r0, &r1);
-      return fold_build2_loc (loc, MULT_EXPR, type,
-			  TREE_OPERAND (arg0, 0),
-			  build_real (type, r0));
-    }
-
-  return NULL_TREE;
-}
 
 /* Return a BIT_FIELD_REF of type TYPE to refer to BITSIZE bits of INNER
    starting at BITPOS.  The field is unsigned if UNSIGNEDP is nonzero
@@ -4107,7 +4078,7 @@ make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type,
 			 build_fold_addr_expr (inner),
 			 build_int_cst (ptr_type_node, 0));
 
-  if (known_zero (bitpos) && !reversep)
+  if (must_eq (bitpos, 0) && !reversep)
     {
       tree size = TYPE_SIZE (TREE_TYPE (inner));
       if ((INTEGRAL_TYPE_P (TREE_TYPE (inner))
@@ -4250,21 +4221,20 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
 		      size_int (nbitsize - lbitsize - lbitpos));
 
   if (! const_p)
-    /* If not comparing with constant, just rework the comparison
-       and return.  */
-    return fold_build2_loc (loc, code, compare_type,
-			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, linner, lhs,
-							 unsigned_type,
-							 nbitsize, nbitpos,
-							 1, lreversep),
-				     mask),
-			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, rinner, rhs,
-							 unsigned_type,
-							 nbitsize, nbitpos,
-							 1, rreversep),
-				     mask));
+    {
+      if (nbitpos < 0)
+	return 0;
+
+      /* If not comparing with constant, just rework the comparison
+	 and return.  */
+      tree t1 = make_bit_field_ref (loc, linner, lhs, unsigned_type,
+				    nbitsize, nbitpos, 1, lreversep);
+      t1 = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, t1, mask);
+      tree t2 = make_bit_field_ref (loc, rinner, rhs, unsigned_type,
+				    nbitsize, nbitpos, 1, rreversep);
+      t2 = fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, t2, mask);
+      return fold_build2_loc (loc, code, compare_type, t1, t2);
+    }
 
   /* Otherwise, we are handling the constant case.  See if the constant is too
      big for the field.  Warn and return a tree for 0 (false) if so.  We do
@@ -4295,6 +4265,9 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
 	}
     }
 
+  if (nbitpos < 0)
+    return 0;
+
   /* Single-bit compares should always be against zero.  */
   if (lbitsize == 1 && ! integer_zerop (rhs))
     {
@@ -6117,7 +6090,10 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 	 results.  */
       ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask);
       lr_mask = const_binop (BIT_IOR_EXPR, lr_mask, rr_mask);
-      if (lnbitsize == rnbitsize && xll_bitpos == xlr_bitpos)
+      if (lnbitsize == rnbitsize
+	  && xll_bitpos == xlr_bitpos
+	  && lnbitpos >= 0
+	  && rnbitpos >= 0)
 	{
 	  lhs = make_bit_field_ref (loc, ll_inner, ll_arg,
 				    lntype, lnbitsize, lnbitpos,
@@ -6141,10 +6117,14 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 	 Note that we still must mask the lhs/rhs expressions.  Furthermore,
 	 the mask must be shifted to account for the shift done by
 	 make_bit_field_ref.  */
-      if ((ll_bitsize + ll_bitpos == rl_bitpos
-	   && lr_bitsize + lr_bitpos == rr_bitpos)
-	  || (ll_bitpos == rl_bitpos + rl_bitsize
-	      && lr_bitpos == rr_bitpos + rr_bitsize))
+      if (((ll_bitsize + ll_bitpos == rl_bitpos
+	    && lr_bitsize + lr_bitpos == rr_bitpos)
+	   || (ll_bitpos == rl_bitpos + rl_bitsize
+	       && lr_bitpos == rr_bitpos + rr_bitsize))
+	  && ll_bitpos >= 0
+	  && rl_bitpos >= 0
+	  && lr_bitpos >= 0
+	  && rr_bitpos >= 0)
 	{
 	  tree type;
 
@@ -6213,6 +6193,9 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 	}
     }
 
+  if (lnbitpos < 0)
+    return 0;
+
   /* Construct the expression we will return.  First get the component
      reference we will make.  Unless the mask is all ones the width of
      that field, perform the mask operation.  Then compare with the
@@ -7977,7 +7960,7 @@ fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
 	     the address of the base if it has the same base type
 	     as the result type and the pointer type is unqualified.  */
 	  if (!offset
-	      && known_zero (bitpos)
+	      && must_eq (bitpos, 0)
 	      && (TYPE_MAIN_VARIANT (TREE_TYPE (type))
 		  == TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	      && TYPE_QUALS (type) == TYPE_UNQUALIFIED)
@@ -8501,7 +8484,7 @@ pointer_may_wrap_p (tree base, tree offset, poly_int64 bitpos)
   if (!total.to_uhwi (&total_hwi)
       || !poly_int_tree_p (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (base))),
 			   &size)
-      || known_zero (size))
+      || must_eq (size, 0U))
     return true;
 
   if (must_le (total_hwi, size))
@@ -8512,7 +8495,7 @@ pointer_may_wrap_p (tree base, tree offset, poly_int64 bitpos)
   if (TREE_CODE (base) == ADDR_EXPR
       && poly_int_tree_p (TYPE_SIZE_UNIT (TREE_TYPE (TREE_OPERAND (base, 0))),
 			  &size)
-      && maybe_nonzero (size)
+      && may_ne (size, 0U)
       && must_le (total_hwi, size))
     return false;
 
@@ -8794,7 +8777,7 @@ fold_comparison (location_t loc, enum tree_code code, tree type,
 		    eliminated.  When ptr is null, although the -> expression
 		    is strictly speaking invalid, GCC retains it as a matter
 		    of QoI.  See PR c/44555. */
-		 && (offset0 == NULL_TREE && known_nonzero (bitpos0)))
+		 && (offset0 == NULL_TREE && must_ne (bitpos0, 0)))
 		|| CONSTANT_CLASS_P (base0))
 	       && indirect_base0
 	       /* The caller guarantees that when one of the arguments is
@@ -9674,12 +9657,6 @@ fold_binary_loc (location_t loc,
 		}
 	    }
 
-	  if (flag_unsafe_math_optimizations
-	      && (TREE_CODE (arg0) == RDIV_EXPR || TREE_CODE (arg0) == MULT_EXPR)
-	      && (TREE_CODE (arg1) == RDIV_EXPR || TREE_CODE (arg1) == MULT_EXPR)
-	      && (tem = distribute_real_division (loc, code, type, arg0, arg1)))
-	    return tem;
-
           /* Convert a + (b*c + d*e) into (a + b*c) + d*e.
              We associate floats only if the user has specified
              -fassociative-math.  */
@@ -10079,13 +10056,6 @@ fold_binary_loc (location_t loc,
 	    return tem;
 	}
 
-      if (FLOAT_TYPE_P (type)
-	  && flag_unsafe_math_optimizations
-	  && (TREE_CODE (arg0) == RDIV_EXPR || TREE_CODE (arg0) == MULT_EXPR)
-	  && (TREE_CODE (arg1) == RDIV_EXPR || TREE_CODE (arg1) == MULT_EXPR)
-	  && (tem = distribute_real_division (loc, code, type, arg0, arg1)))
-	return tem;
-
       /* Handle (A1 * C1) - (A2 * C2) with A1, A2 or C1, C2 being the same or
 	 one.  Make sure the type is not saturating and has the signedness of
 	 the stripped operands, as fold_plusminus_mult_expr will re-associate.
@@ -11502,8 +11472,8 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type,
 
          Also try swapping the arguments and inverting the conditional.  */
       if (COMPARISON_CLASS_P (arg0)
-	  && operand_equal_for_comparison_p (TREE_OPERAND (arg0, 0), arg1)
-	  && !HONOR_SIGNED_ZEROS (element_mode (arg1)))
+	  && operand_equal_for_comparison_p (TREE_OPERAND (arg0, 0), op1)
+	  && !HONOR_SIGNED_ZEROS (element_mode (op1)))
 	{
 	  tem = fold_cond_expr_with_comparison (loc, type, arg0, op1, op2);
 	  if (tem)
@@ -13097,6 +13067,7 @@ tree_call_nonnegative_warnv_p (tree type, combined_fn fn, tree arg0, tree arg1,
       return true;
 
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       /* sqrt(-0.0) is -0.0.  */
       if (!HONOR_SIGNED_ZEROS (element_mode (type)))
 	return true;
@@ -13141,14 +13112,17 @@ tree_call_nonnegative_warnv_p (tree type, combined_fn fn, tree arg0, tree arg1,
       return RECURSE (arg0);
 
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       /* True if the 1st OR 2nd arguments are nonnegative.  */
       return RECURSE (arg0) || RECURSE (arg1);
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
       /* True if the 1st AND 2nd arguments are nonnegative.  */
       return RECURSE (arg0) && RECURSE (arg1);
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       /* True if the 2nd argument is nonnegative.  */
       return RECURSE (arg1);
 
@@ -13647,7 +13621,9 @@ integer_valued_real_call_p (combined_fn fn, tree arg0, tree arg1, int depth)
       return true;
 
     CASE_CFN_FMIN:
+    CASE_CFN_FMIN_FN:
     CASE_CFN_FMAX:
+    CASE_CFN_FMAX_FN:
       return RECURSE (arg0) && RECURSE (arg1);
 
     default:
@@ -14732,9 +14708,14 @@ test_vector_folding ()
 static void
 test_vec_duplicate_folding ()
 {
-  tree type = build_vector_type (ssizetype, 4);
-  tree dup5 = build_vec_duplicate_cst (type, ssize_int (5));
-  tree dup3 = build_vec_duplicate_cst (type, ssize_int (3));
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  poly_uint64 nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree type = build_vector_type (ssizetype, nunits);
+  tree dup5 = build_vector_from_val (type, ssize_int (5));
+  tree dup3 = build_vector_from_val (type, ssize_int (3));
 
   tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);
   ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));
@@ -14748,7 +14729,7 @@ test_vec_duplicate_folding ()
   tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));
   ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));
 
-  tree size_vector = build_vector_type (sizetype, 4);
+  tree size_vector = build_vector_type (sizetype, nunits);
   tree size_dup5 = fold_convert (size_vector, dup5);
   ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));
 
@@ -14757,6 +14738,54 @@ test_vec_duplicate_folding ()
   ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));
 }
 
+/* Verify folding of VEC_SERIES_CSTs and VEC_SERIES_EXPRs.  */
+
+static void
+test_vec_series_folding ()
+{
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  poly_uint64 nunits = GET_MODE_NUNITS (vec_mode);
+  if (must_eq (nunits, 1U))
+    nunits = 4;
+
+  tree type = build_vector_type (ssizetype, nunits);
+  tree s5_4 = build_vec_series (type, ssize_int (5), ssize_int (4));
+  tree s3_9 = build_vec_series (type, ssize_int (3), ssize_int (9));
+
+  tree neg_s5_4_a = fold_unary (NEGATE_EXPR, type, s5_4);
+  tree neg_s5_4_b = build_vec_series (type, ssize_int (-5), ssize_int (-4));
+  ASSERT_TRUE (operand_equal_p (neg_s5_4_a, neg_s5_4_b, 0));
+
+  tree s8_s13_a = fold_binary (PLUS_EXPR, type, s5_4, s3_9);
+  tree s8_s13_b = build_vec_series (type, ssize_int (8), ssize_int (13));
+  ASSERT_TRUE (operand_equal_p (s8_s13_a, s8_s13_b, 0));
+
+  tree s2_m5_a = fold_binary (MINUS_EXPR, type, s5_4, s3_9);
+  tree s2_m5_b = build_vec_series (type, ssize_int (2), ssize_int (-5));
+  ASSERT_TRUE (operand_equal_p (s2_m5_a, s2_m5_b, 0));
+
+  tree s11 = build_vector_from_val (type, ssize_int (11));
+  tree s16_4_a = fold_binary (PLUS_EXPR, type, s5_4, s11);
+  tree s16_4_b = fold_binary (PLUS_EXPR, type, s11, s5_4);
+  tree s16_4_c = build_vec_series (type, ssize_int (16), ssize_int (4));
+  ASSERT_TRUE (operand_equal_p (s16_4_a, s16_4_c, 0));
+  ASSERT_TRUE (operand_equal_p (s16_4_b, s16_4_c, 0));
+
+  tree sm6_4_a = fold_binary (MINUS_EXPR, type, s5_4, s11);
+  tree sm6_4_b = build_vec_series (type, ssize_int (-6), ssize_int (4));
+  ASSERT_TRUE (operand_equal_p (sm6_4_a, sm6_4_b, 0));
+
+  tree s6_m4_a = fold_binary (MINUS_EXPR, type, s11, s5_4);
+  tree s6_m4_b = build_vec_series (type, ssize_int (6), ssize_int (-4));
+  ASSERT_TRUE (operand_equal_p (s6_m4_a, s6_m4_b, 0));
+
+  tree s5_4_expr = fold_binary (VEC_SERIES_EXPR, type,
+				ssize_int (5), ssize_int (4));
+  ASSERT_TRUE (operand_equal_p (s5_4_expr, s5_4, 0));
+  ASSERT_FALSE (operand_equal_p (s5_4_expr, s3_9, 0));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -14765,6 +14794,7 @@ fold_const_c_tests ()
   test_arithmetic_folding ();
   test_vector_folding ();
   test_vec_duplicate_folding ();
+  test_vec_series_folding ();
 }
 
 } // namespace selftest
diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index a37d16e51fa..aa43ff4ebff 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,175 @@
+2017-11-04  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81735
+	* trans-decl.c (gfc_trans_deferred_vars): Do a better job of a
+	case where 'tmp' was used unititialized and remove TODO.
+
+2017-11-03  Steven G. Kargl  <kargl@gcc.gnu.org>
+
+	PR fortran/82796
+	* resolve.c (resolve_equivalence): An entity in a common block within
+ 	a module cannot appear in an equivalence statement if the entity is
+	with a pure procedure.
+
+2017-10-31  Jim Wilson  <wilson@tuliptree.org>
+
+	* parse.c (unexpected_eof): Call gcc_unreachable before return.
+
+2017-10-30  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/80850
+	* trans_expr.c (gfc_conv_procedure_call): When passing a class
+	argument to an unlimited polymorphic dummy, it is wrong to cast
+	the passed expression as unlimited, unless it is unlimited. The
+	correct way is to assign to each of the fields and set the _len
+	field to zero.
+
+2017-10-30  Steven G. Kargl   <kargl@gcc.gnu.org>
+
+	* resolve.c (resolve_transfer): Set derived to correct symbol for
+	BT_CLASS.
+
+2017-10-29  Jim Wilson  <wilson@tuliptree.org>
+
+	* invoke.texi: Delete adb and sdb references.
+
+2017-10-28  Andre Vehreschild  <vehre@gcc.gnu.org>
+
+	* check.c (gfc_check_co_reduce): Clarify error message.
+
+2017-10-28  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81758
+	* trans-expr.c (trans_class_vptr_len_assignment): 'vptr_expr'
+	must only be set if the right hand side expression is of type
+	class.
+
+2017-10-27  Steven G. Kargl  <kargl@gcc.gnu.org>
+
+	PR fortran/82620
+	* match.c (gfc_match_allocate): Exit early on syntax error.
+
+2017-10-27  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/56342
+	* simplify.c (is_constant_array_expr): If the expression is
+	a parameter array, call gfc_simplify_expr.
+
+2017-10-25  Bernhard Reutner-Fischer  <aldot@gcc.gnu.org>
+
+	* match.c (gfc_match_type_is): Fix typo in error message.
+
+2017-10-21  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/82586
+	* decl.c (gfc_get_pdt_instance): Remove the error message that
+	the parameter does not have a corresponding component since
+	this is now taken care of when the derived type is resolved. Go
+	straight to error return instead.
+	(gfc_match_formal_arglist): Make the PDT relevant errors
+	immediate so that parsing of the derived type can continue.
+	(gfc_match_derived_decl): Do not check the match status on
+	return from gfc_match_formal_arglist for the same reason.
+	* resolve.c (resolve_fl_derived0): Check that each type
+	parameter has a corresponding component.
+
+	PR fortran/82587
+	* resolve.c (resolve_generic_f): Check that the derived type
+	can be used before resolving the struture constructor.
+
+	PR fortran/82589
+	* symbol.c (check_conflict): Add the conflicts involving PDT
+	KIND and LEN attributes.
+
+2017-10-19  Bernhard Reutner-Fischer  <aldot@gcc.gnu.org>
+
+	* interface.c (check_sym_interfaces, check_uop_interfaces,
+	gfc_check_interfaces): Base interface_name buffer off
+	GFC_MAX_SYMBOL_LEN.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	PR fortran/82568
+	* gfortran.h (gfc_resolve_do_iterator): Add a bool arg.
+	(gfc_resolve_omp_local_vars): New declaration.
+	* openmp.c (omp_current_ctx): Make static.
+	(gfc_resolve_omp_parallel_blocks): Handle EXEC_OMP_TASKLOOP
+	and EXEC_OMP_TASKLOOP_SIMD.
+	(gfc_resolve_do_iterator): Add ADD_CLAUSE argument, if false,
+	don't actually add any clause.  Move omp_current_ctx test
+	earlier.
+	(handle_local_var, gfc_resolve_omp_local_vars): New functions.
+	* resolve.c (gfc_resolve_code): Call gfc_resolve_omp_parallel_blocks
+	instead of just gfc_resolve_omp_do_blocks for EXEC_OMP_TASKLOOP
+	and EXEC_OMP_TASKLOOP_SIMD.
+	(gfc_resolve_code): Adjust gfc_resolve_do_iterator caller.
+	(resolve_codes): Call gfc_resolve_omp_local_vars.
+
+2017-10-19  Bernhard Reutner-Fischer  <aldot@gcc.gnu.org>
+
+	* gfortran.h (gfc_lookup_function_fuzzy): New declaration.
+	(gfc_closest_fuzzy_match): New declaration.
+	(vec_push): New definition.
+	* misc.c (gfc_closest_fuzzy_match): New definition.
+	* resolve.c: Include spellcheck.h.
+	(lookup_function_fuzzy_find_candidates): New static function.
+	(lookup_uop_fuzzy_find_candidates): Likewise.
+	(lookup_uop_fuzzy): Likewise.
+	(resolve_operator) <INTRINSIC_USER>: Call lookup_uop_fuzzy.
+	(gfc_lookup_function_fuzzy): New definition.
+	(resolve_unknown_f): Call gfc_lookup_function_fuzzy.
+	* interface.c (check_interface0): Likewise.
+	(lookup_arg_fuzzy_find_candidates): New static function.
+	(lookup_arg_fuzzy ): Likewise.
+	(compare_actual_formal): Call lookup_arg_fuzzy.
+	* symbol.c: Include spellcheck.h.
+	(lookup_symbol_fuzzy_find_candidates): New static function.
+	(lookup_symbol_fuzzy): Likewise.
+	(gfc_set_default_type): Call lookup_symbol_fuzzy.
+	(lookup_component_fuzzy_find_candidates): New static function.
+	(lookup_component_fuzzy): Likewise.
+	(gfc_find_component): Call lookup_component_fuzzy.
+
+2017-10-18  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/82567
+	* frontend-passes.c (combine_array_constructor): If an array
+	constructor is all constants and has more elements than a small
+	constant, don't convert a*[b,c] to [a*b,a*c] to reduce compilation
+	times.
+
+2017-10-18  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/79795
+	* resolve.c (resovle_symbol): Change gcc_assert to
+	sensible error message.
+
+2017-10-18  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/82550
+	* trans_decl.c (gfc_get_symbol_decl): Procedure symbols that
+	have the 'used_in_submodule' attribute should be processed by
+	'gfc_get_extern_function_decl'.
+
+2017-10-16  Fritz Reese <fritzoreese@gmail.com>
+
+	PR fortran/82511
+	* trans-io.c (transfer_expr): Treat BT_UNION as BT_DERIVED.
+
+2017-10-15  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/82372
+	* fortran/scanner.c (last_error_char):  New global variable.
+	(gfc_scanner_init_1): Set last_error_char to NULL.
+	(gfc_gobble_whitespace): If a character not printable or
+	not newline, issue an error.
+
+2017-10-13  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81048
+	* resolve.c (resolve_symbol): Ensure that derived type array
+	results get default initialization.
+
 2017-10-11  Nathan Sidwell  <nathan@acm.org>
 
 	* cpp.c (gfc_cpp_add_include_path): Update incpath_e names.
diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 681950e782f..759c15adaec 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1731,7 +1731,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 
   if (!gfc_compare_types (&a->ts, &sym->result->ts))
     {
-      gfc_error ("A argument at %L has type %s but the function passed as "
+      gfc_error ("The A argument at %L has type %s but the function passed as "
 		 "OPERATOR at %L returns %s",
 		 &a->where, gfc_typename (&a->ts), &op->where,
 		 gfc_typename (&sym->result->ts));
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 5bf56c4d4b0..1a2d8f004ca 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3242,13 +3242,10 @@ gfc_get_pdt_instance (gfc_actual_arglist *param_list, gfc_symbol **sym,
       param = type_param_name_list->sym;
 
       c1 = gfc_find_component (pdt, param->name, false, true, NULL);
+      /* An error should already have been thrown in resolve.c
+	 (resolve_fl_derived0).  */
       if (!pdt->attr.use_assoc && !c1)
-	{
-	  gfc_error ("The type parameter name list at %L contains a parameter "
-		     "'%qs' , which is not declared as a component of the type",
-		     &pdt->declared_at, param->name);
-	  goto error_return;
-	}
+	goto error_return;
 
       kind_expr = NULL;
       if (!name_seen)
@@ -5984,7 +5981,7 @@ gfc_match_formal_arglist (gfc_symbol *progname, int st_flag,
       /* The name of a program unit can be in a different namespace,
 	 so check for it explicitly.  After the statement is accepted,
 	 the name is checked for especially in gfc_get_symbol().  */
-      if (gfc_new_block != NULL && sym != NULL
+      if (gfc_new_block != NULL && sym != NULL && !typeparam
 	  && strcmp (sym->name, gfc_new_block->name) == 0)
 	{
 	  gfc_error ("Name %qs at %C is the name of the procedure",
@@ -5999,7 +5996,11 @@ gfc_match_formal_arglist (gfc_symbol *progname, int st_flag,
       m = gfc_match_char (',');
       if (m != MATCH_YES)
 	{
-	  gfc_error ("Unexpected junk in formal argument list at %C");
+	  if (typeparam)
+	    gfc_error_now ("Expected parameter list in type declaration "
+			   "at %C");
+	  else
+	    gfc_error ("Unexpected junk in formal argument list at %C");
 	  goto cleanup;
 	}
     }
@@ -6016,8 +6017,12 @@ ok:
 	  for (q = p->next; q; q = q->next)
 	    if (p->sym == q->sym)
 	      {
-		gfc_error ("Duplicate symbol %qs in formal argument list "
-			   "at %C", p->sym->name);
+		if (typeparam)
+		  gfc_error_now ("Duplicate name %qs in parameter "
+				 "list at %C", p->sym->name);
+		else
+		  gfc_error ("Duplicate symbol %qs in formal argument "
+			     "list at %C", p->sym->name);
 
 		m = MATCH_ERROR;
 		goto cleanup;
@@ -9814,9 +9819,9 @@ gfc_match_derived_decl (void)
 
   if (parameterized_type)
     {
-      m = gfc_match_formal_arglist (sym, 0, 0, true);
-      if (m != MATCH_YES)
-	return m;
+      /* Ignore error or mismatches to avoid the component declarations
+	 causing problems later.  */
+      gfc_match_formal_arglist (sym, 0, 0, true);
       m = gfc_match_eos ();
       if (m != MATCH_YES)
 	return m;
diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index ae4fba63b3c..fcfaf9508c2 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -1635,6 +1635,8 @@ combine_array_constructor (gfc_expr *e)
   gfc_constructor *c, *new_c;
   gfc_constructor_base oldbase, newbase;
   bool scalar_first;
+  int n_elem;
+  bool all_const;
 
   /* Array constructors have rank one.  */
   if (e->rank != 1)
@@ -1674,12 +1676,38 @@ combine_array_constructor (gfc_expr *e)
   if (op2->ts.type == BT_CHARACTER)
     return false;
 
-  scalar = create_var (gfc_copy_expr (op2), "constr");
+  /* This might be an expanded constructor with very many constant values. If
+     we perform the operation here, we might end up with a long compile time
+     and actually longer execution time, so a length bound is in order here.
+     If the constructor constains something which is not a constant, it did
+     not come from an expansion, so leave it alone.  */
+
+#define CONSTR_LEN_MAX 4
 
   oldbase = op1->value.constructor;
+
+  n_elem = 0;
+  all_const = true;
+  for (c = gfc_constructor_first (oldbase); c; c = gfc_constructor_next(c))
+    {
+      if (c->expr->expr_type != EXPR_CONSTANT)
+	{
+	  all_const = false;
+	  break;
+	}
+      n_elem += 1;
+    }
+
+  if (all_const && n_elem > CONSTR_LEN_MAX)
+    return false;
+
+#undef CONSTR_LEN_MAX
+
   newbase = NULL;
   e->expr_type = EXPR_ARRAY;
 
+  scalar = create_var (gfc_copy_expr (op2), "constr");
+
   for (c = gfc_constructor_first (oldbase); c;
        c = gfc_constructor_next (c))
     {
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index b5fc1452747..2c2fc636708 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2796,6 +2796,17 @@ void gfc_done_2 (void);
 
 int get_c_kind (const char *, CInteropKind_t *);
 
+const char *gfc_closest_fuzzy_match (const char *, char **);
+static inline void
+vec_push (char **&optr, size_t &osz, const char *elt)
+{
+  /* {auto,}vec.safe_push () replacement.  Don't ask..  */
+  // if (strlen (elt) < 4) return; premature optimization: eliminated by cutoff
+  optr = XRESIZEVEC (char *, optr, osz + 2);
+  optr[osz] = CONST_CAST (char *, elt);
+  optr[++osz] = NULL;
+}
+
 /* options.c */
 unsigned int gfc_option_lang_mask (void);
 void gfc_init_options_struct (struct gcc_options *);
@@ -3103,7 +3114,8 @@ void gfc_free_omp_declare_simd_list (gfc_omp_declare_simd *);
 void gfc_free_omp_udr (gfc_omp_udr *);
 gfc_omp_udr *gfc_omp_udr_find (gfc_symtree *, gfc_typespec *);
 void gfc_resolve_omp_directive (gfc_code *, gfc_namespace *);
-void gfc_resolve_do_iterator (gfc_code *, gfc_symbol *);
+void gfc_resolve_do_iterator (gfc_code *, gfc_symbol *, bool);
+void gfc_resolve_omp_local_vars (gfc_namespace *);
 void gfc_resolve_omp_parallel_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_omp_do_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_omp_declare_simd (gfc_namespace *);
@@ -3228,6 +3240,7 @@ bool gfc_type_is_extensible (gfc_symbol *);
 bool gfc_resolve_intrinsic (gfc_symbol *, locus *);
 bool gfc_explicit_interface_required (gfc_symbol *, char *, int);
 extern int gfc_do_concurrent_flag;
+const char* gfc_lookup_function_fuzzy (const char *, gfc_symtree *);
 
 
 /* array.c */
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index f8ef33fc778..9f0fcc82f24 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -1793,13 +1793,27 @@ check_interface0 (gfc_interface *p, const char *interface_name)
 	   || !p->sym->attr.if_source)
 	  && !gfc_fl_struct (p->sym->attr.flavor))
 	{
+	  const char *guessed
+	    = gfc_lookup_function_fuzzy (p->sym->name, p->sym->ns->sym_root);
+
 	  if (p->sym->attr.external)
-	    gfc_error ("Procedure %qs in %s at %L has no explicit interface",
-		       p->sym->name, interface_name, &p->sym->declared_at);
+	    if (guessed)
+	      gfc_error ("Procedure %qs in %s at %L has no explicit interface"
+			 "; did you mean %qs?",
+			 p->sym->name, interface_name, &p->sym->declared_at,
+			 guessed);
+	    else
+	      gfc_error ("Procedure %qs in %s at %L has no explicit interface",
+			 p->sym->name, interface_name, &p->sym->declared_at);
 	  else
-	    gfc_error ("Procedure %qs in %s at %L is neither function nor "
-		       "subroutine", p->sym->name, interface_name,
-		      &p->sym->declared_at);
+	    if (guessed)
+	      gfc_error ("Procedure %qs in %s at %L is neither function nor "
+			 "subroutine; did you mean %qs?", p->sym->name,
+			interface_name, &p->sym->declared_at, guessed);
+	    else
+	      gfc_error ("Procedure %qs in %s at %L is neither function nor "
+			 "subroutine", p->sym->name, interface_name,
+			&p->sym->declared_at);
 	  return true;
 	}
 
@@ -1904,7 +1918,7 @@ check_interface1 (gfc_interface *p, gfc_interface *q0,
 static void
 check_sym_interfaces (gfc_symbol *sym)
 {
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("generic interface ''")];
   gfc_interface *p;
 
   if (sym->ns != gfc_current_ns)
@@ -1941,7 +1955,7 @@ check_sym_interfaces (gfc_symbol *sym)
 static void
 check_uop_interfaces (gfc_user_op *uop)
 {
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("operator interface ''")];
   gfc_user_op *uop2;
   gfc_namespace *ns;
 
@@ -2018,7 +2032,7 @@ void
 gfc_check_interfaces (gfc_namespace *ns)
 {
   gfc_namespace *old_ns, *ns2;
-  char interface_name[100];
+  char interface_name[GFC_MAX_SYMBOL_LEN + sizeof("intrinsic '' operator")];
   int i;
 
   old_ns = gfc_current_ns;
@@ -2778,6 +2792,31 @@ is_procptr_result (gfc_expr *expr)
 }
 
 
+/* Recursively append candidate argument ARG to CANDIDATES.  Store the
+   number of total candidates in CANDIDATES_LEN.  */
+
+static void
+lookup_arg_fuzzy_find_candidates (gfc_formal_arglist *arg,
+				  char **&candidates,
+				  size_t &candidates_len)
+{
+  for (gfc_formal_arglist *p = arg; p && p->sym; p = p->next)
+    vec_push (candidates, candidates_len, p->sym->name);
+}
+
+
+/* Lookup argument ARG fuzzily, taking names in ARGUMENTS into account.  */
+
+static const char*
+lookup_arg_fuzzy (const char *arg, gfc_formal_arglist *arguments)
+{
+  char **candidates = NULL;
+  size_t candidates_len = 0;
+  lookup_arg_fuzzy_find_candidates (arguments, candidates, candidates_len);
+  return gfc_closest_fuzzy_match (arg, candidates);
+}
+
+
 /* Given formal and actual argument lists, see if they are compatible.
    If they are compatible, the actual argument list is sorted to
    correspond with the formal list, and elements for missing optional
@@ -2831,8 +2870,16 @@ compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
 	  if (f == NULL)
 	    {
 	      if (where)
-		gfc_error ("Keyword argument %qs at %L is not in "
-			   "the procedure", a->name, &a->expr->where);
+		{
+		  const char *guessed = lookup_arg_fuzzy (a->name, formal);
+		  if (guessed)
+		    gfc_error ("Keyword argument %qs at %L is not in "
+			       "the procedure; did you mean %qs?",
+			       a->name, &a->expr->where, guessed);
+		  else
+		    gfc_error ("Keyword argument %qs at %L is not in "
+			       "the procedure", a->name, &a->expr->where);
+		}
 	      return false;
 	    }
 
@@ -3552,8 +3599,15 @@ gfc_procedure_use (gfc_symbol *sym, gfc_actual_arglist **ap, locus *where)
     {
       if (sym->ns->has_implicit_none_export && sym->attr.proc == PROC_UNKNOWN)
 	{
-	  gfc_error ("Procedure %qs called at %L is not explicitly declared",
-		     sym->name, where);
+	  const char *guessed
+	    = gfc_lookup_function_fuzzy (sym->name, sym->ns->sym_root);
+	  if (guessed)
+	    gfc_error ("Procedure %qs called at %L is not explicitly declared"
+		       "; did you mean %qs?",
+		       sym->name, where, guessed);
+	  else
+	    gfc_error ("Procedure %qs called at %L is not explicitly declared",
+		       sym->name, where);
 	  return false;
 	}
       if (warn_implicit_interface)
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 8892d501d58..261f2535bb5 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -41,7 +41,7 @@ remainder.
 @c man end
 @c man begin SEEALSO
 gpl(7), gfdl(7), fsf-funding(7),
-cpp(1), gcov(1), gcc(1), as(1), ld(1), gdb(1), adb(1), dbx(1), sdb(1)
+cpp(1), gcov(1), gcc(1), as(1), ld(1), gdb(1), dbx(1)
 and the Info entries for @file{gcc}, @file{cpp}, @file{gfortran}, @file{as},
 @file{ld}, @file{binutils} and @file{gdb}.
 @c man end
diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 4d657e0bc34..dcabe269e61 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -3968,7 +3968,10 @@ gfc_match_allocate (void)
   saw_stat = saw_errmsg = saw_source = saw_mold = saw_deferred = false;
 
   if (gfc_match_char ('(') != MATCH_YES)
-    goto syntax;
+    {
+      gfc_syntax_error (ST_ALLOCATE);
+      return MATCH_ERROR;
+    }
 
   /* Match an optional type-spec.  */
   old_locus = gfc_current_locus;
@@ -6204,7 +6207,7 @@ gfc_match_type_is (void)
   return MATCH_YES;
 
 syntax:
-  gfc_error ("Ssyntax error in TYPE IS specification at %C");
+  gfc_error ("Syntax error in TYPE IS specification at %C");
 
 cleanup:
   if (c != NULL)
diff --git a/gcc/fortran/misc.c b/gcc/fortran/misc.c
index a2c199efb56..f47d111ba47 100644
--- a/gcc/fortran/misc.c
+++ b/gcc/fortran/misc.c
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "gfortran.h"
+#include "spellcheck.h"
 
 
 /* Initialize a typespec to unknown.  */
@@ -280,3 +281,43 @@ get_c_kind(const char *c_kind_name, CInteropKind_t kinds_table[])
 
   return ISOCBINDING_INVALID;
 }
+
+
+/* For a given name TYPO, determine the best candidate from CANDIDATES
+   perusing Levenshtein distance.  Frees CANDIDATES before returning.  */
+
+const char *
+gfc_closest_fuzzy_match (const char *typo, char **candidates)
+{
+  /* Determine closest match.  */
+  const char *best = NULL;
+  char **cand = candidates;
+  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
+  const size_t tl = strlen (typo);
+
+  while (cand && *cand)
+    {
+      edit_distance_t dist = levenshtein_distance (typo, tl, *cand,
+	  strlen (*cand));
+      if (dist < best_distance)
+	{
+	   best_distance = dist;
+	   best = *cand;
+	}
+      cand++;
+    }
+  /* If more than half of the letters were misspelled, the suggestion is
+     likely to be meaningless.  */
+  if (best)
+    {
+      unsigned int cutoff = MAX (tl, strlen (best)) / 2;
+
+      if (best_distance > cutoff)
+	{
+	  XDELETEVEC (candidates);
+	  return NULL;
+	}
+      XDELETEVEC (candidates);
+    }
+  return best;
+}
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index c5e00888bbe..2606323d42a 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -5262,7 +5262,7 @@ resolve_omp_atomic (gfc_code *code)
 }
 
 
-struct fortran_omp_context
+static struct fortran_omp_context
 {
   gfc_code *code;
   hash_set<gfc_symbol *> *sharing_clauses;
@@ -5345,6 +5345,8 @@ gfc_resolve_omp_parallel_blocks (gfc_code *code, gfc_namespace *ns)
     case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO:
     case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
     case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
+    case EXEC_OMP_TASKLOOP:
+    case EXEC_OMP_TASKLOOP_SIMD:
     case EXEC_OMP_TEAMS_DISTRIBUTE:
     case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO:
     case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
@@ -5390,8 +5392,11 @@ gfc_omp_restore_state (struct gfc_omp_saved_state *state)
    construct, where they are predetermined private.  */
 
 void
-gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
+gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym, bool add_clause)
 {
+  if (omp_current_ctx == NULL)
+    return;
+
   int i = omp_current_do_collapse;
   gfc_code *c = omp_current_do_code;
 
@@ -5410,9 +5415,6 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
       c = c->block->next;
     }
 
-  if (omp_current_ctx == NULL)
-    return;
-
   /* An openacc context may represent a data clause.  Abort if so.  */
   if (!omp_current_ctx->is_openmp && !oacc_is_loop (omp_current_ctx->code))
     return;
@@ -5421,7 +5423,7 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
       && omp_current_ctx->sharing_clauses->contains (sym))
     return;
 
-  if (! omp_current_ctx->private_iterators->add (sym))
+  if (! omp_current_ctx->private_iterators->add (sym) && add_clause)
     {
       gfc_omp_clauses *omp_clauses = omp_current_ctx->code->ext.omp_clauses;
       gfc_omp_namelist *p;
@@ -5433,6 +5435,22 @@ gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
     }
 }
 
+static void
+handle_local_var (gfc_symbol *sym)
+{
+  if (sym->attr.flavor != FL_VARIABLE
+      || sym->as != NULL
+      || (sym->ts.type != BT_INTEGER && sym->ts.type != BT_REAL))
+    return;
+  gfc_resolve_do_iterator (sym->ns->code, sym, false);
+}
+
+void
+gfc_resolve_omp_local_vars (gfc_namespace *ns)
+{
+  if (omp_current_ctx)
+    gfc_traverse_ns (ns, handle_local_var);
+}
 
 static void
 resolve_omp_do (gfc_code *code)
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index eb0f92e734b..e4deff9c79e 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -2737,6 +2737,9 @@ unexpected_eof (void)
   gfc_done_2 ();
 
   longjmp (eof_buf, 1);
+
+  /* Avoids build error on systems where longjmp is not declared noreturn.  */
+  gcc_unreachable ();
 }
 
 
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index bd316344813..40c1cd3c96f 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -2694,6 +2694,8 @@ generic:
       if (!gfc_convert_to_structure_constructor (expr, intr->sym, NULL,
 						 NULL, false))
 	return false;
+      if (!gfc_use_derived (expr->ts.u.derived))
+	return false;
       return resolve_structure_cons (expr, 0);
     }
 
@@ -2801,6 +2803,43 @@ resolve_specific_f (gfc_expr *expr)
   return true;
 }
 
+/* Recursively append candidate SYM to CANDIDATES.  Store the number of
+   candidates in CANDIDATES_LEN.  */
+
+static void
+lookup_function_fuzzy_find_candidates (gfc_symtree *sym,
+				       char **&candidates,
+				       size_t &candidates_len)
+{
+  gfc_symtree *p;
+
+  if (sym == NULL)
+    return;
+  if ((sym->n.sym->ts.type != BT_UNKNOWN || sym->n.sym->attr.external)
+      && sym->n.sym->attr.flavor == FL_PROCEDURE)
+    vec_push (candidates, candidates_len, sym->name);
+
+  p = sym->left;
+  if (p)
+    lookup_function_fuzzy_find_candidates (p, candidates, candidates_len);
+
+  p = sym->right;
+  if (p)
+    lookup_function_fuzzy_find_candidates (p, candidates, candidates_len);
+}
+
+
+/* Lookup function FN fuzzily, taking names in SYMROOT into account.  */
+
+const char*
+gfc_lookup_function_fuzzy (const char *fn, gfc_symtree *symroot)
+{
+  char **candidates = NULL;
+  size_t candidates_len = 0;
+  lookup_function_fuzzy_find_candidates (symroot, candidates, candidates_len);
+  return gfc_closest_fuzzy_match (fn, candidates);
+}
+
 
 /* Resolve a procedure call not known to be generic nor specific.  */
 
@@ -2851,8 +2890,15 @@ set_type:
 
       if (ts->type == BT_UNKNOWN)
 	{
-	  gfc_error ("Function %qs at %L has no IMPLICIT type",
-		     sym->name, &expr->where);
+	  const char *guessed
+	    = gfc_lookup_function_fuzzy (sym->name, sym->ns->sym_root);
+	  if (guessed)
+	    gfc_error ("Function %qs at %L has no IMPLICIT type"
+		       "; did you mean %qs?",
+		       sym->name, &expr->where, guessed);
+	  else
+	    gfc_error ("Function %qs at %L has no IMPLICIT type",
+		       sym->name, &expr->where);
 	  return false;
 	}
       else
@@ -3713,6 +3759,46 @@ logical_to_bitwise (gfc_expr *e)
   return e;
 }
 
+/* Recursively append candidate UOP to CANDIDATES.  Store the number of
+   candidates in CANDIDATES_LEN.  */
+static void
+lookup_uop_fuzzy_find_candidates (gfc_symtree *uop,
+				  char **&candidates,
+				  size_t &candidates_len)
+{
+  gfc_symtree *p;
+
+  if (uop == NULL)
+    return;
+
+  /* Not sure how to properly filter here.  Use all for a start.
+     n.uop.op is NULL for empty interface operators (is that legal?) disregard
+     these as i suppose they don't make terribly sense.  */
+
+  if (uop->n.uop->op != NULL)
+    vec_push (candidates, candidates_len, uop->name);
+
+  p = uop->left;
+  if (p)
+    lookup_uop_fuzzy_find_candidates (p, candidates, candidates_len);
+
+  p = uop->right;
+  if (p)
+    lookup_uop_fuzzy_find_candidates (p, candidates, candidates_len);
+}
+
+/* Lookup user-operator OP fuzzily, taking names in UOP into account.  */
+
+static const char*
+lookup_uop_fuzzy (const char *op, gfc_symtree *uop)
+{
+  char **candidates = NULL;
+  size_t candidates_len = 0;
+  lookup_uop_fuzzy_find_candidates (uop, candidates, candidates_len);
+  return gfc_closest_fuzzy_match (op, candidates);
+}
+
+
 /* Resolve an operator expression node.  This can involve replacing the
    operation with a user defined function call.  */
 
@@ -3935,8 +4021,16 @@ resolve_operator (gfc_expr *e)
 
     case INTRINSIC_USER:
       if (e->value.op.uop->op == NULL)
-	sprintf (msg, _("Unknown operator %%<%s%%> at %%L"),
-		 e->value.op.uop->name);
+	{
+	  const char *name = e->value.op.uop->name;
+	  const char *guessed;
+	  guessed = lookup_uop_fuzzy (name, e->value.op.uop->ns->uop_root);
+	  if (guessed)
+	    sprintf (msg, _("Unknown operator %%<%s%%> at %%L; did you mean '%s'?"),
+		name, guessed);
+	  else
+	    sprintf (msg, _("Unknown operator %%<%s%%> at %%L"), name);
+	}
       else if (op2 == NULL)
 	sprintf (msg, _("Operand of user operator %%<%s%%> at %%L is %s"),
 		 e->value.op.uop->name, gfc_typename (&op1->ts));
@@ -9087,7 +9181,7 @@ resolve_transfer (gfc_code *code)
   if (dt && dt->dt_io_kind->value.iokind != M_INQUIRE
       && (ts->type == BT_DERIVED || ts->type == BT_CLASS))
     {
-      if (ts->type == BT_DERIVED)
+      if (ts->type == BT_DERIVED || ts->type == BT_CLASS)
 	derived = ts->u.derived;
       else
 	derived = ts->u.derived->components->ts.u.derived;
@@ -10916,6 +11010,8 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns)
 	    case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
 	    case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
 	    case EXEC_OMP_TASK:
+	    case EXEC_OMP_TASKLOOP:
+	    case EXEC_OMP_TASKLOOP_SIMD:
 	    case EXEC_OMP_TEAMS:
 	    case EXEC_OMP_TEAMS_DISTRIBUTE:
 	    case EXEC_OMP_TEAMS_DISTRIBUTE_PARALLEL_DO:
@@ -10931,8 +11027,6 @@ gfc_resolve_code (gfc_code *code, gfc_namespace *ns)
 	    case EXEC_OMP_DO_SIMD:
 	    case EXEC_OMP_SIMD:
 	    case EXEC_OMP_TARGET_SIMD:
-	    case EXEC_OMP_TASKLOOP:
-	    case EXEC_OMP_TASKLOOP_SIMD:
 	      gfc_resolve_omp_do_blocks (code, ns);
 	      break;
 	    case EXEC_SELECT_TYPE:
@@ -11193,7 +11287,8 @@ start:
 	    {
 	      gfc_iterator *iter = code->ext.iterator;
 	      if (gfc_resolve_iterator (iter, true, false))
-		gfc_resolve_do_iterator (code, iter->var->symtree->n.sym);
+		gfc_resolve_do_iterator (code, iter->var->symtree->n.sym,
+					 true);
 	    }
 	  break;
 
@@ -13844,6 +13939,7 @@ resolve_fl_derived0 (gfc_symbol *sym)
 {
   gfc_symbol* super_type;
   gfc_component *c;
+  gfc_formal_arglist *f;
   bool success;
 
   if (sym->attr.unlimited_polymorphic)
@@ -13896,6 +13992,22 @@ resolve_fl_derived0 (gfc_symbol *sym)
       && !ensure_not_abstract (sym, super_type))
     return false;
 
+  /* Check that there is a component for every PDT parameter.  */
+  if (sym->attr.pdt_template)
+    {
+      for (f = sym->formal; f; f = f->next)
+	{
+	  c = gfc_find_component (sym, f->sym->name, true, true, NULL);
+	  if (c == NULL)
+	    {
+	      gfc_error ("Parameterized type %qs does not have a component "
+			 "corresponding to parameter %qs at %L", sym->name,
+			 f->sym->name, &sym->declared_at);
+	      break;
+	    }
+	}
+    }
+
   /* Add derived type to the derived type list.  */
   add_dt_to_dt_list (sym);
 
@@ -14403,7 +14515,23 @@ resolve_symbol (gfc_symbol *sym)
 
   if (as)
     {
-      gcc_assert (as->type != AS_IMPLIED_SHAPE);
+      /* If AS_IMPLIED_SHAPE makes it to here, it must be a bad
+	 specification expression.  */
+      if (as->type == AS_IMPLIED_SHAPE)
+	{
+	  int i;
+	  for (i=0; i<as->rank; i++)
+	    {
+	      if (as->lower[i] != NULL && as->upper[i] == NULL)
+		{
+		  gfc_error ("Bad specification for assumed size array at %L",
+			     &as->lower[i]->where);
+		  return;
+		}
+	    }
+	  gcc_unreachable();
+	}
+
       if (((as->type == AS_ASSUMED_SIZE && !as->cp_was_assumed)
 	   || as->type == AS_ASSUMED_SHAPE)
 	  && !sym->attr.dummy && !sym->attr.select_type_temporary)
@@ -14967,7 +15095,12 @@ resolve_symbol (gfc_symbol *sym)
 
       if ((!a->save && !a->dummy && !a->pointer
 	   && !a->in_common && !a->use_assoc
-	   && !a->result && !a->function)
+	   && a->referenced
+	   && !((a->function || a->result)
+		&& (!a->dimension
+		    || sym->ts.u.derived->attr.alloc_comp
+		    || sym->ts.u.derived->attr.pointer_comp))
+	   && !(a->function && sym != sym->result))
 	  || (a->dummy && a->intent == INTENT_OUT && !a->pointer))
 	apply_default_init (sym);
       else if (a->function && sym->result && a->access != ACCESS_PRIVATE
@@ -15803,9 +15936,22 @@ resolve_equivalence (gfc_equiv *eq)
 	  && sym->ns->proc_name->attr.pure
 	  && sym->attr.in_common)
 	{
-	  gfc_error ("Common block member %qs at %L cannot be an EQUIVALENCE "
-		     "object in the pure procedure %qs",
-		     sym->name, &e->where, sym->ns->proc_name->name);
+	  /* Need to check for symbols that may have entered the pure
+	     procedure via a USE statement.  */
+	  bool saw_sym = false;
+	  if (sym->ns->use_stmts)
+	    {
+	      gfc_use_rename *r;
+	      for (r = sym->ns->use_stmts->rename; r; r = r->next)
+		if (strcmp(r->use_name, sym->name) == 0) saw_sym = true; 
+	    }
+	  else
+	    saw_sym = true;
+
+	  if (saw_sym)
+	    gfc_error ("COMMON block member %qs at %L cannot be an "
+		       "EQUIVALENCE object in the pure procedure %qs",
+		       sym->name, &e->where, sym->ns->proc_name->name);
 	  break;
 	}
 
@@ -16239,6 +16385,7 @@ resolve_codes (gfc_namespace *ns)
   bitmap_obstack_initialize (&labels_obstack);
 
   gfc_resolve_oacc_declare (ns);
+  gfc_resolve_omp_local_vars (ns);
   gfc_resolve_code (ns->code, ns);
 
   bitmap_obstack_release (&labels_obstack);
diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index 82f431da527..49decfac52a 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -80,6 +80,7 @@ static struct gfc_file_change
 size_t file_changes_cur, file_changes_count;
 size_t file_changes_allocated;
 
+static gfc_char_t *last_error_char;
 
 /* Functions dealing with our wide characters (gfc_char_t) and
    sequences of such characters.  */
@@ -269,6 +270,7 @@ gfc_scanner_init_1 (void)
   continue_line = 0;
 
   end_flag = 0;
+  last_error_char = NULL;
 }
 
 
@@ -1700,6 +1702,14 @@ gfc_gobble_whitespace (void)
     }
   while (gfc_is_whitespace (c));
 
+  if (!ISPRINT(c) && c != '\n' && last_error_char != gfc_current_locus.nextc)
+    {
+      char buf[20];
+      last_error_char = gfc_current_locus.nextc;
+      snprintf (buf, 20, "%2.2X", c);
+      gfc_error_now ("Invalid character 0x%s at %C", buf);
+    }
+
   gfc_current_locus = old_loc;
 }
 
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 169aef1d892..ba010a0aebf 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -227,7 +227,8 @@ convert_boz (gfc_expr *x, int kind)
 }
 
 
-/* Test that the expression is an constant array.  */
+/* Test that the expression is an constant array, simplifying if
+   we are dealing with a parameter array.  */
 
 static bool
 is_constant_array_expr (gfc_expr *e)
@@ -237,6 +238,10 @@ is_constant_array_expr (gfc_expr *e)
   if (e == NULL)
     return true;
 
+  if (e->expr_type == EXPR_VARIABLE && e->rank > 0
+      && e->symtree->n.sym->attr.flavor == FL_PARAMETER)
+    gfc_simplify_expr (e, 1);
+
   if (e->expr_type != EXPR_ARRAY || !gfc_is_constant_expr (e))
     return false;
 
diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 4c109fdfbad..11b6f600103 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -245,6 +245,44 @@ gfc_get_default_type (const char *name, gfc_namespace *ns)
 }
 
 
+/* Recursively append candidate SYM to CANDIDATES.  Store the number of
+   candidates in CANDIDATES_LEN.  */
+
+static void
+lookup_symbol_fuzzy_find_candidates (gfc_symtree *sym,
+				     char **&candidates,
+				     size_t &candidates_len)
+{
+  gfc_symtree *p;
+
+  if (sym == NULL)
+    return;
+
+  if (sym->n.sym->ts.type != BT_UNKNOWN && sym->n.sym->ts.type != BT_PROCEDURE)
+    vec_push (candidates, candidates_len, sym->name);
+  p = sym->left;
+  if (p)
+    lookup_symbol_fuzzy_find_candidates (p, candidates, candidates_len);
+
+  p = sym->right;
+  if (p)
+    lookup_symbol_fuzzy_find_candidates (p, candidates, candidates_len);
+}
+
+
+/* Lookup symbol SYM_NAME fuzzily, taking names in SYMBOL into account.  */
+
+static const char*
+lookup_symbol_fuzzy (const char *sym_name, gfc_symbol *symbol)
+{
+  char **candidates = NULL;
+  size_t candidates_len = 0;
+  lookup_symbol_fuzzy_find_candidates (symbol->ns->sym_root, candidates,
+				       candidates_len);
+  return gfc_closest_fuzzy_match (sym_name, candidates);
+}
+
+
 /* Given a pointer to a symbol, set its type according to the first
    letter of its name.  Fails if the letter in question has no default
    type.  */
@@ -263,8 +301,14 @@ gfc_set_default_type (gfc_symbol *sym, int error_flag, gfc_namespace *ns)
     {
       if (error_flag && !sym->attr.untyped)
 	{
-	  gfc_error ("Symbol %qs at %L has no IMPLICIT type",
-		     sym->name, &sym->declared_at);
+	  const char *guessed = lookup_symbol_fuzzy (sym->name, sym);
+	  if (guessed)
+	    gfc_error ("Symbol %qs at %L has no IMPLICIT type"
+		       "; did you mean %qs?",
+		       sym->name, &sym->declared_at, guessed);
+	  else
+	    gfc_error ("Symbol %qs at %L has no IMPLICIT type",
+		       sym->name, &sym->declared_at);
 	  sym->attr.untyped = 1; /* Ensure we only give an error once.  */
 	}
 
@@ -382,7 +426,8 @@ check_conflict (symbol_attribute *attr, const char *name, locus *where)
     *is_bind_c = "BIND(C)", *procedure = "PROCEDURE",
     *proc_pointer = "PROCEDURE POINTER", *abstract = "ABSTRACT",
     *asynchronous = "ASYNCHRONOUS", *codimension = "CODIMENSION",
-    *contiguous = "CONTIGUOUS", *generic = "GENERIC", *automatic = "AUTOMATIC";
+    *contiguous = "CONTIGUOUS", *generic = "GENERIC", *automatic = "AUTOMATIC",
+    *pdt_len = "LEN", *pdt_kind = "KIND";
   static const char *threadprivate = "THREADPRIVATE";
   static const char *omp_declare_target = "OMP DECLARE TARGET";
   static const char *omp_declare_target_link = "OMP DECLARE TARGET LINK";
@@ -663,6 +708,23 @@ check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (entry, oacc_declare_deviceptr)
   conf (entry, oacc_declare_device_resident)
 
+  conf (pdt_kind, allocatable)
+  conf (pdt_kind, pointer)
+  conf (pdt_kind, dimension)
+  conf (pdt_kind, codimension)
+
+  conf (pdt_len, allocatable)
+  conf (pdt_len, pointer)
+  conf (pdt_len, dimension)
+  conf (pdt_len, codimension)
+
+  if (attr->access == ACCESS_PRIVATE)
+    {
+      a1 = privat;
+      conf2 (pdt_kind);
+      conf2 (pdt_len);
+    }
+
   a1 = gfc_code2string (flavors, attr->flavor);
 
   if (attr->in_namelist
@@ -2336,6 +2398,32 @@ find_union_component (gfc_symbol *un, const char *name,
 }
 
 
+/* Recursively append candidate COMPONENT structures to CANDIDATES.  Store
+   the number of total candidates in CANDIDATES_LEN.  */
+
+static void
+lookup_component_fuzzy_find_candidates (gfc_component *component,
+					char **&candidates,
+					size_t &candidates_len)
+{
+  for (gfc_component *p = component; p; p = p->next)
+    vec_push (candidates, candidates_len, p->name);
+}
+
+
+/* Lookup component MEMBER fuzzily, taking names in COMPONENT into account.  */
+
+static const char*
+lookup_component_fuzzy (const char *member, gfc_component *component)
+{
+  char **candidates = NULL;
+  size_t candidates_len = 0;
+  lookup_component_fuzzy_find_candidates (component, candidates,
+					  candidates_len);
+  return gfc_closest_fuzzy_match (member, candidates);
+}
+
+
 /* Given a derived type node and a component name, try to locate the
    component structure.  Returns the NULL pointer if the component is
    not found or the components are private.  If noaccess is set, no access
@@ -2433,8 +2521,16 @@ gfc_find_component (gfc_symbol *sym, const char *name,
     }
 
   if (p == NULL && !silent)
-    gfc_error ("%qs at %C is not a member of the %qs structure",
-	       name, sym->name);
+    {
+      const char *guessed = lookup_component_fuzzy (name, sym->components);
+      if (guessed)
+	gfc_error ("%qs at %C is not a member of the %qs structure"
+		   "; did you mean %qs?",
+		   name, sym->name, guessed);
+      else
+	gfc_error ("%qs at %C is not a member of the %qs structure",
+		   name, sym->name);
+    }
 
   /* Component was found; build the ultimate component reference. */
   if (p != NULL && ref)
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 019b8035b6f..45d5119236a 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1670,7 +1670,9 @@ gfc_get_symbol_decl (gfc_symbol * sym)
     {
       /* Catch functions. Only used for actual parameters,
 	 procedure pointers and procptr initialization targets.  */
-      if (sym->attr.use_assoc || sym->attr.intrinsic
+      if (sym->attr.use_assoc
+	  || sym->attr.used_in_submodule
+	  || sym->attr.intrinsic
 	  || sym->attr.if_source != IFSRC_DECL)
 	{
 	  decl = gfc_get_extern_function_decl (sym);
@@ -4582,7 +4584,10 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 		    && sym->ts.u.cl->passed_length)
 		tmp = gfc_null_and_pass_deferred_len (sym, &init, &loc);
 	      else
-		gfc_restore_backend_locus (&loc);
+		{
+		  gfc_restore_backend_locus (&loc);
+		  tmp = NULL_TREE;
+		}
 
 	      /* Deallocate when leaving the scope. Nullifying is not
 		 needed.  */
@@ -4634,10 +4639,6 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 		}
 
 	      gfc_add_init_cleanup (block, gfc_finish_block (&init), tmp);
-	      /* TODO find out why this is necessary to stop double calls to
-		 free.  Somebody is reusing the expression in 'tmp' because
-		 it is being used unititialized.  */
-	      tmp = NULL_TREE;
 	    }
 	}
       else if (sym->ts.type == BT_CHARACTER && sym->ts.deferred)
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 4e8bfc5d6f9..1a3e3d45e4c 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -5173,10 +5173,39 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 			}
 		      else
 			{
-			  gfc_add_modify (&parmse.pre, var,
-					  fold_build1_loc (input_location,
-							   VIEW_CONVERT_EXPR,
-							   type, parmse.expr));
+			  /* Since the internal representation of unlimited
+			     polymorphic expressions includes an extra field
+			     that other class objects do not, a cast to the
+			     formal type does not work.  */
+			  if (!UNLIMITED_POLY (e) && UNLIMITED_POLY (fsym))
+			    {
+			      tree efield;
+
+			      /* Set the _data field.  */
+			      tmp = gfc_class_data_get (var);
+			      efield = fold_convert (TREE_TYPE (tmp),
+					gfc_class_data_get (parmse.expr));
+			      gfc_add_modify (&parmse.pre, tmp, efield);
+
+			      /* Set the _vptr field.  */
+			      tmp = gfc_class_vptr_get (var);
+			      efield = fold_convert (TREE_TYPE (tmp),
+					gfc_class_vptr_get (parmse.expr));
+			      gfc_add_modify (&parmse.pre, tmp, efield);
+
+			      /* Set the _len field.  */
+			      tmp = gfc_class_len_get (var);
+			      gfc_add_modify (&parmse.pre, tmp,
+					      build_int_cst (TREE_TYPE (tmp), 0));
+			    }
+			  else
+			    {
+			      tmp = fold_build1_loc (input_location,
+						     VIEW_CONVERT_EXPR,
+						     type, parmse.expr);
+			      gfc_add_modify (&parmse.pre, var, tmp);
+					      ;
+			    }
 			  parmse.expr = gfc_build_addr_expr (NULL_TREE, var);
 			}
 		    }
@@ -8053,7 +8082,7 @@ trans_class_vptr_len_assignment (stmtblock_t *block, gfc_expr * le,
     {
       /* Get the vptr from the rhs expression only, when it is variable.
 	 Functions are expected to be assigned to a temporary beforehand.  */
-      vptr_expr = re->expr_type == EXPR_VARIABLE
+      vptr_expr = (re->expr_type == EXPR_VARIABLE && re->ts.type == BT_CLASS)
 	  ? gfc_find_and_cut_at_last_class_ref (re)
 	  : NULL;
       if (vptr_expr != NULL && vptr_expr->ts.type == BT_CLASS)
diff --git a/gcc/fortran/trans-io.c b/gcc/fortran/trans-io.c
index 026f9a993d2..f3e1f3e4d09 100644
--- a/gcc/fortran/trans-io.c
+++ b/gcc/fortran/trans-io.c
@@ -2404,7 +2404,7 @@ transfer_expr (gfc_se * se, gfc_typespec * ts, tree addr_expr,
     case BT_CLASS:
       if (ts->u.derived->components == NULL)
 	return;
-      if (ts->type == BT_DERIVED || ts->type == BT_CLASS)
+      if (gfc_bt_struct (ts->type) || ts->type == BT_CLASS)
 	{
 	  gfc_symbol *derived;
 	  gfc_symbol *dtio_sub = NULL;
@@ -2438,7 +2438,7 @@ transfer_expr (gfc_se * se, gfc_typespec * ts, tree addr_expr,
 	      function = iocall[IOCALL_X_DERIVED];
 	      break;
 	    }
-	  else if (ts->type == BT_DERIVED)
+	  else if (gfc_bt_struct (ts->type))
 	    {
 	      /* Recurse into the elements of the derived type.  */
 	      expr = gfc_evaluate_now (addr_expr, &se->pre);
diff --git a/gcc/function.c b/gcc/function.c
index 1c94329c063..3846555f963 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -415,7 +415,7 @@ assign_stack_local_1 (machine_mode mode, poly_int64 size,
 		     requested size is 0 or the estimated stack
 		     alignment >= mode alignment.  */
 		  gcc_assert ((kind & ASLK_REDUCE_ALIGN)
-			      || known_zero (size)
+			      || must_eq (size, 0)
 			      || (crtl->stack_alignment_estimated
 				  >= GET_MODE_ALIGNMENT (mode)));
 		  alignment_in_bits = crtl->stack_alignment_estimated;
@@ -430,7 +430,7 @@ assign_stack_local_1 (machine_mode mode, poly_int64 size,
   if (crtl->max_used_stack_slot_alignment < alignment_in_bits)
     crtl->max_used_stack_slot_alignment = alignment_in_bits;
 
-  if (mode != BLKmode || maybe_nonzero (size))
+  if (mode != BLKmode || may_ne (size, 0))
     {
       if (kind & ASLK_RECORD_PAD)
 	{
@@ -976,25 +976,26 @@ assign_temp (tree type_or_decl, int memory_required,
 
   if (mode == BLKmode || memory_required)
     {
-      HOST_WIDE_INT size = int_size_in_bytes (type);
+      poly_int64 size;
       rtx tmp;
 
-      /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
-	 problems with allocating the stack space.  */
-      if (size == 0)
-	size = 1;
-
       /* Unfortunately, we don't yet know how to allocate variable-sized
 	 temporaries.  However, sometimes we can find a fixed upper limit on
 	 the size, so try that instead.  */
-      else if (size == -1)
+      if (!poly_int_tree_p (TYPE_SIZE_UNIT (type), &size))
 	size = max_int_size_in_bytes (type);
 
+      /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
+	 problems with allocating the stack space.  */
+      if (must_eq (size, 0))
+	size = 1;
+
       /* The size of the temporary may be too large to fit into an integer.  */
       /* ??? Not sure this should happen except for user silliness, so limit
 	 this to things that aren't compiler-generated temporaries.  The
 	 rest of the time we'll die in assign_stack_temp_for_type.  */
-      if (decl && size == -1
+      if (decl
+	  && !known_size_p (size)
 	  && TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST)
 	{
 	  error ("size of variable %q+D is too large", decl);
@@ -1573,7 +1574,7 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)
 	 move insn in the initial rtl stream.  */
       new_rtx = instantiate_new_reg (SET_SRC (set), &offset);
       if (new_rtx
-	  && maybe_nonzero (offset)
+	  && may_ne (offset, 0)
 	  && REG_P (SET_DEST (set))
 	  && REGNO (SET_DEST (set)) > LAST_VIRTUAL_REGISTER)
 	{
@@ -1610,7 +1611,7 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)
 	  offset += delta;
 
 	  /* If the sum is zero, then replace with a plain move.  */
-	  if (known_zero (offset)
+	  if (must_eq (offset, 0)
 	      && REG_P (SET_DEST (set))
 	      && REGNO (SET_DEST (set)) > LAST_VIRTUAL_REGISTER)
 	    {
@@ -1688,7 +1689,7 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)
 	  new_rtx = instantiate_new_reg (x, &offset);
 	  if (new_rtx == NULL)
 	    continue;
-	  if (known_zero (offset))
+	  if (must_eq (offset, 0))
 	    x = new_rtx;
 	  else
 	    {
@@ -1713,7 +1714,7 @@ instantiate_virtual_regs_in_insn (rtx_insn *insn)
 	  new_rtx = instantiate_new_reg (SUBREG_REG (x), &offset);
 	  if (new_rtx == NULL)
 	    continue;
-	  if (maybe_nonzero (offset))
+	  if (may_ne (offset, 0))
 	    {
 	      start_sequence ();
 	      new_rtx = expand_simple_binop
@@ -2707,7 +2708,7 @@ assign_parm_find_stack_rtl (tree parm, struct assign_parm_data_one *data)
 	    {
 	      poly_int64 offset = subreg_lowpart_offset (DECL_MODE (parm),
 							 data->promoted_mode);
-	      if (maybe_nonzero (offset))
+	      if (may_ne (offset, 0))
 		set_mem_offset (stack_parm, MEM_OFFSET (stack_parm) - offset);
 	    }
 	}
@@ -3440,7 +3441,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 	  /* ??? This may need a big-endian conversion on sparc64.  */
 	  data->stack_parm
 	    = adjust_address (data->stack_parm, data->nominal_mode, 0);
-	  if (maybe_nonzero (offset) && MEM_OFFSET_KNOWN_P (data->stack_parm))
+	  if (may_ne (offset, 0) && MEM_OFFSET_KNOWN_P (data->stack_parm))
 	    set_mem_offset (data->stack_parm,
 			    MEM_OFFSET (data->stack_parm) + offset);
 	}
@@ -4061,10 +4062,9 @@ gimplify_parameters (void)
 		  DECL_IGNORED_P (addr) = 0;
 		  local = build_fold_indirect_ref (addr);
 
-		  t = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN);
-		  t = build_call_expr (t, 2, DECL_SIZE_UNIT (parm),
-				       size_int (DECL_ALIGN (parm)));
-
+		  t = build_alloca_call_expr (DECL_SIZE_UNIT (parm),
+					      DECL_ALIGN (parm),
+					      max_int_size_in_bytes (type));
 		  /* The call has been built for a variable-sized object.  */
 		  CALL_ALLOCA_FOR_VAR_P (t) = 1;
 		  t = fold_convert (ptr_type, t);
@@ -4731,11 +4731,11 @@ number_blocks (tree fn)
   int n_blocks;
   tree *block_vector;
 
-  /* For SDB and XCOFF debugging output, we start numbering the blocks
+  /* For XCOFF debugging output, we start numbering the blocks
      from 1 within each function, rather than keeping a running
      count.  */
-#if SDB_DEBUGGING_INFO || defined (XCOFF_DEBUGGING_INFO)
-  if (write_symbols == SDB_DEBUG || write_symbols == XCOFF_DEBUG)
+#if defined (XCOFF_DEBUGGING_INFO)
+  if (write_symbols == XCOFF_DEBUG)
     next_block_index = 1;
 #endif
 
@@ -5270,7 +5270,7 @@ expand_function_start (tree subr)
     }
 
   /* The following was moved from init_function_start.
-     The move is supposed to make sdb output more accurate.  */
+     The move was supposed to make sdb output more accurate.  */
   /* Indicate the beginning of the function body,
      as opposed to parm setup.  */
   emit_note (NOTE_INSN_FUNCTION_BEG);
@@ -5461,7 +5461,7 @@ expand_function_end (void)
   do_pending_stack_adjust ();
 
   /* Output a linenumber for the end of the function.
-     SDB depends on this.  */
+     SDB depended on this.  */
   set_curr_insn_location (input_location);
 
   /* Before the return label (if any), clobber the return
diff --git a/gcc/gcc-ar.c b/gcc/gcc-ar.c
index 78d2fc1ad30..d5d80e042e5 100644
--- a/gcc/gcc-ar.c
+++ b/gcc/gcc-ar.c
@@ -194,14 +194,6 @@ main (int ac, char **av)
 #ifdef CROSS_DIRECTORY_STRUCTURE
       real_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
 #endif
-      /* Do not search original location in the same folder.  */
-      char *exe_folder = lrealpath (av[0]);
-      exe_folder[strlen (exe_folder) - strlen (lbasename (exe_folder))] = '\0';
-      char *location = concat (exe_folder, PERSONALITY, NULL);
-
-      if (access (location, X_OK) == 0)
-	remove_prefix (exe_folder, &path);
-
       exe_name = find_a_file (&path, real_exe_name, X_OK);
       if (!exe_name)
 	{
diff --git a/gcc/gcc.c b/gcc/gcc.c
index cec3ed5be5f..43e6d590c25 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -170,9 +170,10 @@ env_manager::restore ()
 
 
 /* By default there is no special suffix for target executables.  */
-/* FIXME: when autoconf is fixed, remove the host check - dj */
-#if defined(TARGET_EXECUTABLE_SUFFIX) && defined(HOST_EXECUTABLE_SUFFIX)
+#ifdef TARGET_EXECUTABLE_SUFFIX
 #define HAVE_TARGET_EXECUTABLE_SUFFIX
+#else
+#define TARGET_EXECUTABLE_SUFFIX ""
 #endif
 
 /* By default there is no special suffix for host executables.  */
@@ -1117,7 +1118,7 @@ static const char *cpp_unique_options =
  %{MMD:-MMD %{!o:%b.d}%{o*:%.d%*}}\
  %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*}\
  %{!E:%{!M:%{!MM:%{!MT:%{!MQ:%{MD|MMD:%{o*:-MQ %*}}}}}}}\
- %{remap} %{g3|ggdb3|gstabs3|gcoff3|gxcoff3|gvms3:-dD}\
+ %{remap} %{g3|ggdb3|gstabs3|gxcoff3|gvms3:-dD}\
  %{!iplugindir*:%{fplugin*:%:find-plugindir()}}\
  %{H} %C %{D*&U*&A*} %{i*} %Z %i\
  %{E|M|MM:%W{o*}}";
@@ -5314,7 +5315,7 @@ do_spec_1 (const char *spec, int inswitch, const char *soft_matched_part)
 	      buf = (char *) alloca (p - q + 1);
 	      strncpy (buf, q, p - q);
 	      buf[p - q] = 0;
-	      inform (0, "%s", _(buf));
+	      inform (UNKNOWN_LOCATION, "%s", _(buf));
 	      if (*p)
 		p++;
 	    }
@@ -8191,7 +8192,8 @@ driver::do_spec_on_infiles () const
 	      else if (compare_debug && debug_check_temp_file[0])
 		{
 		  if (verbose_flag)
-		    inform (0, "recompiling with -fcompare-debug");
+		    inform (UNKNOWN_LOCATION,
+			    "recompiling with -fcompare-debug");
 
 		  compare_debug = -compare_debug;
 		  n_switches = n_switches_debug_check[1];
@@ -8216,7 +8218,7 @@ driver::do_spec_on_infiles () const
 					       debug_check_temp_file[1]));
 
 		  if (verbose_flag)
-		    inform (0, "comparing final insns dumps");
+		    inform (UNKNOWN_LOCATION, "comparing final insns dumps");
 
 		  if (compare_files (debug_check_temp_file))
 		    this_file_error = 1;
diff --git a/gcc/gcov.c b/gcc/gcov.c
index c56bac20278..48bcdc0d4c3 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -33,6 +33,7 @@ along with Gcov; see the file COPYING3.  If not see
 #include "config.h"
 #define INCLUDE_ALGORITHM
 #define INCLUDE_VECTOR
+#define INCLUDE_STRING
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
@@ -40,6 +41,7 @@ along with Gcov; see the file COPYING3.  If not see
 #include "diagnostic.h"
 #include "version.h"
 #include "demangle.h"
+#include "color-macros.h"
 
 #include <getopt.h>
 
@@ -106,9 +108,6 @@ typedef struct arc_info
   /* Loop making arc.  */
   unsigned int cycle : 1;
 
-  /* Next branch on line.  */
-  struct arc_info *line_next;
-
   /* Links to next arc on src and dst lists.  */
   struct arc_info *succ_next;
   struct arc_info *pred_next;
@@ -243,67 +242,109 @@ typedef struct coverage_info
 /* Describes a single line of source. Contains a chain of basic blocks
    with code on it.  */
 
-typedef struct line_info
+struct line_info
 {
+  /* Default constructor.  */
+  line_info ();
+
   /* Return true when NEEDLE is one of basic blocks the line belongs to.  */
   bool has_block (block_t *needle);
 
-  gcov_type count;	   /* execution count */
-  arc_t *branches;	   /* branches from blocks that end on this line.  */
-  block_t *blocks;	   /* blocks which start on this line.
-			      Used in all-blocks mode.  */
+  /* Execution count.  */
+  gcov_type count;
+
+  /* Branches from blocks that end on this line.  */
+  vector<arc_t *> branches;
+
+  /* blocks which start on this line.  Used in all-blocks mode.  */
+  vector<block_t *> blocks;
+
   unsigned exists : 1;
   unsigned unexceptional : 1;
-} line_t;
+  unsigned has_unexecuted_block : 1;
+};
 
-bool
-line_t::has_block (block_t *needle)
+line_info::line_info (): count (0), branches (), blocks (), exists (false),
+  unexceptional (0), has_unexecuted_block (0)
 {
-  for (block_t *n = blocks; n; n = n->chain)
-    if (n == needle)
-      return true;
+}
 
-  return false;
+bool
+line_info::has_block (block_t *needle)
+{
+  return std::find (blocks.begin (), blocks.end (), needle) != blocks.end ();
 }
 
 /* Describes a file mentioned in the block graph.  Contains an array
    of line info.  */
 
-typedef struct source_info
+struct source_info
 {
+  /* Default constructor.  */
+  source_info ();
+
   /* Canonical name of source file.  */
   char *name;
   time_t file_time;
 
-  /* Array of line information.  */
-  line_t *lines;
-  unsigned num_lines;
+  /* Vector of line information.  */
+  vector<line_info> lines;
 
   coverage_t coverage;
 
   /* Functions in this source file.  These are in ascending line
      number order.  */
   function_t *functions;
-} source_t;
+};
+
+source_info::source_info (): name (NULL), file_time (), lines (),
+  coverage (), functions (NULL)
+{
+}
 
-typedef struct name_map
+class name_map
 {
-  char *name;  /* Source file name */
+public:
+  name_map ()
+  {
+  }
+
+  name_map (char *_name, unsigned _src): name (_name), src (_src)
+  {
+  }
+
+  bool operator== (const name_map &rhs) const
+  {
+#if HAVE_DOS_BASED_FILE_SYSTEM
+    return strcasecmp (this->name, rhs.name) == 0;
+#else
+    return strcmp (this->name, rhs.name) == 0;
+#endif
+  }
+
+  bool operator< (const name_map &rhs) const
+  {
+#if HAVE_DOS_BASED_FILE_SYSTEM
+    return strcasecmp (this->name, rhs.name) < 0;
+#else
+    return strcmp (this->name, rhs.name) < 0;
+#endif
+  }
+
+  const char *name;  /* Source file name */
   unsigned src;  /* Source file */
-} name_map_t;
+};
 
 /* Holds a list of function basic block graphs.  */
 
 static function_t *functions;
 static function_t **fn_end = &functions;
 
-static source_t *sources;   /* Array of source files  */
-static unsigned n_sources;  /* Number of sources */
-static unsigned a_sources;  /* Allocated sources */
+/* Vector of source files.  */
+static vector<source_info> sources;
 
-static name_map_t *names;   /* Mapping of file names to sources */
-static unsigned n_names;    /* Number of names */
-static unsigned a_names;    /* Allocated names */
+/* Mapping of file names to sources */
+static vector<name_map> names;
 
 /* This holds data summary information.  */
 
@@ -381,11 +422,19 @@ static int flag_hash_filenames = 0;
 
 static int flag_verbose = 0;
 
+/* Print colored output.  */
+
+static int flag_use_colors = 0;
+
 /* Output count information for every basic block, not merely those
    that contain line number information.  */
 
 static int flag_all_blocks = 0;
 
+/* Output human readable numbers.  */
+
+static int flag_human_readable_numbers = 0;
+
 /* Output summary info for each function.  */
 
 static int flag_function_summary = 0;
@@ -424,8 +473,6 @@ static void print_version (void) ATTRIBUTE_NORETURN;
 static void process_file (const char *);
 static void generate_results (const char *);
 static void create_file_names (const char *);
-static int name_search (const void *, const void *);
-static int name_sort (const void *, const void *);
 static char *canonicalize_name (const char *);
 static unsigned find_source (const char *);
 static function_t *read_graph_file (void);
@@ -437,10 +484,10 @@ static void add_line_counts (coverage_t *, function_t *);
 static void executed_summary (unsigned, unsigned);
 static void function_summary (const coverage_t *, const char *);
 static const char *format_gcov (gcov_type, gcov_type, int);
-static void accumulate_line_counts (source_t *);
-static void output_gcov_file (const char *, source_t *);
+static void accumulate_line_counts (source_info *);
+static void output_gcov_file (const char *, source_info *);
 static int output_branch_count (FILE *, int, const arc_t *);
-static void output_lines (FILE *, const source_t *);
+static void output_lines (FILE *, const source_info *);
 static char *make_gcov_file_name (const char *, const char *);
 static char *mangle_name (const char *, char *);
 static void release_structures (void);
@@ -556,7 +603,7 @@ unblock (const block_t *u, block_vector_t &blocked,
 static loop_type
 circuit (block_t *v, arc_vector_t &path, block_t *start,
 	 block_vector_t &blocked, vector<block_vector_t> &block_lists,
-	 line_t &linfo, int64_t &count)
+	 line_info &linfo, int64_t &count)
 {
   loop_type result = NO_LOOP;
 
@@ -605,7 +652,7 @@ circuit (block_t *v, arc_vector_t &path, block_t *start,
    contains a negative loop, then perform the same function once again.  */
 
 static gcov_type
-get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
+get_cycles_count (line_info &linfo, bool handle_negative_cycles = true)
 {
   /* Note that this algorithm works even if blocks aren't in sorted order.
      Each iteration of the circuit detection is completely independent
@@ -615,12 +662,13 @@ get_cycles_count (line_t &linfo, bool handle_negative_cycles = true)
 
   loop_type result = NO_LOOP;
   gcov_type count = 0;
-  for (block_t *block = linfo.blocks; block; block = block->chain)
+  for (vector<block_t *>::iterator it = linfo.blocks.begin ();
+       it != linfo.blocks.end (); it++)
     {
       arc_vector_t path;
       block_vector_t blocked;
       vector<block_vector_t > block_lists;
-      result |= circuit (block, path, block, blocked, block_lists, linfo,
+      result |= circuit (*it, path, *it, blocked, block_lists, linfo,
 			 count);
     }
 
@@ -655,11 +703,6 @@ main (int argc, char **argv)
   /* Handle response files.  */
   expandargv (&argc, &argv);
 
-  a_names = 10;
-  names = XNEWVEC (name_map_t, a_names);
-  a_sources = 10;
-  sources = XNEWVEC (source_t, a_sources);
-
   argno = process_args (argc, argv);
   if (optind == argc)
     print_usage (true);
@@ -703,6 +746,8 @@ print_usage (int error_p)
   fnotice (file, "  -f, --function-summaries        Output summaries for each function\n");
   fnotice (file, "  -h, --help                      Print this help, then exit\n");
   fnotice (file, "  -i, --intermediate-format       Output .gcov file in intermediate text format\n");
+  fnotice (file, "  -j, --human-readable            Output human readable numbers\n");
+  fnotice (file, "  -k, --use-colors                Emit colored output\n");
   fnotice (file, "  -l, --long-file-names           Use long output file names for included\n\
                                     source files\n");
   fnotice (file, "  -m, --demangled-names           Output demangled function names\n");
@@ -744,6 +789,7 @@ static const struct option options[] =
   { "branch-probabilities", no_argument,       NULL, 'b' },
   { "branch-counts",        no_argument,       NULL, 'c' },
   { "intermediate-format",  no_argument,       NULL, 'i' },
+  { "human-readable",	    no_argument,       NULL, 'j' },
   { "no-output",            no_argument,       NULL, 'n' },
   { "long-file-names",      no_argument,       NULL, 'l' },
   { "function-summaries",   no_argument,       NULL, 'f' },
@@ -756,6 +802,7 @@ static const struct option options[] =
   { "unconditional-branches", no_argument,     NULL, 'u' },
   { "display-progress",     no_argument,       NULL, 'd' },
   { "hash-filenames",	    no_argument,       NULL, 'x' },
+  { "use-colors",	    no_argument,       NULL, 'k' },
   { 0, 0, 0, 0 }
 };
 
@@ -766,7 +813,7 @@ process_args (int argc, char **argv)
 {
   int opt;
 
-  const char *opts = "abcdfhilmno:prs:uvwx";
+  const char *opts = "abcdfhijklmno:prs:uvwx";
   while ((opt = getopt_long (argc, argv, opts, options, NULL)) != -1)
     {
       switch (opt)
@@ -789,6 +836,12 @@ process_args (int argc, char **argv)
 	case 'l':
 	  flag_long_names = 1;
 	  break;
+	case 'j':
+	  flag_human_readable_numbers = 1;
+	  break;
+	case 'k':
+	  flag_use_colors = 1;
+	  break;
 	case 'm':
 	  flag_demangled_names = 1;
 	  break;
@@ -839,28 +892,7 @@ process_args (int argc, char **argv)
 /* Output the result in intermediate format used by 'lcov'.
 
 The intermediate format contains a single file named 'foo.cc.gcov',
-with no source code included. A sample output is
-
-file:foo.cc
-function:5,1,_Z3foov
-function:13,1,main
-function:19,1,_GLOBAL__sub_I__Z3foov
-function:19,1,_Z41__static_initialization_and_destruction_0ii
-lcount:5,1
-lcount:7,9
-lcount:9,8
-lcount:11,1
-file:/.../iostream
-lcount:74,1
-file:/.../basic_ios.h
-file:/.../ostream
-file:/.../ios_base.h
-function:157,0,_ZStorSt12_Ios_IostateS_
-lcount:157,0
-file:/.../char_traits.h
-function:258,0,_ZNSt11char_traitsIcE6lengthEPKc
-lcount:258,0
-...
+with no source code included.
 
 The default gcov outputs multiple files: 'foo.cc.gcov',
 'iostream.gcov', 'ios_base.h.gcov', etc. with source code
@@ -868,10 +900,10 @@ included. Instead the intermediate format here outputs only a single
 file 'foo.cc.gcov' similar to the above example. */
 
 static void
-output_intermediate_file (FILE *gcov_file, source_t *src)
+output_intermediate_file (FILE *gcov_file, source_info *src)
 {
   unsigned line_num;    /* current line number.  */
-  const line_t *line;   /* current line info ptr.  */
+  const line_info *line;   /* current line info ptr.  */
   function_t *fn;       /* current function info ptr. */
 
   fprintf (gcov_file, "file:%s\n", src->name);    /* source file name */
@@ -885,32 +917,32 @@ output_intermediate_file (FILE *gcov_file, source_t *src)
     }
 
   for (line_num = 1, line = &src->lines[line_num];
-       line_num < src->num_lines;
+       line_num < src->lines.size ();
        line_num++, line++)
     {
-      arc_t *arc;
       if (line->exists)
-	fprintf (gcov_file, "lcount:%u,%s\n", line_num,
-		 format_gcov (line->count, 0, -1));
+	fprintf (gcov_file, "lcount:%u,%s,%d\n", line_num,
+		 format_gcov (line->count, 0, -1), line->has_unexecuted_block);
       if (flag_branches)
-	for (arc = line->branches; arc; arc = arc->line_next)
-          {
-            if (!arc->is_unconditional && !arc->is_call_non_return)
-              {
-                const char *branch_type;
-                /* branch:<line_num>,<branch_coverage_type>
-                   branch_coverage_type
-                     : notexec (Branch not executed)
-                     : taken (Branch executed and taken)
-                     : nottaken (Branch executed, but not taken)
-                */
-                if (arc->src->count)
-                  branch_type = (arc->count > 0) ? "taken" : "nottaken";
-                else
-                  branch_type = "notexec";
-                fprintf (gcov_file, "branch:%d,%s\n", line_num, branch_type);
-              }
-          }
+	for (vector<arc_t *>::const_iterator it = line->branches.begin ();
+	     it != line->branches.end (); it++)
+	  {
+	    if (!(*it)->is_unconditional && !(*it)->is_call_non_return)
+	      {
+		const char *branch_type;
+		/* branch:<line_num>,<branch_coverage_type>
+		   branch_coverage_type
+		     : notexec (Branch not executed)
+		     : taken (Branch executed and taken)
+		     : nottaken (Branch executed, but not taken)
+		*/
+		if ((*it)->src->count)
+		  branch_type = ((*it)->count > 0) ? "taken" : "nottaken";
+		else
+		  branch_type = "notexec";
+		fprintf (gcov_file, "branch:%d,%s\n", line_num, branch_type);
+	      }
+	  }
     }
 }
 
@@ -939,7 +971,7 @@ process_file (const char *file_name)
 	  unsigned line = fn->line;
 	  unsigned block_no;
 	  function_t *probe, **prev;
-	  
+
 	  /* Now insert it into the source file's list of
 	     functions. Normally functions will be encountered in
 	     ascending order, so a simple scan is quick.  Note we're
@@ -967,8 +999,8 @@ process_file (const char *file_name)
 		    {
 		      unsigned last_line
 			= block->locations[i].lines.back () + 1;
-		      if (last_line > sources[s].num_lines)
-			sources[s].num_lines = last_line;
+		      if (last_line > sources[s].lines.size ())
+			sources[s].lines.resize (last_line);
 		    }
 		}
 	    }
@@ -987,7 +1019,7 @@ process_file (const char *file_name)
 }
 
 static void
-output_gcov_file (const char *file_name, source_t *src)
+output_gcov_file (const char *file_name, source_info *src)
 {
   char *gcov_file_name = make_gcov_file_name (file_name, src->coverage.name);
 
@@ -1020,14 +1052,8 @@ output_gcov_file (const char *file_name, source_t *src)
 static void
 generate_results (const char *file_name)
 {
-  unsigned ix;
-  source_t *src;
   function_t *fn;
 
-  for (ix = n_sources, src = sources; ix--; src++)
-    if (src->num_lines)
-      src->lines = XCNEWVEC (line_t, src->num_lines);
-
   for (fn = functions; fn; fn = fn->next)
     {
       coverage_t coverage;
@@ -1042,18 +1068,23 @@ generate_results (const char *file_name)
 	}
     }
 
+  name_map needle;
+
   if (file_name)
     {
-      name_map_t *name_map = (name_map_t *)bsearch
-	(file_name, names, n_names, sizeof (*names), name_search);
-      if (name_map)
-	file_name = sources[name_map->src].coverage.name;
+      needle.name = file_name;
+      vector<name_map>::iterator it = std::find (names.begin (), names.end (),
+						 needle);
+      if (it != names.end ())
+	file_name = sources[it->src].coverage.name;
       else
 	file_name = canonicalize_name (file_name);
     }
 
-  for (ix = n_sources, src = sources; ix--; src++)
+  for (vector<source_info>::iterator it = sources.begin ();
+       it != sources.end (); it++)
     {
+      source_info *src = &(*it);
       if (flag_relative_only)
 	{
 	  /* Ignore this source, if it is an absolute path (after
@@ -1088,17 +1119,8 @@ generate_results (const char *file_name)
 static void
 release_structures (void)
 {
-  unsigned ix;
   function_t *fn;
 
-  for (ix = n_sources; ix--;)
-    free (sources[ix].lines);
-  free (sources);
-
-  for (ix = n_names; ix--;)
-    free (names[ix].name);
-  free (names);
-
   while ((fn = functions))
     {
       functions = fn->next;
@@ -1174,90 +1196,45 @@ create_file_names (const char *file_name)
   return;
 }
 
-/* A is a string and B is a pointer to name_map_t.  Compare for file
-   name orderability.  */
-
-static int
-name_search (const void *a_, const void *b_)
-{
-  const char *a = (const char *)a_;
-  const name_map_t *b = (const name_map_t *)b_;
-
-#if HAVE_DOS_BASED_FILE_SYSTEM
-  return strcasecmp (a, b->name);
-#else
-  return strcmp (a, b->name);
-#endif
-}
-
-/* A and B are a pointer to name_map_t.  Compare for file name
-   orderability.  */
-
-static int
-name_sort (const void *a_, const void *b_)
-{
-  const name_map_t *a = (const name_map_t *)a_;
-  return name_search (a->name, b_);
-}
-
 /* Find or create a source file structure for FILE_NAME. Copies
    FILE_NAME on creation */
 
 static unsigned
 find_source (const char *file_name)
 {
-  name_map_t *name_map;
   char *canon;
   unsigned idx;
   struct stat status;
 
   if (!file_name)
     file_name = "<unknown>";
-  name_map = (name_map_t *)bsearch
-    (file_name, names, n_names, sizeof (*names), name_search);
-  if (name_map)
-    {
-      idx = name_map->src;
-      goto check_date;
-    }
 
-  if (n_names + 2 > a_names)
+  name_map needle;
+  needle.name = file_name;
+
+  vector<name_map>::iterator it = std::find (names.begin (), names.end (),
+					     needle);
+  if (it != names.end ())
     {
-      /* Extend the name map array -- we'll be inserting one or two
-	 entries.  */
-      a_names *= 2;
-      name_map = XNEWVEC (name_map_t, a_names);
-      memcpy (name_map, names, n_names * sizeof (*names));
-      free (names);
-      names = name_map;
+      idx = it->src;
+      goto check_date;
     }
 
   /* Not found, try the canonical name. */
   canon = canonicalize_name (file_name);
-  name_map = (name_map_t *) bsearch (canon, names, n_names, sizeof (*names),
-				     name_search);
-  if (!name_map)
+  needle.name = canon;
+  it = std::find (names.begin (), names.end (), needle);
+  if (it == names.end ())
     {
       /* Not found with canonical name, create a new source.  */
-      source_t *src;
-
-      if (n_sources == a_sources)
-	{
-	  a_sources *= 2;
-	  src = XNEWVEC (source_t, a_sources);
-	  memcpy (src, sources, n_sources * sizeof (*sources));
-	  free (sources);
-	  sources = src;
-	}
-
-      idx = n_sources;
+      source_info *src;
 
-      name_map = &names[n_names++];
-      name_map->name = canon;
-      name_map->src = idx;
+      idx = sources.size ();
+      needle = name_map (canon, idx);
+      names.push_back (needle);
 
-      src = &sources[n_sources++];
-      memset (src, 0, sizeof (*src));
+      sources.push_back (source_info ());
+      src = &sources.back ();
       src->name = canon;
       src->coverage.name = src->name;
       if (source_length
@@ -1274,18 +1251,17 @@ find_source (const char *file_name)
 	src->file_time = status.st_mtime;
     }
   else
-    idx = name_map->src;
+    idx = it->src;
 
-  if (name_search (file_name, name_map))
+  needle.name = file_name;
+  if (std::find (names.begin (), names.end (), needle) == names.end ())
     {
       /* Append the non-canonical name.  */
-      name_map = &names[n_names++];
-      name_map->name = xstrdup (file_name);
-      name_map->src = idx;
+      names.push_back (name_map (xstrdup (file_name), idx));
     }
 
   /* Resort the name map.  */
-  qsort (names, n_names, sizeof (*names), name_sort);
+  std::sort (names.begin (), names.end ());
 
  check_date:
   if (sources[idx].file_time > bbg_file_time)
@@ -1947,6 +1923,33 @@ add_branch_counts (coverage_t *coverage, const arc_t *arc)
     }
 }
 
+/* Format COUNT, if flag_human_readable_numbers is set, return it human
+   readable format.  */
+
+static char const *
+format_count (gcov_type count)
+{
+  static char buffer[64];
+  const char *units = " kMGTPEZY";
+
+  if (count < 1000 || !flag_human_readable_numbers)
+    {
+      sprintf (buffer, "%" PRId64, count);
+      return buffer;
+    }
+
+  unsigned i;
+  gcov_type divisor = 1;
+  for (i = 0; units[i+1]; i++, divisor *= 1000)
+    {
+      if (count + divisor / 2 < 1000 * divisor)
+	break;
+    }
+  gcov_type r  = (count + divisor / 2) / divisor;
+  sprintf (buffer, "%" PRId64 "%c", r, units[i]);
+  return buffer;
+}
+
 /* Format a GCOV_TYPE integer as either a percent ratio, or absolute
    count.  If dp >= 0, format TOP/BOTTOM * 100 to DP decimal places.
    If DP is zero, no decimal point is printed. Only print 100% when
@@ -1994,7 +1997,7 @@ format_gcov (gcov_type top, gcov_type bottom, int dp)
 	}
     }
   else
-    sprintf (buffer, "%" PRId64, (int64_t)top);
+    return format_count (top);
 
   return buffer;
 }
@@ -2257,13 +2260,13 @@ add_line_counts (coverage_t *coverage, function_t *fn)
   /* Scan each basic block.  */
   for (unsigned ix = 0; ix != fn->blocks.size (); ix++)
     {
-      line_t *line = NULL;
+      line_info *line = NULL;
       block_t *block = &fn->blocks[ix];
       if (block->count && ix && ix + 1 != fn->blocks.size ())
 	fn->blocks_executed++;
       for (unsigned i = 0; i < block->locations.size (); i++)
 	{
-	  const source_t *src = &sources[block->locations[i].source_file_idx];
+	  source_info *src = &sources[block->locations[i].source_file_idx];
 
 	  vector<unsigned> &lines = block->locations[i].lines;
 	  for (unsigned j = 0; j < lines.size (); j++)
@@ -2278,7 +2281,11 @@ add_line_counts (coverage_t *coverage, function_t *fn)
 		}
 	      line->exists = 1;
 	      if (!block->exceptional)
-		line->unexceptional = 1;
+		{
+		  line->unexceptional = 1;
+		  if (block->count == 0)
+		    line->has_unexecuted_block = 1;
+		}
 	      line->count += block->count;
 	    }
 	}
@@ -2290,8 +2297,7 @@ add_line_counts (coverage_t *coverage, function_t *fn)
 	/* Entry or exit block */;
       else if (line != NULL)
 	{
-	  block->chain = line->blocks;
-	  line->blocks = block;
+	  line->blocks.push_back (block);
 
 	  if (flag_branches)
 	    {
@@ -2299,8 +2305,7 @@ add_line_counts (coverage_t *coverage, function_t *fn)
 
 	      for (arc = block->succ; arc; arc = arc->succ_next)
 		{
-		  arc->line_next = line->branches;
-		  line->branches = arc;
+		  line->branches.push_back (arc);
 		  if (coverage && !arc->is_unconditional)
 		    add_branch_counts (coverage, arc);
 		}
@@ -2315,11 +2320,10 @@ add_line_counts (coverage_t *coverage, function_t *fn)
 /* Accumulate the line counts of a file.  */
 
 static void
-accumulate_line_counts (source_t *src)
+accumulate_line_counts (source_info *src)
 {
-  line_t *line;
   function_t *fn, *fn_p, *fn_n;
-  unsigned ix;
+  unsigned ix = 0;
 
   /* Reverse the function order.  */
   for (fn = src->functions, fn_p = NULL; fn; fn_p = fn, fn = fn_n)
@@ -2329,9 +2333,11 @@ accumulate_line_counts (source_t *src)
     }
   src->functions = fn_p;
 
-  for (ix = src->num_lines, line = src->lines; ix--; line++)
+  for (vector<line_info>::reverse_iterator it = src->lines.rbegin ();
+       it != src->lines.rend (); it++)
     {
-      if (line->blocks)
+      line_info *line = &(*it);
+      if (!line->blocks.empty ())
 	{
 	  /* The user expects the line count to be the number of times
 	     a line has been executed. Simply summing the block count
@@ -2339,36 +2345,27 @@ accumulate_line_counts (source_t *src)
 	     is to sum the entry counts to the graph of blocks on this
 	     line, then find the elementary cycles of the local graph
 	     and add the transition counts of those cycles.  */
-	  block_t *block, *block_p, *block_n;
 	  gcov_type count = 0;
 
-	  /* Reverse the block information.  */
-	  for (block = line->blocks, block_p = NULL; block;
-	       block_p = block, block = block_n)
-	    {
-	      block_n = block->chain;
-	      block->chain = block_p;
-	      block->cycle.ident = ix;
-	    }
-	  line->blocks = block_p;
-
 	  /* Sum the entry arcs.  */
-	  for (block = line->blocks; block; block = block->chain)
+	  for (vector<block_t *>::iterator it = line->blocks.begin ();
+	       it != line->blocks.end (); it++)
 	    {
 	      arc_t *arc;
 
-	      for (arc = block->pred; arc; arc = arc->pred_next)
+	      for (arc = (*it)->pred; arc; arc = arc->pred_next)
 		if (flag_branches)
 		  add_branch_counts (&src->coverage, arc);
 	    }
 
 	  /* Cycle detection.  */
-	  for (block = line->blocks; block; block = block->chain)
+	  for (vector<block_t *>::iterator it = line->blocks.begin ();
+	       it != line->blocks.end (); it++)
 	    {
-	      for (arc_t *arc = block->pred; arc; arc = arc->pred_next)
+	      for (arc_t *arc = (*it)->pred; arc; arc = arc->pred_next)
 		if (!line->has_block (arc->src))
 		  count += arc->count;
-	      for (arc_t *arc = block->succ; arc; arc = arc->succ_next)
+	      for (arc_t *arc = (*it)->succ; arc; arc = arc->succ_next)
 		arc->cs_count = arc->count;
 	    }
 
@@ -2383,6 +2380,8 @@ accumulate_line_counts (source_t *src)
 	  if (line->count)
 	    src->coverage.lines_executed++;
 	}
+
+      ix++;
     }
 }
 
@@ -2468,28 +2467,101 @@ read_line (FILE *file)
   return pos ? string : NULL;
 }
 
+/* Pad string S with spaces from left to have total width equal to 9.  */
+
+static void
+pad_count_string (string &s)
+{
+  if (s.size () < 9)
+    s.insert (0, 9 - s.size (), ' ');
+}
+
+/* Print GCOV line beginning to F stream.  If EXISTS is set to true, the
+   line exists in source file.  UNEXCEPTIONAL indicated that it's not in
+   an exceptional statement.  The output is printed for LINE_NUM of given
+   COUNT of executions.  EXCEPTIONAL_STRING and UNEXCEPTIONAL_STRING are
+   used to indicate non-executed blocks.  */
+
+static void
+output_line_beginning (FILE *f, bool exists, bool unexceptional,
+		       bool has_unexecuted_block,
+		       gcov_type count, unsigned line_num,
+		       const char *exceptional_string,
+		       const char *unexceptional_string)
+{
+  string s;
+  if (exists)
+    {
+      if (count > 0)
+	{
+	  s = format_gcov (count, 0, -1);
+	  if (has_unexecuted_block)
+	    {
+	      if (flag_use_colors)
+		{
+		  pad_count_string (s);
+		  s = SGR_SEQ (COLOR_BG_MAGENTA COLOR_SEPARATOR COLOR_FG_WHITE);
+		  s += SGR_RESET;
+		}
+	      else
+		s += "*";
+	    }
+	  pad_count_string (s);
+	}
+      else
+	{
+	  if (flag_use_colors)
+	    {
+	      s = "0";
+	      pad_count_string (s);
+	      if (unexceptional)
+		s.insert (0, SGR_SEQ (COLOR_BG_RED
+				      COLOR_SEPARATOR COLOR_FG_WHITE));
+	      else
+		s.insert (0, SGR_SEQ (COLOR_BG_CYAN
+				      COLOR_SEPARATOR COLOR_FG_WHITE));
+	      s += SGR_RESET;
+	    }
+	  else
+	    {
+	      s = unexceptional ? unexceptional_string : exceptional_string;
+	      pad_count_string (s);
+	    }
+	}
+    }
+  else
+    {
+      s = "-";
+      pad_count_string (s);
+    }
+
+  fprintf (f, "%s:%5u", s.c_str (), line_num);
+}
+
 /* Read in the source file one line at a time, and output that line to
    the gcov file preceded by its execution count and other
    information.  */
 
 static void
-output_lines (FILE *gcov_file, const source_t *src)
+output_lines (FILE *gcov_file, const source_info *src)
 {
+#define  DEFAULT_LINE_START "        -:    0:"
+
   FILE *source_file;
   unsigned line_num;	/* current line number.  */
-  const line_t *line;           /* current line info ptr.  */
+  const line_info *line;  /* current line info ptr.  */
   const char *retval = "";	/* status of source file reading.  */
   function_t *fn = NULL;
 
-  fprintf (gcov_file, "%9s:%5d:Source:%s\n", "-", 0, src->coverage.name);
+  fprintf (gcov_file, DEFAULT_LINE_START "Source:%s\n", src->coverage.name);
   if (!multiple_files)
     {
-      fprintf (gcov_file, "%9s:%5d:Graph:%s\n", "-", 0, bbg_file_name);
-      fprintf (gcov_file, "%9s:%5d:Data:%s\n", "-", 0,
+      fprintf (gcov_file, DEFAULT_LINE_START "Graph:%s\n", bbg_file_name);
+      fprintf (gcov_file, DEFAULT_LINE_START "Data:%s\n",
 	       no_data_file ? "-" : da_file_name);
-      fprintf (gcov_file, "%9s:%5d:Runs:%u\n", "-", 0, object_runs);
+      fprintf (gcov_file, DEFAULT_LINE_START "Runs:%u\n", object_runs);
     }
-  fprintf (gcov_file, "%9s:%5d:Programs:%u\n", "-", 0, program_count);
+  fprintf (gcov_file, DEFAULT_LINE_START "Programs:%u\n", program_count);
 
   source_file = fopen (src->name, "r");
   if (!source_file)
@@ -2498,13 +2570,13 @@ output_lines (FILE *gcov_file, const source_t *src)
       retval = NULL;
     }
   else if (src->file_time == 0)
-    fprintf (gcov_file, "%9s:%5d:Source is newer than graph\n", "-", 0);
+    fprintf (gcov_file, DEFAULT_LINE_START "Source is newer than graph\n");
 
   if (flag_branches)
     fn = src->functions;
 
   for (line_num = 1, line = &src->lines[line_num];
-       line_num < src->num_lines; line_num++, line++)
+       line_num < src->lines.size (); line_num++, line++)
     {
       for (; fn && fn->line == line_num; fn = fn->next_file_fn)
 	{
@@ -2537,44 +2609,44 @@ output_lines (FILE *gcov_file, const source_t *src)
 	 Otherwise, print the execution count before the source line.
 	 There are 16 spaces of indentation added before the source
 	 line so that tabs won't be messed up.  */
-      fprintf (gcov_file, "%9s:%5u:%s\n",
-	       !line->exists ? "-" : line->count
-	       ? format_gcov (line->count, 0, -1)
-	       : line->unexceptional ? "#####" : "=====", line_num,
-	       retval ? retval : "/*EOF*/");
+      output_line_beginning (gcov_file, line->exists, line->unexceptional,
+			     line->has_unexecuted_block, line->count, line_num,
+			     "=====", "#####");
+      fprintf (gcov_file, ":%s\n", retval ? retval : "/*EOF*/");
 
       if (flag_all_blocks)
 	{
-	  block_t *block;
 	  arc_t *arc;
 	  int ix, jx;
 
-	  for (ix = jx = 0, block = line->blocks; block;
-	       block = block->chain)
+	  ix = jx = 0;
+	  for (vector<block_t *>::const_iterator it = line->blocks.begin ();
+	       it != line->blocks.end (); it++)
 	    {
-	      if (!block->is_call_return)
+	      if (!(*it)->is_call_return)
 		{
-		  fprintf (gcov_file, "%9s:%5u-block %2d",
-			   !line->exists ? "-" : block->count
-			   ? format_gcov (block->count, 0, -1)
-			   : block->exceptional ? "%%%%%" : "$$$$$",
-			   line_num, ix++);
+		  output_line_beginning (gcov_file, line->exists,
+					 (*it)->exceptional, false,
+					 (*it)->count, line_num,
+					 "%%%%%", "$$$$$");
+		  fprintf (gcov_file, "-block %2d", ix++);
 		  if (flag_verbose)
-		    fprintf (gcov_file, " (BB %u)", block->id);
+		    fprintf (gcov_file, " (BB %u)", (*it)->id);
 		  fprintf (gcov_file, "\n");
 		}
 	      if (flag_branches)
-		for (arc = block->succ; arc; arc = arc->succ_next)
+		for (arc = (*it)->succ; arc; arc = arc->succ_next)
 		  jx += output_branch_count (gcov_file, jx, arc);
 	    }
 	}
       else if (flag_branches)
 	{
 	  int ix;
-	  arc_t *arc;
 
-	  for (ix = 0, arc = line->branches; arc; arc = arc->line_next)
-	    ix += output_branch_count (gcov_file, ix, arc);
+	  ix = 0;
+	  for (vector<arc_t *>::const_iterator it = line->branches.begin ();
+	       it != line->branches.end (); it++)
+	    ix += output_branch_count (gcov_file, ix, (*it));
 	}
     }
 
diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index be56b0ee25b..3e1279e5b2a 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -252,6 +252,9 @@ skip file is-a.h
 # And line-map.h.
 skip file line-map.h
 
+# And timevar.h.
+skip file timevar.h
+
 # Likewise, skip various inline functions in rtl.h.
 skip rtx_expr_list::next
 skip rtx_expr_list::element
diff --git a/gcc/gencfn-macros.c b/gcc/gencfn-macros.c
index 269429fabfc..5b38ac20a4d 100644
--- a/gcc/gencfn-macros.c
+++ b/gcc/gencfn-macros.c
@@ -98,11 +98,12 @@ is_group (string_set *builtins, const char *name, const char *const *suffixes)
 
 static void
 print_case_cfn (const char *name, bool internal_p,
-		const char *const *suffixes)
+		const char *const *suffixes, bool floatn_p)
 {
-  printf ("#define CASE_CFN_%s", name);
+  const char *floatn = (floatn_p) ? "_FN" : "";
+  printf ("#define CASE_CFN_%s%s", name, floatn);
   if (internal_p)
-    printf (" \\\n  case CFN_%s", name);
+    printf (" \\\n  case CFN_%s%s", name, floatn);
   for (unsigned int i = 0; suffixes[i]; ++i)
     printf ("%s \\\n  case CFN_BUILT_IN_%s%s",
 	    internal_p || i > 0 ? ":" : "", name, suffixes[i]);
@@ -115,9 +116,10 @@ print_case_cfn (const char *name, bool internal_p,
 
 static void
 print_define_operator_list (const char *name, bool internal_p,
-			    const char *const *suffixes)
+			    const char *const *suffixes, bool floatn_p)
 {
-  printf ("(define_operator_list %s\n", name);
+  const char *floatn = (floatn_p) ? "_FN" : "";
+  printf ("(define_operator_list %s%s\n", name, floatn);
   for (unsigned int i = 0; suffixes[i]; ++i)
     printf ("    BUILT_IN_%s%s\n", name, suffixes[i]);
   if (internal_p)
@@ -148,6 +150,8 @@ const char *const internal_fn_int_names[] = {
 };
 
 static const char *const flt_suffixes[] = { "F", "", "L", NULL };
+static const char *const fltfn_suffixes[] = { "F16", "F32", "F64", "F128",
+					      "F32X", "F64X", "F128X", NULL };
 static const char *const int_suffixes[] = { "", "L", "LL", "IMAX", NULL };
 
 static const char *const *const suffix_lists[] = {
@@ -200,15 +204,33 @@ main (int argc, char **argv)
 	{
 	  const char *root = name + 9;
 	  for (unsigned int j = 0; suffix_lists[j]; ++j)
-	    if (is_group (&builtins, root, suffix_lists[j]))
-	      {
-		bool internal_p = internal_fns.contains (root);
-		if (type == 'c')
-		  print_case_cfn (root, internal_p, suffix_lists[j]);
-		else
-		  print_define_operator_list (root, internal_p,
-					      suffix_lists[j]);
-	      }
+	    {
+	      const char *const *const suffix = suffix_lists[j];
+
+	      if (is_group (&builtins, root, suffix))
+		{
+		  bool internal_p = internal_fns.contains (root);
+
+		  if (type == 'c')
+		    print_case_cfn (root, internal_p, suffix, false);
+		  else
+		    print_define_operator_list (root, internal_p,
+						suffix, false);
+
+		      /* Support the _Float<N> and _Float<N>X math functions if
+			 they exist.  We put these out as a separate CFN macro,
+			 so code can add support or not as needed.  */
+		  if (suffix == flt_suffixes
+		      && is_group (&builtins, root, fltfn_suffixes))
+		    {
+		      if (type == 'c')
+			print_case_cfn (root, false, fltfn_suffixes, true);
+		      else
+			print_define_operator_list (root, false, fltfn_suffixes,
+						    true);
+		    }
+		}
+	    }
 	}
     }
 
diff --git a/gcc/genmodes.c b/gcc/genmodes.c
index 9a5ed03b6ec..64074629ea1 100644
--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
@@ -45,10 +45,6 @@ static const char *const mode_class_names[MAX_MODE_CLASS] =
 # define EXTRA_MODES_FILE ""
 #endif
 
-static int bits_per_unit;
-static int max_bitsize_mode_any_int;
-static int max_bitsize_mode_any_mode;
-
 /* Data structure for building up what we know about a mode.
    They're clustered by mode class.  */
 struct mode_data
@@ -383,9 +379,7 @@ complete_mode (struct mode_data *m)
       break;
 
     case MODE_VECTOR_BOOL:
-      validate_mode (m, UNSET, UNSET, SET, SET, UNSET);
-      m->precision = m->ncomponents;
-      m->bytesize = (m->ncomponents + bits_per_unit - 1) / bits_per_unit;
+      validate_mode (m, UNSET, SET, SET, SET, UNSET);
       break;
 
     case MODE_VECTOR_INT:
@@ -539,14 +533,12 @@ make_vector_modes (enum mode_class cl, unsigned int width,
     }
 }
 
-/* Create a vector of booleans with the given number of elements.
-   Each element has BImode and by default the vector is packed,
-   with element 0 being the lsb of the first byte in memory.
-   The target can create an upacked representation by changing
-   the size of the vector.  */
-#define VECTOR_BOOL_MODE(BITS) make_vector_bool_mode (BITS, __FILE__, __LINE__)
+/* Create a vector of booleans with COUNT elements and BYTESIZE bytes
+   in total.  */
+#define VECTOR_BOOL_MODE(COUNT, BYTESIZE) \
+  make_vector_bool_mode (COUNT, BYTESIZE, __FILE__, __LINE__)
 static void ATTRIBUTE_UNUSED
-make_vector_bool_mode (unsigned int bits,
+make_vector_bool_mode (unsigned int count, unsigned int bytesize,
 		       const char *file, unsigned int line)
 {
   struct mode_data *m = find_mode ("BI");
@@ -557,7 +549,7 @@ make_vector_bool_mode (unsigned int bits,
     }
 
   char buf[8];
-  if ((size_t) snprintf (buf, sizeof buf, "V%uBI", bits) >= sizeof buf)
+  if ((size_t) snprintf (buf, sizeof buf, "V%uBI", count) >= sizeof buf)
     {
       error ("%s:%d: number of vector elements is too high",
 	     file, line);
@@ -567,7 +559,8 @@ make_vector_bool_mode (unsigned int bits,
   struct mode_data *v = new_mode (MODE_VECTOR_BOOL,
 				  xstrdup (buf), file, line);
   v->component = m;
-  v->ncomponents = bits;
+  v->ncomponents = count;
+  v->bytesize = bytesize;
 }
 
 /* Input.  */
@@ -803,6 +796,10 @@ make_vector_mode (enum mode_class bclass,
 #define ADJUST_IBIT(M, X)  _ADD_ADJUST (ibit, M, X, ACCUM, UACCUM)
 #define ADJUST_FBIT(M, X)  _ADD_ADJUST (fbit, M, X, FRACT, UACCUM)
 
+static int bits_per_unit;
+static int max_bitsize_mode_any_int;
+static int max_bitsize_mode_any_mode;
+
 static void
 create_modes (void)
 {
@@ -969,9 +966,9 @@ calc_wider_mode (void)
 #define print_decl(TYPE, NAME, ASIZE) \
   puts ("\nconst " TYPE " " NAME "[" ASIZE "] =\n{");
 
-#define print_maybe_const_decl(TYPE, NAME, ASIZE, CATEGORY)	\
+#define print_maybe_const_decl(TYPE, NAME, ASIZE, NEEDS_ADJ)	\
   printf ("\n" TYPE " " NAME "[" ASIZE "] = \n{\n",		\
-	  CATEGORY ? "" : "const ")
+	  NEEDS_ADJ ? "" : "const ")
 
 #define print_closer() puts ("};")
 
@@ -1034,6 +1031,9 @@ emit_mode_size_inline (void)
       for (m = a->mode->contained; m; m = m->next_cont)
 	m->need_bytesize_adj = true;
     }
+
+  /* Changing the number of units by a factor of X also changes the size
+     by a factor of X.  */
   for (mode_adjust *a = adj_nunits; a; a = a->next)
     a->mode->need_bytesize_adj = true;
 
@@ -1049,7 +1049,7 @@ mode_size_inline (machine_mode mode)\n\
   extern %spoly_uint16_pod mode_size[NUM_MACHINE_MODES];\n\
   gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
   switch (mode)\n\
-    {\n", adj_bytesize ? "" : "const ");
+    {\n", adj_nunits || adj_bytesize ? "" : "const ");
 
   for_all_modes (c, m)
     if (!m->need_bytesize_adj)
@@ -1305,7 +1305,8 @@ enum machine_mode\n{");
   /* I can't think of a better idea, can you?  */
   printf ("#define CONST_MODE_NUNITS%s\n", adj_nunits ? "" : " const");
   printf ("#define CONST_MODE_PRECISION%s\n", adj_nunits ? "" : " const");
-  printf ("#define CONST_MODE_SIZE%s\n", adj_bytesize ? "" : " const");
+  printf ("#define CONST_MODE_SIZE%s\n",
+	  adj_bytesize || adj_nunits ? "" : " const");
   printf ("#define CONST_MODE_UNIT_SIZE%s\n", adj_bytesize ? "" : " const");
   printf ("#define CONST_MODE_BASE_ALIGN%s\n", adj_alignment ? "" : " const");
 #if 0 /* disabled for backward compatibility, temporary */
@@ -1718,15 +1719,18 @@ emit_mode_adjustments (void)
   for (a = adj_nunits; a; a = a->next)
     {
       m = a->mode;
-      printf ("\n  /* %s:%d */\n  ps = %s;\n",
+      printf ("\n"
+	      "  {\n"
+	      "    /* %s:%d */\n  ps = %s;\n",
 	      a->file, a->line, a->adjustment);
-      printf ("  mode_nunits[E_%smode] = ps;\n", m->name);
-      printf ("  mode_size[E_%smode] = mode_unit_size[E_%smode] * ps;\n",
+      printf ("    int old_factor = vector_element_size"
+	      " (mode_precision[E_%smode], mode_nunits[E_%smode]);\n",
 	      m->name, m->name);
-      if (m->precision == (unsigned int) -1)
-	printf ("  mode_precision[E_%smode]"
-		" = poly_uint16 (mode_size[E_%smode])"
-		" * BITS_PER_UNIT;", m->name, m->name);
+      printf ("    mode_precision[E_%smode] = ps * old_factor;\n",  m->name);
+      printf ("    mode_size[E_%smode] = exact_div (mode_precision[E_%smode],"
+	      " BITS_PER_UNIT);\n", m->name, m->name);
+      printf ("    mode_nunits[E_%smode] = ps;\n", m->name);
+      printf ("  }\n");
     }
 
   /* Size adjustments must be propagated to all containing modes.
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 0a460a693c7..ed474edb2fc 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -6369,7 +6369,7 @@ fold_ctor_reference (tree type, tree ctor, poly_uint64 poly_offset,
 
   /* We found the field with exact match.  */
   if (useless_type_conversion_p (type, TREE_TYPE (ctor))
-      && known_zero (poly_offset))
+      && must_eq (poly_offset, 0U))
     return canonicalize_constructor_val (unshare_expr (ctor), from_decl);
 
   /* The remaining optimizations need a constant size and offset.  */
@@ -6584,7 +6584,6 @@ gimple_get_virt_method_for_vtable (HOST_WIDE_INT token,
   gcc_assert (init);
   if (init == error_mark_node)
     {
-      gcc_assert (in_lto_p);
       /* Pass down that we lost track of the target.  */
       if (can_refer)
 	*can_refer = false;
diff --git a/gcc/gimple-laddress.c b/gcc/gimple-laddress.c
index 45fa7c3cfd4..85a2b5932e0 100644
--- a/gcc/gimple-laddress.c
+++ b/gcc/gimple-laddress.c
@@ -111,7 +111,7 @@ pass_laddress::execute (function *fun)
 	  poly_int64 bytepos = exact_div (bitpos, BITS_PER_UNIT);
 	  if (offset != NULL_TREE)
 	    {
-	      if (maybe_nonzero (bytepos))
+	      if (may_ne (bytepos, 0))
 		offset = size_binop (PLUS_EXPR, offset, size_int (bytepos));
 	      offset = force_gimple_operand_gsi (&gsi, offset, true, NULL,
 						 true, GSI_SAME_STMT);
diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index bc0558fa818..8ee84a7cc0d 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -82,21 +82,17 @@ debug_gimple_stmt (gimple *gs)
    by xstrdup_for_dump.  */
 
 static const char *
-dump_profile (int frequency, profile_count &count)
+dump_profile (profile_count &count)
 {
-  float minimum = 0.01f;
-
-  gcc_assert (0 <= frequency && frequency <= REG_BR_PROB_BASE);
-  float fvalue = frequency * 100.0f / REG_BR_PROB_BASE;
-  if (fvalue < minimum && frequency > 0)
-    return "[0.01%]";
-
   char *buf;
-  if (count.initialized_p ())
-    buf = xasprintf ("[%.2f%%] [count: %" PRId64 "]", fvalue,
+  if (!count.initialized_p ())
+    return NULL;
+  if (count.ipa_p ())
+    buf = xasprintf ("[count: %" PRId64 "]",
+		     count.to_gcov_type ());
+  else if (count.initialized_p ())
+    buf = xasprintf ("[local count: %" PRId64 "]",
 		     count.to_gcov_type ());
-  else
-    buf = xasprintf ("[%.2f%%] [count: INV]", fvalue);
 
   const char *ret = xstrdup_for_dump (buf);
   free (buf);
@@ -109,7 +105,7 @@ dump_profile (int frequency, profile_count &count)
    by xstrdup_for_dump.  */
 
 static const char *
-dump_probability (profile_probability probability, profile_count &count)
+dump_probability (profile_probability probability)
 {
   float minimum = 0.01f;
   float fvalue = -1;
@@ -122,13 +118,10 @@ dump_probability (profile_probability probability, profile_count &count)
     }
 
   char *buf;
-  if (count.initialized_p ())
-    buf = xasprintf ("[%.2f%%] [count: %" PRId64 "]", fvalue,
-		     count.to_gcov_type ());
-  else if (probability.initialized_p ())
-    buf = xasprintf ("[%.2f%%] [count: INV]", fvalue);
+  if (probability.initialized_p ())
+    buf = xasprintf ("[%.2f%%]", fvalue);
   else
-    buf = xasprintf ("[INV] [count: INV]");
+    buf = xasprintf ("[INV]");
 
   const char *ret = xstrdup_for_dump (buf);
   free (buf);
@@ -141,7 +134,7 @@ dump_probability (profile_probability probability, profile_count &count)
 static void
 dump_edge_probability (pretty_printer *buffer, edge e)
 {
-  pp_scalar (buffer, " %s", dump_probability (e->probability, e->count));
+  pp_scalar (buffer, " %s", dump_probability (e->probability));
 }
 
 /* Print GIMPLE statement G to FILE using SPC indentation spaces and
@@ -2678,8 +2671,7 @@ dump_gimple_bb_header (FILE *outf, basic_block bb, int indent,
 	fprintf (outf, "%*sbb_%d:\n", indent, "", bb->index);
       else
 	fprintf (outf, "%*s<bb %d> %s:\n",
-		 indent, "", bb->index, dump_profile (bb->frequency,
-						      bb->count));
+		 indent, "", bb->index, dump_profile (bb->count));
     }
 }
 
diff --git a/gcc/gimple-ssa-backprop.c b/gcc/gimple-ssa-backprop.c
index 93365ae9c62..16363003115 100644
--- a/gcc/gimple-ssa-backprop.c
+++ b/gcc/gimple-ssa-backprop.c
@@ -354,6 +354,7 @@ backprop::process_builtin_call_use (gcall *call, tree rhs, usage_info *info)
       break;
 
     CASE_CFN_COPYSIGN:
+    CASE_CFN_COPYSIGN_FN:
       /* The sign of the first input is ignored.  */
       if (rhs != gimple_call_arg (call, 1))
 	info->flags.ignore_sign = true;
@@ -373,6 +374,7 @@ backprop::process_builtin_call_use (gcall *call, tree rhs, usage_info *info)
       }
 
     CASE_CFN_FMA:
+    CASE_CFN_FMA_FN:
     case CFN_FMS:
     case CFN_FNMA:
     case CFN_FNMS:
@@ -683,6 +685,7 @@ strip_sign_op_1 (tree rhs)
     switch (gimple_call_combined_fn (call))
       {
       CASE_CFN_COPYSIGN:
+      CASE_CFN_COPYSIGN_FN:
 	return gimple_call_arg (call, 0);
 
       default:
diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index 807e0032410..9a010a6b395 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -154,7 +154,6 @@ isolate_path (basic_block bb, basic_block duplicate,
   if (!duplicate)
     {
       duplicate = duplicate_block (bb, NULL, NULL);
-      bb->frequency = 0;
       bb->count = profile_count::zero ();
       if (!ret_zero)
 	for (ei = ei_start (duplicate->succs); (e2 = ei_safe_edge (ei)); )
@@ -168,8 +167,7 @@ isolate_path (basic_block bb, basic_block duplicate,
       flush_pending_stmts (e2);
 
       /* Update profile only when redirection is really processed.  */
-      bb->frequency += EDGE_FREQUENCY (e);
-      bb->count += e->count;
+      bb->count += e->count ();
     }
 
   /* There may be more than one statement in DUPLICATE which exhibits
diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 7899e09195f..35ceb2cfb75 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "toplev.h"
 #include "substring-locations.h"
 #include "diagnostic.h"
+#include "domwalk.h"
 
 /* The likely worst case value of MB_LEN_MAX for the target, large enough
    for UTF-8.  Ideally, this would be obtained by a target hook if it were
@@ -113,6 +114,19 @@ static int warn_level;
 
 struct format_result;
 
+class sprintf_dom_walker : public dom_walker
+{
+ public:
+  sprintf_dom_walker () : dom_walker (CDI_DOMINATORS) {}
+  ~sprintf_dom_walker () {}
+
+  edge before_dom_children (basic_block) FINAL OVERRIDE;
+  bool handle_gimple_call (gimple_stmt_iterator *);
+
+  struct call_info;
+  bool compute_format_length (call_info &, format_result *);
+};
+
 class pass_sprintf_length : public gimple_opt_pass
 {
   bool fold_return_value;
@@ -135,10 +149,6 @@ public:
       fold_return_value = param;
     }
 
-  bool handle_gimple_call (gimple_stmt_iterator *);
-
-  struct call_info;
-  bool compute_format_length (call_info &, format_result *);
 };
 
 bool
@@ -583,7 +593,7 @@ get_format_string (tree format, location_t *ploc)
 /* For convenience and brevity.  */
 
 static bool
-  (* const fmtwarn) (const substring_loc &, const source_range *,
+  (* const fmtwarn) (const substring_loc &, location_t,
 		     const char *, int, const char *, ...)
   = format_warning_at_substring;
 
@@ -976,7 +986,7 @@ bytes_remaining (unsigned HOST_WIDE_INT navail, const format_result &res)
 
 /* Description of a call to a formatted function.  */
 
-struct pass_sprintf_length::call_info
+struct sprintf_dom_walker::call_info
 {
   /* Function call statement.  */
   gimple *callstmt;
@@ -2348,7 +2358,7 @@ format_plain (const directive &dir, tree)
    should be diagnosed given the AVAILable space in the destination.  */
 
 static bool
-should_warn_p (const pass_sprintf_length::call_info &info,
+should_warn_p (const sprintf_dom_walker::call_info &info,
 	       const result_range &avail, const result_range &result)
 {
   if (result.max <= avail.min)
@@ -2418,8 +2428,8 @@ should_warn_p (const pass_sprintf_length::call_info &info,
    Return true if a warning has been issued.  */
 
 static bool
-maybe_warn (substring_loc &dirloc, source_range *pargrange,
-	    const pass_sprintf_length::call_info &info,
+maybe_warn (substring_loc &dirloc, location_t argloc,
+	    const sprintf_dom_walker::call_info &info,
 	    const result_range &avail_range, const result_range &res,
 	    const directive &dir)
 {
@@ -2476,8 +2486,8 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		  : G_("%qE writing a terminating nul past the end "
 		       "of the destination")));
 
-	  return fmtwarn (dirloc, NULL, NULL, info.warnopt (), fmtstr,
-			  info.func);
+	  return fmtwarn (dirloc, UNKNOWN_LOCATION, NULL, info.warnopt (),
+			  fmtstr, info.func);
 	}
 
       if (res.min == res.max)
@@ -2500,7 +2510,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 			  "%wu bytes into a region of size %wu"))
 		  : G_("%<%.*s%> directive writing %wu bytes "
 		       "into a region of size %wu")));
-	  return fmtwarn (dirloc, pargrange, NULL,
+	  return fmtwarn (dirloc, argloc, NULL,
 			  info.warnopt (), fmtstr, dir.len,
 			  target_to_host (hostdir, sizeof hostdir, dir.beg),
 			  res.min, navail);
@@ -2517,7 +2527,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		       "up to %wu bytes into a region of size %wu"))
 	       : G_("%<%.*s%> directive writing up to %wu bytes "
 		    "into a region of size %wu"));
-	  return fmtwarn (dirloc, pargrange, NULL,
+	  return fmtwarn (dirloc, argloc, NULL,
 			  info.warnopt (), fmtstr, dir.len,
 			  target_to_host (hostdir, sizeof hostdir, dir.beg),
 			  res.max, navail);
@@ -2537,7 +2547,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		       "likely %wu or more bytes into a region of size %wu"))
 	       : G_("%<%.*s%> directive writing likely %wu or more bytes "
 		    "into a region of size %wu"));
-	  return fmtwarn (dirloc, pargrange, NULL,
+	  return fmtwarn (dirloc, argloc, NULL,
 			  info.warnopt (), fmtstr, dir.len,
 			  target_to_host (hostdir, sizeof hostdir, dir.beg),
 			  res.likely, navail);
@@ -2554,7 +2564,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		       "between %wu and %wu bytes into a region of size %wu"))
 	       : G_("%<%.*s%> directive writing between %wu and "
 		    "%wu bytes into a region of size %wu"));
-	  return fmtwarn (dirloc, pargrange, NULL,
+	  return fmtwarn (dirloc, argloc, NULL,
 			  info.warnopt (), fmtstr, dir.len,
 			  target_to_host (hostdir, sizeof hostdir, dir.beg),
 			  res.min, res.max, navail);
@@ -2569,7 +2579,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		   "%wu or more bytes into a region of size %wu"))
 	   : G_("%<%.*s%> directive writing %wu or more bytes "
 		"into a region of size %wu"));
-      return fmtwarn (dirloc, pargrange, NULL,
+      return fmtwarn (dirloc, argloc, NULL,
 		      info.warnopt (), fmtstr, dir.len,
 		      target_to_host (hostdir, sizeof hostdir, dir.beg),
 		      res.min, navail);
@@ -2603,7 +2613,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 	      : G_("%qE writing a terminating nul past the end "
 		   "of the destination")));
 
-      return fmtwarn (dirloc, NULL, NULL, info.warnopt (), fmtstr,
+      return fmtwarn (dirloc, UNKNOWN_LOCATION, NULL, info.warnopt (), fmtstr,
 		      info.func);
     }
 
@@ -2628,7 +2638,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 	      : G_("%<%.*s%> directive writing %wu bytes "
 		   "into a region of size between %wu and %wu")));
 
-      return fmtwarn (dirloc, pargrange, NULL,
+      return fmtwarn (dirloc, argloc, NULL,
 		      info.warnopt (), fmtstr, dir.len,
 		      target_to_host (hostdir, sizeof hostdir, dir.beg),
 		      res.min, avail_range.min, avail_range.max);
@@ -2647,7 +2657,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		   "%wu and %wu"))
 	   : G_("%<%.*s%> directive writing up to %wu bytes "
 		"into a region of size between %wu and %wu"));
-      return fmtwarn (dirloc, pargrange, NULL,
+      return fmtwarn (dirloc, argloc, NULL,
 		      info.warnopt (), fmtstr, dir.len,
 		      target_to_host (hostdir, sizeof hostdir, dir.beg),
 		      res.max, avail_range.min, avail_range.max);
@@ -2669,7 +2679,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		   "%wu and %wu"))
 	   : G_("%<%.*s%> directive writing likely %wu or more bytes "
 		"into a region of size between %wu and %wu"));
-      return fmtwarn (dirloc, pargrange, NULL,
+      return fmtwarn (dirloc, argloc, NULL,
 		      info.warnopt (), fmtstr, dir.len,
 		      target_to_host (hostdir, sizeof hostdir, dir.beg),
 		      res.likely, avail_range.min, avail_range.max);
@@ -2688,7 +2698,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 		   "between %wu and %wu"))
 	   : G_("%<%.*s%> directive writing between %wu and "
 		"%wu bytes into a region of size between %wu and %wu"));
-      return fmtwarn (dirloc, pargrange, NULL,
+      return fmtwarn (dirloc, argloc, NULL,
 		      info.warnopt (), fmtstr, dir.len,
 		      target_to_host (hostdir, sizeof hostdir, dir.beg),
 		      res.min, res.max, avail_range.min, avail_range.max);
@@ -2705,7 +2715,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
 	       "%wu and %wu"))
        : G_("%<%.*s%> directive writing %wu or more bytes "
 	    "into a region of size between %wu and %wu"));
-  return fmtwarn (dirloc, pargrange, NULL,
+  return fmtwarn (dirloc, argloc, NULL,
 		  info.warnopt (), fmtstr, dir.len,
 		  target_to_host (hostdir, sizeof hostdir, dir.beg),
 		  res.min, avail_range.min, avail_range.max);
@@ -2716,7 +2726,7 @@ maybe_warn (substring_loc &dirloc, source_range *pargrange,
    in *RES.  Return true if the directive has been handled.  */
 
 static bool
-format_directive (const pass_sprintf_length::call_info &info,
+format_directive (const sprintf_dom_walker::call_info &info,
 		  format_result *res, const directive &dir)
 {
   /* Offset of the beginning of the directive from the beginning
@@ -2730,17 +2740,11 @@ format_directive (const pass_sprintf_length::call_info &info,
   substring_loc dirloc (info.fmtloc, TREE_TYPE (info.format),
 			offset, start, length);
 
-  /* Also create a location range for the argument if possible.
+  /* Also get the location of the argument if possible.
      This doesn't work for integer literals or function calls.  */
-  source_range argrange;
-  source_range *pargrange;
-  if (dir.arg && CAN_HAVE_LOCATION_P (dir.arg))
-    {
-      argrange = EXPR_LOCATION_RANGE (dir.arg);
-      pargrange = &argrange;
-    }
-  else
-    pargrange = NULL;
+  location_t argloc = UNKNOWN_LOCATION;
+  if (dir.arg)
+    argloc = EXPR_LOCATION (dir.arg);
 
   /* Bail when there is no function to compute the output length,
      or when minimum length checking has been disabled.   */
@@ -2797,7 +2801,7 @@ format_directive (const pass_sprintf_length::call_info &info,
 
   if (fmtres.nullp)
     {
-      fmtwarn (dirloc, pargrange, NULL, info.warnopt (),
+      fmtwarn (dirloc, argloc, NULL, info.warnopt (),
 	       "%<%.*s%> directive argument is null",
 	       dirlen, target_to_host (hostdir, sizeof hostdir, dir.beg));
 
@@ -2816,7 +2820,7 @@ format_directive (const pass_sprintf_length::call_info &info,
   bool warned = res->warned;
 
   if (!warned)
-    warned = maybe_warn (dirloc, pargrange, info, avail_range,
+    warned = maybe_warn (dirloc, argloc, info, avail_range,
 			 fmtres.range, dir);
 
   /* Bump up the total maximum if it isn't too big.  */
@@ -2862,7 +2866,7 @@ format_directive (const pass_sprintf_length::call_info &info,
 	 (like Glibc does under some conditions).  */
 
       if (fmtres.range.min == fmtres.range.max)
-	warned = fmtwarn (dirloc, pargrange, NULL,
+	warned = fmtwarn (dirloc, argloc, NULL,
 			  info.warnopt (),
 			  "%<%.*s%> directive output of %wu bytes exceeds "
 			  "minimum required size of 4095",
@@ -2878,7 +2882,7 @@ format_directive (const pass_sprintf_length::call_info &info,
 	       : G_("%<%.*s%> directive output between %wu and %wu "
 		    "bytes exceeds minimum required size of 4095"));
 
-	  warned = fmtwarn (dirloc, pargrange, NULL,
+	  warned = fmtwarn (dirloc, argloc, NULL,
 			    info.warnopt (), fmtstr, dirlen,
 			    target_to_host (hostdir, sizeof hostdir, dir.beg),
 			    fmtres.range.min, fmtres.range.max);
@@ -2906,7 +2910,7 @@ format_directive (const pass_sprintf_length::call_info &info,
 	 to exceed INT_MAX bytes.  */
 
       if (fmtres.range.min == fmtres.range.max)
-	warned = fmtwarn (dirloc, pargrange, NULL, info.warnopt (),
+	warned = fmtwarn (dirloc, argloc, NULL, info.warnopt (),
 			  "%<%.*s%> directive output of %wu bytes causes "
 			  "result to exceed %<INT_MAX%>",
 			  dirlen,
@@ -2920,7 +2924,7 @@ format_directive (const pass_sprintf_length::call_info &info,
 		     "bytes causes result to exceed %<INT_MAX%>")
 	       : G_ ("%<%.*s%> directive output between %wu and %wu "
 		     "bytes may cause result to exceed %<INT_MAX%>"));
-	  warned = fmtwarn (dirloc, pargrange, NULL,
+	  warned = fmtwarn (dirloc, argloc, NULL,
 			    info.warnopt (), fmtstr, dirlen,
 			    target_to_host (hostdir, sizeof hostdir, dir.beg),
 			    fmtres.range.min, fmtres.range.max);
@@ -3010,7 +3014,7 @@ format_directive (const pass_sprintf_length::call_info &info,
    the directive.  */
 
 static size_t
-parse_directive (pass_sprintf_length::call_info &info,
+parse_directive (sprintf_dom_walker::call_info &info,
 		 directive &dir, format_result *res,
 		 const char *str, unsigned *argno)
 {
@@ -3351,7 +3355,7 @@ parse_directive (pass_sprintf_length::call_info &info,
 	  substring_loc dirloc (info.fmtloc, TREE_TYPE (info.format),
 				caret, begin, end);
 
-	  fmtwarn (dirloc, NULL, NULL,
+	  fmtwarn (dirloc, UNKNOWN_LOCATION, NULL,
 		   info.warnopt (), "%<%.*s%> directive width out of range",
 		   dir.len, target_to_host (hostdir, sizeof hostdir, dir.beg));
 	}
@@ -3385,7 +3389,7 @@ parse_directive (pass_sprintf_length::call_info &info,
 	  substring_loc dirloc (info.fmtloc, TREE_TYPE (info.format),
 				caret, begin, end);
 
-	  fmtwarn (dirloc, NULL, NULL,
+	  fmtwarn (dirloc, UNKNOWN_LOCATION, NULL,
 		   info.warnopt (), "%<%.*s%> directive precision out of range",
 		   dir.len, target_to_host (hostdir, sizeof hostdir, dir.beg));
 	}
@@ -3437,7 +3441,7 @@ parse_directive (pass_sprintf_length::call_info &info,
    that caused the processing to be terminated early).  */
 
 bool
-pass_sprintf_length::compute_format_length (call_info &info,
+sprintf_dom_walker::compute_format_length (call_info &info,
 					    format_result *res)
 {
   if (dump_file)
@@ -3520,7 +3524,7 @@ get_destination_size (tree dest)
    of its return values.  */
 
 static bool
-is_call_safe (const pass_sprintf_length::call_info &info,
+is_call_safe (const sprintf_dom_walker::call_info &info,
 	      const format_result &res, bool under4k,
 	      unsigned HOST_WIDE_INT retval[2])
 {
@@ -3579,7 +3583,7 @@ is_call_safe (const pass_sprintf_length::call_info &info,
 
 static bool
 try_substitute_return_value (gimple_stmt_iterator *gsi,
-			     const pass_sprintf_length::call_info &info,
+			     const sprintf_dom_walker::call_info &info,
 			     const format_result &res)
 {
   tree lhs = gimple_get_lhs (info.callstmt);
@@ -3696,7 +3700,7 @@ try_substitute_return_value (gimple_stmt_iterator *gsi,
 
 static bool
 try_simplify_call (gimple_stmt_iterator *gsi,
-		   const pass_sprintf_length::call_info &info,
+		   const sprintf_dom_walker::call_info &info,
 		   const format_result &res)
 {
   unsigned HOST_WIDE_INT dummy[2];
@@ -3723,7 +3727,7 @@ try_simplify_call (gimple_stmt_iterator *gsi,
    and gsi_next should not be performed in the caller.  */
 
 bool
-pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
+sprintf_dom_walker::handle_gimple_call (gimple_stmt_iterator *gsi)
 {
   call_info info = call_info ();
 
@@ -3988,6 +3992,24 @@ pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
   return call_removed;
 }
 
+edge
+sprintf_dom_walker::before_dom_children (basic_block bb)
+{
+  for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si); )
+    {
+      /* Iterate over statements, looking for function calls.  */
+      gimple *stmt = gsi_stmt (si);
+
+      if (is_gimple_call (stmt) && handle_gimple_call (&si))
+	/* If handle_gimple_call returns true, the iterator is
+	   already pointing to the next statement.  */
+	continue;
+
+      gsi_next (&si);
+    }
+  return NULL;
+}
+
 /* Execute the pass for function FUN.  */
 
 unsigned int
@@ -3995,26 +4017,13 @@ pass_sprintf_length::execute (function *fun)
 {
   init_target_to_host_charmap ();
 
-  basic_block bb;
-  FOR_EACH_BB_FN (bb, fun)
-    {
-      for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si); )
-	{
-	  /* Iterate over statements, looking for function calls.  */
-	  gimple *stmt = gsi_stmt (si);
+  calculate_dominance_info (CDI_DOMINATORS);
 
-	  if (is_gimple_call (stmt) && handle_gimple_call (&si))
-	    /* If handle_gimple_call returns true, the iterator is
-	       already pointing to the next statement.  */
-	    continue;
-
-	  gsi_next (&si);
-	}
-    }
+  sprintf_dom_walker sprintf_dom_walker;
+  sprintf_dom_walker.walk (ENTRY_BLOCK_PTR_FOR_FN (fun));
 
   /* Clean up object size info.  */
   fini_object_sizes ();
-
   return 0;
 }
 
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index 0a15ffdd98b..c5f01d774f5 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -19,7 +19,8 @@
    <http://www.gnu.org/licenses/>.  */
 
 /* The purpose of this pass is to combine multiple memory stores of
-   constant values to consecutive memory locations into fewer wider stores.
+   constant values, values loaded from memory or bitwise operations
+   on those to consecutive memory locations into fewer wider stores.
    For example, if we have a sequence peforming four byte stores to
    consecutive memory locations:
    [p     ] := imm1;
@@ -29,21 +30,49 @@
    we can transform this into a single 4-byte store if the target supports it:
   [p] := imm1:imm2:imm3:imm4 //concatenated immediates according to endianness.
 
+   Or:
+   [p     ] := [q     ];
+   [p + 1B] := [q + 1B];
+   [p + 2B] := [q + 2B];
+   [p + 3B] := [q + 3B];
+   if there is no overlap can be transformed into a single 4-byte
+   load followed by single 4-byte store.
+
+   Or:
+   [p     ] := [q     ] ^ imm1;
+   [p + 1B] := [q + 1B] ^ imm2;
+   [p + 2B] := [q + 2B] ^ imm3;
+   [p + 3B] := [q + 3B] ^ imm4;
+   if there is no overlap can be transformed into a single 4-byte
+   load, xored with imm1:imm2:imm3:imm4 and stored using a single 4-byte store.
+
    The algorithm is applied to each basic block in three phases:
 
-   1) Scan through the basic block recording constant assignments to
+   1) Scan through the basic block recording assignments to
    destinations that can be expressed as a store to memory of a certain size
-   at a certain bit offset.  Record store chains to different bases in a
-   hash_map (m_stores) and make sure to terminate such chains when appropriate
-   (for example when when the stored values get used subsequently).
+   at a certain bit offset from expressions we can handle.  For bit-fields
+   we also note the surrounding bit region, bits that could be stored in
+   a read-modify-write operation when storing the bit-field.  Record store
+   chains to different bases in a hash_map (m_stores) and make sure to
+   terminate such chains when appropriate (for example when when the stored
+   values get used subsequently).
    These stores can be a result of structure element initializers, array stores
    etc.  A store_immediate_info object is recorded for every such store.
    Record as many such assignments to a single base as possible until a
    statement that interferes with the store sequence is encountered.
+   Each store has up to 2 operands, which can be an immediate constant
+   or a memory load, from which the value to be stored can be computed.
+   At most one of the operands can be a constant.  The operands are recorded
+   in store_operand_info struct.
 
    2) Analyze the chain of stores recorded in phase 1) (i.e. the vector of
    store_immediate_info objects) and coalesce contiguous stores into
-   merged_store_group objects.
+   merged_store_group objects.  For bit-fields stores, we don't need to
+   require the stores to be contiguous, just their surrounding bit regions
+   have to be contiguous.  If the expression being stored is different
+   between adjacent stores, such as one store storing a constant and
+   following storing a value loaded from memory, or if the loaded memory
+   objects are not adjacent, a new merged_store_group is created as well.
 
    For example, given the stores:
    [p     ] := 0;
@@ -126,14 +155,43 @@
 #include "tree-eh.h"
 #include "target.h"
 #include "gimplify-me.h"
+#include "rtl.h"
+#include "expr.h"	/* For get_bit_range.  */
 #include "selftest.h"
 
 /* The maximum size (in bits) of the stores this pass should generate.  */
 #define MAX_STORE_BITSIZE (BITS_PER_WORD)
 #define MAX_STORE_BYTES (MAX_STORE_BITSIZE / BITS_PER_UNIT)
 
+/* Limit to bound the number of aliasing checks for loads with the same
+   vuse as the corresponding store.  */
+#define MAX_STORE_ALIAS_CHECKS 64
+
 namespace {
 
+/* Struct recording one operand for the store, which is either a constant,
+   then VAL represents the constant and all the other fields are zero,
+   or a memory load, then VAL represents the reference, BASE_ADDR is non-NULL
+   and the other fields also reflect the memory load.  */
+
+struct store_operand_info
+{
+  tree val;
+  tree base_addr;
+  unsigned HOST_WIDE_INT bitsize;
+  unsigned HOST_WIDE_INT bitpos;
+  unsigned HOST_WIDE_INT bitregion_start;
+  unsigned HOST_WIDE_INT bitregion_end;
+  gimple *stmt;
+  store_operand_info ();
+};
+
+store_operand_info::store_operand_info ()
+  : val (NULL_TREE), base_addr (NULL_TREE), bitsize (0), bitpos (0),
+    bitregion_start (0), bitregion_end (0), stmt (NULL)
+{
+}
+
 /* Struct recording the information about a single store of an immediate
    to memory.  These are created in the first phase and coalesced into
    merged_store_group objects in the second phase.  */
@@ -142,19 +200,45 @@ struct store_immediate_info
 {
   unsigned HOST_WIDE_INT bitsize;
   unsigned HOST_WIDE_INT bitpos;
+  unsigned HOST_WIDE_INT bitregion_start;
+  /* This is one past the last bit of the bit region.  */
+  unsigned HOST_WIDE_INT bitregion_end;
   gimple *stmt;
   unsigned int order;
+  /* INTEGER_CST for constant stores, MEM_REF for memory copy or
+     BIT_*_EXPR for logical bitwise operation.  */
+  enum tree_code rhs_code;
+  /* Operands.  For BIT_*_EXPR rhs_code both operands are used, otherwise
+     just the first one.  */
+  store_operand_info ops[2];
   store_immediate_info (unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
-			gimple *, unsigned int);
+			unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+			gimple *, unsigned int, enum tree_code,
+			const store_operand_info &,
+			const store_operand_info &);
 };
 
 store_immediate_info::store_immediate_info (unsigned HOST_WIDE_INT bs,
 					    unsigned HOST_WIDE_INT bp,
+					    unsigned HOST_WIDE_INT brs,
+					    unsigned HOST_WIDE_INT bre,
 					    gimple *st,
-					    unsigned int ord)
-  : bitsize (bs), bitpos (bp), stmt (st), order (ord)
+					    unsigned int ord,
+					    enum tree_code rhscode,
+					    const store_operand_info &op0r,
+					    const store_operand_info &op1r)
+  : bitsize (bs), bitpos (bp), bitregion_start (brs), bitregion_end (bre),
+    stmt (st), order (ord), rhs_code (rhscode)
+#if __cplusplus >= 201103L
+    , ops { op0r, op1r }
+{
+}
+#else
 {
+  ops[0] = op0r;
+  ops[1] = op1r;
 }
+#endif
 
 /* Struct representing a group of stores to contiguous memory locations.
    These are produced by the second phase (coalescing) and consumed in the
@@ -164,26 +248,34 @@ struct merged_store_group
 {
   unsigned HOST_WIDE_INT start;
   unsigned HOST_WIDE_INT width;
-  /* The size of the allocated memory for val.  */
+  unsigned HOST_WIDE_INT bitregion_start;
+  unsigned HOST_WIDE_INT bitregion_end;
+  /* The size of the allocated memory for val and mask.  */
   unsigned HOST_WIDE_INT buf_size;
+  unsigned HOST_WIDE_INT align_base;
+  unsigned HOST_WIDE_INT load_align_base[2];
 
   unsigned int align;
+  unsigned int load_align[2];
   unsigned int first_order;
   unsigned int last_order;
 
-  auto_vec<struct store_immediate_info *> stores;
+  auto_vec<store_immediate_info *> stores;
   /* We record the first and last original statements in the sequence because
      we'll need their vuse/vdef and replacement position.  It's easier to keep
      track of them separately as 'stores' is reordered by apply_stores.  */
   gimple *last_stmt;
   gimple *first_stmt;
   unsigned char *val;
+  unsigned char *mask;
 
   merged_store_group (store_immediate_info *);
   ~merged_store_group ();
   void merge_into (store_immediate_info *);
   void merge_overlapping (store_immediate_info *);
   bool apply_stores ();
+private:
+  void do_merge (store_immediate_info *);
 };
 
 /* Debug helper.  Dump LEN elements of byte array PTR to FD in hex.  */
@@ -287,8 +379,7 @@ clear_bit_region_be (unsigned char *ptr, unsigned int start,
 	   && len > BITS_PER_UNIT)
     {
       unsigned int nbytes = len / BITS_PER_UNIT;
-      for (unsigned int i = 0; i < nbytes; i++)
-	ptr[i] = 0U;
+      memset (ptr, 0, nbytes);
       if (len % BITS_PER_UNIT != 0)
 	clear_bit_region_be (ptr + nbytes, BITS_PER_UNIT - 1,
 			     len % BITS_PER_UNIT);
@@ -552,10 +643,30 @@ merged_store_group::merged_store_group (store_immediate_info *info)
 {
   start = info->bitpos;
   width = info->bitsize;
+  bitregion_start = info->bitregion_start;
+  bitregion_end = info->bitregion_end;
   /* VAL has memory allocated for it in apply_stores once the group
      width has been finalized.  */
   val = NULL;
-  align = get_object_alignment (gimple_assign_lhs (info->stmt));
+  mask = NULL;
+  unsigned HOST_WIDE_INT align_bitpos = 0;
+  get_object_alignment_1 (gimple_assign_lhs (info->stmt),
+			  &align, &align_bitpos);
+  align_base = start - align_bitpos;
+  for (int i = 0; i < 2; ++i)
+    {
+      store_operand_info &op = info->ops[i];
+      if (op.base_addr == NULL_TREE)
+	{
+	  load_align[i] = 0;
+	  load_align_base[i] = 0;
+	}
+      else
+	{
+	  get_object_alignment_1 (op.val, &load_align[i], &align_bitpos);
+	  load_align_base[i] = op.bitpos - align_bitpos;
+	}
+    }
   stores.create (1);
   stores.safe_push (info);
   last_stmt = info->stmt;
@@ -571,18 +682,37 @@ merged_store_group::~merged_store_group ()
     XDELETEVEC (val);
 }
 
-/* Merge a store recorded by INFO into this merged store.
-   The store is not overlapping with the existing recorded
-   stores.  */
-
+/* Helper method for merge_into and merge_overlapping to do
+   the common part.  */
 void
-merged_store_group::merge_into (store_immediate_info *info)
+merged_store_group::do_merge (store_immediate_info *info)
 {
-  unsigned HOST_WIDE_INT wid = info->bitsize;
-  /* Make sure we're inserting in the position we think we're inserting.  */
-  gcc_assert (info->bitpos == start + width);
+  bitregion_start = MIN (bitregion_start, info->bitregion_start);
+  bitregion_end = MAX (bitregion_end, info->bitregion_end);
+
+  unsigned int this_align;
+  unsigned HOST_WIDE_INT align_bitpos = 0;
+  get_object_alignment_1 (gimple_assign_lhs (info->stmt),
+			  &this_align, &align_bitpos);
+  if (this_align > align)
+    {
+      align = this_align;
+      align_base = info->bitpos - align_bitpos;
+    }
+  for (int i = 0; i < 2; ++i)
+    {
+      store_operand_info &op = info->ops[i];
+      if (!op.base_addr)
+	continue;
+
+      get_object_alignment_1 (op.val, &this_align, &align_bitpos);
+      if (this_align > load_align[i])
+	{
+	  load_align[i] = this_align;
+	  load_align_base[i] = op.bitpos - align_bitpos;
+	}
+    }
 
-  width += wid;
   gimple *stmt = info->stmt;
   stores.safe_push (info);
   if (info->order > last_order)
@@ -597,6 +727,22 @@ merged_store_group::merge_into (store_immediate_info *info)
     }
 }
 
+/* Merge a store recorded by INFO into this merged store.
+   The store is not overlapping with the existing recorded
+   stores.  */
+
+void
+merged_store_group::merge_into (store_immediate_info *info)
+{
+  unsigned HOST_WIDE_INT wid = info->bitsize;
+  /* Make sure we're inserting in the position we think we're inserting.  */
+  gcc_assert (info->bitpos >= start + width
+	      && info->bitregion_start <= bitregion_end);
+
+  width += wid;
+  do_merge (info);
+}
+
 /* Merge a store described by INFO into this merged store.
    INFO overlaps in some way with the current store (i.e. it's not contiguous
    which is handled by merged_store_group::merge_into).  */
@@ -604,23 +750,11 @@ merged_store_group::merge_into (store_immediate_info *info)
 void
 merged_store_group::merge_overlapping (store_immediate_info *info)
 {
-  gimple *stmt = info->stmt;
-  stores.safe_push (info);
-
   /* If the store extends the size of the group, extend the width.  */
-  if ((info->bitpos + info->bitsize) > (start + width))
+  if (info->bitpos + info->bitsize > start + width)
     width += info->bitpos + info->bitsize - (start + width);
 
-  if (info->order > last_order)
-    {
-      last_order = info->order;
-      last_stmt = stmt;
-    }
-  else if (info->order < first_order)
-    {
-      first_order = info->order;
-      first_stmt = stmt;
-    }
+  do_merge (info);
 }
 
 /* Go through all the recorded stores in this group in program order and
@@ -630,37 +764,43 @@ merged_store_group::merge_overlapping (store_immediate_info *info)
 bool
 merged_store_group::apply_stores ()
 {
-  /* The total width of the stores must add up to a whole number of bytes
-     and start at a byte boundary.  We don't support emitting bitfield
-     references for now.  Also, make sure we have more than one store
-     in the group, otherwise we cannot merge anything.  */
-  if (width % BITS_PER_UNIT != 0
-      || start % BITS_PER_UNIT != 0
+  /* Make sure we have more than one store in the group, otherwise we cannot
+     merge anything.  */
+  if (bitregion_start % BITS_PER_UNIT != 0
+      || bitregion_end % BITS_PER_UNIT != 0
       || stores.length () == 1)
     return false;
 
   stores.qsort (sort_by_order);
-  struct store_immediate_info *info;
+  store_immediate_info *info;
   unsigned int i;
   /* Create a buffer of a size that is 2 times the number of bytes we're
      storing.  That way native_encode_expr can write power-of-2-sized
      chunks without overrunning.  */
-  buf_size = 2 * (ROUND_UP (width, BITS_PER_UNIT) / BITS_PER_UNIT);
-  val = XCNEWVEC (unsigned char, buf_size);
+  buf_size = 2 * ((bitregion_end - bitregion_start) / BITS_PER_UNIT);
+  val = XNEWVEC (unsigned char, 2 * buf_size);
+  mask = val + buf_size;
+  memset (val, 0, buf_size);
+  memset (mask, ~0U, buf_size);
 
   FOR_EACH_VEC_ELT (stores, i, info)
     {
-      unsigned int pos_in_buffer = info->bitpos - start;
-      bool ret = encode_tree_to_bitpos (gimple_assign_rhs1 (info->stmt),
-					val, info->bitsize,
-					pos_in_buffer, buf_size);
-      if (dump_file && (dump_flags & TDF_DETAILS))
+      unsigned int pos_in_buffer = info->bitpos - bitregion_start;
+      tree cst = NULL_TREE;
+      if (info->ops[0].val && info->ops[0].base_addr == NULL_TREE)
+	cst = info->ops[0].val;
+      else if (info->ops[1].val && info->ops[1].base_addr == NULL_TREE)
+	cst = info->ops[1].val;
+      bool ret = true;
+      if (cst)
+	ret = encode_tree_to_bitpos (cst, val, info->bitsize,
+				     pos_in_buffer, buf_size);
+      if (cst && dump_file && (dump_flags & TDF_DETAILS))
 	{
 	  if (ret)
 	    {
 	      fprintf (dump_file, "After writing ");
-	      print_generic_expr (dump_file,
-				  gimple_assign_rhs1 (info->stmt), 0);
+	      print_generic_expr (dump_file, cst, 0);
 	      fprintf (dump_file, " of size " HOST_WIDE_INT_PRINT_DEC
 			" at position %d the merged region contains:\n",
 			info->bitsize, pos_in_buffer);
@@ -671,6 +811,13 @@ merged_store_group::apply_stores ()
         }
       if (!ret)
 	return false;
+      unsigned char *m = mask + (pos_in_buffer / BITS_PER_UNIT);
+      if (BYTES_BIG_ENDIAN)
+	clear_bit_region_be (m, (BITS_PER_UNIT - 1
+				 - (pos_in_buffer % BITS_PER_UNIT)),
+			     info->bitsize);
+      else
+	clear_bit_region (m, pos_in_buffer % BITS_PER_UNIT, info->bitsize);
     }
   return true;
 }
@@ -685,7 +832,7 @@ struct imm_store_chain_info
      See pass_store_merging::m_stores_head for more rationale.  */
   imm_store_chain_info *next, **pnxp;
   tree base_addr;
-  auto_vec<struct store_immediate_info *> m_store_info;
+  auto_vec<store_immediate_info *> m_store_info;
   auto_vec<merged_store_group *> m_merged_store_groups;
 
   imm_store_chain_info (imm_store_chain_info *&inspt, tree b_a)
@@ -733,11 +880,16 @@ public:
   {
   }
 
-  /* Pass not supported for PDP-endianness.  */
+  /* Pass not supported for PDP-endianness, nor for insane hosts
+     or target character sizes where native_{encode,interpret}_expr
+     doesn't work properly.  */
   virtual bool
   gate (function *)
   {
-    return flag_store_merging && (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
+    return flag_store_merging
+	   && WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN
+	   && CHAR_BIT == 8
+	   && BITS_PER_UNIT == 8;
   }
 
   virtual unsigned int execute (function *);
@@ -756,9 +908,10 @@ private:
      decisions when going out of SSA).  */
   imm_store_chain_info *m_stores_head;
 
+  void process_store (gimple *);
   bool terminate_and_process_all_chains ();
   bool terminate_all_aliasing_chains (imm_store_chain_info **,
-				      bool, gimple *);
+				      gimple *);
   bool terminate_and_release_chain (imm_store_chain_info *);
 }; // class pass_store_merging
 
@@ -788,7 +941,6 @@ pass_store_merging::terminate_and_process_all_chains ()
 bool
 pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
 						     **chain_info,
-						   bool var_offset_p,
 						   gimple *stmt)
 {
   bool ret = false;
@@ -802,37 +954,21 @@ pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
      of a chain.  */
   if (chain_info)
     {
-      /* We have a chain at BASE and we're writing to [BASE + <variable>].
-	 This can interfere with any of the stores so terminate
-	 the chain.  */
-      if (var_offset_p)
-	{
-	  terminate_and_release_chain (*chain_info);
-	  ret = true;
-	}
-      /* Otherwise go through every store in the chain to see if it
-	 aliases with any of them.  */
-      else
+      store_immediate_info *info;
+      unsigned int i;
+      FOR_EACH_VEC_ELT ((*chain_info)->m_store_info, i, info)
 	{
-	  struct store_immediate_info *info;
-	  unsigned int i;
-	  FOR_EACH_VEC_ELT ((*chain_info)->m_store_info, i, info)
+	  if (ref_maybe_used_by_stmt_p (stmt, gimple_assign_lhs (info->stmt))
+	      || stmt_may_clobber_ref_p (stmt, gimple_assign_lhs (info->stmt)))
 	    {
-	      if (ref_maybe_used_by_stmt_p (stmt,
-					    gimple_assign_lhs (info->stmt))
-		  || stmt_may_clobber_ref_p (stmt,
-					     gimple_assign_lhs (info->stmt)))
+	      if (dump_file && (dump_flags & TDF_DETAILS))
 		{
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    {
-		      fprintf (dump_file,
-			       "stmt causes chain termination:\n");
-		      print_gimple_stmt (dump_file, stmt, 0);
-		    }
-		  terminate_and_release_chain (*chain_info);
-		  ret = true;
-		  break;
+		  fprintf (dump_file, "stmt causes chain termination:\n");
+		  print_gimple_stmt (dump_file, stmt, 0);
 		}
+	      terminate_and_release_chain (*chain_info);
+	      ret = true;
+	      break;
 	    }
 	}
     }
@@ -877,6 +1013,125 @@ pass_store_merging::terminate_and_release_chain (imm_store_chain_info *chain_inf
   return ret;
 }
 
+/* Return true if stmts in between FIRST (inclusive) and LAST (exclusive)
+   may clobber REF.  FIRST and LAST must be in the same basic block and
+   have non-NULL vdef.  */
+
+bool
+stmts_may_clobber_ref_p (gimple *first, gimple *last, tree ref)
+{
+  ao_ref r;
+  ao_ref_init (&r, ref);
+  unsigned int count = 0;
+  tree vop = gimple_vdef (last);
+  gimple *stmt;
+
+  gcc_checking_assert (gimple_bb (first) == gimple_bb (last));
+  do
+    {
+      stmt = SSA_NAME_DEF_STMT (vop);
+      if (stmt_may_clobber_ref_p_1 (stmt, &r))
+	return true;
+      /* Avoid quadratic compile time by bounding the number of checks
+	 we perform.  */
+      if (++count > MAX_STORE_ALIAS_CHECKS)
+	return true;
+      vop = gimple_vuse (stmt);
+    }
+  while (stmt != first);
+  return false;
+}
+
+/* Return true if INFO->ops[IDX] is mergeable with the
+   corresponding loads already in MERGED_STORE group.
+   BASE_ADDR is the base address of the whole store group.  */
+
+bool
+compatible_load_p (merged_store_group *merged_store,
+		   store_immediate_info *info,
+		   tree base_addr, int idx)
+{
+  store_immediate_info *infof = merged_store->stores[0];
+  if (!info->ops[idx].base_addr
+      || (info->ops[idx].bitpos - infof->ops[idx].bitpos
+	  != info->bitpos - infof->bitpos)
+      || !operand_equal_p (info->ops[idx].base_addr,
+			   infof->ops[idx].base_addr, 0))
+    return false;
+
+  store_immediate_info *infol = merged_store->stores.last ();
+  tree load_vuse = gimple_vuse (info->ops[idx].stmt);
+  /* In this case all vuses should be the same, e.g.
+     _1 = s.a; _2 = s.b; _3 = _1 | 1; t.a = _3; _4 = _2 | 2; t.b = _4;
+     or
+     _1 = s.a; _2 = s.b; t.a = _1; t.b = _2;
+     and we can emit the coalesced load next to any of those loads.  */
+  if (gimple_vuse (infof->ops[idx].stmt) == load_vuse
+      && gimple_vuse (infol->ops[idx].stmt) == load_vuse)
+    return true;
+
+  /* Otherwise, at least for now require that the load has the same
+     vuse as the store.  See following examples.  */
+  if (gimple_vuse (info->stmt) != load_vuse)
+    return false;
+
+  if (gimple_vuse (infof->stmt) != gimple_vuse (infof->ops[idx].stmt)
+      || (infof != infol
+	  && gimple_vuse (infol->stmt) != gimple_vuse (infol->ops[idx].stmt)))
+    return false;
+
+  /* If the load is from the same location as the store, already
+     the construction of the immediate chain info guarantees no intervening
+     stores, so no further checks are needed.  Example:
+     _1 = s.a; _2 = _1 & -7; s.a = _2; _3 = s.b; _4 = _3 & -7; s.b = _4;  */
+  if (info->ops[idx].bitpos == info->bitpos
+      && operand_equal_p (info->ops[idx].base_addr, base_addr, 0))
+    return true;
+
+  /* Otherwise, we need to punt if any of the loads can be clobbered by any
+     of the stores in the group, or any other stores in between those.
+     Previous calls to compatible_load_p ensured that for all the
+     merged_store->stores IDX loads, no stmts starting with
+     merged_store->first_stmt and ending right before merged_store->last_stmt
+     clobbers those loads.  */
+  gimple *first = merged_store->first_stmt;
+  gimple *last = merged_store->last_stmt;
+  unsigned int i;
+  store_immediate_info *infoc;
+  /* The stores are sorted by increasing store bitpos, so if info->stmt store
+     comes before the so far first load, we'll be changing
+     merged_store->first_stmt.  In that case we need to give up if
+     any of the earlier processed loads clobber with the stmts in the new
+     range.  */
+  if (info->order < merged_store->first_order)
+    {
+      FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
+	if (stmts_may_clobber_ref_p (info->stmt, first, infoc->ops[idx].val))
+	  return false;
+      first = info->stmt;
+    }
+  /* Similarly, we could change merged_store->last_stmt, so ensure
+     in that case no stmts in the new range clobber any of the earlier
+     processed loads.  */
+  else if (info->order > merged_store->last_order)
+    {
+      FOR_EACH_VEC_ELT (merged_store->stores, i, infoc)
+	if (stmts_may_clobber_ref_p (last, info->stmt, infoc->ops[idx].val))
+	  return false;
+      last = info->stmt;
+    }
+  /* And finally, we'd be adding a new load to the set, ensure it isn't
+     clobbered in the new range.  */
+  if (stmts_may_clobber_ref_p (first, last, info->ops[idx].val))
+    return false;
+
+  /* Otherwise, we are looking for:
+     _1 = s.a; _2 = _1 ^ 15; t.a = _2; _3 = s.b; _4 = _3 ^ 15; t.b = _4;
+     or
+     _1 = s.a; t.a = _1; _2 = s.b; t.b = _2;  */
+  return true;
+}
+
 /* Go through the candidate stores recorded in m_store_info and merge them
    into merged_store_group objects recorded into m_merged_store_groups
    representing the widened stores.  Return true if coalescing was successful
@@ -924,38 +1179,63 @@ imm_store_chain_info::coalesce_immediate_stores ()
       if (IN_RANGE (start, merged_store->start,
 		    merged_store->start + merged_store->width - 1))
 	{
-	  merged_store->merge_overlapping (info);
-	  continue;
+	  /* Only allow overlapping stores of constants.  */
+	  if (info->rhs_code == INTEGER_CST
+	      && merged_store->stores[0]->rhs_code == INTEGER_CST)
+	    {
+	      merged_store->merge_overlapping (info);
+	      continue;
+	    }
 	}
-
-      /* |---store 1---| <gap> |---store 2---|.
-	 Gap between stores.  Start a new group.  */
-      if (start != merged_store->start + merged_store->width)
+      /* |---store 1---||---store 2---|
+	 This store is consecutive to the previous one.
+	 Merge it into the current store group.  There can be gaps in between
+	 the stores, but there can't be gaps in between bitregions.  */
+      else if (info->bitregion_start <= merged_store->bitregion_end
+	       && info->rhs_code == merged_store->stores[0]->rhs_code)
 	{
-	  /* Try to apply all the stores recorded for the group to determine
-	     the bitpattern they write and discard it if that fails.
-	     This will also reject single-store groups.  */
-	  if (!merged_store->apply_stores ())
-	    delete merged_store;
-	  else
-	    m_merged_store_groups.safe_push (merged_store);
+	  store_immediate_info *infof = merged_store->stores[0];
+
+	  /* All the rhs_code ops that take 2 operands are commutative,
+	     swap the operands if it could make the operands compatible.  */
+	  if (infof->ops[0].base_addr
+	      && infof->ops[1].base_addr
+	      && info->ops[0].base_addr
+	      && info->ops[1].base_addr
+	      && (info->ops[1].bitpos - infof->ops[0].bitpos
+		  == info->bitpos - infof->bitpos)
+	      && operand_equal_p (info->ops[1].base_addr,
+				  infof->ops[0].base_addr, 0))
+	    std::swap (info->ops[0], info->ops[1]);
+	  if ((!infof->ops[0].base_addr
+	       || compatible_load_p (merged_store, info, base_addr, 0))
+	      && (!infof->ops[1].base_addr
+		  || compatible_load_p (merged_store, info, base_addr, 1)))
+	    {
+	      merged_store->merge_into (info);
+	      continue;
+	    }
+	}
 
-	  merged_store = new merged_store_group (info);
+      /* |---store 1---| <gap> |---store 2---|.
+	 Gap between stores or the rhs not compatible.  Start a new group.  */
 
-	  continue;
-	}
+      /* Try to apply all the stores recorded for the group to determine
+	 the bitpattern they write and discard it if that fails.
+	 This will also reject single-store groups.  */
+      if (!merged_store->apply_stores ())
+	delete merged_store;
+      else
+	m_merged_store_groups.safe_push (merged_store);
 
-      /* |---store 1---||---store 2---|
-	 This store is consecutive to the previous one.
-	 Merge it into the current store group.  */
-       merged_store->merge_into (info);
+      merged_store = new merged_store_group (info);
     }
 
-    /* Record or discard the last store group.  */
-    if (!merged_store->apply_stores ())
-      delete merged_store;
-    else
-      m_merged_store_groups.safe_push (merged_store);
+  /* Record or discard the last store group.  */
+  if (!merged_store->apply_stores ())
+    delete merged_store;
+  else
+    m_merged_store_groups.safe_push (merged_store);
 
   gcc_assert (m_merged_store_groups.length () <= m_store_info.length ());
   bool success
@@ -964,41 +1244,63 @@ imm_store_chain_info::coalesce_immediate_stores ()
 
   if (success && dump_file)
     fprintf (dump_file, "Coalescing successful!\n"
-			 "Merged into %u stores\n",
-		m_merged_store_groups.length ());
+			"Merged into %u stores\n",
+	     m_merged_store_groups.length ());
 
   return success;
 }
 
-/* Return the type to use for the merged stores described by STMTS.
-   This is needed to get the alias sets right.  */
+/* Return the type to use for the merged stores or loads described by STMTS.
+   This is needed to get the alias sets right.  If IS_LOAD, look for rhs,
+   otherwise lhs.  Additionally set *CLIQUEP and *BASEP to MR_DEPENDENCE_*
+   of the MEM_REFs if any.  */
 
 static tree
-get_alias_type_for_stmts (auto_vec<gimple *> &stmts)
+get_alias_type_for_stmts (vec<gimple *> &stmts, bool is_load,
+			  unsigned short *cliquep, unsigned short *basep)
 {
   gimple *stmt;
   unsigned int i;
-  tree lhs = gimple_assign_lhs (stmts[0]);
-  tree type = reference_alias_ptr_type (lhs);
+  tree type = NULL_TREE;
+  tree ret = NULL_TREE;
+  *cliquep = 0;
+  *basep = 0;
 
   FOR_EACH_VEC_ELT (stmts, i, stmt)
     {
-      if (i == 0)
-	continue;
+      tree ref = is_load ? gimple_assign_rhs1 (stmt)
+			 : gimple_assign_lhs (stmt);
+      tree type1 = reference_alias_ptr_type (ref);
+      tree base = get_base_address (ref);
 
-      lhs = gimple_assign_lhs (stmt);
-      tree type1 = reference_alias_ptr_type (lhs);
+      if (i == 0)
+	{
+	  if (TREE_CODE (base) == MEM_REF)
+	    {
+	      *cliquep = MR_DEPENDENCE_CLIQUE (base);
+	      *basep = MR_DEPENDENCE_BASE (base);
+	    }
+	  ret = type = type1;
+	  continue;
+	}
       if (!alias_ptr_types_compatible_p (type, type1))
-	return ptr_type_node;
+	ret = ptr_type_node;
+      if (TREE_CODE (base) != MEM_REF
+	  || *cliquep != MR_DEPENDENCE_CLIQUE (base)
+	  || *basep != MR_DEPENDENCE_BASE (base))
+	{
+	  *cliquep = 0;
+	  *basep = 0;
+	}
     }
-  return type;
+  return ret;
 }
 
 /* Return the location_t information we can find among the statements
    in STMTS.  */
 
 static location_t
-get_location_for_stmts (auto_vec<gimple *> &stmts)
+get_location_for_stmts (vec<gimple *> &stmts)
 {
   gimple *stmt;
   unsigned int i;
@@ -1018,7 +1320,9 @@ struct split_store
   unsigned HOST_WIDE_INT bytepos;
   unsigned HOST_WIDE_INT size;
   unsigned HOST_WIDE_INT align;
-  auto_vec<gimple *> orig_stmts;
+  auto_vec<store_immediate_info *> orig_stores;
+  /* True if there is a single orig stmt covering the whole split store.  */
+  bool orig;
   split_store (unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
 	       unsigned HOST_WIDE_INT);
 };
@@ -1028,100 +1332,230 @@ struct split_store
 split_store::split_store (unsigned HOST_WIDE_INT bp,
 			  unsigned HOST_WIDE_INT sz,
 			  unsigned HOST_WIDE_INT al)
-			  : bytepos (bp), size (sz), align (al)
+			  : bytepos (bp), size (sz), align (al), orig (false)
 {
-  orig_stmts.create (0);
+  orig_stores.create (0);
 }
 
-/* Record all statements corresponding to stores in GROUP that write to
-   the region starting at BITPOS and is of size BITSIZE.  Record such
-   statements in STMTS.  The stores in GROUP must be sorted by
-   bitposition.  */
+/* Record all stores in GROUP that write to the region starting at BITPOS and
+   is of size BITSIZE.  Record infos for such statements in STORES if
+   non-NULL.  The stores in GROUP must be sorted by bitposition.  Return INFO
+   if there is exactly one original store in the range.  */
 
-static void
-find_constituent_stmts (struct merged_store_group *group,
-			 auto_vec<gimple *> &stmts,
+static store_immediate_info *
+find_constituent_stores (struct merged_store_group *group,
+			 vec<store_immediate_info *> *stores,
+			 unsigned int *first,
 			 unsigned HOST_WIDE_INT bitpos,
 			 unsigned HOST_WIDE_INT bitsize)
 {
-  struct store_immediate_info *info;
+  store_immediate_info *info, *ret = NULL;
   unsigned int i;
+  bool second = false;
+  bool update_first = true;
   unsigned HOST_WIDE_INT end = bitpos + bitsize;
-  FOR_EACH_VEC_ELT (group->stores, i, info)
+  for (i = *first; group->stores.iterate (i, &info); ++i)
     {
       unsigned HOST_WIDE_INT stmt_start = info->bitpos;
       unsigned HOST_WIDE_INT stmt_end = stmt_start + info->bitsize;
-      if (stmt_end < bitpos)
-	continue;
+      if (stmt_end <= bitpos)
+	{
+	  /* BITPOS passed to this function never decreases from within the
+	     same split_group call, so optimize and don't scan info records
+	     which are known to end before or at BITPOS next time.
+	     Only do it if all stores before this one also pass this.  */
+	  if (update_first)
+	    *first = i + 1;
+	  continue;
+	}
+      else
+	update_first = false;
+
       /* The stores in GROUP are ordered by bitposition so if we're past
-	  the region for this group return early.  */
-      if (stmt_start > end)
-	return;
-
-      if (IN_RANGE (stmt_start, bitpos, bitpos + bitsize)
-	  || IN_RANGE (stmt_end, bitpos, end)
-	  /* The statement writes a region that completely encloses the region
-	     that this group writes.  Unlikely to occur but let's
-	     handle it.  */
-	  || IN_RANGE (bitpos, stmt_start, stmt_end))
-	stmts.safe_push (info->stmt);
+	 the region for this group return early.  */
+      if (stmt_start >= end)
+	return ret;
+
+      if (stores)
+	{
+	  stores->safe_push (info);
+	  if (ret)
+	    {
+	      ret = NULL;
+	      second = true;
+	    }
+	}
+      else if (ret)
+	return NULL;
+      if (!second)
+	ret = info;
     }
+  return ret;
 }
 
 /* Split a merged store described by GROUP by populating the SPLIT_STORES
-   vector with split_store structs describing the byte offset (from the base),
-   the bit size and alignment of each store as well as the original statements
-   involved in each such split group.
+   vector (if non-NULL) with split_store structs describing the byte offset
+   (from the base), the bit size and alignment of each store as well as the
+   original statements involved in each such split group.
    This is to separate the splitting strategy from the statement
    building/emission/linking done in output_merged_store.
-   At the moment just start with the widest possible size and keep emitting
-   the widest we can until we have emitted all the bytes, halving the size
-   when appropriate.  */
-
-static bool
-split_group (merged_store_group *group,
-	     auto_vec<struct split_store *> &split_stores)
+   Return number of new stores.
+   If ALLOW_UNALIGNED_STORE is false, then all stores must be aligned.
+   If ALLOW_UNALIGNED_LOAD is false, then all loads must be aligned.
+   If SPLIT_STORES is NULL, it is just a dry run to count number of
+   new stores.  */
+
+static unsigned int
+split_group (merged_store_group *group, bool allow_unaligned_store,
+	     bool allow_unaligned_load,
+	     vec<struct split_store *> *split_stores)
 {
-  unsigned HOST_WIDE_INT pos = group->start;
-  unsigned HOST_WIDE_INT size = group->width;
+  unsigned HOST_WIDE_INT pos = group->bitregion_start;
+  unsigned HOST_WIDE_INT size = group->bitregion_end - pos;
   unsigned HOST_WIDE_INT bytepos = pos / BITS_PER_UNIT;
-  unsigned HOST_WIDE_INT align = group->align;
+  unsigned HOST_WIDE_INT group_align = group->align;
+  unsigned HOST_WIDE_INT align_base = group->align_base;
+  unsigned HOST_WIDE_INT group_load_align = group_align;
 
-  /* We don't handle partial bitfields for now.  We shouldn't have
-     reached this far.  */
   gcc_assert ((size % BITS_PER_UNIT == 0) && (pos % BITS_PER_UNIT == 0));
 
-  bool allow_unaligned
-    = !STRICT_ALIGNMENT && PARAM_VALUE (PARAM_STORE_MERGING_ALLOW_UNALIGNED);
-
-  unsigned int try_size = MAX_STORE_BITSIZE;
-  while (try_size > size
-	 || (!allow_unaligned
-	     && try_size > align))
-    {
-      try_size /= 2;
-      if (try_size < BITS_PER_UNIT)
-	return false;
-    }
-
+  unsigned int ret = 0, first = 0;
   unsigned HOST_WIDE_INT try_pos = bytepos;
   group->stores.qsort (sort_by_bitpos);
 
+  if (!allow_unaligned_load)
+    for (int i = 0; i < 2; ++i)
+      if (group->load_align[i])
+	group_load_align = MIN (group_load_align, group->load_align[i]);
+
   while (size > 0)
     {
-      struct split_store *store = new split_store (try_pos, try_size, align);
+      if ((allow_unaligned_store || group_align <= BITS_PER_UNIT)
+	  && group->mask[try_pos - bytepos] == (unsigned char) ~0U)
+	{
+	  /* Skip padding bytes.  */
+	  ++try_pos;
+	  size -= BITS_PER_UNIT;
+	  continue;
+	}
+
       unsigned HOST_WIDE_INT try_bitpos = try_pos * BITS_PER_UNIT;
-      find_constituent_stmts (group, store->orig_stmts, try_bitpos, try_size);
-      split_stores.safe_push (store);
+      unsigned int try_size = MAX_STORE_BITSIZE, nonmasked;
+      unsigned HOST_WIDE_INT align_bitpos
+	= (try_bitpos - align_base) & (group_align - 1);
+      unsigned HOST_WIDE_INT align = group_align;
+      if (align_bitpos)
+	align = least_bit_hwi (align_bitpos);
+      if (!allow_unaligned_store)
+	try_size = MIN (try_size, align);
+      if (!allow_unaligned_load)
+	{
+	  /* If we can't do or don't want to do unaligned stores
+	     as well as loads, we need to take the loads into account
+	     as well.  */
+	  unsigned HOST_WIDE_INT load_align = group_load_align;
+	  align_bitpos = (try_bitpos - align_base) & (load_align - 1);
+	  if (align_bitpos)
+	    load_align = least_bit_hwi (align_bitpos);
+	  for (int i = 0; i < 2; ++i)
+	    if (group->load_align[i])
+	      {
+		align_bitpos = try_bitpos - group->stores[0]->bitpos;
+		align_bitpos += group->stores[0]->ops[i].bitpos;
+		align_bitpos -= group->load_align_base[i];
+		align_bitpos &= (group_load_align - 1);
+		if (align_bitpos)
+		  {
+		    unsigned HOST_WIDE_INT a = least_bit_hwi (align_bitpos);
+		    load_align = MIN (load_align, a);
+		  }
+	      }
+	  try_size = MIN (try_size, load_align);
+	}
+      store_immediate_info *info
+	= find_constituent_stores (group, NULL, &first, try_bitpos, try_size);
+      if (info)
+	{
+	  /* If there is just one original statement for the range, see if
+	     we can just reuse the original store which could be even larger
+	     than try_size.  */
+	  unsigned HOST_WIDE_INT stmt_end
+	    = ROUND_UP (info->bitpos + info->bitsize, BITS_PER_UNIT);
+	  info = find_constituent_stores (group, NULL, &first, try_bitpos,
+					  stmt_end - try_bitpos);
+	  if (info && info->bitpos >= try_bitpos)
+	    {
+	      try_size = stmt_end - try_bitpos;
+	      goto found;
+	    }
+	}
 
-      try_pos += try_size / BITS_PER_UNIT;
+      /* Approximate store bitsize for the case when there are no padding
+	 bits.  */
+      while (try_size > size)
+	try_size /= 2;
+      /* Now look for whole padding bytes at the end of that bitsize.  */
+      for (nonmasked = try_size / BITS_PER_UNIT; nonmasked > 0; --nonmasked)
+	if (group->mask[try_pos - bytepos + nonmasked - 1]
+	    != (unsigned char) ~0U)
+	  break;
+      if (nonmasked == 0)
+	{
+	  /* If entire try_size range is padding, skip it.  */
+	  try_pos += try_size / BITS_PER_UNIT;
+	  size -= try_size;
+	  continue;
+	}
+      /* Otherwise try to decrease try_size if second half, last 3 quarters
+	 etc. are padding.  */
+      nonmasked *= BITS_PER_UNIT;
+      while (nonmasked <= try_size / 2)
+	try_size /= 2;
+      if (!allow_unaligned_store && group_align > BITS_PER_UNIT)
+	{
+	  /* Now look for whole padding bytes at the start of that bitsize.  */
+	  unsigned int try_bytesize = try_size / BITS_PER_UNIT, masked;
+	  for (masked = 0; masked < try_bytesize; ++masked)
+	    if (group->mask[try_pos - bytepos + masked] != (unsigned char) ~0U)
+	      break;
+	  masked *= BITS_PER_UNIT;
+	  gcc_assert (masked < try_size);
+	  if (masked >= try_size / 2)
+	    {
+	      while (masked >= try_size / 2)
+		{
+		  try_size /= 2;
+		  try_pos += try_size / BITS_PER_UNIT;
+		  size -= try_size;
+		  masked -= try_size;
+		}
+	      /* Need to recompute the alignment, so just retry at the new
+		 position.  */
+	      continue;
+	    }
+	}
+
+    found:
+      ++ret;
+
+      if (split_stores)
+	{
+	  struct split_store *store
+	    = new split_store (try_pos, try_size, align);
+	  info = find_constituent_stores (group, &store->orig_stores,
+					  &first, try_bitpos, try_size);
+	  if (info
+	      && info->bitpos >= try_bitpos
+	      && info->bitpos + info->bitsize <= try_bitpos + try_size)
+	    store->orig = true;
+	  split_stores->safe_push (store);
+	}
 
+      try_pos += try_size / BITS_PER_UNIT;
       size -= try_size;
-      align = try_size;
-      while (size < try_size)
-	try_size /= 2;
     }
-  return true;
+
+  return ret;
 }
 
 /* Given a merged store group GROUP output the widened version of it.
@@ -1135,81 +1569,289 @@ split_group (merged_store_group *group,
 bool
 imm_store_chain_info::output_merged_store (merged_store_group *group)
 {
-  unsigned HOST_WIDE_INT start_byte_pos = group->start / BITS_PER_UNIT;
+  unsigned HOST_WIDE_INT start_byte_pos
+    = group->bitregion_start / BITS_PER_UNIT;
 
   unsigned int orig_num_stmts = group->stores.length ();
   if (orig_num_stmts < 2)
     return false;
 
-  auto_vec<struct split_store *> split_stores;
+  auto_vec<struct split_store *, 32> split_stores;
   split_stores.create (0);
-  if (!split_group (group, split_stores))
-    return false;
+  bool allow_unaligned_store
+    = !STRICT_ALIGNMENT && PARAM_VALUE (PARAM_STORE_MERGING_ALLOW_UNALIGNED);
+  bool allow_unaligned_load = allow_unaligned_store;
+  if (allow_unaligned_store)
+    {
+      /* If unaligned stores are allowed, see how many stores we'd emit
+	 for unaligned and how many stores we'd emit for aligned stores.
+	 Only use unaligned stores if it allows fewer stores than aligned.  */
+      unsigned aligned_cnt
+	= split_group (group, false, allow_unaligned_load, NULL);
+      unsigned unaligned_cnt
+	= split_group (group, true, allow_unaligned_load, NULL);
+      if (aligned_cnt <= unaligned_cnt)
+	allow_unaligned_store = false;
+    }
+  split_group (group, allow_unaligned_store, allow_unaligned_load,
+	       &split_stores);
+
+  if (split_stores.length () >= orig_num_stmts)
+    {
+      /* We didn't manage to reduce the number of statements.  Bail out.  */
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Exceeded original number of stmts (%u)."
+			      "  Not profitable to emit new sequence.\n",
+		   orig_num_stmts);
+	}
+      return false;
+    }
 
   gimple_stmt_iterator last_gsi = gsi_for_stmt (group->last_stmt);
   gimple_seq seq = NULL;
-  unsigned int num_stmts = 0;
   tree last_vdef, new_vuse;
   last_vdef = gimple_vdef (group->last_stmt);
   new_vuse = gimple_vuse (group->last_stmt);
 
   gimple *stmt = NULL;
-  /* The new SSA names created.  Keep track of them so that we can free them
-     if we decide to not use the new sequence.  */
-  auto_vec<tree> new_ssa_names;
   split_store *split_store;
   unsigned int i;
-  bool fail = false;
-
+  auto_vec<gimple *, 32> orig_stmts;
   tree addr = force_gimple_operand_1 (unshare_expr (base_addr), &seq,
 				      is_gimple_mem_ref_addr, NULL_TREE);
+
+  tree load_addr[2] = { NULL_TREE, NULL_TREE };
+  gimple_seq load_seq[2] = { NULL, NULL };
+  gimple_stmt_iterator load_gsi[2] = { gsi_none (), gsi_none () };
+  for (int j = 0; j < 2; ++j)
+    {
+      store_operand_info &op = group->stores[0]->ops[j];
+      if (op.base_addr == NULL_TREE)
+	continue;
+
+      store_immediate_info *infol = group->stores.last ();
+      if (gimple_vuse (op.stmt) == gimple_vuse (infol->ops[j].stmt))
+	{
+	  load_gsi[j] = gsi_for_stmt (op.stmt);
+	  load_addr[j]
+	    = force_gimple_operand_1 (unshare_expr (op.base_addr),
+				      &load_seq[j], is_gimple_mem_ref_addr,
+				      NULL_TREE);
+	}
+      else if (operand_equal_p (base_addr, op.base_addr, 0))
+	load_addr[j] = addr;
+      else
+	load_addr[j]
+	  = force_gimple_operand_1 (unshare_expr (op.base_addr),
+				    &seq, is_gimple_mem_ref_addr,
+				    NULL_TREE);
+    }
+
   FOR_EACH_VEC_ELT (split_stores, i, split_store)
     {
       unsigned HOST_WIDE_INT try_size = split_store->size;
       unsigned HOST_WIDE_INT try_pos = split_store->bytepos;
       unsigned HOST_WIDE_INT align = split_store->align;
-      tree offset_type = get_alias_type_for_stmts (split_store->orig_stmts);
-      location_t loc = get_location_for_stmts (split_store->orig_stmts);
-
-      tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);
-      int_type = build_aligned_type (int_type, align);
-      tree dest = fold_build2 (MEM_REF, int_type, addr,
-			       build_int_cst (offset_type, try_pos));
+      tree dest, src;
+      location_t loc;
+      if (split_store->orig)
+	{
+	  /* If there is just a single constituent store which covers
+	     the whole area, just reuse the lhs and rhs.  */
+	  gimple *orig_stmt = split_store->orig_stores[0]->stmt;
+	  dest = gimple_assign_lhs (orig_stmt);
+	  src = gimple_assign_rhs1 (orig_stmt);
+	  loc = gimple_location (orig_stmt);
+	}
+      else
+	{
+	  store_immediate_info *info;
+	  unsigned short clique, base;
+	  unsigned int k;
+	  FOR_EACH_VEC_ELT (split_store->orig_stores, k, info)
+	    orig_stmts.safe_push (info->stmt);
+	  tree offset_type
+	    = get_alias_type_for_stmts (orig_stmts, false, &clique, &base);
+	  loc = get_location_for_stmts (orig_stmts);
+	  orig_stmts.truncate (0);
+
+	  tree int_type = build_nonstandard_integer_type (try_size, UNSIGNED);
+	  int_type = build_aligned_type (int_type, align);
+	  dest = fold_build2 (MEM_REF, int_type, addr,
+			      build_int_cst (offset_type, try_pos));
+	  if (TREE_CODE (dest) == MEM_REF)
+	    {
+	      MR_DEPENDENCE_CLIQUE (dest) = clique;
+	      MR_DEPENDENCE_BASE (dest) = base;
+	    }
 
-      tree src = native_interpret_expr (int_type,
-					group->val + try_pos - start_byte_pos,
-					group->buf_size);
+	  tree mask
+	    = native_interpret_expr (int_type,
+				     group->mask + try_pos - start_byte_pos,
+				     group->buf_size);
 
-      stmt = gimple_build_assign (dest, src);
-      gimple_set_location (stmt, loc);
-      gimple_set_vuse (stmt, new_vuse);
-      gimple_seq_add_stmt_without_update (&seq, stmt);
+	  tree ops[2];
+	  for (int j = 0;
+	       j < 1 + (split_store->orig_stores[0]->ops[1].val != NULL_TREE);
+	       ++j)
+	    {
+	      store_operand_info &op = split_store->orig_stores[0]->ops[j];
+	      if (op.base_addr)
+		{
+		  FOR_EACH_VEC_ELT (split_store->orig_stores, k, info)
+		    orig_stmts.safe_push (info->ops[j].stmt);
+
+		  offset_type = get_alias_type_for_stmts (orig_stmts, true,
+							  &clique, &base);
+		  location_t load_loc = get_location_for_stmts (orig_stmts);
+		  orig_stmts.truncate (0);
+
+		  unsigned HOST_WIDE_INT load_align = group->load_align[j];
+		  unsigned HOST_WIDE_INT align_bitpos
+		    = (try_pos * BITS_PER_UNIT
+		       - split_store->orig_stores[0]->bitpos
+		       + op.bitpos) & (load_align - 1);
+		  if (align_bitpos)
+		    load_align = least_bit_hwi (align_bitpos);
+
+		  tree load_int_type
+		    = build_nonstandard_integer_type (try_size, UNSIGNED);
+		  load_int_type
+		    = build_aligned_type (load_int_type, load_align);
+
+		  unsigned HOST_WIDE_INT load_pos
+		    = (try_pos * BITS_PER_UNIT
+		       - split_store->orig_stores[0]->bitpos
+		       + op.bitpos) / BITS_PER_UNIT;
+		  ops[j] = fold_build2 (MEM_REF, load_int_type, load_addr[j],
+					build_int_cst (offset_type, load_pos));
+		  if (TREE_CODE (ops[j]) == MEM_REF)
+		    {
+		      MR_DEPENDENCE_CLIQUE (ops[j]) = clique;
+		      MR_DEPENDENCE_BASE (ops[j]) = base;
+		    }
+		  if (!integer_zerop (mask))
+		    /* The load might load some bits (that will be masked off
+		       later on) uninitialized, avoid -W*uninitialized
+		       warnings in that case.  */
+		    TREE_NO_WARNING (ops[j]) = 1;
+
+		  stmt = gimple_build_assign (make_ssa_name (int_type),
+					      ops[j]);
+		  gimple_set_location (stmt, load_loc);
+		  if (gsi_bb (load_gsi[j]))
+		    {
+		      gimple_set_vuse (stmt, gimple_vuse (op.stmt));
+		      gimple_seq_add_stmt_without_update (&load_seq[j], stmt);
+		    }
+		  else
+		    {
+		      gimple_set_vuse (stmt, new_vuse);
+		      gimple_seq_add_stmt_without_update (&seq, stmt);
+		    }
+		  ops[j] = gimple_assign_lhs (stmt);
+		}
+	      else
+		ops[j] = native_interpret_expr (int_type,
+						group->val + try_pos
+						- start_byte_pos,
+						group->buf_size);
+	    }
 
-      /* We didn't manage to reduce the number of statements.  Bail out.  */
-      if (++num_stmts == orig_num_stmts)
-	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
+	  switch (split_store->orig_stores[0]->rhs_code)
 	    {
-	      fprintf (dump_file, "Exceeded original number of stmts (%u)."
-				  "  Not profitable to emit new sequence.\n",
-		       orig_num_stmts);
+	    case BIT_AND_EXPR:
+	    case BIT_IOR_EXPR:
+	    case BIT_XOR_EXPR:
+	      FOR_EACH_VEC_ELT (split_store->orig_stores, k, info)
+		{
+		  tree rhs1 = gimple_assign_rhs1 (info->stmt);
+		  orig_stmts.safe_push (SSA_NAME_DEF_STMT (rhs1));
+		}
+	      location_t bit_loc;
+	      bit_loc = get_location_for_stmts (orig_stmts);
+	      orig_stmts.truncate (0);
+
+	      stmt
+		= gimple_build_assign (make_ssa_name (int_type),
+				       split_store->orig_stores[0]->rhs_code,
+				       ops[0], ops[1]);
+	      gimple_set_location (stmt, bit_loc);
+	      /* If there is just one load and there is a separate
+		 load_seq[0], emit the bitwise op right after it.  */
+	      if (load_addr[1] == NULL_TREE && gsi_bb (load_gsi[0]))
+		gimple_seq_add_stmt_without_update (&load_seq[0], stmt);
+	      /* Otherwise, if at least one load is in seq, we need to
+		 emit the bitwise op right before the store.  If there
+		 are two loads and are emitted somewhere else, it would
+		 be better to emit the bitwise op as early as possible;
+		 we don't track where that would be possible right now
+		 though.  */
+	      else
+		gimple_seq_add_stmt_without_update (&seq, stmt);
+	      src = gimple_assign_lhs (stmt);
+	      break;
+	    default:
+	      src = ops[0];
+	      break;
 	    }
-	  unsigned int ssa_count;
-	  tree ssa_name;
-	  /* Don't forget to cleanup the temporary SSA names.  */
-	  FOR_EACH_VEC_ELT (new_ssa_names, ssa_count, ssa_name)
-	    release_ssa_name (ssa_name);
 
-	  fail = true;
-	  break;
+	  if (!integer_zerop (mask))
+	    {
+	      tree tem = make_ssa_name (int_type);
+	      tree load_src = unshare_expr (dest);
+	      /* The load might load some or all bits uninitialized,
+		 avoid -W*uninitialized warnings in that case.
+		 As optimization, it would be nice if all the bits are
+		 provably uninitialized (no stores at all yet or previous
+		 store a CLOBBER) we'd optimize away the load and replace
+		 it e.g. with 0.  */
+	      TREE_NO_WARNING (load_src) = 1;
+	      stmt = gimple_build_assign (tem, load_src);
+	      gimple_set_location (stmt, loc);
+	      gimple_set_vuse (stmt, new_vuse);
+	      gimple_seq_add_stmt_without_update (&seq, stmt);
+
+	      /* FIXME: If there is a single chunk of zero bits in mask,
+		 perhaps use BIT_INSERT_EXPR instead?  */
+	      stmt = gimple_build_assign (make_ssa_name (int_type),
+					  BIT_AND_EXPR, tem, mask);
+	      gimple_set_location (stmt, loc);
+	      gimple_seq_add_stmt_without_update (&seq, stmt);
+	      tem = gimple_assign_lhs (stmt);
+
+	      if (TREE_CODE (src) == INTEGER_CST)
+		src = wide_int_to_tree (int_type,
+					wi::bit_and_not (wi::to_wide (src),
+							 wi::to_wide (mask)));
+	      else
+		{
+		  tree nmask
+		    = wide_int_to_tree (int_type,
+					wi::bit_not (wi::to_wide (mask)));
+		  stmt = gimple_build_assign (make_ssa_name (int_type),
+					      BIT_AND_EXPR, src, nmask);
+		  gimple_set_location (stmt, loc);
+		  gimple_seq_add_stmt_without_update (&seq, stmt);
+		  src = gimple_assign_lhs (stmt);
+		}
+	      stmt = gimple_build_assign (make_ssa_name (int_type),
+					  BIT_IOR_EXPR, tem, src);
+	      gimple_set_location (stmt, loc);
+	      gimple_seq_add_stmt_without_update (&seq, stmt);
+	      src = gimple_assign_lhs (stmt);
+	    }
 	}
 
+      stmt = gimple_build_assign (dest, src);
+      gimple_set_location (stmt, loc);
+      gimple_set_vuse (stmt, new_vuse);
+      gimple_seq_add_stmt_without_update (&seq, stmt);
+
       tree new_vdef;
       if (i < split_stores.length () - 1)
-	{
-	  new_vdef = make_ssa_name (gimple_vop (cfun), stmt);
-	  new_ssa_names.safe_push (new_vdef);
-	}
+	new_vdef = make_ssa_name (gimple_vop (cfun), stmt);
       else
 	new_vdef = last_vdef;
 
@@ -1221,19 +1863,19 @@ imm_store_chain_info::output_merged_store (merged_store_group *group)
   FOR_EACH_VEC_ELT (split_stores, i, split_store)
     delete split_store;
 
-  if (fail)
-    return false;
-
   gcc_assert (seq);
   if (dump_file)
     {
       fprintf (dump_file,
 	       "New sequence of %u stmts to replace old one of %u stmts\n",
-	       num_stmts, orig_num_stmts);
+	       split_stores.length (), orig_num_stmts);
       if (dump_flags & TDF_DETAILS)
 	print_gimple_seq (dump_file, seq, 0, TDF_VOPS | TDF_MEMSYMS);
     }
   gsi_insert_seq_after (&last_gsi, seq, GSI_SAME_STMT);
+  for (int j = 0; j < 2; ++j)
+    if (load_seq[j])
+      gsi_insert_seq_after (&load_gsi[j], load_seq[j], GSI_SAME_STMT);
 
   return true;
 }
@@ -1333,10 +1975,299 @@ rhs_valid_for_store_merging_p (tree rhs)
 	  && native_encode_expr (rhs, NULL, size) != 0);
 }
 
+/* If MEM is a memory reference usable for store merging (either as
+   store destination or for loads), return the non-NULL base_addr
+   and set *PBITSIZE, *PBITPOS, *PBITREGION_START and *PBITREGION_END.
+   Otherwise return NULL, *PBITPOS should be still valid even for that
+   case.  */
+
+static tree
+mem_valid_for_store_merging (tree mem, unsigned HOST_WIDE_INT *pbitsize,
+			     unsigned HOST_WIDE_INT *pbitpos,
+			     unsigned HOST_WIDE_INT *pbitregion_start,
+			     unsigned HOST_WIDE_INT *pbitregion_end)
+{
+  poly_int64 var_bitsize, var_bitpos;
+  poly_uint64 var_bitregion_start = 0, var_bitregion_end = 0;
+  machine_mode mode;
+  int unsignedp = 0, reversep = 0, volatilep = 0;
+  tree offset;
+  tree base_addr = get_inner_reference (mem, &var_bitsize, &var_bitpos,
+					&offset, &mode, &unsignedp, &reversep,
+					&volatilep);
+  if (must_eq (var_bitsize, 0))
+    {
+      *pbitsize = 0;
+      return NULL_TREE;
+    }
+
+  *pbitsize = -1;
+  if (TREE_CODE (mem) == COMPONENT_REF
+      && DECL_BIT_FIELD_TYPE (TREE_OPERAND (mem, 1)))
+    {
+      get_bit_range (&var_bitregion_start, &var_bitregion_end, mem,
+		     &var_bitpos, &offset);
+      if (may_ne (var_bitregion_end, 0U))
+	var_bitregion_end += 1;
+    }
+
+  if (reversep)
+    return NULL_TREE;
+
+  /* We do not want to rewrite TARGET_MEM_REFs.  */
+  if (TREE_CODE (base_addr) == TARGET_MEM_REF)
+    return NULL_TREE;
+  /* In some cases get_inner_reference may return a
+     MEM_REF [ptr + byteoffset].  For the purposes of this pass
+     canonicalize the base_addr to MEM_REF [ptr] and take
+     byteoffset into account in the bitpos.  This occurs in
+     PR 23684 and this way we can catch more chains.  */
+  else if (TREE_CODE (base_addr) == MEM_REF)
+    {
+      poly_offset_int byte_off = mem_ref_offset (base_addr);
+      poly_offset_int bit_off = byte_off << LOG2_BITS_PER_UNIT;
+      bit_off += var_bitpos;
+      if (bit_off.to_shwi (&var_bitpos))
+	{
+	  if (may_ne (var_bitregion_end, 0U))
+	    {
+	      bit_off = byte_off << LOG2_BITS_PER_UNIT;
+	      bit_off += var_bitregion_start;
+	      if (bit_off.to_uhwi (&var_bitregion_start))
+		{
+		  bit_off = byte_off << LOG2_BITS_PER_UNIT;
+		  bit_off += var_bitregion_end;
+		  if (!bit_off.to_uhwi (&var_bitregion_end))
+		    var_bitregion_end = 0;
+		}
+	      else
+		var_bitregion_end = 0;
+	    }
+	}
+      else
+	return NULL_TREE;
+      base_addr = TREE_OPERAND (base_addr, 0);
+    }
+  /* get_inner_reference returns the base object, get at its
+     address now.  */
+  else
+    {
+      if (may_lt (var_bitpos, 0))
+	return NULL_TREE;
+      base_addr = build_fold_addr_expr (base_addr);
+    }
+
+  HOST_WIDE_INT bitsize, bitpos;
+  if (!var_bitsize.is_constant (&bitsize)
+      || !var_bitpos.is_constant (&bitpos))
+    return NULL_TREE;
+
+  unsigned HOST_WIDE_INT bitregion_start, bitregion_end;
+  if (!var_bitregion_start.is_constant (&bitregion_start)
+      || !var_bitregion_end.is_constant (&bitregion_end))
+    return NULL_TREE;
+
+  if (!bitregion_end)
+    {
+      bitregion_start = ROUND_DOWN (bitpos, BITS_PER_UNIT);
+      bitregion_end = ROUND_UP (bitpos + bitsize, BITS_PER_UNIT);
+    }
+
+  if (offset != NULL_TREE)
+    {
+      /* If the access is variable offset then a base decl has to be
+	 address-taken to be able to emit pointer-based stores to it.
+	 ???  We might be able to get away with re-using the original
+	 base up to the first variable part and then wrapping that inside
+	 a BIT_FIELD_REF.  */
+      tree base = get_base_address (base_addr);
+      if (! base
+	  || (DECL_P (base) && ! TREE_ADDRESSABLE (base)))
+	return NULL_TREE;
+
+      base_addr = build2 (POINTER_PLUS_EXPR, TREE_TYPE (base_addr),
+			  base_addr, offset);
+    }
+
+  *pbitsize = bitsize;
+  *pbitpos = bitpos;
+  *pbitregion_start = bitregion_start;
+  *pbitregion_end = bitregion_end;
+  return base_addr;
+}
+
+/* Return true if STMT is a load that can be used for store merging.
+   In that case fill in *OP.  BITSIZE, BITPOS, BITREGION_START and
+   BITREGION_END are properties of the corresponding store.  */
+
+static bool
+handled_load (gimple *stmt, store_operand_info *op,
+	      unsigned HOST_WIDE_INT bitsize, unsigned HOST_WIDE_INT bitpos,
+	      unsigned HOST_WIDE_INT bitregion_start,
+	      unsigned HOST_WIDE_INT bitregion_end)
+{
+  if (!is_gimple_assign (stmt) || !gimple_vuse (stmt))
+    return false;
+  if (gimple_assign_load_p (stmt)
+      && !stmt_can_throw_internal (stmt)
+      && !gimple_has_volatile_ops (stmt))
+    {
+      tree mem = gimple_assign_rhs1 (stmt);
+      op->base_addr
+	= mem_valid_for_store_merging (mem, &op->bitsize, &op->bitpos,
+				       &op->bitregion_start,
+				       &op->bitregion_end);
+      if (op->base_addr != NULL_TREE
+	  && op->bitsize == bitsize
+	  && ((op->bitpos - bitpos) % BITS_PER_UNIT) == 0
+	  && op->bitpos - op->bitregion_start >= bitpos - bitregion_start
+	  && op->bitregion_end - op->bitpos >= bitregion_end - bitpos)
+	{
+	  op->stmt = stmt;
+	  op->val = mem;
+	  return true;
+	}
+    }
+  return false;
+}
+
+/* Record the store STMT for store merging optimization if it can be
+   optimized.  */
+
+void
+pass_store_merging::process_store (gimple *stmt)
+{
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+  unsigned HOST_WIDE_INT bitsize, bitpos;
+  unsigned HOST_WIDE_INT bitregion_start;
+  unsigned HOST_WIDE_INT bitregion_end;
+  tree base_addr
+    = mem_valid_for_store_merging (lhs, &bitsize, &bitpos,
+				   &bitregion_start, &bitregion_end);
+  if (bitsize == 0)
+    return;
+
+  bool invalid = (base_addr == NULL_TREE
+		  || ((bitsize > MAX_BITSIZE_MODE_ANY_INT)
+		       && (TREE_CODE (rhs) != INTEGER_CST)));
+  enum tree_code rhs_code = ERROR_MARK;
+  store_operand_info ops[2];
+  if (invalid)
+    ;
+  else if (rhs_valid_for_store_merging_p (rhs))
+    {
+      rhs_code = INTEGER_CST;
+      ops[0].val = rhs;
+    }
+  else if (TREE_CODE (rhs) != SSA_NAME || !has_single_use (rhs))
+    invalid = true;
+  else
+    {
+      gimple *def_stmt = SSA_NAME_DEF_STMT (rhs), *def_stmt1, *def_stmt2;
+      if (!is_gimple_assign (def_stmt))
+	invalid = true;
+      else if (handled_load (def_stmt, &ops[0], bitsize, bitpos,
+			     bitregion_start, bitregion_end))
+	rhs_code = MEM_REF;
+      else
+	switch ((rhs_code = gimple_assign_rhs_code (def_stmt)))
+	  {
+	  case BIT_AND_EXPR:
+	  case BIT_IOR_EXPR:
+	  case BIT_XOR_EXPR:
+	    tree rhs1, rhs2;
+	    rhs1 = gimple_assign_rhs1 (def_stmt);
+	    rhs2 = gimple_assign_rhs2 (def_stmt);
+	    invalid = true;
+	    if (TREE_CODE (rhs1) != SSA_NAME || !has_single_use (rhs1))
+	      break;
+	    def_stmt1 = SSA_NAME_DEF_STMT (rhs1);
+	    if (!is_gimple_assign (def_stmt1)
+		|| !handled_load (def_stmt1, &ops[0], bitsize, bitpos,
+				  bitregion_start, bitregion_end))
+	      break;
+	    if (rhs_valid_for_store_merging_p (rhs2))
+	      ops[1].val = rhs2;
+	    else if (TREE_CODE (rhs2) != SSA_NAME || !has_single_use (rhs2))
+	      break;
+	    else
+	      {
+		def_stmt2 = SSA_NAME_DEF_STMT (rhs2);
+		if (!is_gimple_assign (def_stmt2))
+		  break;
+		else if (!handled_load (def_stmt2, &ops[1], bitsize, bitpos,
+					bitregion_start, bitregion_end))
+		  break;
+	      }
+	    invalid = false;
+	    break;
+	  default:
+	    invalid = true;
+	    break;
+	  }
+    }
+
+  struct imm_store_chain_info **chain_info = NULL;
+  if (base_addr)
+    chain_info = m_stores.get (base_addr);
+
+  if (invalid)
+    {
+      terminate_all_aliasing_chains (chain_info, stmt);
+      return;
+    }
+
+  store_immediate_info *info;
+  if (chain_info)
+    {
+      unsigned int ord = (*chain_info)->m_store_info.length ();
+      info = new store_immediate_info (bitsize, bitpos, bitregion_start,
+				       bitregion_end, stmt, ord, rhs_code,
+				       ops[0], ops[1]);
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Recording immediate store from stmt:\n");
+	  print_gimple_stmt (dump_file, stmt, 0);
+	}
+      (*chain_info)->m_store_info.safe_push (info);
+      /* If we reach the limit of stores to merge in a chain terminate and
+	 process the chain now.  */
+      if ((*chain_info)->m_store_info.length ()
+	  == (unsigned int) PARAM_VALUE (PARAM_MAX_STORES_TO_MERGE))
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file,
+		     "Reached maximum number of statements to merge:\n");
+	  terminate_and_release_chain (*chain_info);
+	}
+      return;
+    }
+
+  /* Store aliases any existing chain?  */
+  terminate_all_aliasing_chains (chain_info, stmt);
+  /* Start a new chain.  */
+  struct imm_store_chain_info *new_chain
+    = new imm_store_chain_info (m_stores_head, base_addr);
+  info = new store_immediate_info (bitsize, bitpos, bitregion_start,
+				   bitregion_end, stmt, 0, rhs_code,
+				   ops[0], ops[1]);
+  new_chain->m_store_info.safe_push (info);
+  m_stores.put (base_addr, new_chain);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Starting new chain with statement:\n");
+      print_gimple_stmt (dump_file, stmt, 0);
+      fprintf (dump_file, "The base object is:\n");
+      print_generic_expr (dump_file, base_addr);
+      fprintf (dump_file, "\n");
+    }
+}
+
 /* Entry point for the pass.  Go over each basic block recording chains of
-  immediate stores.  Upon encountering a terminating statement (as defined
-  by stmt_terminates_chain_p) process the recorded stores and emit the widened
-  variants.  */
+   immediate stores.  Upon encountering a terminating statement (as defined
+   by stmt_terminates_chain_p) process the recorded stores and emit the widened
+   variants.  */
 
 unsigned int
 pass_store_merging::execute (function *fun)
@@ -1386,131 +2317,9 @@ pass_store_merging::execute (function *fun)
 	  if (gimple_assign_single_p (stmt) && gimple_vdef (stmt)
 	      && !stmt_can_throw_internal (stmt)
 	      && lhs_valid_for_store_merging_p (gimple_assign_lhs (stmt)))
-	    {
-	      tree lhs = gimple_assign_lhs (stmt);
-	      tree rhs = gimple_assign_rhs1 (stmt);
-
-	      poly_int64 bitsize, bitpos;
-	      machine_mode mode;
-	      int unsignedp = 0, reversep = 0, volatilep = 0;
-	      tree offset, base_addr;
-	      base_addr
-		= get_inner_reference (lhs, &bitsize, &bitpos, &offset, &mode,
-				       &unsignedp, &reversep, &volatilep);
-	      /* As a future enhancement we could handle stores with the same
-		 base and offset.  */
-	      bool invalid = reversep
-			     || !rhs_valid_for_store_merging_p (rhs);
-
-	      /* We do not want to rewrite TARGET_MEM_REFs.  */
-	      if (TREE_CODE (base_addr) == TARGET_MEM_REF)
-		invalid = true;
-	      /* In some cases get_inner_reference may return a
-		 MEM_REF [ptr + byteoffset].  For the purposes of this pass
-		 canonicalize the base_addr to MEM_REF [ptr] and take
-		 byteoffset into account in the bitpos.  This occurs in
-		 PR 23684 and this way we can catch more chains.  */
-	      else if (TREE_CODE (base_addr) == MEM_REF)
-		{
-		  poly_offset_int byte_off = mem_ref_offset (base_addr);
-		  poly_offset_int bit_off = byte_off << LOG2_BITS_PER_UNIT;
-		  bit_off += bitpos;
-		  if (!bit_off.to_shwi (&bitpos))
-		    invalid = true;
-		  base_addr = TREE_OPERAND (base_addr, 0);
-		}
-	      /* get_inner_reference returns the base object, get at its
-	         address now.  */
-	      else
-		base_addr = build_fold_addr_expr (base_addr);
-
-	      if (! invalid
-		  && offset != NULL_TREE)
-		{
-		  /* If the access is variable offset then a base
-		     decl has to be address-taken to be able to
-		     emit pointer-based stores to it.
-		     ???  We might be able to get away with
-		     re-using the original base up to the first
-		     variable part and then wrapping that inside
-		     a BIT_FIELD_REF.  */
-		  tree base = get_base_address (base_addr);
-		  if (! base
-		      || (DECL_P (base)
-			  && ! TREE_ADDRESSABLE (base)))
-		    invalid = true;
-		  else
-		    base_addr = build2 (POINTER_PLUS_EXPR,
-					TREE_TYPE (base_addr),
-					base_addr, offset);
-		}
-
-	      struct imm_store_chain_info **chain_info
-		= m_stores.get (base_addr);
-
-	      HOST_WIDE_INT const_bitsize, const_bitpos;
-	      if (!invalid
-		  && bitsize.is_constant (&const_bitsize)
-		  && bitpos.is_constant (&const_bitpos)
-		  && (const_bitsize <= MAX_BITSIZE_MODE_ANY_INT
-		      || TREE_CODE (rhs) == INTEGER_CST)
-		  && const_bitpos >= 0)
-		{
-		  store_immediate_info *info;
-		  if (chain_info)
-		    {
-		      info = new store_immediate_info (
-			const_bitsize, const_bitpos, stmt,
-			(*chain_info)->m_store_info.length ());
-		      if (dump_file && (dump_flags & TDF_DETAILS))
-			{
-			  fprintf (dump_file,
-				   "Recording immediate store from stmt:\n");
-			  print_gimple_stmt (dump_file, stmt, 0);
-			}
-		      (*chain_info)->m_store_info.safe_push (info);
-		      /* If we reach the limit of stores to merge in a chain
-			 terminate and process the chain now.  */
-		      if ((*chain_info)->m_store_info.length ()
-			   == (unsigned int)
-			      PARAM_VALUE (PARAM_MAX_STORES_TO_MERGE))
-			{
-			  if (dump_file && (dump_flags & TDF_DETAILS))
-			    fprintf (dump_file,
-				 "Reached maximum number of statements"
-				 " to merge:\n");
-			  terminate_and_release_chain (*chain_info);
-			}
-		      continue;
-		    }
-
-		  /* Store aliases any existing chain?  */
-		  terminate_all_aliasing_chains (chain_info, false, stmt);
-		  /* Start a new chain.  */
-		  struct imm_store_chain_info *new_chain
-		    = new imm_store_chain_info (m_stores_head, base_addr);
-		  info = new store_immediate_info (const_bitsize, const_bitpos,
-						   stmt, 0);
-		  new_chain->m_store_info.safe_push (info);
-		  m_stores.put (base_addr, new_chain);
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    {
-		      fprintf (dump_file,
-			       "Starting new chain with statement:\n");
-		      print_gimple_stmt (dump_file, stmt, 0);
-		      fprintf (dump_file, "The base object is:\n");
-		      print_generic_expr (dump_file, base_addr);
-		      fprintf (dump_file, "\n");
-		    }
-		}
-	      else
-		terminate_all_aliasing_chains (chain_info,
-					       offset != NULL_TREE, stmt);
-
-	      continue;
-	    }
-
-	  terminate_all_aliasing_chains (NULL, false, stmt);
+	    process_store (stmt);
+	  else
+	    terminate_all_aliasing_chains (NULL, stmt);
 	}
       terminate_and_process_all_chains ();
     }
diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c
index 2d255a493d0..08c2195575a 100644
--- a/gcc/gimple-ssa-warn-alloca.c
+++ b/gcc/gimple-ssa-warn-alloca.c
@@ -264,7 +264,7 @@ is_max (tree x, wide_int max)
 
 // Analyze the alloca call in STMT and return the alloca type with its
 // corresponding limit (if applicable).  IS_VLA is set if the alloca
-// call is really a BUILT_IN_ALLOCA_WITH_ALIGN, signifying a VLA.
+// call was created by the gimplifier for a VLA.
 //
 // If the alloca call may be too large because of a cast from a signed
 // type to an unsigned type, set *INVALID_CASTED_TYPE to the
@@ -278,7 +278,8 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
   tree len = gimple_call_arg (stmt, 0);
   tree len_casted = NULL;
   wide_int min, max;
-  struct alloca_type_and_limit ret = alloca_type_and_limit (ALLOCA_UNBOUNDED);
+  edge_iterator ei;
+  edge e;
 
   gcc_assert (!is_vla || (unsigned HOST_WIDE_INT) warn_vla_limit > 0);
   gcc_assert (is_vla || (unsigned HOST_WIDE_INT) warn_alloca_limit > 0);
@@ -299,16 +300,18 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
 				      wi::to_wide (len));
       if (integer_zerop (len))
 	return alloca_type_and_limit (ALLOCA_ARG_IS_ZERO);
-      ret = alloca_type_and_limit (ALLOCA_OK);
+
+      return alloca_type_and_limit (ALLOCA_OK);
     }
+
   // Check the range info if available.
-  else if (TREE_CODE (len) == SSA_NAME)
+  if (TREE_CODE (len) == SSA_NAME)
     {
       value_range_type range_type = get_range_info (len, &min, &max);
       if (range_type == VR_RANGE)
 	{
 	  if (wi::leu_p (max, max_size))
-	    ret = alloca_type_and_limit (ALLOCA_OK);
+	    return alloca_type_and_limit (ALLOCA_OK);
 	  else
 	    {
 	      // A cast may have created a range we don't care
@@ -391,52 +394,41 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
   // If we couldn't find anything, try a few heuristics for things we
   // can easily determine.  Check these misc cases but only accept
   // them if all predecessors have a known bound.
-  basic_block bb = gimple_bb (stmt);
-  if (ret.type == ALLOCA_UNBOUNDED)
+  struct alloca_type_and_limit ret = alloca_type_and_limit (ALLOCA_OK);
+  FOR_EACH_EDGE (e, ei, gimple_bb (stmt)->preds)
     {
-      ret.type = ALLOCA_OK;
-      for (unsigned ix = 0; ix < EDGE_COUNT (bb->preds); ix++)
-	{
-	  gcc_assert (!len_casted || TYPE_UNSIGNED (TREE_TYPE (len_casted)));
-	  ret = alloca_call_type_by_arg (len, len_casted,
-					 EDGE_PRED (bb, ix), max_size);
-	  if (ret.type != ALLOCA_OK)
-	    break;
-	}
+      gcc_assert (!len_casted || TYPE_UNSIGNED (TREE_TYPE (len_casted)));
+      ret = alloca_call_type_by_arg (len, len_casted, e, max_size);
+      if (ret.type != ALLOCA_OK)
+	break;
+    }
+
+  if (ret.type != ALLOCA_OK && tentative_cast_from_signed)
+    ret = alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED);
+
+  // If we have a declared maximum size, we can take it into account.
+  if (ret.type != ALLOCA_OK
+      && gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX))
+    {
+      tree arg = gimple_call_arg (stmt, 2);
+      if (compare_tree_int (arg, max_size) <= 0)
+	ret = alloca_type_and_limit (ALLOCA_OK);
+      else
+	ret = alloca_type_and_limit (ALLOCA_BOUND_MAYBE_LARGE,
+				     wi::to_wide (arg));
     }
 
-  if (tentative_cast_from_signed && ret.type != ALLOCA_OK)
-    return alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED);
   return ret;
 }
 
-// Return TRUE if the alloca call in STMT is in a loop, otherwise
-// return FALSE. As an exception, ignore alloca calls for VLAs that
-// occur in a loop since those will be cleaned up when they go out of
-// scope.
+// Return TRUE if STMT is in a loop, otherwise return FALSE.
 
 static bool
-in_loop_p (bool is_vla, gimple *stmt)
+in_loop_p (gimple *stmt)
 {
   basic_block bb = gimple_bb (stmt);
-  if (bb->loop_father
-      && bb->loop_father->header != ENTRY_BLOCK_PTR_FOR_FN (cfun))
-    {
-      // Do not warn on VLAs occurring in a loop, since VLAs are
-      // guaranteed to be cleaned up when they go out of scope.
-      // That is, there is a corresponding __builtin_stack_restore
-      // at the end of the scope in which the VLA occurs.
-      tree fndecl = gimple_call_fn (stmt);
-      while (TREE_CODE (fndecl) == ADDR_EXPR)
-	fndecl = TREE_OPERAND (fndecl, 0);
-      if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
-	  && is_vla
-	  && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA_WITH_ALIGN)
-	return false;
-
-      return true;
-    }
-  return false;
+  return
+    bb->loop_father && bb->loop_father->header != ENTRY_BLOCK_PTR_FOR_FN (cfun);
 }
 
 unsigned int
@@ -455,8 +447,8 @@ pass_walloca::execute (function *fun)
 	    continue;
 	  gcc_assert (gimple_call_num_args (stmt) >= 1);
 
-	  bool is_vla = gimple_alloca_call_p (stmt)
-	    && gimple_call_alloca_for_var_p (as_a <gcall *> (stmt));
+	  const bool is_vla
+	    = gimple_call_alloca_for_var_p (as_a <gcall *> (stmt));
 
 	  // Strict mode whining for VLAs is handled by the front-end,
 	  // so we can safely ignore this case.  Also, ignore VLAs if
@@ -476,9 +468,10 @@ pass_walloca::execute (function *fun)
 	  struct alloca_type_and_limit t
 	    = alloca_call_type (stmt, is_vla, &invalid_casted_type);
 
-	  // Even if we think the alloca call is OK, make sure it's
-	  // not in a loop.
-	  if (t.type == ALLOCA_OK && in_loop_p (is_vla, stmt))
+	  // Even if we think the alloca call is OK, make sure it's not in a
+	  // loop, except for a VLA, since VLAs are guaranteed to be cleaned
+	  // up when they go out of scope, including in a loop.
+	  if (t.type == ALLOCA_OK && !is_vla && in_loop_p (stmt))
 	    t = alloca_type_and_limit (ALLOCA_IN_LOOP);
 
 	  enum opt_code wcode
diff --git a/gcc/gimple-streamer-in.c b/gcc/gimple-streamer-in.c
index 23cf692e321..0dabe1adcf6 100644
--- a/gcc/gimple-streamer-in.c
+++ b/gcc/gimple-streamer-in.c
@@ -266,7 +266,6 @@ input_bb (struct lto_input_block *ib, enum LTO_tags tag,
 
   bb->count = profile_count::stream_in (ib).apply_scale
 		 (count_materialization_scale, REG_BR_PROB_BASE);
-  bb->frequency = streamer_read_hwi (ib);
   bb->flags = streamer_read_hwi (ib);
 
   /* LTO_bb1 has statements.  LTO_bb0 does not.  */
diff --git a/gcc/gimple-streamer-out.c b/gcc/gimple-streamer-out.c
index cdd775388e1..c19e5f1b55f 100644
--- a/gcc/gimple-streamer-out.c
+++ b/gcc/gimple-streamer-out.c
@@ -210,7 +210,6 @@ output_bb (struct output_block *ob, basic_block bb, struct function *fn)
 
   streamer_write_uhwi (ob, bb->index);
   bb->count.stream_out (ob);
-  streamer_write_hwi (ob, bb->frequency);
   streamer_write_hwi (ob, bb->flags);
 
   if (!gsi_end_p (bsi) || phi_nodes (bb))
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 26414d69ea9..af49405929a 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -346,7 +346,7 @@ gimple_build_call_internal_vec (enum internal_fn fn, vec<tree> args)
    this fact.  */
 
 gcall *
-gimple_build_call_from_tree (tree t)
+gimple_build_call_from_tree (tree t, tree fnptrtype)
 {
   unsigned i, nargs;
   gcall *call;
@@ -369,8 +369,7 @@ gimple_build_call_from_tree (tree t)
   gimple_call_set_return_slot_opt (call, CALL_EXPR_RETURN_SLOT_OPT (t));
   if (fndecl
       && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
-      && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA
-	  || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA_WITH_ALIGN))
+      && ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (fndecl)))
     gimple_call_set_alloca_for_var (call, CALL_ALLOCA_FOR_VAR_P (t));
   else
     gimple_call_set_from_thunk (call, CALL_FROM_THUNK_P (t));
@@ -380,6 +379,23 @@ gimple_build_call_from_tree (tree t)
   gimple_set_no_warning (call, TREE_NO_WARNING (t));
   gimple_call_set_with_bounds (call, CALL_WITH_BOUNDS_P (t));
 
+  if (fnptrtype)
+    {
+      gimple_call_set_fntype (call, TREE_TYPE (fnptrtype));
+
+      /* Check if it's an indirect CALL and the type has the
+ 	 nocf_check attribute. In that case propagate the information
+	 to the gimple CALL insn.  */
+      if (!fndecl)
+	{
+	  gcc_assert (POINTER_TYPE_P (fnptrtype));
+	  tree fntype = TREE_TYPE (fnptrtype);
+
+	  if (lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (fntype)))
+	    gimple_call_set_nocf_check (call, TRUE);
+	}
+    }
+
   return call;
 }
 
@@ -1824,11 +1840,35 @@ gimple_copy (gimple *stmt)
 	  gimple_omp_sections_set_clauses (copy, t);
 	  t = unshare_expr (gimple_omp_sections_control (stmt));
 	  gimple_omp_sections_set_control (copy, t);
-	  /* FALLTHRU  */
+	  goto copy_omp_body;
 
 	case GIMPLE_OMP_SINGLE:
+	  {
+	    gomp_single *omp_single_copy = as_a <gomp_single *> (copy);
+	    t = unshare_expr (gimple_omp_single_clauses (stmt));
+	    gimple_omp_single_set_clauses (omp_single_copy, t);
+	  }
+	  goto copy_omp_body;
+
 	case GIMPLE_OMP_TARGET:
+	  {
+	    gomp_target *omp_target_stmt = as_a <gomp_target *> (stmt);
+	    gomp_target *omp_target_copy = as_a <gomp_target *> (copy);
+	    t = unshare_expr (gimple_omp_target_clauses (omp_target_stmt));
+	    gimple_omp_target_set_clauses (omp_target_copy, t);
+	    t = unshare_expr (gimple_omp_target_data_arg (omp_target_stmt));
+	    gimple_omp_target_set_data_arg (omp_target_copy, t);
+	  }
+	  goto copy_omp_body;
+
 	case GIMPLE_OMP_TEAMS:
+	  {
+	    gomp_teams *omp_teams_copy = as_a <gomp_teams *> (copy);
+	    t = unshare_expr (gimple_omp_teams_clauses (stmt));
+	    gimple_omp_teams_set_clauses (omp_teams_copy, t);
+	  }
+	  /* FALLTHRU  */
+
 	case GIMPLE_OMP_SECTION:
 	case GIMPLE_OMP_MASTER:
 	case GIMPLE_OMP_TASKGROUP:
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 6213c49b91f..334def89398 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -148,6 +148,7 @@ enum gf_mask {
     GF_CALL_WITH_BOUNDS 	= 1 << 8,
     GF_CALL_MUST_TAIL_CALL	= 1 << 9,
     GF_CALL_BY_DESCRIPTOR	= 1 << 10,
+    GF_CALL_NOCF_CHECK		= 1 << 11,
     GF_OMP_PARALLEL_COMBINED	= 1 << 0,
     GF_OMP_PARALLEL_GRID_PHONY = 1 << 1,
     GF_OMP_TASK_TASKLOOP	= 1 << 0,
@@ -1425,7 +1426,7 @@ gcall *gimple_build_call (tree, unsigned, ...);
 gcall *gimple_build_call_valist (tree, unsigned, va_list);
 gcall *gimple_build_call_internal (enum internal_fn, unsigned, ...);
 gcall *gimple_build_call_internal_vec (enum internal_fn, vec<tree> );
-gcall *gimple_build_call_from_tree (tree);
+gcall *gimple_build_call_from_tree (tree, tree);
 gassign *gimple_build_assign (tree, tree CXX_MEM_STAT_INFO);
 gassign *gimple_build_assign (tree, enum tree_code,
 			      tree, tree, tree CXX_MEM_STAT_INFO);
@@ -2893,6 +2894,25 @@ gimple_call_set_with_bounds (gimple *gs, bool with_bounds)
 }
 
 
+/* Return true if call GS is marked as nocf_check.  */
+
+static inline bool
+gimple_call_nocf_check_p (const gcall *gs)
+{
+  return (gs->subcode & GF_CALL_NOCF_CHECK) != 0;
+}
+
+/* Mark statement GS as nocf_check call.  */
+
+static inline void
+gimple_call_set_nocf_check (gcall *gs, bool nocf_check)
+{
+  if (nocf_check)
+    gs->subcode |= GF_CALL_NOCF_CHECK;
+  else
+    gs->subcode &= ~GF_CALL_NOCF_CHECK;
+}
+
 /* Return the target of internal call GS.  */
 
 static inline enum internal_fn
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8a5c380027c..540d128a70d 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -27,6 +27,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "rtl.h"
 #include "tree.h"
+#include "memmodel.h"
+#include "tm_p.h"
 #include "gimple.h"
 #include "gimple-predict.h"
 #include "tree-pass.h"		/* FIXME: only for PROP_gimple_any */
@@ -1576,9 +1578,8 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p)
   SET_DECL_VALUE_EXPR (decl, t);
   DECL_HAS_VALUE_EXPR_P (decl) = 1;
 
-  t = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN);
-  t = build_call_expr (t, 2, DECL_SIZE_UNIT (decl),
-		       size_int (DECL_ALIGN (decl)));
+  t = build_alloca_call_expr (DECL_SIZE_UNIT (decl), DECL_ALIGN (decl),
+			      max_int_size_in_bytes (TREE_TYPE (decl)));
   /* The call has been built for a variable-sized object.  */
   CALL_ALLOCA_FOR_VAR_P (t) = 1;
   t = fold_convert (ptr_type, t);
@@ -1658,6 +1659,7 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	  && TREE_ADDRESSABLE (decl)
 	  && !TREE_STATIC (decl)
 	  && !DECL_HAS_VALUE_EXPR_P (decl)
+	  && DECL_ALIGN (decl) <= MAX_SUPPORTED_STACK_ALIGNMENT
 	  && dbg_cnt (asan_use_after_scope))
 	{
 	  asan_poisoned_variables->add (decl);
@@ -3175,8 +3177,7 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
       && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
     switch (DECL_FUNCTION_CODE (fndecl))
       {
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
+      CASE_BUILT_IN_ALLOCA:
 	/* If the call has been built for a variable-sized object, then we
 	   want to restore the stack level when the enclosing BIND_EXPR is
 	   exited to reclaim the allocated space; otherwise, we precisely
@@ -3381,8 +3382,7 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_p, bool want_value)
       /* The CALL_EXPR in *EXPR_P is already in GIMPLE form, so all we
 	 have to do is replicate it as a GIMPLE_CALL tuple.  */
       gimple_stmt_iterator gsi;
-      call = gimple_build_call_from_tree (*expr_p);
-      gimple_call_set_fntype (call, TREE_TYPE (fnptrtype));
+      call = gimple_build_call_from_tree (*expr_p, fnptrtype);
       notice_special_calls (call);
       if (EXPR_CILK_SPAWN (*expr_p))
         gimplify_cilk_detach (pre_p);
@@ -5663,8 +5663,7 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 						    CALL_EXPR_ARG (*from_p, 2));
 	  else
 	    {
-	      call_stmt = gimple_build_call_from_tree (*from_p);
-	      gimple_call_set_fntype (call_stmt, TREE_TYPE (fnptrtype));
+	      call_stmt = gimple_build_call_from_tree (*from_p, fnptrtype);
 	    }
 	}
       notice_special_calls (call_stmt);
@@ -6507,7 +6506,9 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 	      clobber = build2 (MODIFY_EXPR, TREE_TYPE (temp), temp, clobber);
 	      gimple_push_cleanup (temp, clobber, false, pre_p, true);
 	    }
-	  if (asan_poisoned_variables && dbg_cnt (asan_use_after_scope))
+	  if (asan_poisoned_variables
+	      && DECL_ALIGN (temp) <= MAX_SUPPORTED_STACK_ALIGNMENT
+	      && dbg_cnt (asan_use_after_scope))
 	    {
 	      tree asan_cleanup = build_asan_poison_call_expr (temp);
 	      if (asan_cleanup)
@@ -7972,7 +7973,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 			o1 = wi::to_poly_offset (offset);
 		      else
 			o1 = 0;
-		      if (maybe_nonzero (bitpos))
+		      if (may_ne (bitpos, 0))
 			o1 += bits_to_bytes_round_down (bitpos);
 		      sc = &OMP_CLAUSE_CHAIN (*osc);
 		      if (*sc != c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 418e1274fdf..0fa2cccebfe 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-44132970e4b6c1186036bf8eda8982fb6e905d6f
+64d570c590a76921cbdca4efb22e4675e19cc809
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 8337cbeb602..dad22ebd2c9 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -144,8 +144,8 @@ Expression::convert_for_assignment(Gogo*, Type* lhs_type,
       || rhs->is_error_expression())
     return Expression::make_error(location);
 
-  if (lhs_type->forwarded() != rhs_type->forwarded()
-      && lhs_type->interface_type() != NULL)
+  bool are_identical = Type::are_identical(lhs_type, rhs_type, false, NULL);
+  if (!are_identical && lhs_type->interface_type() != NULL)
     {
       if (rhs_type->interface_type() == NULL)
         return Expression::convert_type_to_interface(lhs_type, rhs, location);
@@ -153,8 +153,7 @@ Expression::convert_for_assignment(Gogo*, Type* lhs_type,
         return Expression::convert_interface_to_interface(lhs_type, rhs, false,
                                                           location);
     }
-  else if (lhs_type->forwarded() != rhs_type->forwarded()
-	   && rhs_type->interface_type() != NULL)
+  else if (!are_identical && rhs_type->interface_type() != NULL)
     return Expression::convert_interface_to_type(lhs_type, rhs, location);
   else if (lhs_type->is_slice_type() && rhs_type->is_nil_type())
     {
@@ -165,8 +164,15 @@ Expression::convert_for_assignment(Gogo*, Type* lhs_type,
     }
   else if (rhs_type->is_nil_type())
     return Expression::make_nil(location);
-  else if (Type::are_identical(lhs_type, rhs_type, false, NULL))
+  else if (are_identical)
     {
+      if (lhs_type->forwarded() != rhs_type->forwarded())
+	{
+	  // Different but identical types require an explicit
+	  // conversion.  This happens with type aliases.
+	  return Expression::make_cast(lhs_type, rhs, location);
+	}
+
       // No conversion is needed.
       return rhs;
     }
diff --git a/gcc/graphite-dependences.c b/gcc/graphite-dependences.c
index 2066b2ea59c..bd3e91ba860 100644
--- a/gcc/graphite-dependences.c
+++ b/gcc/graphite-dependences.c
@@ -67,9 +67,9 @@ add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
    reads are returned in READS and writes in MUST_WRITES and MAY_WRITES.  */
 
 static void
-scop_get_reads_and_writes (scop_p scop, isl_union_map *reads,
-			   isl_union_map *must_writes,
-			   isl_union_map *may_writes)
+scop_get_reads_and_writes (scop_p scop, isl_union_map *&reads,
+			   isl_union_map *&must_writes,
+			   isl_union_map *&may_writes)
 {
   int i, j;
   poly_bb_p pbb;
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index e7d95e22110..0858672facc 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -56,17 +56,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfganal.h"
 #include "value-prof.h"
 #include "tree-ssa.h"
+#include "tree-vectorizer.h"
 #include "graphite.h"
 
-/* We always try to use signed 128 bit types, but fall back to smaller types
-   in case a platform does not provide types of these sizes. In the future we
-   should use isl to derive the optimal type for each subexpression.  */
-
-static int max_mode_int_precision =
-  GET_MODE_PRECISION (int_mode_for_size (MAX_FIXED_MODE_SIZE, 0).require ());
-static int graphite_expression_type_precision = 128 <= max_mode_int_precision ?
-						128 : max_mode_int_precision;
-
 struct ast_build_info
 {
   ast_build_info()
@@ -143,8 +135,7 @@ enum phi_node_kind
 class translate_isl_ast_to_gimple
 {
  public:
-  translate_isl_ast_to_gimple (sese_info_p r)
-    : region (r), codegen_error (false) { }
+  translate_isl_ast_to_gimple (sese_info_p r);
   edge translate_isl_ast (loop_p context_loop, __isl_keep isl_ast_node *node,
 			  edge next_e, ivs_params &ip);
   edge translate_isl_ast_node_for (loop_p context_loop,
@@ -199,14 +190,12 @@ class translate_isl_ast_to_gimple
   __isl_give isl_ast_node * scop_to_isl_ast (scop_p scop);
 
   tree get_rename_from_scev (tree old_name, gimple_seq *stmts, loop_p loop,
-			     basic_block new_bb, basic_block old_bb,
 			     vec<tree> iv_map);
-  bool graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
+  void graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
 				       vec<tree> iv_map);
   edge copy_bb_and_scalar_dependences (basic_block bb, edge next_e,
 				       vec<tree> iv_map);
   void set_rename (tree old_name, tree expr);
-  void set_rename_for_each_def (gimple *stmt);
   void gsi_insert_earliest (gimple_seq seq);
   bool codegen_error_p () const { return codegen_error; }
 
@@ -236,8 +225,24 @@ private:
 
   /* A vector of all the edges at if_condition merge points.  */
   auto_vec<edge, 2> merge_points;
+
+  tree graphite_expr_type;
 };
 
+translate_isl_ast_to_gimple::translate_isl_ast_to_gimple (sese_info_p r)
+  : region (r), codegen_error (false)
+{
+  /* We always try to use signed 128 bit types, but fall back to smaller types
+     in case a platform does not provide types of these sizes. In the future we
+     should use isl to derive the optimal type for each subexpression.  */
+  int max_mode_int_precision
+    = GET_MODE_PRECISION (int_mode_for_size (MAX_FIXED_MODE_SIZE, 0).require ());
+  int graphite_expr_type_precision
+    = 128 <= max_mode_int_precision ?  128 : max_mode_int_precision;
+  graphite_expr_type
+    = build_nonstandard_integer_type (graphite_expr_type_precision, 0);
+}
+
 /* Return the tree variable that corresponds to the given isl ast identifier
    expression (an isl_ast_expr of type isl_ast_expr_id).
 
@@ -260,11 +265,9 @@ gcc_expression_from_isl_ast_expr_id (tree type,
 	      "Could not map isl_id to tree expression");
   isl_ast_expr_free (expr_id);
   tree t = res->second;
-  tree *val = region->parameter_rename_map->get(t);
-
-  if (!val)
-   val = &t;
-  return fold_convert (type, *val);
+  if (useless_type_conversion_p (type, TREE_TYPE (t)))
+    return t;
+  return fold_convert (type, t);
 }
 
 /* Converts an isl_ast_expr_int expression E to a widest_int.
@@ -703,8 +706,7 @@ translate_isl_ast_node_for (loop_p context_loop, __isl_keep isl_ast_node *node,
 			    edge next_e, ivs_params &ip)
 {
   gcc_assert (isl_ast_node_get_type (node) == isl_ast_node_for);
-  tree type
-    = build_nonstandard_integer_type (graphite_expression_type_precision, 0);
+  tree type = graphite_expr_type;
 
   isl_ast_expr *for_init = isl_ast_node_for_get_init (node);
   tree lb = gcc_expression_from_isl_expression (type, for_init, ip);
@@ -743,8 +745,7 @@ build_iv_mapping (vec<tree> iv_map, gimple_poly_bb_p gbb,
   for (i = 1; i < isl_ast_expr_get_op_n_arg (user_expr); i++)
     {
       arg_expr = isl_ast_expr_get_op_arg (user_expr, i);
-      tree type =
-	build_nonstandard_integer_type (graphite_expression_type_precision, 0);
+      tree type = graphite_expr_type;
       tree t = gcc_expression_from_isl_expression (type, arg_expr, ip);
 
       /* To fail code generation, we generate wrong code until we discard it.  */
@@ -791,13 +792,12 @@ translate_isl_ast_node_user (__isl_keep isl_ast_node *node,
   isl_ast_expr_free (user_expr);
 
   basic_block old_bb = GBB_BB (gbb);
-  if (dump_file)
+  if (dump_file && (dump_flags & TDF_DETAILS))
     {
       fprintf (dump_file,
 	       "[codegen] copying from bb_%d on edge (bb_%d, bb_%d)\n",
 	       old_bb->index, next_e->src->index, next_e->dest->index);
       print_loops_bb (dump_file, GBB_BB (gbb), 0, 3);
-
     }
 
   next_e = copy_bb_and_scalar_dependences (old_bb, next_e, iv_map);
@@ -807,7 +807,7 @@ translate_isl_ast_node_user (__isl_keep isl_ast_node *node,
   if (codegen_error_p ())
     return NULL;
 
-  if (dump_file)
+  if (dump_file && (dump_flags & TDF_DETAILS))
     {
       fprintf (dump_file, "[codegen] (after copy) new basic block\n");
       print_loops_bb (dump_file, next_e->src, 0, 3);
@@ -842,8 +842,7 @@ edge translate_isl_ast_to_gimple::
 graphite_create_new_guard (edge entry_edge, __isl_take isl_ast_expr *if_cond,
 			   ivs_params &ip)
 {
-  tree type =
-    build_nonstandard_integer_type (graphite_expression_type_precision, 0);
+  tree type = graphite_expr_type;
   tree cond_expr = gcc_expression_from_isl_expression (type, if_cond, ip);
 
   /* To fail code generation, we generate wrong code until we discard it.  */
@@ -933,32 +932,12 @@ set_rename (tree old_name, tree expr)
     {
       fprintf (dump_file, "[codegen] setting rename: old_name = ");
       print_generic_expr (dump_file, old_name);
-      fprintf (dump_file, ", new_name = ");
+      fprintf (dump_file, ", new decl = ");
       print_generic_expr (dump_file, expr);
       fprintf (dump_file, "\n");
     }
-
-  if (old_name == expr)
-    return;
-
-  vec <tree> *renames = region->rename_map->get (old_name);
-
-  if (renames)
-    renames->safe_push (expr);
-  else
-    {
-      vec<tree> r;
-      r.create (2);
-      r.safe_push (expr);
-      region->rename_map->put (old_name, r);
-    }
-
-  tree t;
-  int i;
-  /* For a parameter of a scop we don't want to rename it.  */
-  FOR_EACH_VEC_ELT (region->params, i, t)
-    if (old_name == t)
-      region->parameter_rename_map->put(old_name, expr);
+  bool res = region->rename_map->put (old_name, expr);
+  gcc_assert (! res);
 }
 
 /* Return an iterator to the instructions comes last in the execution order.
@@ -1070,9 +1049,9 @@ gsi_insert_earliest (gimple_seq seq)
 
       if (dump_file)
 	{
-	  fprintf (dump_file, "[codegen] inserting statement: ");
+	  fprintf (dump_file, "[codegen] inserting statement in BB %d: ",
+		   gimple_bb (use_stmt)->index);
 	  print_gimple_stmt (dump_file, use_stmt, 0, TDF_VOPS | TDF_MEMSYMS);
-	  print_loops_bb (dump_file, gimple_bb (use_stmt), 0, 3);
 	}
     }
 }
@@ -1082,7 +1061,6 @@ gsi_insert_earliest (gimple_seq seq)
 
 tree translate_isl_ast_to_gimple::
 get_rename_from_scev (tree old_name, gimple_seq *stmts, loop_p loop,
-		      basic_block new_bb, basic_block,
 		      vec<tree> iv_map)
 {
   tree scev = scalar_evolution_in_region (region->region, loop, old_name);
@@ -1111,16 +1089,6 @@ get_rename_from_scev (tree old_name, gimple_seq *stmts, loop_p loop,
       return build_zero_cst (TREE_TYPE (old_name));
     }
 
-  if (TREE_CODE (new_expr) == SSA_NAME)
-    {
-      basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (new_expr));
-      if (bb && !dominated_by_p (CDI_DOMINATORS, new_bb, bb))
-	{
-	  set_codegen_error ();
-	  return build_zero_cst (TREE_TYPE (old_name));
-	}
-    }
-
   /* Replace the old_name with the new_expr.  */
   return force_gimple_operand (unshare_expr (new_expr), stmts,
 			       true, NULL_TREE);
@@ -1148,36 +1116,13 @@ should_copy_to_new_region (gimple *stmt, sese_info_p region)
       && scev_analyzable_p (lhs, region->region))
     return false;
 
-  /* Do not copy parameters that have been generated in the header of the
-     scop.  */
-  if (is_gimple_assign (stmt)
-      && (lhs = gimple_assign_lhs (stmt))
-      && TREE_CODE (lhs) == SSA_NAME
-      && region->parameter_rename_map->get(lhs))
-    return false;
-
   return true;
 }
 
-/* Create new names for all the definitions created by COPY and add replacement
-   mappings for each new name.  */
-
-void translate_isl_ast_to_gimple::
-set_rename_for_each_def (gimple *stmt)
-{
-  def_operand_p def_p;
-  ssa_op_iter op_iter;
-  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, op_iter, SSA_OP_ALL_DEFS)
-    {
-      tree old_name = DEF_FROM_PTR (def_p);
-      create_new_def_for (old_name, stmt, def_p);
-    }
-}
-
 /* Duplicates the statements of basic block BB into basic block NEW_BB
    and compute the new induction variables according to the IV_MAP.  */
 
-bool translate_isl_ast_to_gimple::
+void translate_isl_ast_to_gimple::
 graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
 				vec<tree> iv_map)
 {
@@ -1194,7 +1139,6 @@ graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
       /* Create a new copy of STMT and duplicate STMT's virtual
 	 operands.  */
       gimple *copy = gimple_copy (stmt);
-      gsi_insert_after (&gsi_tgt, copy, GSI_NEW_STMT);
 
       /* Rather than not copying debug stmts we reset them.
          ???  Where we can rewrite uses without inserting new
@@ -1209,57 +1153,54 @@ graphite_copy_stmts_from_block (basic_block bb, basic_block new_bb,
 	    gcc_unreachable ();
 	}
 
-      if (dump_file)
-	{
-	  fprintf (dump_file, "[codegen] inserting statement: ");
-	  print_gimple_stmt (dump_file, copy, 0);
-	}
-
       maybe_duplicate_eh_stmt (copy, stmt);
       gimple_duplicate_stmt_histograms (cfun, copy, cfun, stmt);
 
       /* Crete new names for each def in the copied stmt.  */
-      set_rename_for_each_def (copy);
+      def_operand_p def_p;
+      ssa_op_iter op_iter;
+      FOR_EACH_SSA_DEF_OPERAND (def_p, copy, op_iter, SSA_OP_ALL_DEFS)
+	{
+	  tree old_name = DEF_FROM_PTR (def_p);
+	  create_new_def_for (old_name, copy, def_p);
+	}
 
-      if (codegen_error_p ())
-	return false;
+      gsi_insert_after (&gsi_tgt, copy, GSI_NEW_STMT);
+      if (dump_file)
+	{
+	  fprintf (dump_file, "[codegen] inserting statement: ");
+	  print_gimple_stmt (dump_file, copy, 0);
+	}
 
-      /* For each SSA_NAME in the parameter_rename_map rename their usage.  */
+      /* For each SCEV analyzable SSA_NAME, rename their usage.  */
       ssa_op_iter iter;
       use_operand_p use_p;
       if (!is_gimple_debug (copy))
-	FOR_EACH_SSA_USE_OPERAND (use_p, copy, iter, SSA_OP_USE)
-	  {
-	    tree old_name = USE_FROM_PTR (use_p);
-
-	    if (TREE_CODE (old_name) != SSA_NAME
-		|| SSA_NAME_IS_DEFAULT_DEF (old_name))
-	      continue;
-
-	    tree *new_expr = region->parameter_rename_map->get (old_name);
-	    tree new_name;
-	    if (!new_expr
-		&& scev_analyzable_p (old_name, region->region))
-	      {
-		gimple_seq stmts = NULL;
-		new_name = get_rename_from_scev (old_name, &stmts,
-						 bb->loop_father,
-						 new_bb, bb, iv_map);
-		if (! codegen_error_p ())
-		  gsi_insert_earliest (stmts);
-		new_expr = &new_name;
-	      }
-
-	    if (!new_expr)
-	      continue;
-
-	    replace_exp (use_p, *new_expr);
-	  }
+	{
+	  bool changed = false;
+	  FOR_EACH_SSA_USE_OPERAND (use_p, copy, iter, SSA_OP_USE)
+	    {
+	      tree old_name = USE_FROM_PTR (use_p);
+
+	      if (TREE_CODE (old_name) != SSA_NAME
+		  || SSA_NAME_IS_DEFAULT_DEF (old_name)
+		  || ! scev_analyzable_p (old_name, region->region))
+		continue;
+
+	      gimple_seq stmts = NULL;
+	      tree new_name = get_rename_from_scev (old_name, &stmts,
+						    bb->loop_father, iv_map);
+	      if (! codegen_error_p ())
+		gsi_insert_earliest (stmts);
+	      replace_exp (use_p, new_name);
+	      changed = true;
+	    }
+	  if (changed)
+	    fold_stmt_inplace (&gsi_tgt);
+	}
 
       update_stmt (copy);
     }
-
-  return true;
 }
 
 
@@ -1282,39 +1223,21 @@ copy_bb_and_scalar_dependences (basic_block bb, edge next_e, vec<tree> iv_map)
 	continue;
 
       tree new_phi_def;
-      vec <tree> *renames = region->rename_map->get (res);
-      if (! renames || renames->is_empty ())
+      tree *rename = region->rename_map->get (res);
+      if (! rename)
 	{
 	  new_phi_def = create_tmp_reg (TREE_TYPE (res));
 	  set_rename (res, new_phi_def);
 	}
       else
-	{
-	  gcc_assert (renames->length () == 1);
-	  new_phi_def = (*renames)[0];
-	}
+	new_phi_def = *rename;
 
       gassign *ass = gimple_build_assign (NULL_TREE, new_phi_def);
       create_new_def_for (res, ass, NULL);
       gsi_insert_after (&gsi_tgt, ass, GSI_NEW_STMT);
     }
 
-  vec <basic_block> *copied_bbs = region->copied_bb_map->get (bb);
-  if (copied_bbs)
-    copied_bbs->safe_push (new_bb);
-  else
-    {
-      vec<basic_block> bbs;
-      bbs.create (2);
-      bbs.safe_push (new_bb);
-      region->copied_bb_map->put (bb, bbs);
-    }
-
-  if (!graphite_copy_stmts_from_block (bb, new_bb, iv_map))
-    {
-      set_codegen_error ();
-      return NULL;
-    }
+  graphite_copy_stmts_from_block (bb, new_bb, iv_map);
 
   /* Insert out-of SSA copies on the original BB outgoing edges.  */
   gsi_tgt = gsi_last_bb (new_bb);
@@ -1340,17 +1263,14 @@ copy_bb_and_scalar_dependences (basic_block bb, edge next_e, vec<tree> iv_map)
 		continue;
 
 	      tree new_phi_def;
-	      vec <tree> *renames = region->rename_map->get (res);
-	      if (! renames || renames->is_empty ())
+	      tree *rename = region->rename_map->get (res);
+	      if (! rename)
 		{
 		  new_phi_def = create_tmp_reg (TREE_TYPE (res));
 		  set_rename (res, new_phi_def);
 		}
 	      else
-		{
-		  gcc_assert (renames->length () == 1);
-		  new_phi_def = (*renames)[0];
-		}
+		new_phi_def = *rename;
 
 	      tree arg = PHI_ARG_DEF_FROM_EDGE (phi, e);
 	      if (TREE_CODE (arg) == SSA_NAME
@@ -1359,7 +1279,7 @@ copy_bb_and_scalar_dependences (basic_block bb, edge next_e, vec<tree> iv_map)
 		  gimple_seq stmts = NULL;
 		  tree new_name = get_rename_from_scev (arg, &stmts,
 							bb->loop_father,
-							new_bb, bb, iv_map);
+							iv_map);
 		  if (! codegen_error_p ())
 		    gsi_insert_earliest (stmts);
 		  arg = new_name;
@@ -1385,13 +1305,14 @@ add_parameters_to_ivs_params (scop_p scop, ivs_params &ip)
 {
   sese_info_p region = scop->scop_info;
   unsigned nb_parameters = isl_set_dim (scop->param_context, isl_dim_param);
-  gcc_assert (nb_parameters == region->params.length ());
+  gcc_assert (nb_parameters == sese_nb_params (region));
   unsigned i;
-  for (i = 0; i < nb_parameters; i++)
+  tree param;
+  FOR_EACH_VEC_ELT (region->params, i, param)
     {
       isl_id *tmp_id = isl_set_get_dim_id (scop->param_context,
 					   isl_dim_param, i);
-      ip[tmp_id] = region->params[i];
+      ip[tmp_id] = param;
     }
 }
 
@@ -1427,6 +1348,13 @@ ast_build_before_for (__isl_keep isl_ast_build *build, void *user)
 __isl_give isl_ast_node *translate_isl_ast_to_gimple::
 scop_to_isl_ast (scop_p scop)
 {
+  int old_err = isl_options_get_on_error (scop->isl_context);
+  int old_max_operations = isl_ctx_get_max_operations (scop->isl_context);
+  int max_operations = PARAM_VALUE (PARAM_MAX_ISL_OPERATIONS);
+  if (max_operations)
+    isl_ctx_set_max_operations (scop->isl_context, max_operations);
+  isl_options_set_on_error (scop->isl_context, ISL_ON_ERROR_CONTINUE);
+
   gcc_assert (scop->transformed_schedule);
 
   /* Set the separate option to reduce control flow overhead.  */
@@ -1445,70 +1373,56 @@ scop_to_isl_ast (scop_p scop)
   isl_ast_node *ast_isl = isl_ast_build_node_from_schedule
     (context_isl, schedule);
   isl_ast_build_free (context_isl);
-  return ast_isl;
-}
-
-/* Copy def from sese REGION to the newly created TO_REGION. TR is defined by
-   DEF_STMT. GSI points to entry basic block of the TO_REGION.  */
-
-static void
-copy_def (tree tr, gimple *def_stmt, sese_info_p region, sese_info_p to_region,
-	  gimple_stmt_iterator *gsi)
-{
-  if (!defined_in_sese_p (tr, region->region))
-    return;
 
-  ssa_op_iter iter;
-  use_operand_p use_p;
-  FOR_EACH_SSA_USE_OPERAND (use_p, def_stmt, iter, SSA_OP_USE)
+  isl_options_set_on_error (scop->isl_context, old_err);
+  isl_ctx_reset_operations (scop->isl_context);
+  isl_ctx_set_max_operations (scop->isl_context, old_max_operations);
+  if (isl_ctx_last_error (scop->isl_context) != isl_error_none)
     {
-      tree use_tr = USE_FROM_PTR (use_p);
-
-      /* Do not copy parameters that have been generated in the header of the
-	 scop.  */
-      if (region->parameter_rename_map->get(use_tr))
-	continue;
-
-      gimple *def_of_use = SSA_NAME_DEF_STMT (use_tr);
-      if (!def_of_use)
-	continue;
-
-      copy_def (use_tr, def_of_use, region, to_region, gsi);
-    }
-
-  gimple *copy = gimple_copy (def_stmt);
-  gsi_insert_after (gsi, copy, GSI_NEW_STMT);
-
-  /* Create new names for all the definitions created by COPY and
-     add replacement mappings for each new name.  */
-  def_operand_p def_p;
-  ssa_op_iter op_iter;
-  FOR_EACH_SSA_DEF_OPERAND (def_p, copy, op_iter, SSA_OP_ALL_DEFS)
-    {
-      tree old_name = DEF_FROM_PTR (def_p);
-      tree new_name = create_new_def_for (old_name, copy, def_p);
-      region->parameter_rename_map->put(old_name, new_name);
+      location_t loc = find_loop_location
+	(scop->scop_info->region.entry->dest->loop_father);
+      if (isl_ctx_last_error (scop->isl_context) == isl_error_quota)
+	dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
+			 "loop nest not optimized, AST generation timed out "
+			 "after %d operations [--param max-isl-operations]\n",
+			 max_operations);
+      else
+	dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
+			 "loop nest not optimized, ISL AST generation "
+			 "signalled an error\n");
+      isl_ast_node_free (ast_isl);
+      return NULL;
     }
 
-  update_stmt (copy);
+  return ast_isl;
 }
 
+/* Generate out-of-SSA copies for the entry edge FALSE_ENTRY/TRUE_ENTRY
+   in REGION.  */
+
 static void
-copy_internal_parameters (sese_info_p region, sese_info_p to_region)
+generate_entry_out_of_ssa_copies (edge false_entry,
+				  edge true_entry,
+				  sese_info_p region)
 {
-  /* For all the parameters which definitino is in the if_region->false_region,
-     insert code on true_region (if_region->true_region->entry). */
-
-  int i;
-  tree tr;
-  gimple_stmt_iterator gsi = gsi_start_bb(to_region->region.entry->dest);
-
-  FOR_EACH_VEC_ELT (region->params, i, tr)
+  gimple_stmt_iterator gsi_tgt = gsi_start_bb (true_entry->dest);
+  for (gphi_iterator psi = gsi_start_phis (false_entry->dest);
+       !gsi_end_p (psi); gsi_next (&psi))
     {
-      // If def is not in region.
-      gimple *def_stmt = SSA_NAME_DEF_STMT (tr);
-      if (def_stmt)
-	copy_def (tr, def_stmt, region, to_region, &gsi);
+      gphi *phi = psi.phi ();
+      tree res = gimple_phi_result (phi);
+      if (virtual_operand_p (res))
+	continue;
+      /* When there's no out-of-SSA var registered do not bother
+         to create one.  */
+      tree *rename = region->rename_map->get (res);
+      if (! rename)
+	continue;
+      tree new_phi_def = *rename;
+      gassign *ass = gimple_build_assign (new_phi_def,
+					  PHI_ARG_DEF_FROM_EDGE (phi,
+								 false_entry));
+      gsi_insert_after (&gsi_tgt, ass, GSI_NEW_STMT);
     }
 }
 
@@ -1528,6 +1442,12 @@ graphite_regenerate_ast_isl (scop_p scop)
   timevar_push (TV_GRAPHITE_CODE_GEN);
   t.add_parameters_to_ivs_params (scop, ip);
   root_node = t.scop_to_isl_ast (scop);
+  if (! root_node)
+    {
+      ivs_params_clear (ip);
+      timevar_pop (TV_GRAPHITE_CODE_GEN);
+      return false;
+    }
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -1546,10 +1466,6 @@ graphite_regenerate_ast_isl (scop_p scop)
   region->if_region = if_region;
 
   loop_p context_loop = region->region.entry->src->loop_father;
-
-  /* Copy all the parameters which are defined in the region.  */
-  copy_internal_parameters(if_region->false_region, if_region->true_region);
-
   edge e = single_succ_edge (if_region->true_region->region.entry->dest);
   basic_block bb = split_edge (e);
 
@@ -1559,35 +1475,24 @@ graphite_regenerate_ast_isl (scop_p scop)
   t.translate_isl_ast (context_loop, root_node, e, ip);
   if (! t.codegen_error_p ())
     {
+      generate_entry_out_of_ssa_copies (if_region->false_region->region.entry,
+					if_region->true_region->region.entry,
+					region);
       sese_insert_phis_for_liveouts (region,
 				     if_region->region->region.exit->src,
 				     if_region->false_region->region.exit,
 				     if_region->true_region->region.exit);
       if (dump_file)
 	fprintf (dump_file, "[codegen] isl AST to Gimple succeeded.\n");
-
-      mark_virtual_operands_for_renaming (cfun);
-      update_ssa (TODO_update_ssa);
-      checking_verify_ssa (true, true);
-      rewrite_into_loop_closed_ssa (NULL, 0);
-      /* We analyzed evolutions of all SCOPs during SCOP detection
-         which cached evolutions.  Now we've introduced PHIs for
-	 liveouts which causes those cached solutions to be invalid
-	 for code-generation purposes given we'd insert references
-	 to SSA names not dominating their new use.  */
-      scev_reset ();
     }
 
   if (t.codegen_error_p ())
     {
-      if (dump_file)
-	fprintf (dump_file, "codegen error: "
-		 "reverting back to the original code.\n");
-      set_ifsese_condition (if_region, integer_zero_node);
+      location_t loc = find_loop_location
+	(scop->scop_info->region.entry->dest->loop_father);
+      dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
+		       "loop nest not optimized, code generation error\n");
 
-      /* We registered new names, scrap that.  */
-      if (need_ssa_update_p (cfun))
-	delete_update_ssa ();
       /* Remove the unreachable region.  */
       remove_edge_and_dominated_blocks (if_region->true_region->region.entry);
       basic_block ifb = if_region->false_region->region.entry->src;
@@ -1603,9 +1508,11 @@ graphite_regenerate_ast_isl (scop_p scop)
 	  delete_loop (loop);
     }
 
-  /* Verifies properties that GRAPHITE should maintain during translation.  */
-  checking_verify_loop_structure ();
-  checking_verify_loop_closed_ssa (true);
+  /* We are delaying SSA update to after code-generating all SCOPs.
+     This is because we analyzed DRs and parameters on the unmodified
+     IL and thus rely on SSA update to pick up new dominating definitions
+     from for example SESE liveout PHIs.  This is also for efficiency
+     as SSA update does work depending on the size of the function.  */
 
   free (if_region->true_region);
   free (if_region->region);
diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index f9d69247b0c..1bef380b32a 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -1005,15 +1005,10 @@ scop_detection::graphite_can_represent_expr (sese_l scop, loop_p loop,
 bool
 scop_detection::stmt_has_simple_data_refs_p (sese_l scop, gimple *stmt)
 {
-  edge nest;
+  edge nest = scop.entry;;
   loop_p loop = loop_containing_stmt (stmt);
   if (!loop_in_sese_p (loop, scop))
-    {
-      nest = scop.entry;
-      loop = NULL;
-    }
-  else
-    nest = loop_preheader_edge (outermost_loop_in_sese (scop, gimple_bb (stmt)));
+    loop = NULL;
 
   auto_vec<data_reference_p> drs;
   if (! graphite_find_data_references_in_stmt (nest, loop, stmt, &drs))
@@ -1108,7 +1103,7 @@ scop_detection::stmt_simple_for_scop_p (sese_l scop, gimple *stmt,
 	    tree op = gimple_op (stmt, i);
 	    if (!graphite_can_represent_expr (scop, loop, op)
 		/* We can only constrain on integer type.  */
-		|| (TREE_CODE (TREE_TYPE (op)) != INTEGER_TYPE))
+		|| ! INTEGRAL_TYPE_P (TREE_TYPE (op)))
 	      {
 		DEBUG_PRINT (dp << "[scop-detection-fail] "
 				<< "Graphite cannot represent stmt:\n";
@@ -1151,49 +1146,23 @@ scop_detection::nb_pbbs_in_loops (scop_p scop)
   return res;
 }
 
-/* When parameter NAME is in REGION, returns its index in SESE_PARAMS.
-   Otherwise returns -1.  */
+/* Assigns the parameter NAME an index in REGION.  */
 
-static inline int
-parameter_index_in_region_1 (tree name, sese_info_p region)
+static void
+assign_parameter_index_in_region (tree name, sese_info_p region)
 {
+  gcc_assert (TREE_CODE (name) == SSA_NAME
+	      && INTEGRAL_TYPE_P (TREE_TYPE (name))
+	      && ! defined_in_sese_p (name, region->region));
+
   int i;
   tree p;
-
-  gcc_assert (TREE_CODE (name) == SSA_NAME);
-
   FOR_EACH_VEC_ELT (region->params, i, p)
     if (p == name)
-      return i;
-
-  return -1;
-}
-
-/* When the parameter NAME is in REGION, returns its index in
-   SESE_PARAMS.  Otherwise this function inserts NAME in SESE_PARAMS
-   and returns the index of NAME.  */
-
-static int
-parameter_index_in_region (tree name, sese_info_p region)
-{
-  int i;
-
-  gcc_assert (TREE_CODE (name) == SSA_NAME);
-
-  /* Cannot constrain on anything else than INTEGER_TYPE parameters.  */
-  if (TREE_CODE (TREE_TYPE (name)) != INTEGER_TYPE)
-    return -1;
-
-  if (!invariant_in_sese_p_rec (name, region->region, NULL))
-    return -1;
-
-  i = parameter_index_in_region_1 (name, region);
-  if (i != -1)
-    return i;
+      return;
 
   i = region->params.length ();
   region->params.safe_push (name);
-  return i;
 }
 
 /* In the context of sese S, scan the expression E and translate it to
@@ -1235,7 +1204,7 @@ scan_tree_for_params (sese_info_p s, tree e)
       break;
 
     case SSA_NAME:
-      parameter_index_in_region (e, s);
+      assign_parameter_index_in_region (e, s);
       break;
 
     case INTEGER_CST:
@@ -1383,15 +1352,10 @@ try_generate_gimple_bb (scop_p scop, basic_block bb)
   vec<scalar_use> reads = vNULL;
 
   sese_l region = scop->scop_info->region;
-  edge nest;
+  edge nest = region.entry;
   loop_p loop = bb->loop_father;
   if (!loop_in_sese_p (loop, region))
-    {
-      nest = region.entry;
-      loop = NULL;
-    }
-  else
-    nest = loop_preheader_edge (outermost_loop_in_sese (region, bb));
+    loop = NULL;
 
   for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
        gsi_next (&gsi))
@@ -1710,10 +1674,6 @@ build_scops (vec<scop_p> *scops)
   sese_l *s;
   FOR_EACH_VEC_ELT (scops_l, i, s)
     {
-      /* For our out-of-SSA we need a block on s->entry, similar to how
-         we include the LCSSA block in the region.  */
-      s->entry = single_pred_edge (split_edge (s->entry));
-
       scop_p scop = new_scop (s->entry, s->exit);
 
       /* Record all basic blocks and their conditions in REGION.  */
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index ed6cbeccca1..248c34a41c9 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -142,11 +142,8 @@ isl_id_for_dr (scop_p s)
 /* Extract an affine expression from the ssa_name E.  */
 
 static isl_pw_aff *
-extract_affine_name (scop_p s, tree e, __isl_take isl_space *space)
+extract_affine_name (int dimension, __isl_take isl_space *space)
 {
-  isl_id *id = isl_id_for_ssa_name (s, e);
-  int dimension = isl_space_find_dim_by_id (space, isl_dim_param, id);
-  isl_id_free (id);
   isl_set *dom = isl_set_universe (isl_space_copy (space));
   isl_aff *aff = isl_aff_zero_on_domain (isl_local_space_from_space (space));
   aff = isl_aff_add_coefficient_si (aff, isl_dim_param, dimension, 1);
@@ -211,17 +208,13 @@ wrap (isl_pw_aff *pwaff, unsigned width)
    Otherwise returns -1.  */
 
 static inline int
-parameter_index_in_region_1 (tree name, sese_info_p region)
+parameter_index_in_region (tree name, sese_info_p region)
 {
   int i;
   tree p;
-
-  gcc_assert (TREE_CODE (name) == SSA_NAME);
-
   FOR_EACH_VEC_ELT (region->params, i, p)
     if (p == name)
       return i;
-
   return -1;
 }
 
@@ -288,10 +281,13 @@ extract_affine (scop_p s, tree e, __isl_take isl_space *space)
       break;
 
     case SSA_NAME:
-      gcc_assert (-1 != parameter_index_in_region_1 (e, s->scop_info)
-		  || defined_in_sese_p (e, s->scop_info->region));
-      res = extract_affine_name (s, e, space);
-      break;
+      {
+	gcc_assert (! defined_in_sese_p (e, s->scop_info->region));
+	int dim = parameter_index_in_region (e, s->scop_info);
+	gcc_assert (dim != -1);
+	res = extract_affine_name (dim, space);
+	break;
+      }
 
     case INTEGER_CST:
       res = extract_affine_int (e, space);
@@ -431,54 +427,40 @@ add_conditions_to_domain (poly_bb_p pbb)
    of P.  */
 
 static void
-add_param_constraints (scop_p scop, graphite_dim_t p)
+add_param_constraints (scop_p scop, graphite_dim_t p, tree parameter)
 {
-  tree parameter = scop->scop_info->params[p];
   tree type = TREE_TYPE (parameter);
-  tree lb = NULL_TREE;
-  tree ub = NULL_TREE;
+  wide_int min, max;
 
-  if (POINTER_TYPE_P (type) || !TYPE_MIN_VALUE (type))
-    lb = lower_bound_in_type (type, type);
-  else
-    lb = TYPE_MIN_VALUE (type);
+  gcc_assert (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type));
 
-  if (POINTER_TYPE_P (type) || !TYPE_MAX_VALUE (type))
-    ub = upper_bound_in_type (type, type);
+  if (INTEGRAL_TYPE_P (type)
+      && get_range_info (parameter, &min, &max) == VR_RANGE)
+    ;
   else
-    ub = TYPE_MAX_VALUE (type);
-
-  if (lb)
     {
-      isl_space *space = isl_set_get_space (scop->param_context);
-      isl_constraint *c;
-      isl_val *v;
-
-      c = isl_inequality_alloc (isl_local_space_from_space (space));
-      v = isl_val_int_from_wi (scop->isl_context, wi::to_widest (lb));
-      v = isl_val_neg (v);
-      c = isl_constraint_set_constant_val (c, v);
-      c = isl_constraint_set_coefficient_si (c, isl_dim_param, p, 1);
-
-      scop->param_context = isl_set_coalesce
-	(isl_set_add_constraint (scop->param_context, c));
+      min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+      max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
     }
 
-  if (ub)
-    {
-      isl_space *space = isl_set_get_space (scop->param_context);
-      isl_constraint *c;
-      isl_val *v;
-
-      c = isl_inequality_alloc (isl_local_space_from_space (space));
-
-      v = isl_val_int_from_wi (scop->isl_context, wi::to_widest (ub));
-      c = isl_constraint_set_constant_val (c, v);
-      c = isl_constraint_set_coefficient_si (c, isl_dim_param, p, -1);
-
-      scop->param_context = isl_set_coalesce
-	(isl_set_add_constraint (scop->param_context, c));
-    }
+  isl_space *space = isl_set_get_space (scop->param_context);
+  isl_constraint *c = isl_inequality_alloc (isl_local_space_from_space (space));
+  isl_val *v = isl_val_int_from_wi (scop->isl_context,
+				    widest_int::from (min, TYPE_SIGN (type)));
+  v = isl_val_neg (v);
+  c = isl_constraint_set_constant_val (c, v);
+  c = isl_constraint_set_coefficient_si (c, isl_dim_param, p, 1);
+  scop->param_context = isl_set_coalesce
+      (isl_set_add_constraint (scop->param_context, c));
+
+  space = isl_set_get_space (scop->param_context);
+  c = isl_inequality_alloc (isl_local_space_from_space (space));
+  v = isl_val_int_from_wi (scop->isl_context,
+			   widest_int::from (max, TYPE_SIGN (type)));
+  c = isl_constraint_set_constant_val (c, v);
+  c = isl_constraint_set_coefficient_si (c, isl_dim_param, p, -1);
+  scop->param_context = isl_set_coalesce
+      (isl_set_add_constraint (scop->param_context, c));
 }
 
 /* Add a constrain to the ACCESSES polyhedron for the alias set of
@@ -930,9 +912,8 @@ build_scop_context (scop_p scop)
 
   scop->param_context = isl_set_universe (space);
 
-  graphite_dim_t p;
-  for (p = 0; p < nbp; p++)
-    add_param_constraints (scop, p);
+  FOR_EACH_VEC_ELT (region->params, i, e)
+    add_param_constraints (scop, i, e);
 }
 
 /* Return true when loop A is nested in loop B.  */
@@ -1194,7 +1175,7 @@ build_schedule_loop_nest (scop_p scop, int *index, loop_p context_loop)
 
 /* Build the schedule of the SCOP.  */
 
-static bool
+static void
 build_original_schedule (scop_p scop)
 {
   int i = 0;
@@ -1216,9 +1197,6 @@ build_original_schedule (scop_p scop)
       fprintf (dump_file, "[sese-to-poly] original schedule:\n");
       print_isl_schedule (dump_file, scop->original_schedule);
     }
-  if (!scop->original_schedule)
-    return false;
-  return true;
 }
 
 /* Builds the polyhedral representation for a SESE region.  */
diff --git a/gcc/graphite.c b/gcc/graphite.c
index 0bdcc28cba8..22d83307bd2 100644
--- a/gcc/graphite.c
+++ b/gcc/graphite.c
@@ -55,6 +55,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "tree-vectorizer.h"
 #include "tree-ssa-loop-manip.h"
+#include "tree-ssa.h"
+#include "tree-into-ssa.h"
 #include "graphite.h"
 
 /* Print global statistics to FILE.  */
@@ -109,7 +111,7 @@ print_global_statistics (FILE* file)
   fprintf (file, "LOOPS:%ld, ", n_loops);
   fprintf (file, "CONDITIONS:%ld, ", n_conditions);
   fprintf (file, "STMTS:%ld)\n", n_stmts);
-  fprintf (file, "\nGlobal profiling statistics (");
+  fprintf (file, "Global profiling statistics (");
   fprintf (file, "BBS:");
   n_p_bbs.dump (file);
   fprintf (file, ", LOOPS:");
@@ -118,7 +120,7 @@ print_global_statistics (FILE* file)
   n_p_conditions.dump (file);
   fprintf (file, ", STMTS:");
   n_p_stmts.dump (file);
-  fprintf (file, ")\n");
+  fprintf (file, ")\n\n");
 }
 
 /* Print statistics for SCOP to FILE.  */
@@ -183,7 +185,7 @@ print_graphite_scop_statistics (FILE* file, scop_p scop)
   fprintf (file, "LOOPS:%ld, ", n_loops);
   fprintf (file, "CONDITIONS:%ld, ", n_conditions);
   fprintf (file, "STMTS:%ld)\n", n_stmts);
-  fprintf (file, "\nSCoP profiling statistics (");
+  fprintf (file, "SCoP profiling statistics (");
   fprintf (file, "BBS:");
   n_p_bbs.dump (file);
   fprintf (file, ", LOOPS:");
@@ -192,7 +194,7 @@ print_graphite_scop_statistics (FILE* file, scop_p scop)
   n_p_conditions.dump (file);
   fprintf (file, ", STMTS:");
   n_p_stmts.dump (file);
-  fprintf (file, ")\n");
+  fprintf (file, ")\n\n");
 }
 
 /* Print statistics for SCOPS to FILE.  */
@@ -201,73 +203,10 @@ static void
 print_graphite_statistics (FILE* file, vec<scop_p> scops)
 {
   int i;
-
   scop_p scop;
 
   FOR_EACH_VEC_ELT (scops, i, scop)
     print_graphite_scop_statistics (file, scop);
-
-  /* Print the loop structure.  */
-  print_loops (file, 2);
-  print_loops (file, 3);
-}
-
-/* Initialize graphite: when there are no loops returns false.  */
-
-static bool
-graphite_initialize (void)
-{
-  int min_loops = PARAM_VALUE (PARAM_GRAPHITE_MIN_LOOPS_PER_FUNCTION);
-  int nloops = number_of_loops (cfun);
-
-  if (nloops <= min_loops)
-    {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  if (nloops <= min_loops)
-	    fprintf (dump_file, "\nFunction does not have enough loops: "
-		     "PARAM_GRAPHITE_MIN_LOOPS_PER_FUNCTION = %d.\n",
-		     min_loops);
-
-	  fprintf (dump_file, "\nnumber of SCoPs: 0\n");
-	  print_global_statistics (dump_file);
-	}
-
-      return false;
-    }
-
-  calculate_dominance_info (CDI_DOMINATORS);
-  initialize_original_copy_tables ();
-
-  if (dump_file && dump_flags)
-    {
-      dump_function_to_file (current_function_decl, dump_file, dump_flags);
-      print_loops (dump_file, 3);
-    }
-
-  return true;
-}
-
-/* Finalize graphite: perform CFG cleanup when NEED_CFG_CLEANUP_P is
-   true.  */
-
-static void
-graphite_finalize (bool need_cfg_cleanup_p)
-{
-  if (need_cfg_cleanup_p)
-    {
-      free_dominance_info (CDI_DOMINATORS);
-      scev_reset ();
-      cleanup_tree_cfg ();
-      profile_status_for_fn (cfun) = PROFILE_ABSENT;
-      release_recorded_exits (cfun);
-      tree_estimate_probability (false);
-    }
-
-  free_original_copy_tables ();
-
-  if (dump_file && dump_flags)
-    print_loops (dump_file, 3);
 }
 
 /* Deletes all scops in SCOPS.  */
@@ -396,7 +335,7 @@ graphite_transform_loops (void)
 {
   int i;
   scop_p scop;
-  bool need_cfg_cleanup_p = false;
+  bool changed = false;
   vec<scop_p> scops = vNULL;
   isl_ctx *ctx;
 
@@ -405,8 +344,7 @@ graphite_transform_loops (void)
   if (parallelized_function_p (cfun->decl))
     return;
 
-  if (!graphite_initialize ())
-    return;
+  calculate_dominance_info (CDI_DOMINATORS);
 
   ctx = isl_ctx_alloc ();
   isl_options_set_on_error (ctx, ISL_ON_ERROR_ABORT);
@@ -415,6 +353,13 @@ graphite_transform_loops (void)
   sort_sibling_loops (cfun);
   canonicalize_loop_closed_ssa_form ();
 
+  /* Print the loop structure.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      print_loops (dump_file, 2);
+      print_loops (dump_file, 3);
+    }
+
   calculate_dominance_info (CDI_POST_DOMINATORS);
   build_scops (&scops);
   free_dominance_info (CDI_POST_DOMINATORS);
@@ -435,18 +380,26 @@ graphite_transform_loops (void)
 	if (!apply_poly_transforms (scop))
 	  continue;
 
-	location_t loc = find_loop_location
-	  (scops[i]->scop_info->region.entry->dest->loop_father);
-
-	need_cfg_cleanup_p = true;
-	if (!graphite_regenerate_ast_isl (scop))
-	  dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
-			   "loop nest not optimized, code generation error\n");
-	else
-	  dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
-			   "loop nest optimized\n");
+	changed = true;
+	if (graphite_regenerate_ast_isl (scop))
+	  {
+	    location_t loc = find_loop_location
+	      (scops[i]->scop_info->region.entry->dest->loop_father);
+	    dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
+			     "loop nest optimized\n");
+	  }
       }
 
+  if (changed)
+    {
+      mark_virtual_operands_for_renaming (cfun);
+      update_ssa (TODO_update_ssa);
+      checking_verify_ssa (true, true);
+      rewrite_into_loop_closed_ssa (NULL, 0);
+      scev_reset ();
+      checking_verify_loop_structure ();
+    }
+
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
       loop_p loop;
@@ -461,9 +414,17 @@ graphite_transform_loops (void)
     }
 
   free_scops (scops);
-  graphite_finalize (need_cfg_cleanup_p);
   the_isl_ctx = NULL;
   isl_ctx_free (ctx);
+
+  if (changed)
+    {
+      cleanup_tree_cfg ();
+      profile_status_for_fn (cfun) = PROFILE_ABSENT;
+      release_recorded_exits (cfun);
+      tree_estimate_probability (false);
+    }
+
 }
 
 #else /* If isl is not available: #ifndef HAVE_isl.  */
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index d6dab57101b..f5c06a95bb6 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -3084,8 +3084,7 @@ ready_sort_real (struct ready_list *ready)
   if (n_ready_real == 2)
     swap_sort (first, n_ready_real);
   else if (n_ready_real > 2)
-    /* HACK: Disable qsort checking for now (PR82396).  */
-    (qsort) (first, n_ready_real, sizeof (rtx), rank_for_schedule);
+    qsort (first, n_ready_real, sizeof (rtx), rank_for_schedule);
 
   if (sched_verbose >= 4)
     {
@@ -3918,8 +3917,8 @@ sched_pressure_start_bb (basic_block bb)
       - call_saved_regs_num[cl]).  */
   {
     int i;
-    int entry_freq = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
-    int bb_freq = bb->frequency;
+    int entry_freq = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun);
+    int bb_freq = bb->count.to_frequency (cfun);
 
     if (bb_freq == 0)
       {
@@ -5569,9 +5568,7 @@ autopref_multipass_init (const rtx_insn *insn, int write)
 
   gcc_assert (data->status == AUTOPREF_MULTIPASS_DATA_UNINITIALIZED);
   data->base = NULL_RTX;
-  data->min_offset = 0;
-  data->max_offset = 0;
-  data->multi_mem_insn_p = false;
+  data->offset = 0;
   /* Set insn entry initialized, but not relevant for auto-prefetcher.  */
   data->status = AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
 
@@ -5586,10 +5583,9 @@ autopref_multipass_init (const rtx_insn *insn, int write)
     {
       int n_elems = XVECLEN (pat, 0);
 
-      int i = 0;
-      rtx prev_base = NULL_RTX;
-      int min_offset = 0;
-      int max_offset = 0;
+      int i, offset;
+      rtx base, prev_base = NULL_RTX;
+      int min_offset = INT_MAX;
 
       for (i = 0; i < n_elems; i++)
 	{
@@ -5597,38 +5593,23 @@ autopref_multipass_init (const rtx_insn *insn, int write)
 	  if (GET_CODE (set) != SET)
 	    return;
 
-	  rtx base = NULL_RTX;
-	  int offset = 0;
 	  if (!analyze_set_insn_for_autopref (set, write, &base, &offset))
 	    return;
 
-	  if (i == 0)
-	    {
-	      prev_base = base;
-	      min_offset = offset;
-	      max_offset = offset;
-	    }
 	  /* Ensure that all memory operations in the PARALLEL use the same
 	     base register.  */
-	  else if (REGNO (base) != REGNO (prev_base))
+	  if (i > 0 && REGNO (base) != REGNO (prev_base))
 	    return;
-	  else
-	    {
-	      min_offset = MIN (min_offset, offset);
-	      max_offset = MAX (max_offset, offset);
-	    }
+	  prev_base = base;
+	  min_offset = MIN (min_offset, offset);
 	}
 
-      /* If we reached here then we have a valid PARALLEL of multiple memory
-	 ops with prev_base as the base and min_offset and max_offset
-	 containing the offsets range.  */
+      /* If we reached here then we have a valid PARALLEL of multiple memory ops
+	 with prev_base as the base and min_offset containing the offset.  */
       gcc_assert (prev_base);
       data->base = prev_base;
-      data->min_offset = min_offset;
-      data->max_offset = max_offset;
-      data->multi_mem_insn_p = true;
+      data->offset = min_offset;
       data->status = AUTOPREF_MULTIPASS_DATA_NORMAL;
-
       return;
     }
 
@@ -5638,7 +5619,7 @@ autopref_multipass_init (const rtx_insn *insn, int write)
     return;
 
   if (!analyze_set_insn_for_autopref (set, write, &data->base,
-				       &data->min_offset))
+				       &data->offset))
     return;
 
   /* This insn is relevant for the auto-prefetcher.
@@ -5647,63 +5628,6 @@ autopref_multipass_init (const rtx_insn *insn, int write)
   data->status = AUTOPREF_MULTIPASS_DATA_NORMAL;
 }
 
-
-/* Helper for autopref_rank_for_schedule.  Given the data of two
-   insns relevant to the auto-prefetcher modelling code DATA1 and DATA2
-   return their comparison result.  Return 0 if there is no sensible
-   ranking order for the two insns.  */
-
-static int
-autopref_rank_data (autopref_multipass_data_t data1,
-		     autopref_multipass_data_t data2)
-{
-  /* Simple case when both insns are simple single memory ops.  */
-  if (!data1->multi_mem_insn_p && !data2->multi_mem_insn_p)
-    return data1->min_offset - data2->min_offset;
-
-  /* Two load/store multiple insns.  Return 0 if the offset ranges
-     overlap and the difference between the minimum offsets otherwise.  */
-  else if (data1->multi_mem_insn_p && data2->multi_mem_insn_p)
-    {
-      int min1 = data1->min_offset;
-      int max1 = data1->max_offset;
-      int min2 = data2->min_offset;
-      int max2 = data2->max_offset;
-
-      if (max1 < min2 || min1 > max2)
-	return min1 - min2;
-      else
-	return 0;
-    }
-
-  /* The other two cases is a pair of a load/store multiple and
-     a simple memory op.  Return 0 if the single op's offset is within the
-     range of the multi-op insn and the difference between the single offset
-     and the minimum offset of the multi-set insn otherwise.  */
-  else if (data1->multi_mem_insn_p && !data2->multi_mem_insn_p)
-    {
-      int max1 = data1->max_offset;
-      int min1 = data1->min_offset;
-
-      if (data2->min_offset >= min1
-	  && data2->min_offset <= max1)
-	return 0;
-      else
-	return min1 - data2->min_offset;
-    }
-  else
-    {
-      int max2 = data2->max_offset;
-      int min2 = data2->min_offset;
-
-      if (data1->min_offset >= min2
-	  && data1->min_offset <= max2)
-	return 0;
-      else
-	return data1->min_offset - min2;
-    }
-}
-
 /* Helper function for rank_for_schedule sorting.  */
 static int
 autopref_rank_for_schedule (const rtx_insn *insn1, const rtx_insn *insn2)
@@ -5726,7 +5650,7 @@ autopref_rank_for_schedule (const rtx_insn *insn1, const rtx_insn *insn2)
       int irrel2 = data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
 
       if (!irrel1 && !irrel2)
-	r = autopref_rank_data (data1, data2);
+	r = data1->offset - data2->offset;
       else
 	r = irrel2 - irrel1;
     }
@@ -5754,7 +5678,7 @@ autopref_multipass_dfa_lookahead_guard_1 (const rtx_insn *insn1,
     return 0;
 
   if (rtx_equal_p (data1->base, data2->base)
-      && autopref_rank_data (data1, data2) > 0)
+      && data1->offset > data2->offset)
     {
       if (sched_verbose >= 2)
 	{
@@ -8217,8 +8141,6 @@ init_before_recovery (basic_block *before_recovery_ptr)
 
       single->count = last->count;
       empty->count = last->count;
-      single->frequency = last->frequency;
-      empty->frequency = last->frequency;
       BB_COPY_PARTITION (single, last);
       BB_COPY_PARTITION (empty, last);
 
@@ -8311,11 +8233,8 @@ sched_create_recovery_edges (basic_block first_bb, basic_block rec,
      'todo_spec' variable in create_check_block_twin and
      in sel-sched.c `check_ds' in create_speculation_check.  */
   e->probability = profile_probability::very_unlikely ();
-  e->count = first_bb->count.apply_probability (e->probability);
-  rec->count = e->count;
-  rec->frequency = EDGE_FREQUENCY (e);
+  rec->count = e->count ();
   e2->probability = e->probability.invert ();
-  e2->count = first_bb->count - e2->count;
 
   rtx_code_label *label = block_label (second_bb);
   rtx_jump_insn *jump = emit_jump_insn_after (targetm.gen_jump (label),
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 593b92f11c1..a462ae5aa11 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -4231,12 +4231,11 @@ gen_hsa_alloca (gcall *call, hsa_bb *hbb)
 
   built_in_function fn = DECL_FUNCTION_CODE (gimple_call_fndecl (call));
 
-  gcc_checking_assert (fn == BUILT_IN_ALLOCA
-		       || fn == BUILT_IN_ALLOCA_WITH_ALIGN);
+  gcc_checking_assert (ALLOCA_FUNCTION_CODE_P (fn));
 
   unsigned bit_alignment = 0;
 
-  if (fn == BUILT_IN_ALLOCA_WITH_ALIGN)
+  if (fn != BUILT_IN_ALLOCA)
     {
       tree alignment_tree = gimple_call_arg (call, 1);
       if (TREE_CODE (alignment_tree) != INTEGER_CST)
@@ -5716,8 +5715,7 @@ gen_hsa_insns_for_call (gimple *stmt, hsa_bb *hbb)
 
 	break;
       }
-    case BUILT_IN_ALLOCA:
-    case BUILT_IN_ALLOCA_WITH_ALIGN:
+    CASE_BUILT_IN_ALLOCA:
       {
 	gen_hsa_alloca (call, hbb);
 	break;
@@ -6331,7 +6329,7 @@ convert_switch_statements (void)
 	    tree label = gimple_switch_label (s, i);
 	    basic_block label_bb = label_to_block_fn (func, CASE_LABEL (label));
 	    edge e = find_edge (bb, label_bb);
-	    edge_counts.safe_push (e->count);
+	    edge_counts.safe_push (e->count ());
 	    edge_probabilities.safe_push (e->probability);
 	    gphi_iterator phi_gsi;
 
@@ -6421,7 +6419,6 @@ convert_switch_statements (void)
 	    if (prob_sum.initialized_p ())
 	      new_edge->probability = edge_probabilities[i] / prob_sum;
 
-	    new_edge->count = edge_counts[i];
 	    new_edges.safe_push (new_edge);
 
 	    if (i < labels - 1)
@@ -6437,10 +6434,7 @@ convert_switch_statements (void)
 
 		edge next_edge = make_edge (cur_bb, next_bb, EDGE_FALSE_VALUE);
 		next_edge->probability = new_edge->probability.invert ();
-		next_edge->count = edge_counts[0]
-		  + sum_slice <profile_count> (edge_counts, i, labels,
-					       profile_count::zero ());
-		next_bb->frequency = EDGE_FREQUENCY (next_edge);
+		next_bb->count = next_edge->count ();
 		cur_bb = next_bb;
 	      }
 	    else /* Link last IF statement and default label
@@ -6448,7 +6442,6 @@ convert_switch_statements (void)
 	      {
 		edge e = make_edge (cur_bb, default_label_bb, EDGE_FALSE_VALUE);
 		e->probability = new_edge->probability.invert ();
-		e->count = edge_counts[0];
 		new_edges.safe_insert (0, e);
 	      }
 	  }
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 47169270d6b..4041a5ba9ba 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -5283,8 +5283,6 @@ dead_or_predicable (basic_block test_bb, basic_block merge_bb,
       redirect_edge_succ (BRANCH_EDGE (test_bb), new_dest);
       if (reversep)
 	{
-	  std::swap (BRANCH_EDGE (test_bb)->count,
-		     FALLTHRU_EDGE (test_bb)->count);
 	  std::swap (BRANCH_EDGE (test_bb)->probability,
 		     FALLTHRU_EDGE (test_bb)->probability);
 	  update_br_prob_note (test_bb);
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 4264bb81fe1..c4dcb7fb13e 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1791,8 +1791,8 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1,
 		}
 
 	      /* At this point hipart{0,1} are both in [-1, 0].  If they are
-		 the same, overflow happened if res is negative, if they are
-		 different, overflow happened if res is positive.  */
+		 the same, overflow happened if res is non-positive, if they
+		 are different, overflow happened if res is positive.  */
 	      if (op0_sign != 1 && op1_sign != 1 && op0_sign != op1_sign)
 		emit_jump (hipart_different);
 	      else if (op0_sign == 1 || op1_sign == 1)
@@ -1800,7 +1800,7 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1,
 					 NULL_RTX, NULL, hipart_different,
 					 profile_probability::even ());
 
-	      do_compare_rtx_and_jump (res, const0_rtx, LT, false, mode,
+	      do_compare_rtx_and_jump (res, const0_rtx, LE, false, mode,
 				       NULL_RTX, NULL, do_error,
 				       profile_probability::very_unlikely ());
 	      emit_jump (done_label);
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index a98d35c5b74..43885e7988d 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -97,6 +97,11 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_FLT_FLOATN_FN
+#define DEF_INTERNAL_FLT_FLOATN_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FLT_FN (NAME, FLAGS, OPTAB, TYPE)
+#endif
+
 #ifndef DEF_INTERNAL_INT_FN
 #define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
@@ -207,7 +212,7 @@ DEF_INTERNAL_FLT_FN (LOG2, ECF_CONST, log2, unary)
 DEF_INTERNAL_FLT_FN (LOGB, ECF_CONST, logb, unary)
 DEF_INTERNAL_FLT_FN (SIGNIFICAND, ECF_CONST, significand, unary)
 DEF_INTERNAL_FLT_FN (SIN, ECF_CONST, sin, unary)
-DEF_INTERNAL_FLT_FN (SQRT, ECF_CONST, sqrt, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (SQRT, ECF_CONST, sqrt, unary)
 DEF_INTERNAL_FLT_FN (TAN, ECF_CONST, tan, unary)
 
 /* FP rounding.  */
@@ -220,13 +225,13 @@ DEF_INTERNAL_FLT_FN (TRUNC, ECF_CONST, btrunc, unary)
 
 /* Binary math functions.  */
 DEF_INTERNAL_FLT_FN (ATAN2, ECF_CONST, atan2, binary)
-DEF_INTERNAL_FLT_FN (COPYSIGN, ECF_CONST, copysign, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (COPYSIGN, ECF_CONST, copysign, binary)
 DEF_INTERNAL_FLT_FN (FMOD, ECF_CONST, fmod, binary)
 DEF_INTERNAL_FLT_FN (POW, ECF_CONST, pow, binary)
 DEF_INTERNAL_FLT_FN (REMAINDER, ECF_CONST, remainder, binary)
 DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary)
-DEF_INTERNAL_FLT_FN (FMIN, ECF_CONST, fmin, binary)
-DEF_INTERNAL_FLT_FN (FMAX, ECF_CONST, fmax, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary)
+DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary)
 DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
 
 /* FP scales.  */
@@ -331,6 +336,7 @@ DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL)
 
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
+#undef DEF_INTERNAL_FLT_FLOATN_FN
 #undef DEF_INTERNAL_COND_OPTAB_FN
 #undef DEF_INTERNAL_OPTAB_FN
 #undef DEF_INTERNAL_FN
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index d23c1d8ba3e..24d2be79103 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3257,6 +3257,8 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   if (dump_file)
     fprintf (dump_file, "\n Propagating constants:\n\n");
 
+  max_count = profile_count::uninitialized ();
+
   FOR_EACH_DEFINED_FUNCTION (node)
   {
     struct ipa_node_params *info = IPA_NODE_REF (node);
@@ -3270,8 +3272,7 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
       }
     if (node->definition && !node->alias)
       overall_size += ipa_fn_summaries->get (node)->self_size;
-    if (node->count > max_count)
-      max_count = node->count;
+    max_count = max_count.max (node->count);
   }
 
   max_new_size = overall_size;
@@ -5125,7 +5126,7 @@ make_pass_ipa_cp (gcc::context *ctxt)
 void
 ipa_cp_c_finalize (void)
 {
-  max_count = profile_count::zero ();
+  max_count = profile_count::uninitialized ();
   overall_size = 0;
   max_new_size = 0;
 }
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 076ccd40bd7..f6841104a32 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -542,6 +542,7 @@ void
 ipa_call_summary::reset ()
 {
   call_stmt_size = call_stmt_time = 0;
+  is_return_callee_uncaptured = false;
   if (predicate)
     edge_predicate_pool.remove (predicate);
   predicate = NULL;
@@ -1607,7 +1608,7 @@ static basic_block
 get_minimal_bb (basic_block init_bb, basic_block use_bb)
 {
   struct loop *l = find_common_loop (init_bb->loop_father, use_bb->loop_father);
-  if (l && l->header->frequency < init_bb->frequency)
+  if (l && l->header->count < init_bb->count)
     return l->header;
   return init_bb;
 }
@@ -1663,20 +1664,21 @@ param_change_prob (gimple *stmt, int i)
     {
       int init_freq;
 
-      if (!bb->frequency)
+      if (!bb->count.to_frequency (cfun))
 	return REG_BR_PROB_BASE;
 
       if (SSA_NAME_IS_DEFAULT_DEF (base))
-	init_freq = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
+	init_freq = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun);
       else
 	init_freq = get_minimal_bb
 		      (gimple_bb (SSA_NAME_DEF_STMT (base)),
-		       gimple_bb (stmt))->frequency;
+		       gimple_bb (stmt))->count.to_frequency (cfun);
 
       if (!init_freq)
 	init_freq = 1;
-      if (init_freq < bb->frequency)
-	return MAX (GCOV_COMPUTE_SCALE (init_freq, bb->frequency), 1);
+      if (init_freq < bb->count.to_frequency (cfun))
+	return MAX (GCOV_COMPUTE_SCALE (init_freq,
+					bb->count.to_frequency (cfun)), 1);
       else
 	return REG_BR_PROB_BASE;
     }
@@ -1691,7 +1693,7 @@ param_change_prob (gimple *stmt, int i)
 
       if (init != error_mark_node)
 	return 0;
-      if (!bb->frequency)
+      if (!bb->count.to_frequency (cfun))
 	return REG_BR_PROB_BASE;
       ao_ref_init (&refd, op);
       info.stmt = stmt;
@@ -1707,17 +1709,17 @@ param_change_prob (gimple *stmt, int i)
       /* Assume that every memory is initialized at entry.
          TODO: Can we easilly determine if value is always defined
          and thus we may skip entry block?  */
-      if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency)
-	max = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
+      if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun))
+	max = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun);
       else
 	max = 1;
 
       EXECUTE_IF_SET_IN_BITMAP (info.bb_set, 0, index, bi)
-	max = MIN (max, BASIC_BLOCK_FOR_FN (cfun, index)->frequency);
+	max = MIN (max, BASIC_BLOCK_FOR_FN (cfun, index)->count.to_frequency (cfun));
 
       BITMAP_FREE (info.bb_set);
-      if (max < bb->frequency)
-	return MAX (GCOV_COMPUTE_SCALE (max, bb->frequency), 1);
+      if (max < bb->count.to_frequency (cfun))
+	return MAX (GCOV_COMPUTE_SCALE (max, bb->count.to_frequency (cfun)), 1);
       else
 	return REG_BR_PROB_BASE;
     }
@@ -3204,6 +3206,10 @@ read_ipa_call_summary (struct lto_input_block *ib, struct cgraph_edge *e)
   es->call_stmt_size = streamer_read_uhwi (ib);
   es->call_stmt_time = streamer_read_uhwi (ib);
   es->loop_depth = streamer_read_uhwi (ib);
+
+  bitpack_d bp = streamer_read_bitpack (ib);
+  es->is_return_callee_uncaptured = bp_unpack_value (&bp, 1);
+
   p.stream_in (ib);
   edge_set_predicate (e, &p);
   length = streamer_read_uhwi (ib);
@@ -3360,6 +3366,11 @@ write_ipa_call_summary (struct output_block *ob, struct cgraph_edge *e)
   streamer_write_uhwi (ob, es->call_stmt_size);
   streamer_write_uhwi (ob, es->call_stmt_time);
   streamer_write_uhwi (ob, es->loop_depth);
+
+  bitpack_d bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, es->is_return_callee_uncaptured, 1);
+  streamer_write_bitpack (&bp);
+
   if (es->predicate)
     es->predicate->stream_out (ob);
   else
diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
index f50d6806e61..a794bd09318 100644
--- a/gcc/ipa-fnsummary.h
+++ b/gcc/ipa-fnsummary.h
@@ -197,7 +197,9 @@ struct ipa_call_summary
   int call_stmt_time;
   /* Depth of loop nest, 0 means no nesting.  */
   unsigned int loop_depth;
-  
+  /* Indicates whether the caller returns the value of it's callee.  */
+  bool is_return_callee_uncaptured;
+
   /* Keep all field empty so summary dumping works during its computation.
      This is useful for debugging.  */
   ipa_call_summary ()
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 7b4cd9d49e8..cb66aa5f7a0 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1422,6 +1422,7 @@ sem_function::init (void)
 	      }
 	  }
 
+	hstate.commit_flag ();
 	gcode_hash = hstate.end ();
 	bb_sizes.safe_push (nondbg_stmt_count);
 
@@ -1646,6 +1647,11 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash &hstate)
 	  if (gimple_op (stmt, i))
 	    add_type (TREE_TYPE (gimple_op (stmt, i)), hstate);
 	}
+      /* Consider nocf_check attribute in hash as it affects code
+ 	 generation.  */
+      if (code == GIMPLE_CALL
+	  && flag_cf_protection & CF_BRANCH)
+	hstate.add_flag (gimple_call_nocf_check_p (as_a <gcall *> (stmt)));
     default:
       break;
     }
diff --git a/gcc/ipa-inline-transform.c b/gcc/ipa-inline-transform.c
index dc224f7a394..886e8edd473 100644
--- a/gcc/ipa-inline-transform.c
+++ b/gcc/ipa-inline-transform.c
@@ -676,9 +676,9 @@ inline_transform (struct cgraph_node *node)
     {
       profile_count num = node->count;
       profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
-      bool scale = num.initialized_p ()
-		   && (den > 0 || num == profile_count::zero ())
-		   && !(num == den);
+      bool scale = num.initialized_p () && den.ipa_p ()
+		   && (den.nonzero_p () || num == profile_count::zero ())
+		   && !(num == den.ipa ());
       if (scale)
 	{
 	  if (dump_file)
@@ -692,14 +692,7 @@ inline_transform (struct cgraph_node *node)
 
 	  basic_block bb;
 	  FOR_ALL_BB_FN (bb, cfun)
-	    {
-	      bb->count = bb->count.apply_scale (num, den);
-	
-	      edge e;
-	      edge_iterator ei;
-	      FOR_EACH_EDGE (e, ei, bb->succs)
-		e->count = e->count.apply_scale (num, den);
-	    }
+	    bb->count = bb->count.apply_scale (num, den);
 	  ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = node->count;
 	}
       todo = optimize_inline_calls (current_function_decl);
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index dd46cb61362..687996876ce 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -640,8 +640,8 @@ compute_uninlined_call_time (struct cgraph_edge *edge,
 			 ? edge->caller->global.inlined_to
 			 : edge->caller);
 
-  if (edge->count > profile_count::zero ()
-      && caller->count > profile_count::zero ())
+  if (edge->count.nonzero_p ()
+      && caller->count.nonzero_p ())
     uninlined_call_time *= (sreal)edge->count.to_gcov_type ()
 			   / caller->count.to_gcov_type ();
   if (edge->frequency)
@@ -665,8 +665,8 @@ compute_inlined_call_time (struct cgraph_edge *edge,
 			 : edge->caller);
   sreal caller_time = ipa_fn_summaries->get (caller)->time;
 
-  if (edge->count > profile_count::zero ()
-      && caller->count > profile_count::zero ())
+  if (edge->count.nonzero_p ()
+      && caller->count.nonzero_p ())
     time *= (sreal)edge->count.to_gcov_type () / caller->count.to_gcov_type ();
   if (edge->frequency)
     time *= cgraph_freq_base_rec * edge->frequency;
@@ -733,7 +733,7 @@ want_inline_small_function_p (struct cgraph_edge *e, bool report)
       want_inline = false;
     }
   else if ((DECL_DECLARED_INLINE_P (callee->decl)
-	    || e->count > profile_count::zero ())
+	    || e->count.nonzero_p ())
 	   && ipa_fn_summaries->get (callee)->min_size
 		- ipa_call_summaries->get (e)->call_stmt_size
 	      > 16 * MAX_INLINE_INSNS_SINGLE)
@@ -843,7 +843,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
       reason = "recursive call is cold";
       want_inline = false;
     }
-  else if (outer_node->count == profile_count::zero ())
+  else if (!outer_node->count.nonzero_p ())
     {
       reason = "not executed in profile";
       want_inline = false;
@@ -881,7 +881,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
       int i;
       for (i = 1; i < depth; i++)
 	max_prob = max_prob * max_prob / CGRAPH_FREQ_BASE;
-      if (max_count > profile_count::zero () && edge->count > profile_count::zero ()
+      if (max_count.nonzero_p () && edge->count.nonzero_p () 
 	  && (edge->count.to_gcov_type () * CGRAPH_FREQ_BASE
 	      / outer_node->count.to_gcov_type ()
 	      >= max_prob))
@@ -889,7 +889,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
 	  reason = "profile of recursive call is too large";
 	  want_inline = false;
 	}
-      if (max_count == profile_count::zero ()
+      if (!max_count.nonzero_p ()
 	  && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
 	      >= max_prob))
 	{
@@ -915,7 +915,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
      methods.  */
   else
     {
-      if (max_count > profile_count::zero () && edge->count.initialized_p ()
+      if (max_count.nonzero_p () && edge->count.initialized_p ()
 	  && (edge->count.to_gcov_type () * 100
 	      / outer_node->count.to_gcov_type ()
 	      <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
@@ -923,7 +923,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
 	  reason = "profile of recursive call is too small";
 	  want_inline = false;
 	}
-      else if ((max_count == profile_count::zero ()
+      else if ((!max_count.nonzero_p ()
 	        || !edge->count.initialized_p ())
 	       && (edge->frequency * 100 / caller_freq
 	           <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
@@ -1070,7 +1070,7 @@ edge_badness (struct cgraph_edge *edge, bool dump)
      then calls without.
   */
   else if (opt_for_fn (caller->decl, flag_guess_branch_prob)
-	   || caller->count > profile_count::zero ())
+	   || caller->count.nonzero_p ())
     {
       sreal numerator, denominator;
       int overall_growth;
@@ -1080,7 +1080,7 @@ edge_badness (struct cgraph_edge *edge, bool dump)
 		   - inlined_time);
       if (numerator == 0)
 	numerator = ((sreal) 1 >> 8);
-      if (caller->count > profile_count::zero ())
+      if (caller->count.nonzero_p ())
 	numerator *= caller->count.to_gcov_type ();
       else if (caller->count.initialized_p ())
 	numerator = numerator >> 11;
@@ -1521,7 +1521,7 @@ recursive_inlining (struct cgraph_edge *edge,
 	{
 	  fprintf (dump_file,
 		   "   Inlining call of depth %i", depth);
-	  if (node->count > profile_count::zero ())
+	  if (node->count.nonzero_p ())
 	    {
 	      fprintf (dump_file, " called approx. %.2f times per call",
 		       (double)curr->count.to_gcov_type ()
@@ -1684,7 +1684,8 @@ resolve_noninline_speculation (edge_heap_t *edge_heap, struct cgraph_edge *edge)
 				  ? node->global.inlined_to : node;
       auto_bitmap updated_nodes;
 
-      spec_rem += edge->count;
+      if (edge->count.initialized_p ())
+        spec_rem += edge->count;
       edge->resolve_speculation ();
       reset_edge_caches (where);
       ipa_update_overall_fn_summary (where);
@@ -1789,8 +1790,7 @@ inline_small_functions (void)
 	  }
 
 	for (edge = node->callers; edge; edge = edge->next_caller)
-	  if (!(max_count >= edge->count))
-	    max_count = edge->count;
+	  max_count = max_count.max (edge->count);
       }
   ipa_free_postorder_info ();
   initialize_growth_caches ();
@@ -2049,7 +2049,7 @@ inline_small_functions (void)
       update_caller_keys (&edge_heap, where, updated_nodes, NULL);
       /* Offline copy count has possibly changed, recompute if profile is
 	 available.  */
-      if (max_count > profile_count::zero ())
+      if (max_count.nonzero_p ())
         {
 	  struct cgraph_node *n = cgraph_node::get (edge->callee->decl);
 	  if (n != edge->callee && n->analyzed)
@@ -2392,6 +2392,7 @@ ipa_inline (void)
     ipa_dump_fn_summaries (dump_file);
 
   nnodes = ipa_reverse_postorder (order);
+  spec_rem = profile_count::zero ();
 
   FOR_EACH_FUNCTION (node)
     {
@@ -2487,8 +2488,9 @@ ipa_inline (void)
 	      next = edge->next_callee;
 	      if (edge->speculative && !speculation_useful_p (edge, false))
 		{
+		  if (edge->count.initialized_p ())
+		    spec_rem += edge->count;
 		  edge->resolve_speculation ();
-		  spec_rem += edge->count;
 		  update = true;
 		  remove_functions = true;
 		}
@@ -2526,9 +2528,6 @@ ipa_inline (void)
 
   if (dump_file)
     ipa_dump_fn_summaries (dump_file);
-  /* In WPA we use inline summaries for partitioning process.  */
-  if (!flag_wpa)
-    ipa_free_fn_summary ();
   return remove_functions ? TODO_remove_functions : 0;
 }
 
diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
index f149d0196fa..8eb03dd7c24 100644
--- a/gcc/ipa-profile.c
+++ b/gcc/ipa-profile.c
@@ -179,53 +179,54 @@ ipa_profile_generate_summary (void)
   hash_table<histogram_hash> hashtable (10);
   
   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
-    FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl))
-      {
-	int time = 0;
-	int size = 0;
-        for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
-	  {
-	    gimple *stmt = gsi_stmt (gsi);
-	    if (gimple_code (stmt) == GIMPLE_CALL
-		&& !gimple_call_fndecl (stmt))
-	      {
-		histogram_value h;
-		h = gimple_histogram_value_of_type
-		      (DECL_STRUCT_FUNCTION (node->decl),
-		       stmt, HIST_TYPE_INDIR_CALL);
-		/* No need to do sanity check: gimple_ic_transform already
-		   takes away bad histograms.  */
-		if (h)
-		  {
-		    /* counter 0 is target, counter 1 is number of execution we called target,
-		       counter 2 is total number of executions.  */
-		    if (h->hvalue.counters[2])
-		      {
-			struct cgraph_edge * e = node->get_edge (stmt);
-			if (e && !e->indirect_unknown_callee)
-			  continue;
-			e->indirect_info->common_target_id
-			  = h->hvalue.counters [0];
-			e->indirect_info->common_target_probability
-			  = GCOV_COMPUTE_SCALE (h->hvalue.counters [1], h->hvalue.counters [2]);
-			if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE)
-			  {
-			    if (dump_file)
-			      fprintf (dump_file, "Probability capped to 1\n");
-			    e->indirect_info->common_target_probability = REG_BR_PROB_BASE;
-			  }
-		      }
-		    gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl),
-						    stmt, h);
-		  }
-	      }
-	    time += estimate_num_insns (stmt, &eni_time_weights);
-	    size += estimate_num_insns (stmt, &eni_size_weights);
-	  }
-	if (bb->count.initialized_p ())
-	  account_time_size (&hashtable, histogram, bb->count.to_gcov_type (),
-			     time, size);
-      }
+    if (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION (node->decl))->count.ipa_p ())
+      FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl))
+	{
+	  int time = 0;
+	  int size = 0;
+	  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	    {
+	      gimple *stmt = gsi_stmt (gsi);
+	      if (gimple_code (stmt) == GIMPLE_CALL
+		  && !gimple_call_fndecl (stmt))
+		{
+		  histogram_value h;
+		  h = gimple_histogram_value_of_type
+			(DECL_STRUCT_FUNCTION (node->decl),
+			 stmt, HIST_TYPE_INDIR_CALL);
+		  /* No need to do sanity check: gimple_ic_transform already
+		     takes away bad histograms.  */
+		  if (h)
+		    {
+		      /* counter 0 is target, counter 1 is number of execution we called target,
+			 counter 2 is total number of executions.  */
+		      if (h->hvalue.counters[2])
+			{
+			  struct cgraph_edge * e = node->get_edge (stmt);
+			  if (e && !e->indirect_unknown_callee)
+			    continue;
+			  e->indirect_info->common_target_id
+			    = h->hvalue.counters [0];
+			  e->indirect_info->common_target_probability
+			    = GCOV_COMPUTE_SCALE (h->hvalue.counters [1], h->hvalue.counters [2]);
+			  if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE)
+			    {
+			      if (dump_file)
+				fprintf (dump_file, "Probability capped to 1\n");
+			      e->indirect_info->common_target_probability = REG_BR_PROB_BASE;
+			    }
+			}
+		      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl),
+						      stmt, h);
+		    }
+		}
+	      time += estimate_num_insns (stmt, &eni_time_weights);
+	      size += estimate_num_insns (stmt, &eni_size_weights);
+	    }
+	  if (bb->count.ipa_p () && bb->count.initialized_p ())
+	    account_time_size (&hashtable, histogram, bb->count.ipa ().to_gcov_type (),
+			       time, size);
+	}
   histogram.qsort (cmp_counts);
 }
 
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 915423559cb..bdc752207b1 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -56,6 +56,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "intl.h"
 #include "opts.h"
+#include "ssa.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
+#include "ipa-prop.h"
+#include "ipa-fnsummary.h"
 
 /* Lattice values for const and pure functions.  Everything starts out
    being const, then may drop to pure and then neither depending on
@@ -67,7 +72,16 @@ enum pure_const_state_e
   IPA_NEITHER
 };
 
-const char *pure_const_names[3] = {"const", "pure", "neither"};
+static const char *pure_const_names[3] = {"const", "pure", "neither"};
+
+enum malloc_state_e
+{
+  STATE_MALLOC_TOP,
+  STATE_MALLOC,
+  STATE_MALLOC_BOTTOM
+};
+
+static const char *malloc_state_names[] = {"malloc_top", "malloc", "malloc_bottom"};
 
 /* Holder for the const_state.  There is one of these per function
    decl.  */
@@ -92,11 +106,13 @@ struct funct_state_d
   /* If function can call free, munmap or otherwise make previously
      non-trapping memory accesses trapping.  */
   bool can_free;
+
+  enum malloc_state_e malloc_state;
 };
 
 /* State used when we know nothing about function.  */
 static struct funct_state_d varying_state
-   = { IPA_NEITHER, IPA_NEITHER, true, true, true, true };
+   = { IPA_NEITHER, IPA_NEITHER, true, true, true, true, STATE_MALLOC_BOTTOM };
 
 
 typedef struct funct_state_d * funct_state;
@@ -216,6 +232,19 @@ warn_function_const (tree decl, bool known_finite)
 			 known_finite, warned_about, "const");
 }
 
+/* Emit suggestion about __attribute__((malloc)) for DECL.  */
+
+static void
+warn_function_malloc (tree decl)
+{
+  static hash_set<tree> *warned_about;
+  warned_about
+    = suggest_attribute (OPT_Wsuggest_attribute_malloc, decl,
+			 false, warned_about, "malloc");
+}
+
+/* Emit suggestion about __attribute__((noreturn)) for DECL.  */
+
 static void
 warn_function_noreturn (tree decl)
 {
@@ -518,8 +547,7 @@ special_builtin_state (enum pure_const_state_e *state, bool *looping,
       {
 	case BUILT_IN_RETURN:
 	case BUILT_IN_UNREACHABLE:
-	case BUILT_IN_ALLOCA:
-	case BUILT_IN_ALLOCA_WITH_ALIGN:
+	CASE_BUILT_IN_ALLOCA:
 	case BUILT_IN_STACK_SAVE:
 	case BUILT_IN_STACK_RESTORE:
 	case BUILT_IN_EH_POINTER:
@@ -828,6 +856,149 @@ check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa)
     }
 }
 
+/* Check that RETVAL is used only in STMT and in comparisons against 0.
+   RETVAL is return value of the function and STMT is return stmt.  */
+
+static bool
+check_retval_uses (tree retval, gimple *stmt)
+{
+  imm_use_iterator use_iter;
+  gimple *use_stmt;
+
+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, retval)
+    if (gcond *cond = dyn_cast<gcond *> (use_stmt))
+      {
+	tree op2 = gimple_cond_rhs (cond);
+	if (!integer_zerop (op2))
+	  RETURN_FROM_IMM_USE_STMT (use_iter, false);
+      }
+    else if (gassign *ga = dyn_cast<gassign *> (use_stmt))
+      {
+	enum tree_code code = gimple_assign_rhs_code (ga);
+	if (TREE_CODE_CLASS (code) != tcc_comparison)
+	  RETURN_FROM_IMM_USE_STMT (use_iter, false);
+	if (!integer_zerop (gimple_assign_rhs2 (ga)))
+	  RETURN_FROM_IMM_USE_STMT (use_iter, false);
+      }
+    else if (is_gimple_debug (use_stmt))
+      ;
+    else if (use_stmt != stmt)
+      RETURN_FROM_IMM_USE_STMT (use_iter, false);
+
+  return true;
+}
+
+/* malloc_candidate_p() checks if FUN can possibly be annotated with malloc
+   attribute. Currently this function does a very conservative analysis.
+   FUN is considered to be a candidate if
+   1) It returns a value of pointer type.
+   2) SSA_NAME_DEF_STMT (return_value) is either a function call or
+      a phi, and element of phi is either NULL or
+      SSA_NAME_DEF_STMT(element) is function call.
+   3) The return-value has immediate uses only within comparisons (gcond or gassign)
+      and return_stmt (and likewise a phi arg has immediate use only within comparison
+      or the phi stmt).  */
+
+static bool
+malloc_candidate_p (function *fun, bool ipa)
+{
+  basic_block exit_block = EXIT_BLOCK_PTR_FOR_FN (fun);
+  edge e;
+  edge_iterator ei;
+  cgraph_node *node = cgraph_node::get_create (fun->decl);
+
+#define DUMP_AND_RETURN(reason)  \
+{  \
+  if (dump_file && (dump_flags & TDF_DETAILS))  \
+    fprintf (dump_file, "%s", (reason));  \
+  return false;  \
+}
+
+  if (EDGE_COUNT (exit_block->preds) == 0)
+    return false;
+
+  FOR_EACH_EDGE (e, ei, exit_block->preds)
+    {
+      gimple_stmt_iterator gsi = gsi_last_bb (e->src);
+      greturn *ret_stmt = dyn_cast<greturn *> (gsi_stmt (gsi));
+
+      if (!ret_stmt)
+	return false;
+
+      tree retval = gimple_return_retval (ret_stmt);
+      if (!retval)
+	DUMP_AND_RETURN("No return value.")
+
+      if (TREE_CODE (retval) != SSA_NAME
+	  || TREE_CODE (TREE_TYPE (retval)) != POINTER_TYPE)
+	DUMP_AND_RETURN("Return value is not SSA_NAME or not a pointer type.")
+
+      if (!check_retval_uses (retval, ret_stmt))
+	DUMP_AND_RETURN("Return value has uses outside return stmt"
+			" and comparisons against 0.")
+
+      gimple *def = SSA_NAME_DEF_STMT (retval);
+      if (gcall *call_stmt = dyn_cast<gcall *> (def))
+	{
+	  tree callee_decl = gimple_call_fndecl (call_stmt);
+	  if (!callee_decl)
+	    return false;
+
+	  if (!ipa && !DECL_IS_MALLOC (callee_decl))
+	    DUMP_AND_RETURN("callee_decl does not have malloc attribute for"
+			    " non-ipa mode.")
+
+	  cgraph_edge *cs = node->get_edge (call_stmt);
+	  if (cs)
+	    {
+	      ipa_call_summary *es = ipa_call_summaries->get (cs);
+	      gcc_assert (es);
+	      es->is_return_callee_uncaptured = true;
+	    }
+	}
+
+      else if (gphi *phi = dyn_cast<gphi *> (def))
+	for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+	  {
+	    tree arg = gimple_phi_arg_def (phi, i);
+	    if (TREE_CODE (arg) != SSA_NAME)
+	      DUMP_AND_RETURN("phi arg is not SSA_NAME.")
+	    if (!(arg == null_pointer_node || check_retval_uses (arg, phi)))
+	      DUMP_AND_RETURN("phi arg has uses outside phi"
+			      " and comparisons against 0.")
+
+	    gimple *arg_def = SSA_NAME_DEF_STMT (arg);
+	    gcall *call_stmt = dyn_cast<gcall *> (arg_def);
+	    if (!call_stmt)
+	      return false;
+	    tree callee_decl = gimple_call_fndecl (call_stmt);
+	    if (!callee_decl)
+	      return false;
+	    if (!ipa && !DECL_IS_MALLOC (callee_decl))
+	      DUMP_AND_RETURN("callee_decl does not have malloc attribute for"
+			      " non-ipa mode.")
+
+	    cgraph_edge *cs = node->get_edge (call_stmt);
+	    if (cs)
+	      {
+		ipa_call_summary *es = ipa_call_summaries->get (cs);
+		gcc_assert (es);
+		es->is_return_callee_uncaptured = true;
+	      }
+	  }
+
+      else
+	DUMP_AND_RETURN("def_stmt of return value is not a call or phi-stmt.")
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "\nFound %s to be candidate for malloc attribute\n",
+	     IDENTIFIER_POINTER (DECL_NAME (fun->decl)));
+  return true;
+
+#undef DUMP_AND_RETURN
+}
+
 
 /* This is the main routine for finding the reference patterns for
    global variables within a function FN.  */
@@ -937,6 +1108,14 @@ end:
   if (TREE_NOTHROW (decl))
     l->can_throw = false;
 
+  l->malloc_state = STATE_MALLOC_BOTTOM;
+  if (DECL_IS_MALLOC (decl))
+    l->malloc_state = STATE_MALLOC;
+  else if (ipa && malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), true))
+    l->malloc_state = STATE_MALLOC_TOP;
+  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), false))
+    l->malloc_state = STATE_MALLOC;
+
   pop_cfun ();
   if (dump_file)
     {
@@ -950,6 +1129,8 @@ end:
         fprintf (dump_file, "Function is locally pure.\n");
       if (l->can_free)
 	fprintf (dump_file, "Function can locally free.\n");
+      if (l->malloc_state == STATE_MALLOC)
+	fprintf (dump_file, "Function is locally malloc.\n");
     }
   return l;
 }
@@ -1083,6 +1264,7 @@ pure_const_write_summary (void)
 	  bp_pack_value (&bp, fs->looping, 1);
 	  bp_pack_value (&bp, fs->can_throw, 1);
 	  bp_pack_value (&bp, fs->can_free, 1);
+	  bp_pack_value (&bp, fs->malloc_state, 2);
 	  streamer_write_bitpack (&bp);
 	}
     }
@@ -1143,6 +1325,9 @@ pure_const_read_summary (void)
 	      fs->looping = bp_unpack_value (&bp, 1);
 	      fs->can_throw = bp_unpack_value (&bp, 1);
 	      fs->can_free = bp_unpack_value (&bp, 1);
+	      fs->malloc_state
+			= (enum malloc_state_e) bp_unpack_value (&bp, 2);
+
 	      if (dump_file)
 		{
 		  int flags = flags_from_decl_or_type (node->decl);
@@ -1165,6 +1350,8 @@ pure_const_read_summary (void)
 		    fprintf (dump_file,"  function is locally throwing\n");
 		  if (fs->can_free)
 		    fprintf (dump_file,"  function can locally free\n");
+		  fprintf (dump_file, "\n malloc state: %s\n",
+			   malloc_state_names[fs->malloc_state]);
 		}
 	    }
 
@@ -1675,6 +1862,131 @@ propagate_nothrow (void)
   free (order);
 }
 
+/* Debugging function to dump state of malloc lattice.  */
+
+DEBUG_FUNCTION
+static void
+dump_malloc_lattice (FILE *dump_file, const char *s)
+{
+  if (!dump_file)
+    return;
+
+  fprintf (dump_file, "\n\nMALLOC LATTICE %s:\n", s);
+  cgraph_node *node;
+  FOR_EACH_FUNCTION (node)
+    {
+      funct_state fs = get_function_state (node);
+      malloc_state_e state = fs->malloc_state;
+      fprintf (dump_file, "%s: %s\n", node->name (), malloc_state_names[state]);
+    }
+}
+
+/* Propagate malloc attribute across the callgraph.  */
+
+static void
+propagate_malloc (void)
+{
+  cgraph_node *node;
+  FOR_EACH_FUNCTION (node)
+    {
+      if (DECL_IS_MALLOC (node->decl))
+	if (!has_function_state (node))
+	  {
+	    funct_state l = XCNEW (struct funct_state_d);
+	    *l = varying_state;
+	    l->malloc_state = STATE_MALLOC;
+	    set_function_state (node, l);
+	  }
+    }
+
+  dump_malloc_lattice (dump_file, "Initial");
+  struct cgraph_node **order
+    = XNEWVEC (struct cgraph_node *, symtab->cgraph_count);
+  int order_pos = ipa_reverse_postorder (order);
+  bool changed = true;
+
+  while (changed)
+    {
+      changed = false;
+      /* Walk in postorder.  */
+      for (int i = order_pos - 1; i >= 0; --i)
+	{
+	  cgraph_node *node = order[i];
+	  if (node->alias
+	      || !node->definition
+	      || !has_function_state (node))
+	    continue;
+
+	  funct_state l = get_function_state (node);
+
+	  /* FIXME: add support for indirect-calls.  */
+	  if (node->indirect_calls)
+	    {
+	      l->malloc_state = STATE_MALLOC_BOTTOM;
+	      continue;
+	    }
+
+	  if (node->get_availability () <= AVAIL_INTERPOSABLE)
+	    {
+	      l->malloc_state = STATE_MALLOC_BOTTOM;
+	      continue;
+	    }
+
+	  if (l->malloc_state == STATE_MALLOC_BOTTOM)
+	    continue;
+
+	  vec<cgraph_node *> callees = vNULL;
+	  for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee)
+	    {
+	      ipa_call_summary *es = ipa_call_summaries->get (cs);
+	      if (es && es->is_return_callee_uncaptured)
+		callees.safe_push (cs->callee);
+	    }
+
+	  malloc_state_e new_state = l->malloc_state;
+	  for (unsigned j = 0; j < callees.length (); j++)
+	    {
+	      cgraph_node *callee = callees[j];
+	      if (!has_function_state (callee))
+		{
+		  new_state = STATE_MALLOC_BOTTOM;
+		  break;
+		}
+	      malloc_state_e callee_state = get_function_state (callee)->malloc_state;
+	      if (new_state < callee_state)
+		new_state = callee_state;
+	    }
+	  if (new_state != l->malloc_state)
+	    {
+	      changed = true;
+	      l->malloc_state = new_state;
+	    }
+	}
+    }
+
+  FOR_EACH_DEFINED_FUNCTION (node)
+    if (has_function_state (node))
+      {
+	funct_state l = get_function_state (node);
+	if (!node->alias
+	    && l->malloc_state == STATE_MALLOC
+	    && !node->global.inlined_to)
+	  {
+	    if (dump_file && (dump_flags & TDF_DETAILS))
+	      fprintf (dump_file, "Function %s found to be malloc\n",
+		       node->name ());
+
+	    bool malloc_decl_p = DECL_IS_MALLOC (node->decl);
+	    node->set_malloc_flag (true);
+	    if (!malloc_decl_p && warn_suggest_attribute_malloc)
+		warn_function_malloc (node->decl);
+	  }
+      }
+
+  dump_malloc_lattice (dump_file, "after propagation");
+  ipa_free_postorder_info ();
+  free (order);
+}
 
 /* Produce the global information by preforming a transitive closure
    on the local information that was produced by generate_summary.  */
@@ -1693,6 +2005,7 @@ execute (function *)
   /* Nothrow makes more function to not lead to return and improve
      later analysis.  */
   propagate_nothrow ();
+  propagate_malloc ();
   remove_p = propagate_pure_const ();
 
   /* Cleanup. */
@@ -1700,6 +2013,10 @@ execute (function *)
     if (has_function_state (node))
       free (get_function_state (node));
   funct_state_vec.release ();
+
+  /* In WPA we use inline summaries for partitioning process.  */
+  if (!flag_wpa)
+    ipa_free_fn_summary ();
   return remove_p ? TODO_remove_functions : 0;
 }
 
@@ -1894,6 +2211,19 @@ pass_local_pure_const::execute (function *fun)
 	fprintf (dump_file, "Function found to be nothrow: %s\n",
 		 current_function_name ());
     }
+
+  if (l->malloc_state == STATE_MALLOC
+      && !DECL_IS_MALLOC (current_function_decl))
+    {
+      node->set_malloc_flag (true);
+      if (warn_suggest_attribute_malloc)
+	warn_function_malloc (node->decl);
+      changed = true;
+      if (dump_file)
+	fprintf (dump_file, "Function found to be malloc: %s\n",
+		 node->name ());
+    }
+
   free (l);
   if (changed)
     return execute_fixup_cfg ();
diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index e3759d6c50e..252ea053e2a 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -444,7 +444,7 @@ consider_split (struct split_point *current, bitmap non_ssa_vars,
 
   /* Do not split when we would end up calling function anyway.  */
   if (incoming_freq
-      >= (ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency
+      >= (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun)
 	  * PARAM_VALUE (PARAM_PARTIAL_INLINING_ENTRY_PROBABILITY) / 100))
     {
       /* When profile is guessed, we can not expect it to give us
@@ -454,13 +454,14 @@ consider_split (struct split_point *current, bitmap non_ssa_vars,
 	 is likely noticeable win.  */
       if (back_edge
 	  && profile_status_for_fn (cfun) != PROFILE_READ
-	  && incoming_freq < ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency)
+	  && incoming_freq
+		 < ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file,
 		     "  Split before loop, accepting despite low frequencies %i %i.\n",
 		     incoming_freq,
-		     ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency);
+		     ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun));
 	}
       else
 	{
@@ -714,8 +715,10 @@ consider_split (struct split_point *current, bitmap non_ssa_vars,
      out smallest size of header.
      In future we might re-consider this heuristics.  */
   if (!best_split_point.split_bbs
-      || best_split_point.entry_bb->frequency > current->entry_bb->frequency
-      || (best_split_point.entry_bb->frequency == current->entry_bb->frequency
+      || best_split_point.entry_bb->count.to_frequency (cfun)
+	 > current->entry_bb->count.to_frequency (cfun)
+      || (best_split_point.entry_bb->count.to_frequency (cfun)
+	  == current->entry_bb->count.to_frequency (cfun)
 	  && best_split_point.split_size < current->split_size))
 	
     {
@@ -1285,8 +1288,7 @@ split_function (basic_block return_bb, struct split_point *split_point,
 	  FOR_EACH_EDGE (e, ei, return_bb->preds)
 	    if (bitmap_bit_p (split_point->split_bbs, e->src->index))
 	      {
-		new_return_bb->count += e->count;
-		new_return_bb->frequency += EDGE_FREQUENCY (e);
+		new_return_bb->count += e->count ();
 		redirect_edge_and_branch (e, new_return_bb);
 		redirected = true;
 		break;
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 708710d6135..e9ab78cdabb 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -524,20 +524,36 @@ ipa_merge_profiles (struct cgraph_node *dst,
 	  unsigned int i;
 
 	  dstbb = BASIC_BLOCK_FOR_FN (dstcfun, srcbb->index);
-	  if (dstbb->count.initialized_p ())
-	    dstbb->count += srcbb->count;
-	  else
-	    dstbb->count = srcbb->count;
-	  for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
+
+	  /* Either sum the profiles if both are IPA and not global0, or
+	     pick more informative one (that is nonzero IPA if other is
+	     uninitialized, guessed or global0).   */
+	  if (!dstbb->count.ipa ().initialized_p ()
+	      || (dstbb->count.ipa () == profile_count::zero ()
+		  && (srcbb->count.ipa ().initialized_p ()
+		      && !(srcbb->count.ipa () == profile_count::zero ()))))
 	    {
-	      edge srce = EDGE_SUCC (srcbb, i);
-	      edge dste = EDGE_SUCC (dstbb, i);
-	      if (dstbb->count.initialized_p ())
-	        dste->count += srce->count;
-	      else
-		dste->count = srce->count;
-	      if (dstbb->count > 0 && dste->count.initialized_p ())
-		dste->probability = dste->count.probability_in (dstbb->count);
+	      dstbb->count = srcbb->count;
+	      for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
+		{
+		  edge srce = EDGE_SUCC (srcbb, i);
+		  edge dste = EDGE_SUCC (dstbb, i);
+		  if (srce->probability.initialized_p ())
+		    dste->probability = srce->probability;
+		}
+	    }	
+	  else if (srcbb->count.ipa ().initialized_p ()
+		   && !(srcbb->count.ipa () == profile_count::zero ()))
+	    {
+	      for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
+		{
+		  edge srce = EDGE_SUCC (srcbb, i);
+		  edge dste = EDGE_SUCC (dstbb, i);
+		  dste->probability = 
+		    dste->probability * dstbb->count.probability_in (dstbb->count + srcbb->count)
+		    + srce->probability * srcbb->count.probability_in (dstbb->count + srcbb->count);
+		}
+	      dstbb->count += srcbb->count;
 	    }
 	}
       push_cfun (dstcfun);
@@ -548,7 +564,7 @@ ipa_merge_profiles (struct cgraph_node *dst,
 	{
 	  if (e->speculative)
 	    continue;
-	  e->count = gimple_bb (e->call_stmt)->count;
+	  e->count = gimple_bb (e->call_stmt)->count.ipa ();
 	  e->frequency = compute_call_stmt_bb_frequency
 			     (dst->decl,
 			      gimple_bb (e->call_stmt));
@@ -626,7 +642,7 @@ ipa_merge_profiles (struct cgraph_node *dst,
 	      ipa_ref *ref;
 
 	      e2->speculative_call_info (direct, indirect, ref);
-	      e->count = count;
+	      e->count = count.ipa ();
 	      e->frequency = freq;
 	      int prob = direct->count.probability_in (e->count)
 			 .to_reg_br_prob_base ();
@@ -635,7 +651,7 @@ ipa_merge_profiles (struct cgraph_node *dst,
 	    }
 	  else
 	    {
-	      e->count = count;
+	      e->count = count.ipa ();
 	      e->frequency = freq;
 	    }
 	}
diff --git a/gcc/ira-build.c b/gcc/ira-build.c
index 366b83e6df1..67c0305a168 100644
--- a/gcc/ira-build.c
+++ b/gcc/ira-build.c
@@ -2202,7 +2202,8 @@ loop_compare_func (const void *v1p, const void *v2p)
     return -1;
   if (! l1->to_remove_p && l2->to_remove_p)
     return 1;
-  if ((diff = l1->loop->header->frequency - l2->loop->header->frequency) != 0)
+  if ((diff = l1->loop->header->count.to_frequency (cfun)
+	      - l2->loop->header->count.to_frequency (cfun)) != 0)
     return diff;
   if ((diff = (int) loop_depth (l1->loop) - (int) loop_depth (l2->loop)) != 0)
     return diff;
@@ -2260,7 +2261,7 @@ mark_loops_for_removal (void)
 	  (ira_dump_file,
 	   "  Mark loop %d (header %d, freq %d, depth %d) for removal (%s)\n",
 	   sorted_loops[i]->loop_num, sorted_loops[i]->loop->header->index,
-	   sorted_loops[i]->loop->header->frequency,
+	   sorted_loops[i]->loop->header->count.to_frequency (cfun),
 	   loop_depth (sorted_loops[i]->loop),
 	   low_pressure_loop_node_p (sorted_loops[i]->parent)
 	   && low_pressure_loop_node_p (sorted_loops[i])
@@ -2293,7 +2294,7 @@ mark_all_loops_for_removal (void)
 	     "  Mark loop %d (header %d, freq %d, depth %d) for removal\n",
 	     ira_loop_nodes[i].loop_num,
 	     ira_loop_nodes[i].loop->header->index,
-	     ira_loop_nodes[i].loop->header->frequency,
+	     ira_loop_nodes[i].loop->header->count.to_frequency (cfun),
 	     loop_depth (ira_loop_nodes[i].loop));
       }
 }
diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 8be7f31c5e9..72f7dd9ba21 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -3004,14 +3004,13 @@ allocno_priority_compare_func (const void *v1p, const void *v2p)
 {
   ira_allocno_t a1 = *(const ira_allocno_t *) v1p;
   ira_allocno_t a2 = *(const ira_allocno_t *) v2p;
-  int pri1, pri2;
+  int pri1, pri2, diff;
 
   /* Assign hard reg to static chain pointer pseudo first when
      non-local goto is used.  */
-  if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a1)))
-    return 1;
-  else if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a2)))
-    return -1;
+  if ((diff = (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a2))
+	       - non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a1)))) != 0)
+    return diff;
   pri1 = allocno_priorities[ALLOCNO_NUM (a1)];
   pri2 = allocno_priorities[ALLOCNO_NUM (a2)];
   if (pri2 != pri1)
diff --git a/gcc/ira.c b/gcc/ira.c
index 4345f7595d0..93d02093757 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -4419,6 +4419,12 @@ rtx_moveable_p (rtx *loc, enum op_type type)
 	 for a reason.  */
       return false;
 
+    case ASM_OPERANDS:
+      /* The same is true for volatile asm: it has unknown side effects, it
+         cannot be moved at will.  */
+      if (MEM_VOLATILE_P (x))
+	return false;
+
     default:
       break;
     }
diff --git a/gcc/jit/ChangeLog b/gcc/jit/ChangeLog
index 24df99057a0..87b5473922a 100644
--- a/gcc/jit/ChangeLog
+++ b/gcc/jit/ChangeLog
@@ -1,3 +1,10 @@
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* docs/internals/index.rst (Running the test suite): Document
+	PRESERVE_EXECUTABLES.
+	(Running under valgrind): Add markup to RUN_UNDER_VALGRIND.
+	* docs/_build/texinfo/libgccjit.texi: Regenerate.
+
 2017-10-04  David Malcolm  <dmalcolm@redhat.com>
 
 	* docs/cp/topics/expressions.rst (Vector expressions): New
diff --git a/gcc/jit/docs/_build/texinfo/libgccjit.texi b/gcc/jit/docs/_build/texinfo/libgccjit.texi
index 344c93e4ccf..a3b206f2dd4 100644
--- a/gcc/jit/docs/_build/texinfo/libgccjit.texi
+++ b/gcc/jit/docs/_build/texinfo/libgccjit.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-libgccjit 8.0.0 (experimental 20171004), October 04, 2017
+libgccjit 8.0.0 (experimental 20171031), October 31, 2017
 
 David Malcolm
 
@@ -15016,7 +15016,12 @@ jit/build/gcc/testsuite/jit/jit.log
 
 @noindent
 
-The test executables can be seen as:
+The test executables are normally deleted after each test is run.  For
+debugging, they can be preserved by setting 
+@geindex PRESERVE_EXECUTABLES
+@geindex environment variable; PRESERVE_EXECUTABLES
+@code{PRESERVE_EXECUTABLES}
+in the environment.  If so, they can then be seen as:
 
 @example
 jit/build/gcc/testsuite/jit/*.exe
@@ -15029,7 +15034,9 @@ which can be run independently.
 You can compile and run individual tests by passing "jit.exp=TESTNAME" to RUNTESTFLAGS e.g.:
 
 @example
-[gcc] $ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c"
+[gcc] $ PRESERVE_EXECUTABLES= \
+          make check-jit \
+            RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c"
 @end example
 
 @noindent
@@ -15056,7 +15063,10 @@ and once a test has been compiled, you can debug it directly:
 @subsection Running under valgrind
 
 
-The jit testsuite detects if RUN_UNDER_VALGRIND is present in the
+The jit testsuite detects if 
+@geindex RUN_UNDER_VALGRIND
+@geindex environment variable; RUN_UNDER_VALGRIND
+@code{RUN_UNDER_VALGRIND} is present in the
 environment (with any value).  If it is present, it runs the test client
 code under valgrind@footnote{http://valgrind.org},
 specifcally, the default
diff --git a/gcc/jit/docs/internals/index.rst b/gcc/jit/docs/internals/index.rst
index cadf36283ef..4ad7f61f774 100644
--- a/gcc/jit/docs/internals/index.rst
+++ b/gcc/jit/docs/internals/index.rst
@@ -103,7 +103,9 @@ and detailed logs in:
 
   jit/build/gcc/testsuite/jit/jit.log
 
-The test executables can be seen as:
+The test executables are normally deleted after each test is run.  For
+debugging, they can be preserved by setting :envvar:`PRESERVE_EXECUTABLES`
+in the environment.  If so, they can then be seen as:
 
 .. code-block:: console
 
@@ -115,7 +117,9 @@ You can compile and run individual tests by passing "jit.exp=TESTNAME" to RUNTES
 
 .. code-block:: console
 
-   [gcc] $ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c"
+   [gcc] $ PRESERVE_EXECUTABLES= \
+             make check-jit \
+               RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c"
 
 and once a test has been compiled, you can debug it directly:
 
@@ -130,7 +134,7 @@ and once a test has been compiled, you can debug it directly:
 Running under valgrind
 **********************
 
-The jit testsuite detects if RUN_UNDER_VALGRIND is present in the
+The jit testsuite detects if :envvar:`RUN_UNDER_VALGRIND` is present in the
 environment (with any value).  If it is present, it runs the test client
 code under `valgrind <http://valgrind.org>`_,
 specifcally, the default
diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index c54b790f0cc..9b3212b90cf 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -266,8 +266,8 @@ lhd_gimplify_expr (tree *expr_p ATTRIBUTE_UNUSED,
 }
 
 /* lang_hooks.tree_size: Determine the size of a tree with code C,
-   which is a language-specific tree code in category tcc_constant or
-   tcc_exceptional.  The default expects never to be called.  */
+   which is a language-specific tree code in category tcc_constant,
+   tcc_exceptional or tcc_type.  The default expects never to be called.  */
 size_t
 lhd_tree_size (enum tree_code c ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index b0c9829a6cd..d1288f1965d 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -307,10 +307,10 @@ struct lang_hooks
   /* Remove any parts of the tree that are used only by the FE. */
   void (*free_lang_data) (tree);
 
-  /* Determines the size of any language-specific tcc_constant or
-     tcc_exceptional nodes.  Since it is called from make_node, the
-     only information available is the tree code.  Expected to die
-     on unrecognized codes.  */
+  /* Determines the size of any language-specific tcc_constant,
+     tcc_exceptional or tcc_type nodes.  Since it is called from
+     make_node, the only information available is the tree code.
+     Expected to die on unrecognized codes.  */
   size_t (*tree_size) (enum tree_code);
 
   /* Return the language mask used for converting argv into a sequence
diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index 5769d9deccb..3a1d838affd 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -393,9 +393,7 @@ add_test (rtx cond, edge *e, basic_block dest)
 
   edge e2 = make_edge (bb, dest, (*e)->flags & ~EDGE_FALLTHRU);
   e2->probability = prob;
-  e2->count = e2->src->count.apply_probability (prob);
   (*e)->probability = prob.invert ();
-  (*e)->count = (*e)->count.apply_probability (prob);
   update_br_prob_note (e2->src);
   return true;
 }
@@ -508,7 +506,6 @@ doloop_modify (struct loop *loop, struct niter_desc *desc,
       set_immediate_dominator (CDI_DOMINATORS, new_preheader, preheader);
 
       set_zero->count = profile_count::uninitialized ();
-      set_zero->frequency = 0;
 
       te = single_succ_edge (preheader);
       for (; ass; ass = XEXP (ass, 1))
@@ -524,7 +521,6 @@ doloop_modify (struct loop *loop, struct niter_desc *desc,
 	     also be very hard to show that it is impossible, so we must
 	     handle this case.  */
 	  set_zero->count = preheader->count;
-	  set_zero->frequency = preheader->frequency;
 	}
 
       if (EDGE_COUNT (set_zero->preds) == 0)
diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 45e822980ff..1d0c66f2b2f 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -353,7 +353,7 @@ iv_get_reaching_def (rtx_insn *insn, rtx reg, df_ref *def)
   adef = DF_REF_CHAIN (use)->ref;
 
   /* We do not handle setting only part of the register.  */
-  if (DF_REF_FLAGS (adef) & (DF_REF_READ_WRITE | DF_REF_SUBREG))
+  if (DF_REF_FLAGS (adef) & DF_REF_READ_WRITE)
     return GRD_INVALID;
 
   def_insn = DF_REF_INSN (adef);
diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c
index 322f151ac5d..91bf5dddeed 100644
--- a/gcc/loop-unroll.c
+++ b/gcc/loop-unroll.c
@@ -863,7 +863,7 @@ unroll_loop_runtime_iterations (struct loop *loop)
   unsigned i, j;
   profile_probability p;
   basic_block preheader, *body, swtch, ezc_swtch = NULL;
-  int may_exit_copy, iter_freq, new_freq;
+  int may_exit_copy;
   profile_count iter_count, new_count;
   unsigned n_peel;
   edge e;
@@ -970,14 +970,11 @@ unroll_loop_runtime_iterations (struct loop *loop)
   /* Record the place where switch will be built for preconditioning.  */
   swtch = split_edge (loop_preheader_edge (loop));
 
-  /* Compute frequency/count increments for each switch block and initialize
+  /* Compute count increments for each switch block and initialize
      innermost switch block.  Switch blocks and peeled loop copies are built
      from innermost outward.  */
-  iter_freq = new_freq = swtch->frequency / (max_unroll + 1);
   iter_count = new_count = swtch->count.apply_scale (1, max_unroll + 1);
-  swtch->frequency = new_freq;
   swtch->count = new_count;
-  single_succ_edge (swtch)->count = new_count;
 
   for (i = 0; i < n_peel; i++)
     {
@@ -996,10 +993,8 @@ unroll_loop_runtime_iterations (struct loop *loop)
       p = profile_probability::always ().apply_scale (1, i + 2);
 
       preheader = split_edge (loop_preheader_edge (loop));
-      /* Add in frequency/count of edge from switch block.  */
-      preheader->frequency += iter_freq;
+      /* Add in count of edge from switch block.  */
       preheader->count += iter_count;
-      single_succ_edge (preheader)->count = preheader->count;
       branch_code = compare_and_jump_seq (copy_rtx (niter), GEN_INT (j), EQ,
 					  block_label (preheader), p,
 					  NULL);
@@ -1011,14 +1006,10 @@ unroll_loop_runtime_iterations (struct loop *loop)
       swtch = split_edge_and_insert (single_pred_edge (swtch), branch_code);
       set_immediate_dominator (CDI_DOMINATORS, preheader, swtch);
       single_succ_edge (swtch)->probability = p.invert ();
-      single_succ_edge (swtch)->count = new_count;
-      new_freq += iter_freq;
       new_count += iter_count;
-      swtch->frequency = new_freq;
       swtch->count = new_count;
       e = make_edge (swtch, preheader,
 		     single_succ_edge (swtch)->flags & EDGE_IRREDUCIBLE_LOOP);
-      e->count = iter_count;
       e->probability = p;
     }
 
@@ -1028,14 +1019,11 @@ unroll_loop_runtime_iterations (struct loop *loop)
       p = profile_probability::always ().apply_scale (1, max_unroll + 1);
       swtch = ezc_swtch;
       preheader = split_edge (loop_preheader_edge (loop));
-      /* Recompute frequency/count adjustments since initial peel copy may
+      /* Recompute count adjustments since initial peel copy may
 	 have exited and reduced those values that were computed above.  */
-      iter_freq = swtch->frequency / (max_unroll + 1);
       iter_count = swtch->count.apply_scale (1, max_unroll + 1);
-      /* Add in frequency/count of edge from switch block.  */
-      preheader->frequency += iter_freq;
+      /* Add in count of edge from switch block.  */
       preheader->count += iter_count;
-      single_succ_edge (preheader)->count = preheader->count;
       branch_code = compare_and_jump_seq (copy_rtx (niter), const0_rtx, EQ,
 					  block_label (preheader), p,
 					  NULL);
@@ -1044,10 +1032,8 @@ unroll_loop_runtime_iterations (struct loop *loop)
       swtch = split_edge_and_insert (single_succ_edge (swtch), branch_code);
       set_immediate_dominator (CDI_DOMINATORS, preheader, swtch);
       single_succ_edge (swtch)->probability = p.invert ();
-      single_succ_edge (swtch)->count -= iter_count;
       e = make_edge (swtch, preheader,
 		     single_succ_edge (swtch)->flags & EDGE_IRREDUCIBLE_LOOP);
-      e->count = iter_count;
       e->probability = p;
     }
 
diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index dd853d799bc..0e76a718a1e 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -670,7 +670,7 @@ simplify_gen_subreg_concatn (machine_mode outermode, rtx op,
 
       if (must_eq (GET_MODE_SIZE (GET_MODE (op)),
 		   GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))))
-	  && known_zero (SUBREG_BYTE (op)))
+	  && must_eq (SUBREG_BYTE (op), 0))
 	return simplify_gen_subreg_concatn (outermode, SUBREG_REG (op),
 					    GET_MODE (SUBREG_REG (op)), byte);
 
@@ -869,7 +869,7 @@ resolve_simple_move (rtx set, rtx_insn *insn)
 
   if (GET_CODE (src) == SUBREG
       && resolve_reg_p (SUBREG_REG (src))
-      && (maybe_nonzero (SUBREG_BYTE (src))
+      && (may_ne (SUBREG_BYTE (src), 0)
 	  || may_ne (orig_size, GET_MODE_SIZE (GET_MODE (SUBREG_REG (src))))))
     {
       real_dest = dest;
@@ -883,7 +883,7 @@ resolve_simple_move (rtx set, rtx_insn *insn)
 
   if (GET_CODE (dest) == SUBREG
       && resolve_reg_p (SUBREG_REG (dest))
-      && (maybe_nonzero (SUBREG_BYTE (dest))
+      && (may_ne (SUBREG_BYTE (dest), 0)
 	  || may_ne (orig_size, GET_MODE_SIZE (GET_MODE (SUBREG_REG (dest))))))
     {
       rtx reg, smove;
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index ff192733955..c2902fd682f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -3170,7 +3170,7 @@ equiv_address_substitution (struct address_info *ad)
 	  change_p = true;
 	}
     }
-  if (maybe_nonzero (disp))
+  if (may_ne (disp, 0))
     {
       if (ad->disp != NULL)
 	*ad->disp = plus_constant (GET_MODE (*ad->inner), *ad->disp, disp);
@@ -4013,7 +4013,7 @@ curr_insn_transform (bool check_only_p)
       if (INSN_CODE (curr_insn) >= 0
           && (p = get_insn_name (INSN_CODE (curr_insn))) != NULL)
         fprintf (lra_dump_file, " {%s}", p);
-      if (maybe_nonzero (curr_id->sp_offset))
+      if (may_ne (curr_id->sp_offset, 0))
 	{
 	  fprintf (lra_dump_file, " (sp_off=");
 	  print_dec (curr_id->sp_offset, lra_dump_file);
@@ -4224,8 +4224,9 @@ curr_insn_transform (bool check_only_p)
 	      reg = SUBREG_REG (*loc);
 	      poly_int64 byte = SUBREG_BYTE (*loc);
 	      if (REG_P (reg)
-		  /* Strict_low_part requires reload the register not
-		     the sub-register.	*/
+		  /* Strict_low_part requires reloading the register and not
+		     just the subreg.  Likewise for a strict subreg no wider
+		     than a word for WORD_REGISTER_OPERATIONS targets.  */
 		  && (curr_static_id->operand[i].strict_low
 		      || (!paradoxical_subreg_p (mode, GET_MODE (reg))
 			  && (hard_regno
@@ -4236,7 +4237,11 @@ curr_insn_transform (bool check_only_p)
 			  && (goal_alt[i] == NO_REGS
 			      || (simplify_subreg_regno
 				  (ira_class_hard_regs[goal_alt[i]][0],
-				   GET_MODE (reg), byte, mode) >= 0)))))
+				   GET_MODE (reg), byte, mode) >= 0)))
+		      || (partial_subreg_p (mode, GET_MODE (reg))
+			  && must_le (GET_MODE_SIZE (GET_MODE (reg)),
+				      UNITS_PER_WORD)
+			  && WORD_REGISTER_OPERATIONS)))
 		{
 		  /* An OP_INOUT is required when reloading a subreg of a
 		     mode wider than a word to ensure that data beyond the
@@ -4283,7 +4288,13 @@ curr_insn_transform (bool check_only_p)
 	}
       else if (curr_static_id->operand[i].type == OP_IN
 	       && (curr_static_id->operand[goal_alt_matched[i][0]].type
-		   == OP_OUT))
+		   == OP_OUT
+		   || (curr_static_id->operand[goal_alt_matched[i][0]].type
+		       == OP_INOUT
+		       && (operands_match_p
+			   (*curr_id->operand_loc[i],
+			    *curr_id->operand_loc[goal_alt_matched[i][0]],
+			    -1)))))
 	{
 	  /* generate reloads for input and matched outputs.  */
 	  match_inputs[0] = i;
@@ -4294,9 +4305,14 @@ curr_insn_transform (bool check_only_p)
 			[goal_alt_number * n_operands + goal_alt_matched[i][0]]
 			.earlyclobber);
 	}
-      else if (curr_static_id->operand[i].type == OP_OUT
+      else if ((curr_static_id->operand[i].type == OP_OUT
+		|| (curr_static_id->operand[i].type == OP_INOUT
+		    && (operands_match_p
+			(*curr_id->operand_loc[i],
+			 *curr_id->operand_loc[goal_alt_matched[i][0]],
+			 -1))))
 	       && (curr_static_id->operand[goal_alt_matched[i][0]].type
-		   == OP_IN))
+		    == OP_IN))
 	/* Generate reloads for output and matched inputs.  */
 	match_reload (i, goal_alt_matched[i], outputs, goal_alt[i], &before,
 		      &after, curr_static_id->operand_alternative
diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
index b958190c2c7..bea8b023b7c 100644
--- a/gcc/lra-eliminations.c
+++ b/gcc/lra-eliminations.c
@@ -264,7 +264,7 @@ get_elimination (rtx reg)
   if ((ep = elimination_map[hard_regno]) != NULL)
     return ep->from_rtx != reg ? NULL : ep;
   poly_int64 offset = self_elim_offsets[hard_regno];
-  if (known_zero (offset))
+  if (must_eq (offset, 0))
     return NULL;
   /* This is an iteration to restore offsets just after HARD_REGNO
      stopped to be eliminable.	*/
@@ -340,7 +340,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
   int copied = 0;
 
   lra_assert (!update_p || !full_p);
-  lra_assert (known_zero (update_sp_offset)
+  lra_assert (must_eq (update_sp_offset, 0)
 	      || (!subst_p && update_p && !full_p));
   if (! current_function_decl)
     return x;
@@ -366,7 +366,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
 	{
 	  rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
 
-	  if (maybe_nonzero (update_sp_offset))
+	  if (may_ne (update_sp_offset, 0))
 	    {
 	      if (ep->to_rtx == stack_pointer_rtx)
 		return plus_constant (Pmode, to, update_sp_offset);
@@ -399,7 +399,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
 	      if (! update_p && ! full_p)
 		return gen_rtx_PLUS (Pmode, to, XEXP (x, 1));
 	      
-	      if (maybe_nonzero (update_sp_offset))
+	      if (may_ne (update_sp_offset, 0))
 		offset = ep->to_rtx == stack_pointer_rtx ? update_sp_offset : 0;
 	      else
 		offset = (update_p
@@ -456,7 +456,7 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
 	{
 	  rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
 
-	  if (maybe_nonzero (update_sp_offset))
+	  if (may_ne (update_sp_offset, 0))
 	    {
 	      if (ep->to_rtx == stack_pointer_rtx)
 		return plus_constant (Pmode,
@@ -952,7 +952,7 @@ eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p,
 
 		/* We should never process such insn with non-zero
 		   UPDATE_SP_OFFSET.  */
-		lra_assert (known_zero (update_sp_offset));
+		lra_assert (must_eq (update_sp_offset, 0));
 		
 		if (remove_reg_equal_offset_note (insn, ep->to_rtx, &offset)
 		    || strip_offset (src, &offset) == ep->to_rtx)
@@ -1032,7 +1032,7 @@ eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p,
 
 	  if (! replace_p)
 	    {
-	      if (known_zero (update_sp_offset))
+	      if (must_eq (update_sp_offset, 0))
 		offset += (ep->offset - ep->previous_offset);
 	      if (ep->to_rtx == stack_pointer_rtx)
 		{
@@ -1051,7 +1051,7 @@ eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p,
 	     the cost of the insn by replacing a simple REG with (plus
 	     (reg sp) CST).  So try only when we already had a PLUS
 	     before.  */
-	  if (known_zero (offset) || plus_src)
+	  if (must_eq (offset, 0) || plus_src)
 	    {
 	      rtx new_src = plus_constant (GET_MODE (to_rtx), to_rtx, offset);
 
@@ -1239,7 +1239,7 @@ update_reg_eliminate (bitmap insns_with_changed_offsets)
 	      if (lra_dump_file != NULL)
 		fprintf (lra_dump_file, "    Using elimination %d to %d now\n",
 			 ep1->from, ep1->to);
-	      lra_assert (known_zero (ep1->previous_offset));
+	      lra_assert (must_eq (ep1->previous_offset, 0));
 	      ep1->previous_offset = ep->offset;
 	    }
 	  else
@@ -1251,7 +1251,7 @@ update_reg_eliminate (bitmap insns_with_changed_offsets)
 		fprintf (lra_dump_file, "    %d is not eliminable at all\n",
 			 ep->from);
 	      self_elim_offsets[ep->from] = -ep->offset;
-	      if (maybe_nonzero (ep->offset))
+	      if (may_ne (ep->offset, 0))
 		bitmap_ior_into (insns_with_changed_offsets,
 				 &lra_reg_info[ep->from].insn_bitmap);
 	    }
@@ -1357,13 +1357,13 @@ init_elimination (void)
 	    if (NONDEBUG_INSN_P (insn))
 	      {
 		mark_not_eliminable (PATTERN (insn), VOIDmode);
-		if (maybe_nonzero (curr_sp_change)
+		if (may_ne (curr_sp_change, 0)
 		    && find_reg_note (insn, REG_LABEL_OPERAND, NULL_RTX))
 		  stop_to_sp_elimination_p = true;
 	      }
 	  }
       if (! frame_pointer_needed
-	  && (maybe_nonzero (curr_sp_change) || stop_to_sp_elimination_p)
+	  && (may_ne (curr_sp_change, 0) || stop_to_sp_elimination_p)
 	  && bb->succs && bb->succs->length () != 0)
 	for (ep = reg_eliminate; ep < &reg_eliminate[NUM_ELIMINABLE_REGS]; ep++)
 	  if (ep->to == STACK_POINTER_REGNUM)
diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 4648eca5ace..df7e2537dd0 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -220,6 +220,9 @@ lra_intersected_live_ranges_p (lra_live_range_t r1, lra_live_range_t r2)
   return false;
 }
 
+/* The corresponding bitmaps of BB currently being processed.  */
+static bitmap bb_killed_pseudos, bb_gen_pseudos;
+
 /* The function processing birth of hard register REGNO.  It updates
    living hard regs, START_LIVING, and conflict hard regs for living
    pseudos.  Conflict hard regs for the pic pseudo is not updated if
@@ -243,6 +246,8 @@ make_hard_regno_born (int regno, bool check_pic_pseudo_p ATTRIBUTE_UNUSED)
 	|| i != REGNO (pic_offset_table_rtx))
 #endif
       SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno);
+  if (fixed_regs[regno])
+    bitmap_set_bit (bb_gen_pseudos, regno);
 }
 
 /* Process the death of hard register REGNO.  This updates
@@ -255,6 +260,11 @@ make_hard_regno_dead (int regno)
     return;
   sparseset_set_bit (start_dying, regno);
   CLEAR_HARD_REG_BIT (hard_regs_live, regno);
+  if (fixed_regs[regno])
+    {
+      bitmap_clear_bit (bb_gen_pseudos, regno);
+      bitmap_set_bit (bb_killed_pseudos, regno);
+    }
 }
 
 /* Mark pseudo REGNO as living at program point POINT, update conflicting
@@ -299,9 +309,6 @@ mark_pseudo_dead (int regno, int point)
     }
 }
 
-/* The corresponding bitmaps of BB currently being processed.  */
-static bitmap bb_killed_pseudos, bb_gen_pseudos;
-
 /* Mark register REGNO (pseudo or hard register) in MODE as live at
    program point POINT.  Update BB_GEN_PSEUDOS.
    Return TRUE if the liveness tracking sets were modified, or FALSE
diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c
index 2828d42cd13..d549271128d 100644
--- a/gcc/lra-remat.c
+++ b/gcc/lra-remat.c
@@ -1242,7 +1242,7 @@ do_remat (void)
 	  if (remat_insn != NULL)
 	    {
 	      poly_int64 sp_offset_change = cand_sp_offset - id->sp_offset;
-	      if (maybe_nonzero (sp_offset_change))
+	      if (may_ne (sp_offset_change, 0))
 		change_sp_offset (remat_insn, sp_offset_change);
 	      update_scratch_ops (remat_insn);
 	      lra_process_new_insns (insn, remat_insn, NULL,
diff --git a/gcc/lra.c b/gcc/lra.c
index 64c5cfbea1c..8d44c75b0b4 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -820,7 +820,8 @@ collect_non_operand_hard_regs (rtx *x, lra_insn_recog_data_t data,
   const char *fmt = GET_RTX_FORMAT (code);
 
   for (i = 0; i < data->insn_static_data->n_operands; i++)
-    if (x == data->operand_loc[i])
+    if (! data->insn_static_data->operand[i].is_operator
+	&& x == data->operand_loc[i])
       /* It is an operand loc. Stop here.  */
       return list;
   for (i = 0; i < data->insn_static_data->n_dups; i++)
@@ -2371,7 +2372,7 @@ lra (FILE *f)
   bitmap_initialize (&lra_optional_reload_pseudos, &reg_obstack);
   bitmap_initialize (&lra_subreg_reload_pseudos, &reg_obstack);
   live_p = false;
-  if (maybe_nonzero (get_frame_size ()) && crtl->stack_alignment_needed)
+  if (may_ne (get_frame_size (), 0) && crtl->stack_alignment_needed)
     /* If we have a stack frame, we must align it now.  The stack size
        may be a part of the offset computation for register
        elimination.  */
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index babdcfb7cf6..4682be089c4 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -715,8 +715,7 @@ make_new_block (struct function *fn, unsigned int index)
 
 static void
 input_cfg (struct lto_input_block *ib, struct data_in *data_in,
-	   struct function *fn,
-	   int count_materialization_scale)
+	   struct function *fn)
 {
   unsigned int bb_count;
   basic_block p_bb;
@@ -756,13 +755,10 @@ input_cfg (struct lto_input_block *ib, struct data_in *data_in,
 	  unsigned int edge_flags;
 	  basic_block dest;
 	  profile_probability probability;
-	  profile_count count;
 	  edge e;
 
 	  dest_index = streamer_read_uhwi (ib);
 	  probability = profile_probability::stream_in (ib);
-	  count = profile_count::stream_in (ib).apply_scale
-			 (count_materialization_scale, REG_BR_PROB_BASE);
 	  edge_flags = streamer_read_uhwi (ib);
 
 	  dest = BASIC_BLOCK_FOR_FN (fn, dest_index);
@@ -772,7 +768,6 @@ input_cfg (struct lto_input_block *ib, struct data_in *data_in,
 
 	  e = make_edge (bb, dest, edge_flags);
 	  e->probability = probability;
-	  e->count = count;
 	}
 
       index = streamer_read_hwi (ib);
@@ -1070,7 +1065,7 @@ input_function (tree fn_decl, struct data_in *data_in,
   if (!node)
     node = cgraph_node::create (fn_decl);
   input_struct_function_base (fn, data_in, ib);
-  input_cfg (ib_cfg, data_in, fn, node->count_materialization_scale);
+  input_cfg (ib_cfg, data_in, fn);
 
   /* Read all the SSA names.  */
   input_ssa_names (ib, data_in, fn);
@@ -1197,6 +1192,7 @@ input_function (tree fn_decl, struct data_in *data_in,
     gimple_set_body (fn_decl, bb_seq (ei_edge (ei)->dest));
   }
 
+  counts_to_freqs ();
   fixup_call_stmt_edges (node, stmts);
   execute_all_ipa_stmt_fixups (node, stmts);
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index cf98eed52a3..12a3249debb 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1885,7 +1885,6 @@ output_cfg (struct output_block *ob, struct function *fn)
 	{
 	  streamer_write_uhwi (ob, e->dest->index);
 	  e->probability.stream_out (ob);
-	  e->count.stream_out (ob);
 	  streamer_write_uhwi (ob, e->flags);
 	}
     }
diff --git a/gcc/lto/ChangeLog b/gcc/lto/ChangeLog
index 3e6b00bc487..173cde67369 100644
--- a/gcc/lto/ChangeLog
+++ b/gcc/lto/ChangeLog
@@ -1,3 +1,7 @@
+2017-10-13  Jan Hubicka  <hubicka@ucw.cz>
+
+	* lto-lang.c (lto_post_options): Clean shlib flag when not doing PIC.
+
 2017-10-11  Nathan Sidwell  <nathan@acm.org>
 
 	* lto.c (mentions_vars_p_decl_with_vis): Use
diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index 6062deea5f3..a2f2c931338 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -854,11 +854,13 @@ lto_post_options (const char **pfilename ATTRIBUTE_UNUSED)
          flag_pie is 2.  */
       flag_pie = MAX (flag_pie, flag_pic);
       flag_pic = flag_pie;
+      flag_shlib = 0;
       break;
 
     case LTO_LINKER_OUTPUT_EXEC: /* Normal executable */
       flag_pic = 0;
       flag_pie = 0;
+      flag_shlib = 0;
       break;
 
     case LTO_LINKER_OUTPUT_UNKNOWN:
diff --git a/gcc/machmode.def b/gcc/machmode.def
index afe685195ef..dcf10565958 100644
--- a/gcc/machmode.def
+++ b/gcc/machmode.def
@@ -142,6 +142,12 @@ along with GCC; see the file COPYING3.  If not see
 	than two bytes (if CLASS is FLOAT).  CLASS must be INT or
 	FLOAT.  The names follow the same rule as VECTOR_MODE uses.
 
+     VECTOR_BOOL_MODE (COUNT, BYTESIZE)
+	Create a vector of booleans with COUNT elements and BYTESIZE bytes.
+	Each boolean occupies (COUNT * BITS_PER_UNIT) / BYTESIZE bits,
+	with the element at index 0 occupying the lsb of the first byte
+	in memory.  Only the lowest bit of each element is significant.
+
      COMPLEX_MODES (CLASS);
         For all modes presently declared in class CLASS, construct
 	corresponding complex modes.  Modes smaller than one byte
@@ -163,6 +169,12 @@ along with GCC; see the file COPYING3.  If not see
 	Unlike a FORMAT argument, if you are adjusting a float format
 	you must put an & in front of the name of each format structure.
 
+     ADJUST_NUNITS (MODE, EXPR);
+	Like the above, but set the number of nunits of MODE to EXPR.
+	This changes the size and precision of the mode in proportion
+	to the change in the number of units; for example, doubling
+	the number of units doubles the size and precision as well.
+
    Note: If a mode is ever made which is more than 255 bytes wide,
    machmode.h and genmodes.c will have to be changed to allocate
    more space for the mode_size and mode_alignment arrays.  */
diff --git a/gcc/match.pd b/gcc/match.pd
index 11c04dba77d..63566df3205 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1514,12 +1514,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        t1 = TYPE_OVERFLOW_WRAPS (type) ? type : TREE_TYPE (@1);
     }
     (convert (plus (convert:t1 @0) (convert:t1 @1))))))
- /* -(-A) -> A */
+ /* -(T)(-A) -> (T)A
+    Sign-extension is ok except for INT_MIN, which thankfully cannot
+    happen without overflow.  */
  (simplify
-  (negate (convert? (negate @1)))
-  (if (tree_nop_conversion_p (type, TREE_TYPE (@1))
-       && !TYPE_OVERFLOW_SANITIZED (type))
+  (negate (convert (negate @1)))
+  (if (INTEGRAL_TYPE_P (type)
+       && (TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@1))
+	   || (!TYPE_UNSIGNED (TREE_TYPE (@1))
+	       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1))))
+       && !TYPE_OVERFLOW_SANITIZED (type)
+       && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
    (convert @1)))
+ (simplify
+  (negate (convert negate_expr_p@1))
+  (if (SCALAR_FLOAT_TYPE_P (type)
+       && ((DECIMAL_FLOAT_TYPE_P (type)
+	    == DECIMAL_FLOAT_TYPE_P (TREE_TYPE (@1))
+	    && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@1)))
+	   || !HONOR_SIGN_DEPENDENT_ROUNDING (type)))
+   (convert (negate @1))))
+ (simplify
+  (negate (nop_convert (negate @1)))
+  (if (!TYPE_OVERFLOW_SANITIZED (type)
+       && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
+   (view_convert @1)))
 
  /* We can't reassociate floating-point unless -fassociative-math
     or fixed-point plus or minus because of saturation to +-Inf.  */
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index f85011e90c2..71b2a616096 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1422,15 +1422,15 @@ sms_schedule (void)
       get_ebb_head_tail (bb, bb, &head, &tail);
       latch_edge = loop_latch_edge (loop);
       gcc_assert (single_exit (loop));
-      if (single_exit (loop)->count > profile_count::zero ())
-	trip_count = latch_edge->count.to_gcov_type ()
-		     / single_exit (loop)->count.to_gcov_type ();
+      if (single_exit (loop)->count () > profile_count::zero ())
+	trip_count = latch_edge->count ().to_gcov_type ()
+		     / single_exit (loop)->count ().to_gcov_type ();
 
       /* Perform SMS only on loops that their average count is above threshold.  */
 
-      if ( latch_edge->count > profile_count::zero ()
-          && (latch_edge->count
-	      < single_exit (loop)->count.apply_scale
+      if ( latch_edge->count () > profile_count::zero ()
+          && (latch_edge->count()
+	      < single_exit (loop)->count ().apply_scale
 				 (SMS_LOOP_AVERAGE_COUNT_THRESHOLD, 1)))
 	{
 	  if (dump_file)
@@ -1552,9 +1552,9 @@ sms_schedule (void)
 
       latch_edge = loop_latch_edge (loop);
       gcc_assert (single_exit (loop));
-      if (single_exit (loop)->count > profile_count::zero ())
-	trip_count = latch_edge->count.to_gcov_type ()
-		     / single_exit (loop)->count.to_gcov_type ();
+      if (single_exit (loop)->count ()> profile_count::zero ())
+	trip_count = latch_edge->count ().to_gcov_type ()
+		     / single_exit (loop)->count ().to_gcov_type ();
 
       if (dump_file)
 	{
diff --git a/gcc/objc/ChangeLog b/gcc/objc/ChangeLog
index 20b0fe44b29..7d865928999 100644
--- a/gcc/objc/ChangeLog
+++ b/gcc/objc/ChangeLog
@@ -1,3 +1,12 @@
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* objc-gnu-runtime-abi-01.c (objc_gnu_runtime_abi_01_init): Use
+	UNKNOWN_LOCATION rather than 0.
+
+2017-10-17  Nathan Sidwell  <nathan@acm.org>
+
+	* objc-act.c (objc_common_tree_size): Return size of TYPE nodes.
+
 2017-10-10  Richard Sandiford  <richard.sandiford@linaro.org>
 
 	* objc-act.c (objc_decl_method_attributes): Use wi::to_wide when
diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c
index ce2adcc0ded..765192c82aa 100644
--- a/gcc/objc/objc-act.c
+++ b/gcc/objc/objc-act.c
@@ -10118,11 +10118,14 @@ objc_common_tree_size (enum tree_code code)
     case CLASS_METHOD_DECL:
     case INSTANCE_METHOD_DECL:
     case KEYWORD_DECL:
-    case PROPERTY_DECL:
-      return sizeof (struct tree_decl_non_common);
+    case PROPERTY_DECL:			return sizeof (tree_decl_non_common);
+    case CLASS_INTERFACE_TYPE:
+    case CLASS_IMPLEMENTATION_TYPE:
+    case CATEGORY_INTERFACE_TYPE:
+    case CATEGORY_IMPLEMENTATION_TYPE:
+    case PROTOCOL_INTERFACE_TYPE:	return sizeof (tree_type_non_common);
     default:
       gcc_unreachable ();
-  
     }
 }
 
diff --git a/gcc/objc/objc-gnu-runtime-abi-01.c b/gcc/objc/objc-gnu-runtime-abi-01.c
index b53d1820db3..4321b365358 100644
--- a/gcc/objc/objc-gnu-runtime-abi-01.c
+++ b/gcc/objc/objc-gnu-runtime-abi-01.c
@@ -130,7 +130,8 @@ objc_gnu_runtime_abi_01_init (objc_runtime_hooks *rthooks)
   /* GNU runtime does not need the compiler to change code in order to do GC. */
   if (flag_objc_gc)
     {
-      warning_at (0, 0, "%<-fobjc-gc%> is ignored for %<-fgnu-runtime%>");
+      warning_at (UNKNOWN_LOCATION, 0,
+		  "%<-fobjc-gc%> is ignored for %<-fgnu-runtime%>");
       flag_objc_gc = 0;
     }
 
diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 130814e68f2..34a95aa15b6 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -1399,6 +1399,7 @@ expand_omp_taskreg (struct omp_region *region)
 
       if (optimize)
 	optimize_omp_library_calls (entry_stmt);
+      counts_to_freqs ();
       cgraph_edge::rebuild_edges ();
 
       /* Some EH regions might become dead, see PR34608.  If
diff --git a/gcc/omp-grid.c b/gcc/omp-grid.c
index a7b6f60aeaf..121c96ebe39 100644
--- a/gcc/omp-grid.c
+++ b/gcc/omp-grid.c
@@ -1315,6 +1315,7 @@ grid_attempt_target_gridification (gomp_target *target,
       n1 = fold_convert (itype, n1);
       n2 = fold_convert (itype, n2);
 
+      tree cond = fold_build2 (cond_code, boolean_type_node, n1, n2);
       tree step
 	= omp_get_for_step_from_incr (loc, gimple_omp_for_incr (inner_loop, i));
 
@@ -1328,6 +1329,7 @@ grid_attempt_target_gridification (gomp_target *target,
 			 fold_build1 (NEGATE_EXPR, itype, step));
       else
 	t = fold_build2 (TRUNC_DIV_EXPR, itype, t, step);
+      t = fold_build3 (COND_EXPR, itype, cond, t, build_zero_cst (itype));
       if (grid.tiling)
 	{
 	  if (cond_code == GT_EXPR)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ec838c5a175..c17226c8ffc 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -3466,7 +3466,7 @@ omp_clause_aligned_alignment (tree clause)
 	machine_mode vmode = targetm.vectorize.preferred_simd_mode (mode);
 	if (GET_MODE_CLASS (vmode) != classes[i + 1])
 	  continue;
-	while (maybe_nonzero (vs)
+	while (may_ne (vs, 0U)
 	       && must_lt (GET_MODE_SIZE (vmode), vs)
 	       && GET_MODE_2XWIDER_MODE (vmode).exists ())
 	  vmode = GET_MODE_2XWIDER_MODE (vmode).require ();
@@ -3506,7 +3506,7 @@ static bool
 lower_rec_simd_input_clauses (tree new_var, omp_context *ctx,
 			      omplow_simd_context *sctx, tree &ivar, tree &lvar)
 {
-  if (known_zero (sctx->max_vf))
+  if (must_eq (sctx->max_vf, 0U))
     {
       sctx->max_vf = sctx->is_simt ? omp_max_simt_vf () : omp_max_vf ();
       if (may_gt (sctx->max_vf, 1U))
@@ -4670,7 +4670,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
 
   /* If max_vf is non-zero, then we can use only a vectorization factor
      up to the max_vf we chose.  So stick it into the safelen clause.  */
-  if (maybe_nonzero (sctx.max_vf))
+  if (may_ne (sctx.max_vf, 0U))
     {
       tree c = omp_find_clause (gimple_omp_for_clauses (ctx->stmt),
 				OMP_CLAUSE_SAFELEN);
diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index 30d60e636d2..9cd66e26c27 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -1141,6 +1141,7 @@ simd_clone_adjust (struct cgraph_node *node)
     {
       basic_block orig_exit = EDGE_PRED (EXIT_BLOCK_PTR_FOR_FN (cfun), 0)->src;
       incr_bb = create_empty_bb (orig_exit);
+      incr_bb->count = profile_count::zero ();
       add_bb_to_loop (incr_bb, body_bb->loop_father);
       /* The succ of orig_exit was EXIT_BLOCK_PTR_FOR_FN (cfun), with an empty
 	 flag.  Set it now to be a FALLTHRU_EDGE.  */
@@ -1151,11 +1152,13 @@ simd_clone_adjust (struct cgraph_node *node)
 	{
 	  edge e = EDGE_PRED (EXIT_BLOCK_PTR_FOR_FN (cfun), i);
 	  redirect_edge_succ (e, incr_bb);
+	  incr_bb->count += e->count ();
 	}
     }
   else if (node->simdclone->inbranch)
     {
       incr_bb = create_empty_bb (entry_bb);
+      incr_bb->count = profile_count::zero ();
       add_bb_to_loop (incr_bb, body_bb->loop_father);
     }
 
@@ -1252,6 +1255,7 @@ simd_clone_adjust (struct cgraph_node *node)
       gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
       edge e = make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
       e->probability = profile_probability::unlikely ().guessed ();
+      incr_bb->count += e->count ();
       edge fallthru = FALLTHRU_EDGE (loop->header);
       fallthru->flags = EDGE_FALSE_VALUE;
       fallthru->probability = profile_probability::likely ().guessed ();
diff --git a/gcc/optabs.c b/gcc/optabs.c
index a6635e15116..7b8c0f60c99 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6329,10 +6329,10 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx *ptarget_oval,
   return true;
 }
 
-/* Generate asm volatile("" : : : "memory") as the memory barrier.  */
+/* Generate asm volatile("" : : : "memory") as the memory blockage.  */
 
 static void
-expand_asm_memory_barrier (void)
+expand_asm_memory_blockage (void)
 {
   rtx asm_op, clob;
 
@@ -6348,6 +6348,17 @@ expand_asm_memory_barrier (void)
   emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, asm_op, clob)));
 }
 
+/* Do not propagate memory accesses across this point.  */
+
+static void
+expand_memory_blockage (void)
+{
+  if (targetm.have_memory_blockage ())
+    emit_insn (targetm.gen_memory_blockage ());
+  else
+    expand_asm_memory_blockage ();
+}
+
 /* This routine will either emit the mem_thread_fence pattern or issue a 
    sync_synchronize to generate a fence for memory model MEMMODEL.  */
 
@@ -6359,14 +6370,14 @@ expand_mem_thread_fence (enum memmodel model)
   if (targetm.have_mem_thread_fence ())
     {
       emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
-      expand_asm_memory_barrier ();
+      expand_memory_blockage ();
     }
   else if (targetm.have_memory_barrier ())
     emit_insn (targetm.gen_memory_barrier ());
   else if (synchronize_libfunc != NULL_RTX)
     emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode);
   else
-    expand_asm_memory_barrier ();
+    expand_memory_blockage ();
 }
 
 /* Emit a signal fence with given memory model.  */
@@ -6377,7 +6388,7 @@ expand_mem_signal_fence (enum memmodel model)
   /* No machine barrier is required to implement a signal fence, but
      a compiler memory barrier must be issued, except for relaxed MM.  */
   if (!is_mm_relaxed (model))
-    expand_asm_memory_barrier ();
+    expand_memory_blockage ();
 }
 
 /* This function expands the atomic load operation:
@@ -6399,7 +6410,7 @@ expand_atomic_load (rtx target, rtx mem, enum memmodel model)
       struct expand_operand ops[3];
       rtx_insn *last = get_last_insn ();
       if (is_mm_seq_cst (model))
-	expand_asm_memory_barrier ();
+	expand_memory_blockage ();
 
       create_output_operand (&ops[0], target, mode);
       create_fixed_operand (&ops[1], mem);
@@ -6407,7 +6418,7 @@ expand_atomic_load (rtx target, rtx mem, enum memmodel model)
       if (maybe_expand_insn (icode, 3, ops))
 	{
 	  if (!is_mm_relaxed (model))
-	    expand_asm_memory_barrier ();
+	    expand_memory_blockage ();
 	  return ops[0].value;
 	}
       delete_insns_since (last);
@@ -6457,14 +6468,14 @@ expand_atomic_store (rtx mem, rtx val, enum memmodel model, bool use_release)
     {
       rtx_insn *last = get_last_insn ();
       if (!is_mm_relaxed (model))
-	expand_asm_memory_barrier ();
+	expand_memory_blockage ();
       create_fixed_operand (&ops[0], mem);
       create_input_operand (&ops[1], val, mode);
       create_integer_operand (&ops[2], model);
       if (maybe_expand_insn (icode, 3, ops))
 	{
 	  if (is_mm_seq_cst (model))
-	    expand_asm_memory_barrier ();
+	    expand_memory_blockage ();
 	  return const0_rtx;
 	}
       delete_insns_since (last);
@@ -7095,7 +7106,10 @@ maybe_legitimize_operand (enum insn_code icode, unsigned int opno,
       if (mode != VOIDmode
 	  && must_eq (trunc_int_for_mode (op->int_value, mode),
 		      op->int_value))
-	goto input;
+	{
+	  op->value = gen_int_mode (op->int_value, mode);
+	  goto input;
+	}
       break;
     }
   return insn_operand_matches (icode, opno, op->value);
diff --git a/gcc/opts.c b/gcc/opts.c
index adf3d89851d..ac383d48ec1 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -37,7 +37,7 @@ static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);
 /* Indexed by enum debug_info_type.  */
 const char *const debug_type_names[] =
 {
-  "none", "stabs", "coff", "dwarf-2", "xcoff", "vms"
+  "none", "stabs", "dwarf-2", "xcoff", "vms"
 };
 
 /* Parse the -femit-struct-debug-detailed option value
@@ -1521,6 +1521,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (object-size, SANITIZE_OBJECT_SIZE, true),
   SANITIZER_OPT (vptr, SANITIZE_VPTR, true),
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true),
+  SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true),
   SANITIZER_OPT (all, ~0U, true),
 #undef SANITIZER_OPT
   { NULL, 0U, 0UL, false }
@@ -2350,10 +2351,6 @@ common_handle_option (struct gcc_options *opts,
                        loc);
       break;
 
-    case OPT_gcoff:
-      set_debug_level (SDB_DEBUG, false, arg, opts, opts_set, loc);
-      break;
-
     case OPT_gdwarf:
       if (arg && strlen (arg) != 0)
         {
diff --git a/gcc/output.h b/gcc/output.h
index e98a911c647..ede44476a76 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -308,11 +308,6 @@ extern void output_quoted_string (FILE *, const char *);
    This variable is defined  in final.c.  */
 extern rtx_sequence *final_sequence;
 
-/* The line number of the beginning of the current function.  Various
-   md code needs this so that it can output relative linenumbers.  */
-
-extern int sdb_begin_function_line;
-
 /* File in which assembler code is being written.  */
 
 #ifdef BUFSIZ
diff --git a/gcc/params.def b/gcc/params.def
index bc4e9e40fd6..4a96268c102 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -882,13 +882,6 @@ DEFPARAM (PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP,
 	  "maximum number of arrays per scop.",
 	  100, 0, 0)
 
-/* Maximal number of basic blocks in the functions analyzed by Graphite.  */
-
-DEFPARAM (PARAM_GRAPHITE_MIN_LOOPS_PER_FUNCTION,
-	  "graphite-min-loops-per-function",
-	  "minimal number of loops per function to be analyzed by Graphite.",
-	  2, 0, 0)
-
 DEFPARAM (PARAM_MAX_ISL_OPERATIONS,
 	  "max-isl-operations",
 	  "maximum number of isl operations, 0 means unlimited",
diff --git a/gcc/poly-int.h b/gcc/poly-int.h
index 61cf3213db8..73c0efd47e4 100644
--- a/gcc/poly-int.h
+++ b/gcc/poly-int.h
@@ -152,20 +152,24 @@ struct if_lossless<T1, T2, T3, true>
 /* poly_int_traits<T> describes an integer type T that might be polynomial
    or non-polynomial:
 
-   - poly_int_coeffs<T>::is_poly is true if T is a poly_int-based type
+   - poly_int_traits<T>::is_poly is true if T is a poly_int-based type
      and false otherwise.
 
-   - poly_int_coeffs<T>::num_coeffs gives the number of coefficients in T
+   - poly_int_traits<T>::num_coeffs gives the number of coefficients in T
      if T is a poly_int and 1 otherwise.
 
-   - poly_int_coeffs<T>::coeff_type gives the coefficent type of T if T
-     is a poly_int and T itself otherwise.  */
+   - poly_int_traits<T>::coeff_type gives the coefficent type of T if T
+     is a poly_int and T itself otherwise
+
+   - poly_int_traits<T>::int_type is a shorthand for
+     typename poly_coeff_traits<coeff_type>::int_type.  */
 template<typename T>
 struct poly_int_traits
 {
   static const bool is_poly = false;
   static const unsigned int num_coeffs = 1;
   typedef T coeff_type;
+  typedef typename poly_coeff_traits<T>::int_type int_type;
 };
 template<unsigned int N, typename C>
 struct poly_int_traits<poly_int_pod<N, C> >
@@ -173,6 +177,7 @@ struct poly_int_traits<poly_int_pod<N, C> >
   static const bool is_poly = true;
   static const unsigned int num_coeffs = N;
   typedef C coeff_type;
+  typedef typename poly_coeff_traits<C>::int_type int_type;
 };
 template<unsigned int N, typename C>
 struct poly_int_traits<poly_int<N, C> > : poly_int_traits<poly_int_pod<N, C> >
@@ -304,8 +309,7 @@ struct poly_result<T1, T2, 2>
 
 /* The type to which an integer constant should be cast before
    comparing it with T.  */
-#define POLY_INT_TYPE(T) \
-  typename poly_coeff_traits<typename poly_int_traits<T>::coeff_type>::int_type
+#define POLY_INT_TYPE(T) typename poly_int_traits<T>::int_type
 
 /* RES is a poly_int result that has coefficients of type C and that
    is being built up a coefficient at a time.  Set coefficient number I
@@ -1332,52 +1336,6 @@ may_ne (const Ca &a, const Cb &b)
 /* Return true if A must be unequal to B.  */
 #define must_ne(A, B) (!may_eq (A, B))
 
-/* Return true if A is known to be zero.  */
-
-template<typename T>
-inline bool
-known_zero (const T &a)
-{
-  typedef POLY_INT_TYPE (T) int_type;
-  return must_eq (a, int_type (0));
-}
-
-/* Return true if A is known to be nonzero.  */
-
-template<typename T>
-inline bool
-known_nonzero (const T &a)
-{
-  typedef POLY_INT_TYPE (T) int_type;
-  return must_ne (a, int_type (0));
-}
-
-/* Return true if A might be equal to zero.  */
-#define maybe_zero(A) (!known_nonzero (A))
-
-/* Return true if A might not be equal to zero.  */
-#define maybe_nonzero(A) (!known_zero (A))
-
-/* Return true if A is known to be equal to 1.  */
-
-template<typename T>
-inline bool
-known_one (const T &a)
-{
-  typedef POLY_INT_TYPE (T) int_type;
-  return must_eq (a, int_type (1));
-}
-
-/* Return true if A is known to be all ones.  */
-
-template<typename T>
-inline bool
-known_all_ones (const T &a)
-{
-  typedef POLY_INT_TYPE (T) int_type;
-  return must_eq (a, int_type (-1));
-}
-
 /* Return true if A might be less than or equal to B for some
    indeterminate values.  */
 
@@ -1596,8 +1554,7 @@ template<unsigned int N, typename Ca>
 inline Ca
 constant_lower_bound (const poly_int_pod<N, Ca> &a)
 {
-  typedef POLY_INT_TYPE (Ca) ICa;
-  gcc_checking_assert (must_ge (a, ICa (0)));
+  gcc_checking_assert (must_ge (a, POLY_INT_TYPE (Ca) (0)));
   return a.coeffs[0];
 }
 
@@ -2509,7 +2466,12 @@ struct poly_span_traits<T1, T2, T3, HOST_WIDE_INT, unsigned HOST_WIDE_INT>
 /* Return true if SIZE represents a known size, assuming that all-ones
    indicates an unknown size.  */
 
-#define known_size_p(X) (!known_all_ones (X))
+template<typename T>
+inline bool
+known_size_p (const T &a)
+{
+  return may_ne (a, POLY_INT_TYPE (T) (-1));
+}
 
 /* Return true if range [POS, POS + SIZE) might include VAL.
    SIZE can be the special value -1, in which case the range is
@@ -2557,9 +2519,9 @@ ranges_may_overlap_p (const T1 &pos1, const T2 &size1,
 		      const T3 &pos2, const T4 &size2)
 {
   if (maybe_in_range_p (pos2, pos1, size1))
-    return maybe_nonzero (size2);
+    return may_ne (size2, POLY_INT_TYPE (T4) (0));
   if (maybe_in_range_p (pos1, pos2, size2))
-    return maybe_nonzero (size1);
+    return may_ne (size1, POLY_INT_TYPE (T2) (0));
   return false;
 }
 
@@ -2602,10 +2564,9 @@ known_subrange_p (const T1 &pos1, const T2 &size1,
 		  const T3 &pos2, const T4 &size2)
 {
   typedef typename poly_int_traits<T2>::coeff_type C2;
-  typedef POLY_INT_TYPE (C2) IC2;
   typedef POLY_BINARY_COEFF (T2, T4) size_diff_type;
   typedef poly_span_traits<T1, T3, size_diff_type> span;
-  return (must_gt (size1, IC2 (0))
+  return (must_gt (size1, POLY_INT_TYPE (T2) (0))
 	  && (poly_coeff_traits<C2>::signedness > 0
 	      || known_size_p (size1))
 	  && known_size_p (size2)
diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c
index a1dcac2600c..15fdb7e0cfe 100644
--- a/gcc/postreload-gcse.c
+++ b/gcc/postreload-gcse.c
@@ -1108,14 +1108,14 @@ eliminate_partially_redundant_load (basic_block bb, rtx_insn *insn,
 	    avail_insn = NULL;
 	}
 
-      if (EDGE_CRITICAL_P (pred) && pred->count.initialized_p ())
-	critical_count += pred->count;
+      if (EDGE_CRITICAL_P (pred) && pred->count ().initialized_p ())
+	critical_count += pred->count ();
 
       if (avail_insn != NULL_RTX)
 	{
 	  npred_ok++;
-	  if (pred->count.initialized_p ())
-	    ok_count = ok_count + pred->count;
+	  if (pred->count ().initialized_p ())
+	    ok_count = ok_count + pred->count ();
 	  if (! set_noop_p (PATTERN (gen_move_insn (copy_rtx (dest),
 						    copy_rtx (avail_reg)))))
 	    {
@@ -1139,8 +1139,8 @@ eliminate_partially_redundant_load (basic_block bb, rtx_insn *insn,
 	  /* Adding a load on a critical edge will cause a split.  */
 	  if (EDGE_CRITICAL_P (pred))
 	    critical_edge_split = true;
-	  if (pred->count.initialized_p ())
-	    not_ok_count = not_ok_count + pred->count;
+	  if (pred->count ().initialized_p ())
+	    not_ok_count = not_ok_count + pred->count ();
 	  unoccr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
 						    sizeof (struct unoccr));
 	  unoccr->insn = NULL;
diff --git a/gcc/postreload.c b/gcc/postreload.c
index f5a26c55c48..a70d11a6c87 100644
--- a/gcc/postreload.c
+++ b/gcc/postreload.c
@@ -1706,7 +1706,7 @@ move2add_valid_value_p (int regno, scalar_int_mode mode)
 	 regno of the lowpart might be different.  */
       poly_int64 s_off = subreg_lowpart_offset (mode, old_mode);
       s_off = subreg_regno_offset (regno, old_mode, s_off, mode);
-      if (maybe_nonzero (s_off))
+      if (may_ne (s_off, 0))
 	/* We could in principle adjust regno, check reg_mode[regno] to be
 	   BLKmode, and return s_off to the caller (vs. -1 for failure),
 	   but we currently have no callers that could make use of this
diff --git a/gcc/predict.c b/gcc/predict.c
index e534502aaf5..cf42ccbd903 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -137,12 +137,12 @@ maybe_hot_frequency_p (struct function *fun, int freq)
   if (profile_status_for_fn (fun) == PROFILE_ABSENT)
     return true;
   if (node->frequency == NODE_FREQUENCY_EXECUTED_ONCE
-      && freq < (ENTRY_BLOCK_PTR_FOR_FN (fun)->frequency * 2 / 3))
+      && freq < (ENTRY_BLOCK_PTR_FOR_FN (fun)->count.to_frequency (cfun) * 2 / 3))
     return false;
   if (PARAM_VALUE (HOT_BB_FREQUENCY_FRACTION) == 0)
     return false;
   if (freq * PARAM_VALUE (HOT_BB_FREQUENCY_FRACTION)
-      < ENTRY_BLOCK_PTR_FOR_FN (fun)->frequency)
+      < ENTRY_BLOCK_PTR_FOR_FN (fun)->count.to_frequency (cfun))
     return false;
   return true;
 }
@@ -175,10 +175,14 @@ set_hot_bb_threshold (gcov_type min)
 /* Return TRUE if frequency FREQ is considered to be hot.  */
 
 bool
-maybe_hot_count_p (struct function *, profile_count count)
+maybe_hot_count_p (struct function *fun, profile_count count)
 {
   if (!count.initialized_p ())
     return true;
+  if (!count.ipa_p ())
+    return maybe_hot_frequency_p (fun, count.to_frequency (fun));
+  if (count.ipa () == profile_count::zero ())
+    return false;
   /* Code executed at most once is not hot.  */
   if (count <= MAX (profile_info ? profile_info->runs : 1, 1))
     return false;
@@ -192,9 +196,7 @@ bool
 maybe_hot_bb_p (struct function *fun, const_basic_block bb)
 {
   gcc_checking_assert (fun);
-  if (!maybe_hot_count_p (fun, bb->count))
-    return false;
-  return maybe_hot_frequency_p (fun, bb->frequency);
+  return maybe_hot_count_p (fun, bb->count);
 }
 
 /* Return true in case BB can be CPU intensive and should be optimized
@@ -203,9 +205,7 @@ maybe_hot_bb_p (struct function *fun, const_basic_block bb)
 bool
 maybe_hot_edge_p (edge e)
 {
-  if (!maybe_hot_count_p (cfun, e->count))
-    return false;
-  return maybe_hot_frequency_p (cfun, EDGE_FREQUENCY (e));
+  return maybe_hot_count_p (cfun, e->count ());
 }
 
 /* Return true if profile COUNT and FREQUENCY, or function FUN static
@@ -213,7 +213,7 @@ maybe_hot_edge_p (edge e)
    
 static bool
 probably_never_executed (struct function *fun,
-                         profile_count count, int)
+                         profile_count count)
 {
   gcc_checking_assert (fun);
   if (count == profile_count::zero ())
@@ -238,7 +238,7 @@ probably_never_executed (struct function *fun,
 bool
 probably_never_executed_bb_p (struct function *fun, const_basic_block bb)
 {
-  return probably_never_executed (fun, bb->count, bb->frequency);
+  return probably_never_executed (fun, bb->count);
 }
 
 
@@ -247,7 +247,7 @@ probably_never_executed_bb_p (struct function *fun, const_basic_block bb)
 static bool
 unlikely_executed_edge_p (edge e)
 {
-  return (e->count == profile_count::zero ()
+  return (e->count () == profile_count::zero ()
 	  || e->probability == profile_probability::never ())
 	 || (e->flags & (EDGE_EH | EDGE_FAKE));
 }
@@ -259,7 +259,7 @@ probably_never_executed_edge_p (struct function *fun, edge e)
 {
   if (unlikely_executed_edge_p (e))
     return true;
-  return probably_never_executed (fun, e->count, EDGE_FREQUENCY (e));
+  return probably_never_executed (fun, e->count ());
 }
 
 /* Return true when current function should always be optimized for size.  */
@@ -746,8 +746,8 @@ dump_prediction (FILE *file, enum br_predictor predictor, int probability,
       if (e)
 	{
 	  fprintf (file, " hit ");
-	  e->count.dump (file);
-	  fprintf (file, " (%.1f%%)", e->count.to_gcov_type() * 100.0
+	  e->count ().dump (file);
+	  fprintf (file, " (%.1f%%)", e->count ().to_gcov_type() * 100.0
 		   / bb->count.to_gcov_type ());
 	}
     }
@@ -1289,7 +1289,8 @@ combine_predictions_for_bb (basic_block bb, bool dry_run)
     }
   clear_bb_predictions (bb);
 
-  if (!bb->count.initialized_p () && !dry_run)
+  if ((!bb->count.nonzero_p () || !first->probability.initialized_p ())
+      && !dry_run)
     {
       first->probability
 	 = profile_probability::from_reg_br_prob_base (combined_probability);
@@ -3014,10 +3015,7 @@ propagate_freq (basic_block head, bitmap tovisit)
       BLOCK_INFO (bb)->npredecessors = count;
       /* When function never returns, we will never process exit block.  */
       if (!count && bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
-	{
-	  bb->count = profile_count::zero ();
-	  bb->frequency = 0;
-	}
+	bb->count = profile_count::zero ();
     }
 
   BLOCK_INFO (head)->frequency = 1;
@@ -3050,7 +3048,10 @@ propagate_freq (basic_block head, bitmap tovisit)
 				  * BLOCK_INFO (e->src)->frequency /
 				  REG_BR_PROB_BASE);  */
 
-		sreal tmp = e->probability.to_reg_br_prob_base ();
+		/* FIXME: Graphite is producing edges with no profile. Once
+		   this is fixed, drop this.  */
+		sreal tmp = e->probability.initialized_p () ?
+			    e->probability.to_reg_br_prob_base () : 0;
 		tmp *= BLOCK_INFO (e->src)->frequency;
 		tmp *= real_inv_br_prob_base;
 		frequency += tmp;
@@ -3082,7 +3083,10 @@ propagate_freq (basic_block head, bitmap tovisit)
 	     = ((e->probability * BLOCK_INFO (bb)->frequency)
 	     / REG_BR_PROB_BASE); */
 
-	  sreal tmp = e->probability.to_reg_br_prob_base ();
+	  /* FIXME: Graphite is producing edges with no profile. Once
+	     this is fixed, drop this.  */
+	  sreal tmp = e->probability.initialized_p () ?
+		      e->probability.to_reg_br_prob_base () : 0;
 	  tmp *= BLOCK_INFO (bb)->frequency;
 	  EDGE_INFO (e)->back_edge_prob = tmp * real_inv_br_prob_base;
 	}
@@ -3196,24 +3200,33 @@ drop_profile (struct cgraph_node *node, profile_count call_count)
     }
 
   basic_block bb;
-  FOR_ALL_BB_FN (bb, fn)
+  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+  if (flag_guess_branch_prob)
     {
-      bb->count = profile_count::uninitialized ();
-
-      edge_iterator ei;
-      edge e;
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	e->count = profile_count::uninitialized ();
+      bool clear_zeros
+	 = ENTRY_BLOCK_PTR_FOR_FN
+		 (DECL_STRUCT_FUNCTION (node->decl))->count.nonzero_p ();
+      FOR_ALL_BB_FN (bb, fn)
+	if (clear_zeros || !(bb->count == profile_count::zero ()))
+	  bb->count = bb->count.guessed_local ();
+      DECL_STRUCT_FUNCTION (node->decl)->cfg->count_max =
+        DECL_STRUCT_FUNCTION (node->decl)->cfg->count_max.guessed_local ();
     }
+  else
+    {
+      FOR_ALL_BB_FN (bb, fn)
+	bb->count = profile_count::uninitialized ();
+      DECL_STRUCT_FUNCTION (node->decl)->cfg->count_max
+	 = profile_count::uninitialized ();
+    }
+  pop_cfun ();
 
   struct cgraph_edge *e;
   for (e = node->callees; e; e = e->next_caller)
     {
-      e->count = profile_count::uninitialized ();
       e->frequency = compute_call_stmt_bb_frequency (e->caller->decl,
 						     gimple_bb (e->call_stmt));
     }
-  node->count = profile_count::uninitialized ();
   
   profile_status_for_fn (fn)
       = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
@@ -3307,33 +3320,16 @@ handle_missing_profiles (void)
 bool
 counts_to_freqs (void)
 {
-  gcov_type count_max;
-  profile_count true_count_max = profile_count::zero ();
+  profile_count true_count_max = profile_count::uninitialized ();
   basic_block bb;
 
-  /* Don't overwrite the estimated frequencies when the profile for
-     the function is missing.  We may drop this function PROFILE_GUESSED
-     later in drop_profile ().  */
-  if (!ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.initialized_p ()
-      || ENTRY_BLOCK_PTR_FOR_FN (cfun)->count == profile_count::zero ())
-    return false;
-
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
-    if (bb->count > true_count_max)
-      true_count_max = bb->count;
+    if (!(bb->count < true_count_max))
+      true_count_max = true_count_max.max (bb->count);
 
-  /* If we have no counts to base frequencies on, keep those that are
-     already there.  */
-  if (!(true_count_max > 0))
-    return false;
-
-  count_max = true_count_max.to_gcov_type ();
-
-  FOR_ALL_BB_FN (bb, cfun)
-    if (bb->count.initialized_p ())
-      bb->frequency = RDIV (bb->count.to_gcov_type () * BB_FREQ_MAX, count_max);
+  cfun->cfg->count_max = true_count_max;
 
-  return true;
+  return true_count_max.nonzero_p ();
 }
 
 /* Return true if function is likely to be expensive, so there is no point to
@@ -3355,11 +3351,11 @@ expensive_function_p (int threshold)
   /* Frequencies are out of range.  This either means that function contains
      internal loop executing more than BB_FREQ_MAX times or profile feedback
      is available and function has not been executed at all.  */
-  if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency == 0)
+  if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun) == 0)
     return true;
 
   /* Maximally BB_FREQ_MAX^2 so overflow won't happen.  */
-  limit = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency * threshold;
+  limit = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.to_frequency (cfun) * threshold;
   FOR_EACH_BB_FN (bb, cfun)
     {
       rtx_insn *insn;
@@ -3367,7 +3363,7 @@ expensive_function_p (int threshold)
       FOR_BB_INSNS (bb, insn)
 	if (active_insn_p (insn))
 	  {
-	    sum += bb->frequency;
+	    sum += bb->count.to_frequency (cfun);
 	    if (sum > limit)
 	      return true;
 	}
@@ -3396,7 +3392,7 @@ propagate_unlikely_bbs_forward (void)
 	{
 	  bb = worklist.pop ();
 	  FOR_EACH_EDGE (e, ei, bb->succs)
-	    if (!(e->count == profile_count::zero ())
+	    if (!(e->count () == profile_count::zero ())
 		&& !(e->dest->count == profile_count::zero ())
 		&& !e->dest->aux)
 	      {
@@ -3416,9 +3412,6 @@ propagate_unlikely_bbs_forward (void)
 		     "Basic block %i is marked unlikely by forward prop\n",
 		     bb->index);
 	  bb->count = profile_count::zero ();
-	  bb->frequency = 0;
-          FOR_EACH_EDGE (e, ei, bb->succs)
-	    e->count = profile_count::zero ();
 	}
       else
         bb->aux = NULL;
@@ -3449,21 +3442,14 @@ determine_unlikely_bbs ()
 	  bb->count = profile_count::zero ();
 	}
 
-      if (bb->count == profile_count::zero ())
-	{
-	  bb->frequency = 0;
-          FOR_EACH_EDGE (e, ei, bb->preds)
-	    e->count = profile_count::zero ();
-	}
-
       FOR_EACH_EDGE (e, ei, bb->succs)
-	if (!(e->count == profile_count::zero ())
+	if (!(e->probability == profile_probability::never ())
 	    && unlikely_executed_edge_p (e))
 	  {
             if (dump_file && (dump_flags & TDF_DETAILS))
 	      fprintf (dump_file, "Edge %i->%i is locally unlikely\n",
 		       bb->index, e->dest->index);
-	    e->count = profile_count::zero ();
+	    e->probability = profile_probability::never ();
 	  }
 
       gcc_checking_assert (!bb->aux);
@@ -3477,7 +3463,8 @@ determine_unlikely_bbs ()
       {
 	nsuccs[bb->index] = 0;
         FOR_EACH_EDGE (e, ei, bb->succs)
-	  if (!(e->count == profile_count::zero ()))
+	  if (!(e->probability == profile_probability::never ())
+	      && !(e->dest->count == profile_count::zero ()))
 	    nsuccs[bb->index]++;
 	if (!nsuccs[bb->index])
 	  worklist.safe_push (bb);
@@ -3509,11 +3496,10 @@ determine_unlikely_bbs ()
 		 "Basic block %i is marked unlikely by backward prop\n",
 		 bb->index);
       bb->count = profile_count::zero ();
-      bb->frequency = 0;
       FOR_EACH_EDGE (e, ei, bb->preds)
-	if (!(e->count == profile_count::zero ()))
+	if (!(e->probability == profile_probability::never ()))
 	  {
-	    e->count = profile_count::zero ();
+	    e->probability = profile_probability::never ();
 	    if (!(e->src->count == profile_count::zero ()))
 	      {
 	        nsuccs[e->src->index]--;
@@ -3566,8 +3552,13 @@ estimate_bb_frequencies (bool force)
 
 	  FOR_EACH_EDGE (e, ei, bb->succs)
 	    {
-	      EDGE_INFO (e)->back_edge_prob
-		 = e->probability.to_reg_br_prob_base ();
+	      /* FIXME: Graphite is producing edges with no profile. Once
+		 this is fixed, drop this.  */
+	      if (e->probability.initialized_p ())
+	        EDGE_INFO (e)->back_edge_prob
+		   = e->probability.to_reg_br_prob_base ();
+	      else
+		EDGE_INFO (e)->back_edge_prob = REG_BR_PROB_BASE / 2;
 	      EDGE_INFO (e)->back_edge_prob *= real_inv_br_prob_base;
 	    }
 	}
@@ -3576,16 +3567,28 @@ estimate_bb_frequencies (bool force)
          to outermost to examine frequencies for back edges.  */
       estimate_loops ();
 
+      bool global0 = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.initialized_p ()
+		     && ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa_p ();
+
       freq_max = 0;
       FOR_EACH_BB_FN (bb, cfun)
 	if (freq_max < BLOCK_INFO (bb)->frequency)
 	  freq_max = BLOCK_INFO (bb)->frequency;
 
       freq_max = real_bb_freq_max / freq_max;
+      cfun->cfg->count_max = profile_count::uninitialized ();
       FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
 	{
 	  sreal tmp = BLOCK_INFO (bb)->frequency * freq_max + real_one_half;
-	  bb->frequency = tmp.to_int ();
+	  profile_count count = profile_count::from_gcov_type (tmp.to_int ());	
+
+	  /* If we have profile feedback in which this function was never
+	     executed, then preserve this info.  */
+	  if (global0)
+	    bb->count = count.global0 ();
+	  else if (!(bb->count == profile_count::zero ()))
+	    bb->count = count.guessed_local ();
+          cfun->cfg->count_max = cfun->cfg->count_max.max (bb->count);
 	}
 
       free_aux_for_blocks ();
@@ -3610,7 +3613,8 @@ compute_function_frequency (void)
   if (profile_status_for_fn (cfun) != PROFILE_READ)
     {
       int flags = flags_from_decl_or_type (current_function_decl);
-      if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count == profile_count::zero ()
+      if ((ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa_p ()
+	   && ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa() == profile_count::zero ())
 	  || lookup_attribute ("cold", DECL_ATTRIBUTES (current_function_decl))
 	     != NULL)
 	{
@@ -3729,7 +3733,7 @@ pass_profile::execute (function *fun)
    {
      struct loop *loop;
      FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
-       if (loop->header->frequency)
+       if (loop->header->count.initialized_p ())
          fprintf (dump_file, "Loop got predicted %d to iterate %i times.\n",
        	   loop->num,
        	   (int)expected_loop_iterations_unbounded (loop));
@@ -3855,15 +3859,12 @@ rebuild_frequencies (void)
      which may also lead to frequencies incorrectly reduced to 0. There
      is less precision in the probabilities, so we only do this for small
      max counts.  */
-  profile_count count_max = profile_count::zero ();
+  cfun->cfg->count_max = profile_count::uninitialized ();
   basic_block bb;
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
-    if (bb->count > count_max)
-      count_max = bb->count;
+    cfun->cfg->count_max = cfun->cfg->count_max.max (bb->count);
 
-  if (profile_status_for_fn (cfun) == PROFILE_GUESSED
-      || (!flag_auto_profile && profile_status_for_fn (cfun) == PROFILE_READ
-	  && count_max < REG_BR_PROB_BASE / 10))
+  if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
     {
       loop_optimizer_init (0);
       add_noreturn_fake_exit_edges ();
@@ -3928,8 +3929,6 @@ force_edge_cold (edge e, bool impossible)
   profile_probability prob_sum = profile_probability::never ();
   edge_iterator ei;
   edge e2;
-  profile_count old_count = e->count;
-  profile_probability old_probability = e->probability;
   bool uninitialized_exit = false;
 
   profile_probability goal = (impossible ? profile_probability::never ()
@@ -3937,13 +3936,13 @@ force_edge_cold (edge e, bool impossible)
 
   /* If edge is already improbably or cold, just return.  */
   if (e->probability <= goal
-      && (!impossible || e->count == profile_count::zero ()))
+      && (!impossible || e->count () == profile_count::zero ()))
     return;
   FOR_EACH_EDGE (e2, ei, e->src->succs)
     if (e2 != e)
       {
-	if (e2->count.initialized_p ())
-	  count_sum += e2->count;
+	if (e2->count ().initialized_p ())
+	  count_sum += e2->count ();
 	else
 	  uninitialized_exit = true;
 	if (e2->probability.initialized_p ())
@@ -3956,13 +3955,6 @@ force_edge_cold (edge e, bool impossible)
     {
       if (!(e->probability < goal))
 	e->probability = goal;
-      if (impossible)
-	e->count = profile_count::zero ();
-      else if (old_probability > profile_probability::never ())
-	e->count = e->count.apply_probability (e->probability
-					       / old_probability);
-      else
-        e->count = e->count.apply_scale (1, REG_BR_PROB_BASE);
 
       profile_probability prob_comp = prob_sum / e->probability.invert ();
 
@@ -3971,12 +3963,9 @@ force_edge_cold (edge e, bool impossible)
 		 "probability to other edges.\n",
 		 e->src->index, e->dest->index,
 		 impossible ? "impossible" : "cold");
-      profile_count count_sum2 = count_sum + old_count - e->count;
       FOR_EACH_EDGE (e2, ei, e->src->succs)
 	if (e2 != e)
 	  {
-	    if (count_sum > 0)
-	      e2->count.apply_scale (count_sum2, count_sum);
 	    e2->probability /= prob_comp;
 	  }
       if (current_ir_type () != IR_GIMPLE
@@ -4027,7 +4016,6 @@ force_edge_cold (edge e, bool impossible)
 		fprintf (dump_file,
 			 "Making bb %i impossible and dropping count to 0.\n",
 			 e->src->index);
-	      e->count = profile_count::zero ();
 	      e->src->count = profile_count::zero ();
 	      FOR_EACH_EDGE (e2, ei, e->src->preds)
 		force_edge_cold (e2, impossible);
@@ -4042,18 +4030,20 @@ force_edge_cold (edge e, bool impossible)
 	 after loop transforms.  */
       if (!(prob_sum > profile_probability::never ())
 	  && count_sum == profile_count::zero ()
-	  && single_pred_p (e->src) && e->src->frequency > (impossible ? 0 : 1))
+	  && single_pred_p (e->src) && e->src->count.to_frequency (cfun)
+	     > (impossible ? 0 : 1))
 	{
-	  int old_frequency = e->src->frequency;
+	  int old_frequency = e->src->count.to_frequency (cfun);
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file, "Making bb %i %s.\n", e->src->index,
 		     impossible ? "impossible" : "cold");
-	  e->src->frequency = MIN (e->src->frequency, impossible ? 0 : 1);
+	  int new_frequency = MIN (e->src->count.to_frequency (cfun),
+				   impossible ? 0 : 1);
 	  if (impossible)
-	    e->src->count = e->count = profile_count::zero ();
+	    e->src->count = profile_count::zero ();
 	  else
-	    e->src->count = e->count = e->count.apply_scale (e->src->frequency,
-							     old_frequency);
+	    e->src->count = e->count ().apply_scale (new_frequency,
+						     old_frequency);
 	  force_edge_cold (single_pred_edge (e->src), impossible);
 	}
       else if (dump_file && (dump_flags & TDF_DETAILS)
diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index a72f9cda188..2ecdbb4299e 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -516,7 +516,7 @@ rtx_writer::print_rtx_operand_code_r (const_rtx in_rtx)
       if (REG_EXPR (in_rtx))
 	print_mem_expr (m_outfile, REG_EXPR (in_rtx));
 
-      if (maybe_nonzero (REG_OFFSET (in_rtx)))
+      if (may_ne (REG_OFFSET (in_rtx), 0))
 	{
 	  fprintf (m_outfile, "+");
 	  print_poly_int (m_outfile, REG_OFFSET (in_rtx));
diff --git a/gcc/profile-count.c b/gcc/profile-count.c
index 44ceaed2d66..d7031404645 100644
--- a/gcc/profile-count.c
+++ b/gcc/profile-count.c
@@ -42,7 +42,11 @@ profile_count::dump (FILE *f) const
   else
     {
       fprintf (f, "%" PRId64, m_val);
-      if (m_quality == profile_adjusted)
+      if (m_quality == profile_guessed_local)
+	fprintf (f, " (estimated locally)");
+      else if (m_quality == profile_guessed_global0)
+	fprintf (f, " (estimated locally, globally 0)");
+      else if (m_quality == profile_adjusted)
 	fprintf (f, " (adjusted)");
       else if (m_quality == profile_afdo)
 	fprintf (f, " (auto FDO)");
@@ -65,6 +69,7 @@ profile_count::debug () const
 bool
 profile_count::differs_from_p (profile_count other) const
 {
+  gcc_checking_assert (compatible_p (other));
   if (!initialized_p () || !other.initialized_p ())
     return false;
   if ((uint64_t)m_val - (uint64_t)other.m_val < 100
@@ -213,3 +218,40 @@ slow_safe_scale_64bit (uint64_t a, uint64_t b, uint64_t c, uint64_t *res)
   *res = (uint64_t) -1;
   return false;
 }
+
+/* Return count as frequency within FUN scaled in range 0 to REG_FREQ_MAX
+   Used for legacy code and should not be used anymore.  */
+
+int
+profile_count::to_frequency (struct function *fun) const
+{
+  if (!initialized_p ())
+    return BB_FREQ_MAX;
+  if (*this == profile_count::zero ())
+    return 0;
+  gcc_assert (REG_BR_PROB_BASE == BB_FREQ_MAX
+	      && fun->cfg->count_max.initialized_p ());
+  profile_probability prob = probability_in (fun->cfg->count_max);
+  if (!prob.initialized_p ())
+    return REG_BR_PROB_BASE;
+  return prob.to_reg_br_prob_base ();
+}
+
+/* Return count as frequency within FUN scaled in range 0 to CGRAPH_FREQ_MAX
+   where CGRAPH_FREQ_BASE means that count equals to entry block count.
+   Used for legacy code and should not be used anymore.  */
+
+int
+profile_count::to_cgraph_frequency (profile_count entry_bb_count) const
+{
+  if (!initialized_p ())
+    return CGRAPH_FREQ_BASE;
+  if (*this == profile_count::zero ())
+    return 0;
+  gcc_checking_assert (entry_bb_count.initialized_p ());
+  uint64_t scale;
+  if (!safe_scale_64bit (!entry_bb_count.m_val ? m_val + 1 : m_val,
+			 CGRAPH_FREQ_BASE, MAX (1, entry_bb_count.m_val), &scale))
+    return CGRAPH_FREQ_MAX;
+  return MIN (scale, CGRAPH_FREQ_MAX);
+}
diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 4546e199f24..d793d11c830 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -21,21 +21,37 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_PROFILE_COUNT_H
 #define GCC_PROFILE_COUNT_H
 
+struct function;
+
 /* Quality of the profile count.  Because gengtype does not support enums
    inside of classes, this is in global namespace.  */
 enum profile_quality {
+  /* Profile is based on static branch prediction heuristics and may
+     or may not match reality.  It is local to function and can not be compared
+     inter-procedurally.  Never used by probabilities (they are always local).
+   */
+  profile_guessed_local = 0,
+  /* Profile was read by feedback and was 0, we used local heuristics to guess
+     better.  This is the case of functions not run in profile fedback.
+     Never used by probabilities.  */
+  profile_guessed_global0 = 1,
+
+
   /* Profile is based on static branch prediction heuristics.  It may or may
-     not reflect the reality.  */
-  profile_guessed = 0,
+     not reflect the reality but it can be compared interprocedurally
+     (for example, we inlined function w/o profile feedback into function
+      with feedback and propagated from that).
+     Never used by probablities.  */
+  profile_guessed = 2,
   /* Profile was determined by autofdo.  */
-  profile_afdo = 1,
+  profile_afdo = 3,
   /* Profile was originally based on feedback but it was adjusted
      by code duplicating optimization.  It may not precisely reflect the
      particular code path.  */
-  profile_adjusted = 2,
+  profile_adjusted = 4,
   /* Profile was read from profile feedback or determined by accurate static
      method.  */
-  profile_precise = 3
+  profile_precise = 5
 };
 
 /* The base value for branch probability notes and edge probabilities.  */
@@ -114,15 +130,15 @@ safe_scale_64bit (uint64_t a, uint64_t b, uint64_t c, uint64_t *res)
 
 class GTY((user)) profile_probability
 {
-  static const int n_bits = 30;
+  static const int n_bits = 29;
   /* We can technically use ((uint32_t) 1 << (n_bits - 1)) - 2 but that
      will lead to harder multiplication sequences.  */
   static const uint32_t max_probability = (uint32_t) 1 << (n_bits - 2);
   static const uint32_t uninitialized_probability
 		 = ((uint32_t) 1 << (n_bits - 1)) - 1;
 
-  uint32_t m_val : 30;
-  enum profile_quality m_quality : 2;
+  uint32_t m_val : 29;
+  enum profile_quality m_quality : 3;
 
   friend class profile_count;
 public:
@@ -226,14 +242,14 @@ public:
   static profile_probability from_reg_br_prob_note (int v)
     {
       profile_probability ret;
-      ret.m_val = ((unsigned int)v) / 4;
-      ret.m_quality = (enum profile_quality)(v & 3);
+      ret.m_val = ((unsigned int)v) / 8;
+      ret.m_quality = (enum profile_quality)(v & 7);
       return ret;
     }
   int to_reg_br_prob_note () const
     {
       gcc_checking_assert (initialized_p ());
-      int ret = m_val * 4 + m_quality;
+      int ret = m_val * 8 + m_quality;
       gcc_checking_assert (profile_probability::from_reg_br_prob_note (ret)
 			   == *this);
       return ret;
@@ -489,8 +505,9 @@ public:
     {
       if (m_val == uninitialized_probability)
 	return m_quality == profile_guessed;
-      else
-	return m_val <= max_probability;
+      else if (m_quality < profile_guessed)
+	return false;
+      return m_val <= max_probability;
     }
 
   /* Comparsions are three-state and conservative.  False is returned if
@@ -530,9 +547,32 @@ public:
   void stream_out (struct lto_output_stream *);
 };
 
-/* Main data type to hold profile counters in GCC.  In most cases profile
-   counts originate from profile feedback. They are 64bit integers
-   representing number of executions during the train run.
+/* Main data type to hold profile counters in GCC. Profile counts originate
+   either from profile feedback, static profile estimation or both.  We do not
+   perform whole program profile propagation and thus profile estimation
+   counters are often local to function, while counters from profile feedback
+   (or special cases of profile estimation) can be used inter-procedurally.
+
+   There are 3 basic types
+     1) local counters which are result of intra-procedural static profile
+        estimation.
+     2) ipa counters which are result of profile feedback or special case
+        of static profile estimation (such as in function main).
+     3) counters which counts as 0 inter-procedurally (beause given function
+        was never run in train feedback) but they hold local static profile
+        estimate.
+
+   Counters of type 1 and 3 can not be mixed with counters of different type
+   within operation (because whole function should use one type of counter)
+   with exception that global zero mix in most operations where outcome is
+   well defined.
+
+   To take local counter and use it inter-procedurally use ipa member function
+   which strips information irelevant at the inter-procedural level.
+
+   Counters are 61bit integers representing number of executions during the
+   train run or normalized frequency within the function.
+
    As the profile is maintained during the compilation, many adjustments are
    made.  Not all transformations can be made precisely, most importantly
    when code is being duplicated.  It also may happen that part of CFG has
@@ -567,12 +607,25 @@ class GTY(()) profile_count
      64bit.  Although a counter cannot be negative, we use a signed
      type to hold various extra stages.  */
 
-  static const int n_bits = 62;
+  static const int n_bits = 61;
   static const uint64_t max_count = ((uint64_t) 1 << n_bits) - 2;
   static const uint64_t uninitialized_count = ((uint64_t) 1 << n_bits) - 1;
 
   uint64_t m_val : n_bits;
-  enum profile_quality m_quality : 2;
+  enum profile_quality m_quality : 3;
+
+  /* Return true if both values can meaningfully appear in single function
+     body.  We have either all counters in function local or global, otherwise
+     operations between them are not really defined well.  */
+  bool compatible_p (const profile_count other) const
+    {
+      if (!initialized_p () || !other.initialized_p ())
+	return true;
+      if (*this == profile_count::zero ()
+	  || other == profile_count::zero ())
+	return true;
+      return ipa_p () == other.ipa_p ();
+    }
 public:
 
   /* Used for counters which are expected to be never executed.  */
@@ -597,7 +650,7 @@ public:
     {
       profile_count c;
       c.m_val = uninitialized_count;
-      c.m_quality = profile_guessed;
+      c.m_quality = profile_guessed_local;
       return c;
     }
 
@@ -630,6 +683,11 @@ public:
     {
       return m_quality >= profile_adjusted;
     }
+  /* Return true if vlaue can be operated inter-procedurally.  */
+  bool ipa_p () const
+    {
+      return !initialized_p () || m_quality >= profile_guessed_global0;
+    }
 
   /* When merging basic blocks, the two different profile counts are unified.
      Return true if this can be done without losing info about profile.
@@ -671,6 +729,7 @@ public:
 	return profile_count::uninitialized ();
 
       profile_count ret;
+      gcc_checking_assert (compatible_p (other));
       ret.m_val = m_val + other.m_val;
       ret.m_quality = MIN (m_quality, other.m_quality);
       return ret;
@@ -688,6 +747,7 @@ public:
 	return *this = profile_count::uninitialized ();
       else
 	{
+          gcc_checking_assert (compatible_p (other));
 	  m_val += other.m_val;
 	  m_quality = MIN (m_quality, other.m_quality);
 	}
@@ -699,6 +759,7 @@ public:
 	return *this;
       if (!initialized_p () || !other.initialized_p ())
 	return profile_count::uninitialized ();
+      gcc_checking_assert (compatible_p (other));
       profile_count ret;
       ret.m_val = m_val >= other.m_val ? m_val - other.m_val : 0;
       ret.m_quality = MIN (m_quality, other.m_quality);
@@ -712,6 +773,7 @@ public:
 	return *this = profile_count::uninitialized ();
       else
 	{
+          gcc_checking_assert (compatible_p (other));
 	  m_val = m_val >= other.m_val ? m_val - other.m_val: 0;
 	  m_quality = MIN (m_quality, other.m_quality);
 	}
@@ -721,48 +783,115 @@ public:
   /* Return false if profile_count is bogus.  */
   bool verify () const
     {
-      return m_val != uninitialized_count || m_quality == profile_guessed;
+      return m_val != uninitialized_count || m_quality == profile_guessed_local;
     }
 
   /* Comparsions are three-state and conservative.  False is returned if
      the inequality can not be decided.  */
   bool operator< (const profile_count &other) const
     {
-      return initialized_p () && other.initialized_p () && m_val < other.m_val;
+      if (!initialized_p () || !other.initialized_p ())
+	return false;
+      if (*this == profile_count::zero ())
+	return !(other == profile_count::zero ());
+      if (other == profile_count::zero ())
+	return false;
+      gcc_checking_assert (compatible_p (other));
+      return m_val < other.m_val;
     }
   bool operator> (const profile_count &other) const
     {
+      if (!initialized_p () || !other.initialized_p ())
+	return false;
+      if (*this  == profile_count::zero ())
+	return false;
+      if (other == profile_count::zero ())
+	return !(*this == profile_count::zero ());
+      gcc_checking_assert (compatible_p (other));
       return initialized_p () && other.initialized_p () && m_val > other.m_val;
     }
   bool operator< (const gcov_type other) const
     {
+      gcc_checking_assert (ipa_p ());
       gcc_checking_assert (other >= 0);
       return initialized_p () && m_val < (uint64_t) other;
     }
   bool operator> (const gcov_type other) const
     {
+      gcc_checking_assert (ipa_p ());
       gcc_checking_assert (other >= 0);
       return initialized_p () && m_val > (uint64_t) other;
     }
 
   bool operator<= (const profile_count &other) const
     {
-      return initialized_p () && other.initialized_p () && m_val <= other.m_val;
+      if (!initialized_p () || !other.initialized_p ())
+	return false;
+      if (*this == profile_count::zero ())
+	return true;
+      if (other == profile_count::zero ())
+	return (*this == profile_count::zero ());
+      gcc_checking_assert (compatible_p (other));
+      return m_val <= other.m_val;
     }
   bool operator>= (const profile_count &other) const
     {
-      return initialized_p () && other.initialized_p () && m_val >= other.m_val;
+      if (!initialized_p () || !other.initialized_p ())
+	return false;
+      if (other == profile_count::zero ())
+	return true;
+      if (*this == profile_count::zero ())
+	return !(other == profile_count::zero ());
+      gcc_checking_assert (compatible_p (other));
+      return m_val >= other.m_val;
     }
   bool operator<= (const gcov_type other) const
     {
+      gcc_checking_assert (ipa_p ());
       gcc_checking_assert (other >= 0);
       return initialized_p () && m_val <= (uint64_t) other;
     }
   bool operator>= (const gcov_type other) const
     {
+      gcc_checking_assert (ipa_p ());
       gcc_checking_assert (other >= 0);
       return initialized_p () && m_val >= (uint64_t) other;
     }
+  /* Return true when value is not zero and can be used for scaling. 
+     This is different from *this > 0 because that requires counter to
+     be IPA.  */
+  bool nonzero_p () const
+    {
+      return initialized_p () && m_val != 0;
+    }
+
+  /* Make counter forcingly nonzero.  */
+  profile_count force_nonzero () const
+    {
+      if (!initialized_p ())
+	return *this;
+      profile_count ret = *this;
+      if (ret.m_val == 0)
+	ret.m_val = 1;
+      return ret;
+    }
+
+  profile_count max (profile_count other) const
+    {
+      if (!initialized_p ())
+	return other;
+      if (!other.initialized_p ())
+	return *this;
+      if (*this == profile_count::zero ())
+	return other;
+      if (other == profile_count::zero ())
+	return *this;
+      gcc_checking_assert (compatible_p (other));
+      if (m_val < other.m_val || (m_val == other.m_val
+				  && m_quality < other.m_quality))
+	return other;
+      return *this;
+    }
 
   /* PROB is a probability in scale 0...REG_BR_PROB_BASE.  Scale counter
      accordingly.  */
@@ -814,13 +943,13 @@ public:
     }
   profile_count apply_scale (profile_count num, profile_count den) const
     {
-      if (m_val == 0)
+      if (*this == profile_count::zero ())
 	return *this;
-      if (num.m_val == 0)
+      if (num == profile_count::zero ())
 	return num;
       if (!initialized_p () || !num.initialized_p () || !den.initialized_p ())
 	return profile_count::uninitialized ();
-      gcc_checking_assert (den > 0);
+      gcc_checking_assert (den.m_val);
       if (num == den)
 	return *this;
 
@@ -828,7 +957,30 @@ public:
       uint64_t val;
       safe_scale_64bit (m_val, num.m_val, den.m_val, &val);
       ret.m_val = MIN (val, max_count);
-      ret.m_quality = MIN (m_quality, profile_adjusted);
+      ret.m_quality = MIN (MIN (MIN (m_quality, profile_adjusted),
+			        num.m_quality), den.m_quality);
+      if (num.ipa_p () && !ret.ipa_p ())
+	ret.m_quality = MIN (num.m_quality, profile_guessed);
+      return ret;
+    }
+
+  /* Return THIS with quality dropped to GUESSED_LOCAL.  */
+  profile_count guessed_local () const
+    {
+      profile_count ret = *this;
+      if (!initialized_p ())
+	return *this;
+      ret.m_quality = profile_guessed_local;
+      return ret;
+    }
+
+  /* We know that profile is globally0 but keep local profile if present.  */
+  profile_count global0 () const
+    {
+      profile_count ret = *this;
+      if (!initialized_p ())
+	return *this;
+      ret.m_quality = profile_guessed_global0;
       return ret;
     }
 
@@ -836,10 +988,21 @@ public:
   profile_count guessed () const
     {
       profile_count ret = *this;
-      ret.m_quality = profile_guessed;
+      ret.m_quality = MIN (ret.m_quality, profile_guessed);
       return ret;
     }
 
+  /* Return variant of profile counte which is always safe to compare
+     acorss functions.  */
+  profile_count ipa () const
+    {
+      if (m_quality > profile_guessed_global0)
+	return *this;
+      if (m_quality == profile_guessed_global0)
+	return profile_count::zero ();
+      return profile_count::uninitialized ();
+    }
+
   /* Return THIS with quality dropped to AFDO.  */
   profile_count afdo () const
     {
@@ -852,21 +1015,26 @@ public:
      OVERALL.  */
   profile_probability probability_in (const profile_count overall) const
     {
-      if (!m_val)
+      if (*this == profile_count::zero ())
 	return profile_probability::never ();
       if (!initialized_p () || !overall.initialized_p ()
 	  || !overall.m_val)
 	return profile_probability::uninitialized ();
       profile_probability ret;
-      if (overall < m_val)
+      gcc_checking_assert (compatible_p (overall));
+
+      if (overall.m_val < m_val)
 	ret.m_val = profile_probability::max_probability;
       else
 	ret.m_val = RDIV (m_val * profile_probability::max_probability,
 			  overall.m_val);
-      ret.m_quality = MIN (m_quality, overall.m_quality);
+      ret.m_quality = MAX (MIN (m_quality, overall.m_quality), profile_guessed);
       return ret;
     }
 
+  int to_frequency (struct function *fun) const;
+  int to_cgraph_frequency (profile_count entry_bb_count) const;
+
   /* Output THIS to F.  */
   void dump (FILE *f) const;
 
diff --git a/gcc/profile.c b/gcc/profile.c
index 6d40241a37b..2b30a9e6754 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -476,38 +476,6 @@ read_profile_edge_counts (gcov_type *exec_counts)
     return num_edges;
 }
 
-#define OVERLAP_BASE 10000
-
-/* Compare the static estimated profile to the actual profile, and
-   return the "degree of overlap" measure between them.
-
-   Degree of overlap is a number between 0 and OVERLAP_BASE. It is
-   the sum of each basic block's minimum relative weights between
-   two profiles. And overlap of OVERLAP_BASE means two profiles are
-   identical.  */
-
-static int
-compute_frequency_overlap (void)
-{
-  gcov_type count_total = 0, freq_total = 0;
-  int overlap = 0;
-  basic_block bb;
-
-  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
-    {
-      count_total += bb_gcov_count (bb);
-      freq_total += bb->frequency;
-    }
-
-  if (count_total == 0 || freq_total == 0)
-    return 0;
-
-  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
-    overlap += MIN (bb_gcov_count (bb) * OVERLAP_BASE / count_total,
-		    bb->frequency * OVERLAP_BASE / freq_total);
-
-  return overlap;
-}
 
 /* Compute the branch probabilities for the various branches.
    Annotate them accordingly.  
@@ -676,14 +644,6 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
 	    }
 	}
     }
-  if (dump_file)
-    {
-      int overlap = compute_frequency_overlap ();
-      gimple_dump_cfg (dump_file, dump_flags);
-      fprintf (dump_file, "Static profile overlap: %d.%d%%\n",
-	       overlap / (OVERLAP_BASE / 100),
-	       overlap % (OVERLAP_BASE / 100));
-    }
 
   total_num_passes += passes;
   if (dump_file)
@@ -829,15 +789,18 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
 	}
     }
 
-  FOR_ALL_BB_FN (bb, cfun)
-    {
-      edge e;
-      edge_iterator ei;
-
+  /* If we have real data, use them!  */
+  if (bb_gcov_count (ENTRY_BLOCK_PTR_FOR_FN (cfun))
+      || !flag_guess_branch_prob)
+    FOR_ALL_BB_FN (bb, cfun)
       bb->count = profile_count::from_gcov_type (bb_gcov_count (bb));
-      FOR_EACH_EDGE (e, ei, bb->succs)
-        e->count = profile_count::from_gcov_type (edge_gcov_count (e));
-    }
+  /* If function was not trained, preserve local estimates including statically
+     determined zero counts.  */
+  else
+    FOR_ALL_BB_FN (bb, cfun)
+      if (!(bb->count == profile_count::zero ()))
+        bb->count = bb->count.global0 ();
+
   bb_gcov_counts.release ();
   delete edge_gcov_counts;
   edge_gcov_counts = NULL;
diff --git a/gcc/recog.c b/gcc/recog.c
index 0ac16b9b87f..05e69134236 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -1007,7 +1007,7 @@ general_operand (rtx op, machine_mode mode)
 
 	 ??? This is a kludge.  */
       if (!reload_completed
-	  && maybe_nonzero (SUBREG_BYTE (op))
+	  && may_ne (SUBREG_BYTE (op), 0)
 	  && MEM_P (sub))
 	return 0;
 
@@ -1380,7 +1380,7 @@ indirect_operand (rtx op, machine_mode mode)
 	 operand.  */
       poly_int64 offset;
       rtx addr = strip_offset (XEXP (SUBREG_REG (op), 0), &offset);
-      return (known_zero (offset + SUBREG_BYTE (op))
+      return (must_eq (offset + SUBREG_BYTE (op), 0)
 	      && general_operand (addr, Pmode));
     }
 
@@ -1967,7 +1967,7 @@ offsettable_address_addr_space_p (int strictp, machine_mode mode, rtx y,
      Clearly that depends on the situation in which it's being used.
      However, the current situation in which we test 0xffffffff is
      less than ideal.  Caveat user.  */
-  if (known_zero (mode_sz))
+  if (must_eq (mode_sz, 0))
     mode_sz = BIGGEST_ALIGNMENT / BITS_PER_UNIT;
 
   /* If the expression contains a constant term,
@@ -3379,6 +3379,7 @@ peep2_attempt (basic_block bb, rtx_insn *insn, int match_len, rtx_insn *attempt)
 	  case REG_NORETURN:
 	  case REG_SETJMP:
 	  case REG_TM:
+	  case REG_CALL_NOCF_CHECK:
 	    add_reg_note (new_insn, REG_NOTE_KIND (note),
 			  XEXP (note, 0));
 	    break;
@@ -3860,7 +3861,7 @@ const pass_data pass_data_split_all_insns =
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   0, /* properties_required */
-  0, /* properties_provided */
+  PROP_rtl_split_insns, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
   0, /* todo_flags_finish */
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index a542990cde2..d83fc45ef72 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -232,3 +232,10 @@ REG_NOTE (STACK_CHECK)
    The decl might not be available in the call due to splitting of the call
    insn.  This note is a SYMBOL_REF.  */
 REG_NOTE (CALL_DECL)
+
+/* Indicate that a call should not be verified for control-flow consistency.
+   The target address of the call is assumed as a valid address and no check
+   to validate a branch to the target address is needed.  The call is marked
+   when a called function has a 'notrack' attribute.  This note is used by the
+   compiler when the option -fcf-protection=branch is specified.  */
+REG_NOTE (CALL_NOCF_CHECK)
diff --git a/gcc/reg-stack.c b/gcc/reg-stack.c
index f2381067f5e..83fc4762671 100644
--- a/gcc/reg-stack.c
+++ b/gcc/reg-stack.c
@@ -262,7 +262,7 @@ static bool move_for_stack_reg (rtx_insn *, stack_ptr, rtx);
 static bool move_nan_for_stack_reg (rtx_insn *, stack_ptr, rtx);
 static int swap_rtx_condition_1 (rtx);
 static int swap_rtx_condition (rtx_insn *);
-static void compare_for_stack_reg (rtx_insn *, stack_ptr, rtx);
+static void compare_for_stack_reg (rtx_insn *, stack_ptr, rtx, bool);
 static bool subst_stack_regs_pat (rtx_insn *, stack_ptr, rtx);
 static void subst_asm_stack_regs (rtx_insn *, stack_ptr);
 static bool subst_stack_regs (rtx_insn *, stack_ptr);
@@ -1325,7 +1325,8 @@ swap_rtx_condition (rtx_insn *insn)
    set up.  */
 
 static void
-compare_for_stack_reg (rtx_insn *insn, stack_ptr regstack, rtx pat_src)
+compare_for_stack_reg (rtx_insn *insn, stack_ptr regstack,
+		       rtx pat_src, bool can_pop_second_op)
 {
   rtx *src1, *src2;
   rtx src1_note, src2_note;
@@ -1366,8 +1367,18 @@ compare_for_stack_reg (rtx_insn *insn, stack_ptr regstack, rtx pat_src)
 
   if (src1_note)
     {
-      pop_stack (regstack, REGNO (XEXP (src1_note, 0)));
-      replace_reg (&XEXP (src1_note, 0), FIRST_STACK_REG);
+      if (*src2 == CONST0_RTX (GET_MODE (*src2)))
+	{
+	  /* This is `ftst' insn that can't pop register.  */
+	  remove_regno_note (insn, REG_DEAD, REGNO (XEXP (src1_note, 0)));
+	  emit_pop_insn (insn, regstack, XEXP (src1_note, 0),
+			 EMIT_AFTER);
+	}
+      else
+	{
+	  pop_stack (regstack, REGNO (XEXP (src1_note, 0)));
+	  replace_reg (&XEXP (src1_note, 0), FIRST_STACK_REG);
+	}
     }
 
   /* If the second operand dies, handle that.  But if the operands are
@@ -1384,7 +1395,7 @@ compare_for_stack_reg (rtx_insn *insn, stack_ptr regstack, rtx pat_src)
 	 at top (FIRST_STACK_REG) now.  */
 
       if (get_hard_regnum (regstack, XEXP (src2_note, 0)) == FIRST_STACK_REG
-	  && src1_note)
+	  && src1_note && can_pop_second_op)
 	{
 	  pop_stack (regstack, REGNO (XEXP (src2_note, 0)));
 	  replace_reg (&XEXP (src2_note, 0), FIRST_STACK_REG + 1);
@@ -1549,10 +1560,6 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr regstack, rtx pat)
 
 	switch (GET_CODE (pat_src))
 	  {
-	  case COMPARE:
-	    compare_for_stack_reg (insn, regstack, pat_src);
-	    break;
-
 	  case CALL:
 	    {
 	      int count;
@@ -1953,31 +1960,35 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr regstack, rtx pat)
 		replace_reg (src2, FIRST_STACK_REG + 1);
 		break;
 
-	      case UNSPEC_SAHF:
-		/* (unspec [(unspec [(compare)] UNSPEC_FNSTSW)] UNSPEC_SAHF)
-		   The combination matches the PPRO fcomi instruction.  */
-
-		pat_src = XVECEXP (pat_src, 0, 0);
-		gcc_assert (GET_CODE (pat_src) == UNSPEC);
-		gcc_assert (XINT (pat_src, 1) == UNSPEC_FNSTSW);
-		/* Fall through.  */
-
 	      case UNSPEC_FNSTSW:
 		/* Combined fcomp+fnstsw generated for doing well with
 		   CSE.  When optimizing this would have been broken
 		   up before now.  */
 
 		pat_src = XVECEXP (pat_src, 0, 0);
-		gcc_assert (GET_CODE (pat_src) == COMPARE);
+		if (GET_CODE (pat_src) == COMPARE)
+		  goto do_compare;
 
-		compare_for_stack_reg (insn, regstack, pat_src);
-		break;
+		/* Fall through.  */
+
+	      case UNSPEC_NOTRAP:
+
+		pat_src = XVECEXP (pat_src, 0, 0);
+		gcc_assert (GET_CODE (pat_src) == COMPARE);
+		goto do_compare;
 
 	      default:
 		gcc_unreachable ();
 	      }
 	    break;
 
+	  case COMPARE:
+	  do_compare:
+	    /* `fcomi' insn can't pop two regs.  */
+	    compare_for_stack_reg (insn, regstack, pat_src,
+				   REGNO (*dest) != FLAGS_REG);
+	    break;
+
 	  case IF_THEN_ELSE:
 	    /* This insn requires the top of stack to be the destination.  */
 
@@ -2948,9 +2959,9 @@ better_edge (edge e1, edge e2)
   if (EDGE_FREQUENCY (e1) < EDGE_FREQUENCY (e2))
     return e2;
 
-  if (e1->count > e2->count)
+  if (e1->count () > e2->count ())
     return e1;
-  if (e1->count < e2->count)
+  if (e1->count () < e2->count ())
     return e2;
 
   /* Prefer critical edges to minimize inserting compensation code on
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 55f4ea36a7d..4ca10f58a58 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -345,8 +345,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
      We can't properly represent the latter case in our tables, so don't
      record anything then.  */
   else if (sn < hard_regno_nregs (sr, vd->e[sr].mode)
-	   && maybe_nonzero (subreg_lowpart_offset (GET_MODE (dest),
-						    vd->e[sr].mode)))
+	   && may_ne (subreg_lowpart_offset (GET_MODE (dest),
+					     vd->e[sr].mode), 0U))
     return;
 
   /* If SRC had been assigned a mode narrower than the copy, we can't
@@ -870,8 +870,8 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
 	      /* And likewise, if we are narrowing on big endian the transformation
 		 is also invalid.  */
 	      if (REG_NREGS (src) < hard_regno_nregs (regno, vd->e[regno].mode)
-		  && maybe_nonzero (subreg_lowpart_offset (mode,
-							   vd->e[regno].mode)))
+		  && may_ne (subreg_lowpart_offset (mode,
+						    vd->e[regno].mode), 0U))
 		goto no_move_special_case;
 	    }
 
diff --git a/gcc/regs.h b/gcc/regs.h
index 8225355e3f3..2dc94a929d3 100644
--- a/gcc/regs.h
+++ b/gcc/regs.h
@@ -130,8 +130,10 @@ extern size_t reg_info_p_size;
    frequency.  */
 #define REG_FREQ_FROM_BB(bb) (optimize_function_for_size_p (cfun)	      \
 			      ? REG_FREQ_MAX				      \
-			      : ((bb)->frequency * REG_FREQ_MAX / BB_FREQ_MAX)\
-			      ? ((bb)->frequency * REG_FREQ_MAX / BB_FREQ_MAX)\
+			      : ((bb)->count.to_frequency (cfun)	      \
+				* REG_FREQ_MAX / BB_FREQ_MAX)		      \
+			      ? ((bb)->count.to_frequency (cfun)	      \
+				 * REG_FREQ_MAX / BB_FREQ_MAX)		      \
 			      : 1)
 
 /* Indexed by N, gives number of insns in which register N dies.
diff --git a/gcc/reload.c b/gcc/reload.c
index c09a9c6a3f8..744cd51a2b0 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -1785,7 +1785,7 @@ combine_reloads (void)
 	&& (ira_reg_class_max_nregs [(int)rld[i].rclass][(int) rld[i].inmode]
 	    == ira_reg_class_max_nregs [(int) rld[output_reload].rclass]
 				       [(int) rld[output_reload].outmode])
-	&& known_zero (rld[i].inc)
+	&& must_eq (rld[i].inc, 0)
 	&& rld[i].reg_rtx == 0
 	/* Don't combine two reloads with different secondary
 	   memory locations.  */
@@ -6170,7 +6170,7 @@ find_reloads_subreg_address (rtx x, int opnum, enum reload_type type,
 				   XEXP (tem, 0), &XEXP (tem, 0),
 				   opnum, type, ind_levels, insn);
   /* ??? Do we need to handle nonzero offsets somehow?  */
-  if (known_zero (offset) && !rtx_equal_p (tem, orig))
+  if (must_eq (offset, 0) && !rtx_equal_p (tem, orig))
     push_reg_equiv_alt_mem (regno, tem);
 
   /* For some processors an address may be valid in the original mode but
@@ -7120,7 +7120,7 @@ find_inc_amount (rtx x, rtx inced)
       if (fmt[i] == 'e')
 	{
 	  poly_int64 tem = find_inc_amount (XEXP (x, i), inced);
-	  if (maybe_nonzero (tem))
+	  if (may_ne (tem, 0))
 	    return tem;
 	}
       if (fmt[i] == 'E')
@@ -7129,7 +7129,7 @@ find_inc_amount (rtx x, rtx inced)
 	  for (j = XVECLEN (x, i) - 1; j >= 0; j--)
 	    {
 	      poly_int64 tem = find_inc_amount (XVECEXP (x, i, j), inced);
-	      if (maybe_nonzero (tem))
+	      if (may_ne (tem, 0))
 		return tem;
 	    }
 	}
@@ -7290,7 +7290,7 @@ debug_reload_to_stream (FILE *f)
       if (rld[r].nongroup)
 	fprintf (f, ", nongroup");
 
-      if (maybe_nonzero (rld[r].inc))
+      if (may_ne (rld[r].inc, 0))
 	{
 	  fprintf (f, ", inc by ");
 	  print_dec (rld[r].inc, f, SIGNED);
diff --git a/gcc/reload1.c b/gcc/reload1.c
index 2ec09c4a7cc..902d940245d 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -955,7 +955,7 @@ reload (rtx_insn *first, int global)
       if (caller_save_needed)
 	setup_save_areas ();
 
-      if (maybe_nonzero (starting_frame_size) && crtl->stack_alignment_needed)
+      if (may_ne (starting_frame_size, 0) && crtl->stack_alignment_needed)
 	{
 	  /* If we have a stack frame, we must align it now.  The
 	     stack size may be a part of the offset computation for
@@ -2196,7 +2196,7 @@ alter_reg (int i, int from_reg, bool dont_share_p)
 	  if (BYTES_BIG_ENDIAN)
 	    {
 	      adjust = inherent_size - total_size;
-	      if (maybe_nonzero (adjust))
+	      if (may_ne (adjust, 0))
 		{
 		  poly_uint64 total_bits = total_size * BITS_PER_UNIT;
 		  machine_mode mem_mode
@@ -2254,7 +2254,7 @@ alter_reg (int i, int from_reg, bool dont_share_p)
 	  if (BYTES_BIG_ENDIAN)
 	    {
 	      adjust = GET_MODE_SIZE (mode) - total_size;
-	      if (maybe_nonzero (adjust))
+	      if (may_ne (adjust, 0))
 		{
 		  poly_uint64 total_bits = total_size * BITS_PER_UNIT;
 		  machine_mode mem_mode
@@ -3383,7 +3383,7 @@ eliminate_regs_in_insn (rtx_insn *insn, int replace)
 	       increase the cost of the insn by replacing a simple REG
 	       with (plus (reg sp) CST).  So try only when we already
 	       had a PLUS before.  */
-	    if (known_zero (offset) || plus_src)
+	    if (must_eq (offset, 0) || plus_src)
 	      {
 		rtx new_src = plus_constant (GET_MODE (to_rtx),
 					     to_rtx, offset);
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 3c297eb501f..79a5ae197c1 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -494,7 +494,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, poly_int64 offset, poly_int64 size,
 
 	  if (may_lt (offset, 0))
 	    return 1;
-	  if (known_zero (offset))
+	  if (must_eq (offset, 0))
 	    return 0;
 	  if (!known_size_p (size))
 	    return 1;
@@ -649,7 +649,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, poly_int64 offset, poly_int64 size,
       if (XEXP (x, 0) == pic_offset_table_rtx
 	  && GET_CODE (XEXP (x, 1)) == CONST
 	  && GET_CODE (XEXP (XEXP (x, 1), 0)) == UNSPEC
-	  && known_zero (offset))
+	  && must_eq (offset, 0))
 	return 0;
 
       /* - or it is an address that can't trap plus a constant integer.  */
@@ -3641,7 +3641,7 @@ subreg_size_offset_from_lsb (poly_uint64 outer_bytes, poly_uint64 inner_bytes,
   gcc_checking_assert (ordered_p (outer_bytes, inner_bytes));
   if (may_gt (outer_bytes, inner_bytes))
     {
-      gcc_checking_assert (known_zero (lsb_shift));
+      gcc_checking_assert (must_eq (lsb_shift, 0U));
       return 0;
     }
 
@@ -3745,7 +3745,7 @@ subreg_get_info (unsigned int xregno, machine_mode xmode,
   gcc_checking_assert (ordered_p (xsize, ysize));
 
   /* Paradoxical subregs are otherwise valid.  */
-  if (!rknown && known_zero (offset) && may_gt (ysize, xsize))
+  if (!rknown && must_eq (offset, 0U) && may_gt (ysize, xsize))
     {
       info->representable_p = true;
       /* If this is a big endian paradoxical subreg, which uses more
@@ -3822,7 +3822,7 @@ subreg_get_info (unsigned int xregno, machine_mode xmode,
       info->representable_p = true;
       rknown = true;
 
-      if (known_zero (offset) || nregs_xmode == nregs_ymode)
+      if (must_eq (offset, 0U) || nregs_xmode == nregs_ymode)
 	{
 	  info->offset = 0;
 	  info->nregs = nregs_ymode;
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 9d963f05c21..00e7ae031e6 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -424,8 +424,8 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE,
 		      "__ubsan_handle_vla_bound_not_positive",
 		      BT_FN_VOID_PTR_PTR,
 		      ATTR_COLD_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH,
-		      "__ubsan_handle_type_mismatch",
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1,
+		      "__ubsan_handle_type_mismatch_v1",
 		      BT_FN_VOID_PTR_PTR,
 		      ATTR_COLD_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_ADD_OVERFLOW,
@@ -464,8 +464,8 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE_ABORT,
 		      "__ubsan_handle_vla_bound_not_positive_abort",
 		      BT_FN_VOID_PTR_PTR,
 		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT,
-		      "__ubsan_handle_type_mismatch_abort",
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT,
+		      "__ubsan_handle_type_mismatch_v1_abort",
 		      BT_FN_VOID_PTR_PTR,
 		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_ADD_OVERFLOW_ABORT,
@@ -516,12 +516,20 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_NONNULL_ARG_ABORT,
 		      "__ubsan_handle_nonnull_arg_abort",
 		      BT_FN_VOID_PTR,
 		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN,
-		      "__ubsan_handle_nonnull_return",
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1,
+		      "__ubsan_handle_nonnull_return_v1",
+		      BT_FN_VOID_PTR_PTR,
+		      ATTR_COLD_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1_ABORT,
+		      "__ubsan_handle_nonnull_return_v1_abort",
+		      BT_FN_VOID_PTR_PTR,
+		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN,
+		      "__ubsan_handle_invalid_builtin",
 		      BT_FN_VOID_PTR,
 		      ATTR_COLD_NOTHROW_LEAF_LIST)
-DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_ABORT,
-		      "__ubsan_handle_nonnull_return_abort",
+DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN_ABORT,
+		      "__ubsan_handle_invalid_builtin_abort",
 		      BT_FN_VOID_PTR,
 		      ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HANDLE_DYNAMIC_TYPE_CACHE_MISS,
diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c
index baef4d05f0d..df933f6516c 100644
--- a/gcc/sbitmap.c
+++ b/gcc/sbitmap.c
@@ -180,6 +180,8 @@ sbitmap_vector_alloc (unsigned int n_vecs, unsigned int n_elms)
 void
 bitmap_copy (sbitmap dst, const_sbitmap src)
 {
+  gcc_checking_assert (src->size <= dst->size);
+
   memcpy (dst->elms, src->elms, sizeof (SBITMAP_ELT_TYPE) * dst->size);
 }
 
@@ -187,6 +189,8 @@ bitmap_copy (sbitmap dst, const_sbitmap src)
 int
 bitmap_equal_p (const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+
   return !memcmp (a->elms, b->elms, sizeof (SBITMAP_ELT_TYPE) * a->size);
 }
 
@@ -211,6 +215,8 @@ bitmap_clear_range (sbitmap bmap, unsigned int start, unsigned int count)
   if (count == 0)
     return;
 
+  bitmap_check_index (bmap, start + count - 1);
+
   unsigned int start_word = start / SBITMAP_ELT_BITS;
   unsigned int start_bitno = start % SBITMAP_ELT_BITS;
 
@@ -267,6 +273,8 @@ bitmap_set_range (sbitmap bmap, unsigned int start, unsigned int count)
   if (count == 0)
     return;
 
+  bitmap_check_index (bmap, start + count - 1);
+
   unsigned int start_word = start / SBITMAP_ELT_BITS;
   unsigned int start_bitno = start % SBITMAP_ELT_BITS;
 
@@ -324,6 +332,8 @@ bool
 bitmap_bit_in_range_p (const_sbitmap bmap, unsigned int start, unsigned int end)
 {
   gcc_checking_assert (start <= end);
+  bitmap_check_index (bmap, end);
+
   unsigned int start_word = start / SBITMAP_ELT_BITS;
   unsigned int start_bitno = start % SBITMAP_ELT_BITS;
 
@@ -467,6 +477,9 @@ bitmap_vector_ones (sbitmap *bmap, unsigned int n_vecs)
 bool
 bitmap_ior_and_compl (sbitmap dst, const_sbitmap a, const_sbitmap b, const_sbitmap c)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, c);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -489,6 +502,8 @@ bitmap_ior_and_compl (sbitmap dst, const_sbitmap a, const_sbitmap b, const_sbitm
 void
 bitmap_not (sbitmap dst, const_sbitmap src)
 {
+  bitmap_check_sizes (src, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr srcp = src->elms;
@@ -510,6 +525,9 @@ bitmap_not (sbitmap dst, const_sbitmap src)
 void
 bitmap_and_compl (sbitmap dst, const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, dst);
+
   unsigned int i, dst_size = dst->size;
   unsigned int min_size = dst->size;
   sbitmap_ptr dstp = dst->elms;
@@ -537,6 +555,8 @@ bitmap_and_compl (sbitmap dst, const_sbitmap a, const_sbitmap b)
 bool
 bitmap_intersect_p (const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+
   const_sbitmap_ptr ap = a->elms;
   const_sbitmap_ptr bp = b->elms;
   unsigned int i, n;
@@ -555,6 +575,9 @@ bitmap_intersect_p (const_sbitmap a, const_sbitmap b)
 bool
 bitmap_and (sbitmap dst, const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -577,6 +600,9 @@ bitmap_and (sbitmap dst, const_sbitmap a, const_sbitmap b)
 bool
 bitmap_xor (sbitmap dst, const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -599,6 +625,9 @@ bitmap_xor (sbitmap dst, const_sbitmap a, const_sbitmap b)
 bool
 bitmap_ior (sbitmap dst, const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -620,6 +649,8 @@ bitmap_ior (sbitmap dst, const_sbitmap a, const_sbitmap b)
 bool
 bitmap_subset_p (const_sbitmap a, const_sbitmap b)
 {
+  bitmap_check_sizes (a, b);
+
   unsigned int i, n = a->size;
   const_sbitmap_ptr ap, bp;
 
@@ -636,6 +667,10 @@ bitmap_subset_p (const_sbitmap a, const_sbitmap b)
 bool
 bitmap_or_and (sbitmap dst, const_sbitmap a, const_sbitmap b, const_sbitmap c)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, c);
+  bitmap_check_sizes (c, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -659,6 +694,10 @@ bitmap_or_and (sbitmap dst, const_sbitmap a, const_sbitmap b, const_sbitmap c)
 bool
 bitmap_and_or (sbitmap dst, const_sbitmap a, const_sbitmap b, const_sbitmap c)
 {
+  bitmap_check_sizes (a, b);
+  bitmap_check_sizes (b, c);
+  bitmap_check_sizes (c, dst);
+
   unsigned int i, n = dst->size;
   sbitmap_ptr dstp = dst->elms;
   const_sbitmap_ptr ap = a->elms;
@@ -823,11 +862,64 @@ namespace selftest {
 
 /* Selftests for sbitmaps.  */
 
+/* Checking function that uses both bitmap_bit_in_range_p and
+   loop of bitmap_bit_p and verifies consistent results.  */
+
+static bool
+bitmap_bit_in_range_p_checking (sbitmap s, unsigned int start,
+				unsigned end)
+{
+  bool r1 = bitmap_bit_in_range_p (s, start, end);
+  bool r2 = false;
+
+  for (unsigned int i = start; i <= end; i++)
+    if (bitmap_bit_p (s, i))
+      {
+	r2 = true;
+	break;
+      }
+
+  ASSERT_EQ (r1, r2);
+  return r1;
+}
+
+/* Verify bitmap_set_range functions for sbitmap.  */
+
+static void
+test_set_range ()
+{
+  sbitmap s = sbitmap_alloc (16);
+  bitmap_clear (s);
+
+  bitmap_set_range (s, 0, 1);
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 0, 0));
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 1, 15));
+  bitmap_set_range (s, 15, 1);
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 1, 14));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 15, 15));
+
+  s = sbitmap_alloc (1024);
+  bitmap_clear (s);
+  bitmap_set_range (s, 512, 1);
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 0, 511));
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 513, 1023));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 512, 512));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 508, 512));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 508, 513));
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 508, 511));
+
+  bitmap_clear (s);
+  bitmap_set_range (s, 512, 64);
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 0, 511));
+  ASSERT_FALSE (bitmap_bit_in_range_p_checking (s, 512 + 64, 1023));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 512, 512));
+  ASSERT_TRUE (bitmap_bit_in_range_p_checking (s, 512 + 63, 512 + 63));
+}
 
-/* Verify range functions for sbitmap.  */
+/* Verify bitmap_bit_in_range_p functions for sbitmap.  */
 
 static void
-test_range_functions ()
+test_bit_in_range ()
 {
   sbitmap s = sbitmap_alloc (1024);
   bitmap_clear (s);
@@ -900,7 +992,8 @@ test_range_functions ()
 void
 sbitmap_c_tests ()
 {
-  test_range_functions ();
+  test_set_range ();
+  test_bit_in_range ();
 }
 
 } // namespace selftest
diff --git a/gcc/sbitmap.h b/gcc/sbitmap.h
index ff52e939bf3..a5ff0685e43 100644
--- a/gcc/sbitmap.h
+++ b/gcc/sbitmap.h
@@ -96,10 +96,29 @@ struct simple_bitmap_def
 /* Return the number of bits in BITMAP.  */
 #define SBITMAP_SIZE(BITMAP) ((BITMAP)->n_bits)
 
+/* Verify that access at INDEX in bitmap MAP is valid.  */ 
+
+static inline void
+bitmap_check_index (const_sbitmap map, int index)
+{
+  gcc_checking_assert (index >= 0);
+  gcc_checking_assert ((unsigned int)index < map->n_bits);
+}
+
+/* Verify that bitmaps A and B have same size.  */ 
+
+static inline void
+bitmap_check_sizes (const_sbitmap a, const_sbitmap b)
+{
+  gcc_checking_assert (a->n_bits == b->n_bits);
+}
+
 /* Test if bit number bitno in the bitmap is set.  */
 static inline SBITMAP_ELT_TYPE
 bitmap_bit_p (const_sbitmap map, int bitno)
 {
+  bitmap_check_index (map, bitno);
+
   size_t i = bitno / SBITMAP_ELT_BITS;
   unsigned int s = bitno % SBITMAP_ELT_BITS;
   return (map->elms[i] >> s) & (SBITMAP_ELT_TYPE) 1;
@@ -110,6 +129,8 @@ bitmap_bit_p (const_sbitmap map, int bitno)
 static inline void
 bitmap_set_bit (sbitmap map, int bitno)
 {
+  bitmap_check_index (map, bitno);
+
   map->elms[bitno / SBITMAP_ELT_BITS]
     |= (SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS;
 }
@@ -119,6 +140,8 @@ bitmap_set_bit (sbitmap map, int bitno)
 static inline void
 bitmap_clear_bit (sbitmap map, int bitno)
 {
+  bitmap_check_index (map, bitno);
+
   map->elms[bitno / SBITMAP_ELT_BITS]
     &= ~((SBITMAP_ELT_TYPE) 1 << (bitno) % SBITMAP_ELT_BITS);
 }
@@ -148,6 +171,8 @@ static inline void
 bmp_iter_set_init (sbitmap_iterator *i, const_sbitmap bmp,
 		   unsigned int min, unsigned *bit_no ATTRIBUTE_UNUSED)
 {
+  bitmap_check_index (bmp, min);
+
   i->word_num = min / (unsigned int) SBITMAP_ELT_BITS;
   i->bit_num = min;
   i->size = bmp->size;
diff --git a/gcc/sched-ebb.c b/gcc/sched-ebb.c
index a0422f4b1ba..e51749c60cc 100644
--- a/gcc/sched-ebb.c
+++ b/gcc/sched-ebb.c
@@ -231,11 +231,9 @@ rank (rtx_insn *insn1, rtx_insn *insn2)
   basic_block bb1 = BLOCK_FOR_INSN (insn1);
   basic_block bb2 = BLOCK_FOR_INSN (insn2);
 
-  if (bb1->count > bb2->count
-      || bb1->frequency > bb2->frequency)
+  if (bb1->count > bb2->count)
     return -1;
-  if (bb1->count < bb2->count
-      || bb1->frequency < bb2->frequency)
+  if (bb1->count < bb2->count)
     return 1;
   return 0;
 }
diff --git a/gcc/sched-int.h b/gcc/sched-int.h
index 2af8f9fc32c..6832589e3d0 100644
--- a/gcc/sched-int.h
+++ b/gcc/sched-int.h
@@ -819,15 +819,8 @@ struct autopref_multipass_data_
   /* Base part of memory address.  */
   rtx base;
 
-  /* Memory offsets from the base.  For single simple sets
-     only min_offset is valid.  For multi-set insns min_offset
-     and max_offset record the minimum and maximum offsets from the same
-     base among the sets inside the PARALLEL.  */
-  int min_offset;
-  int max_offset;
-
-  /* True if this is a load/store-multiple instruction.  */
-  bool multi_mem_insn_p;
+  /* Memory offsets from the base.  */
+  int offset;
 
   /* Entry status.  */
   enum autopref_multipass_data_status status;
diff --git a/gcc/sdbout.c b/gcc/sdbout.c
deleted file mode 100644
index acd25a3c765..00000000000
--- a/gcc/sdbout.c
+++ /dev/null
@@ -1,1661 +0,0 @@
-/* Output sdb-format symbol table information from GNU compiler.
-   Copyright (C) 1988-2017 Free Software Foundation, Inc.
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation; either version 3, or (at your option) any later
-version.
-
-GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-/*  mike@tredysvr.Tredydev.Unisys.COM says:
-I modified the struct.c example and have a nm of a .o resulting from the
-AT&T C compiler.  From the example below I would conclude the following:
-
-1. All .defs from structures are emitted as scanned.  The example below
-   clearly shows the symbol table entries for BoxRec2 are after the first
-   function.
-
-2. All functions and their locals (including statics) are emitted as scanned.
-
-3. All nested unnamed union and structure .defs must be emitted before
-   the structure in which they are nested.  The AT&T assembler is a
-   one pass beast as far as symbolics are concerned.
-
-4. All structure .defs are emitted before the typedefs that refer to them.
-
-5. All top level static and external variable definitions are moved to the
-   end of file with all top level statics occurring first before externs.
-
-6. All undefined references are at the end of the file.
-*/
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "gsyms.h"
-#include "tm.h"
-#include "debug.h"
-#include "tree.h"
-#include "varasm.h"
-#include "stor-layout.h"
-
-static GTY(()) tree anonymous_types;
-
-/* Counter to generate unique "names" for nameless struct members.  */
-
-static GTY(()) int unnamed_struct_number;
-
-/* Declarations whose debug info was deferred till end of compilation.  */
-
-static GTY(()) vec<tree, va_gc> *deferred_global_decls;
-
-/* The C front end may call sdbout_symbol before sdbout_init runs.
-   We save all such decls in this list and output them when we get
-   to sdbout_init.  */
-
-static GTY(()) tree preinit_symbols;
-static GTY(()) bool sdbout_initialized;
-
-#include "rtl.h"
-#include "regs.h"
-#include "function.h"
-#include "memmodel.h"
-#include "emit-rtl.h"
-#include "flags.h"
-#include "insn-config.h"
-#include "reload.h"
-#include "output.h"
-#include "diagnostic-core.h"
-#include "tm_p.h"
-#include "langhooks.h"
-#include "target.h"
-
-/* 1 if PARM is passed to this function in memory.  */
-
-#define PARM_PASSED_IN_MEMORY(PARM) \
- (MEM_P (DECL_INCOMING_RTL (PARM)))
-
-/* A C expression for the integer offset value of an automatic variable
-   (C_AUTO) having address X (an RTX).  */
-#ifndef DEBUGGER_AUTO_OFFSET
-#define DEBUGGER_AUTO_OFFSET(X) \
-  (GET_CODE (X) == PLUS ? INTVAL (XEXP (X, 1)) : 0)
-#endif
-
-/* A C expression for the integer offset value of an argument (C_ARG)
-   having address X (an RTX).  The nominal offset is OFFSET.  */
-#ifndef DEBUGGER_ARG_OFFSET
-#define DEBUGGER_ARG_OFFSET(OFFSET, X) (OFFSET)
-#endif
-
-/* Line number of beginning of current function, minus one.
-   Negative means not in a function or not using sdb.  */
-
-int sdb_begin_function_line = -1;
-
-
-extern FILE *asm_out_file;
-
-extern tree current_function_decl;
-
-#include "sdbout.h"
-
-static void sdbout_init			(const char *);
-static void sdbout_finish		(const char *);
-static void sdbout_start_source_file	(unsigned int, const char *);
-static void sdbout_end_source_file	(unsigned int);
-static void sdbout_begin_block		(unsigned int, unsigned int);
-static void sdbout_end_block		(unsigned int, unsigned int);
-static void sdbout_source_line		(unsigned int, unsigned int,
-					 const char *, int, bool);
-static void sdbout_end_epilogue		(unsigned int, const char *);
-static void sdbout_early_global_decl	(tree);
-static void sdbout_late_global_decl	(tree);
-static void sdbout_begin_prologue	(unsigned int, unsigned int,
-					 const char *);
-static void sdbout_end_prologue		(unsigned int, const char *);
-static void sdbout_begin_function	(tree);
-static void sdbout_end_function		(unsigned int);
-static void sdbout_toplevel_data	(tree);
-static void sdbout_label		(rtx_code_label *);
-static char *gen_fake_label		(void);
-static int plain_type			(tree);
-static int template_name_p		(tree);
-static void sdbout_record_type_name	(tree);
-static int plain_type_1			(tree, int);
-static void sdbout_block		(tree);
-static void sdbout_syms			(tree);
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-static void sdbout_queue_anonymous_type	(tree);
-static void sdbout_dequeue_anonymous_types (void);
-#endif
-static void sdbout_type			(tree);
-static void sdbout_field_types		(tree);
-static void sdbout_one_type		(tree);
-static void sdbout_parms		(tree);
-static void sdbout_reg_parms		(tree);
-
-/* Random macros describing parts of SDB data.  */
-
-/* Default value of delimiter is ";".  */
-#ifndef SDB_DELIM
-#define SDB_DELIM	";"
-#endif
-
-/* Maximum number of dimensions the assembler will allow.  */
-#ifndef SDB_MAX_DIM
-#define SDB_MAX_DIM 4
-#endif
-
-#ifndef PUT_SDB_SCL
-#define PUT_SDB_SCL(a) fprintf (asm_out_file, "\t.scl\t%d%s", (a), SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_INT_VAL
-#define PUT_SDB_INT_VAL(a) \
- do {									\
-   fprintf (asm_out_file, "\t.val\t" HOST_WIDE_INT_PRINT_DEC "%s",	\
-	    (HOST_WIDE_INT) (a), SDB_DELIM);				\
- } while (0)
-
-#endif
-
-#ifndef PUT_SDB_VAL
-#define PUT_SDB_VAL(a)				\
-( fputs ("\t.val\t", asm_out_file),		\
-  output_addr_const (asm_out_file, (a)),	\
-  fprintf (asm_out_file, SDB_DELIM))
-#endif
-
-#ifndef PUT_SDB_DEF
-#define PUT_SDB_DEF(a)				\
-do { fprintf (asm_out_file, "\t.def\t");	\
-     assemble_name (asm_out_file, a);	\
-     fprintf (asm_out_file, SDB_DELIM); } while (0)
-#endif
-
-#ifndef PUT_SDB_PLAIN_DEF
-#define PUT_SDB_PLAIN_DEF(a) \
-  fprintf (asm_out_file, "\t.def\t.%s%s", a, SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_ENDEF
-#define PUT_SDB_ENDEF fputs ("\t.endef\n", asm_out_file)
-#endif
-
-#ifndef PUT_SDB_TYPE
-#define PUT_SDB_TYPE(a) fprintf (asm_out_file, "\t.type\t0%o%s", a, SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_SIZE
-#define PUT_SDB_SIZE(a) \
- do {									\
-   fprintf (asm_out_file, "\t.size\t" HOST_WIDE_INT_PRINT_DEC "%s",	\
-	    (HOST_WIDE_INT) (a), SDB_DELIM);				\
- } while (0)
-#endif
-
-#ifndef PUT_SDB_START_DIM
-#define PUT_SDB_START_DIM fprintf (asm_out_file, "\t.dim\t")
-#endif
-
-#ifndef PUT_SDB_NEXT_DIM
-#define PUT_SDB_NEXT_DIM(a) fprintf (asm_out_file, "%d,", a)
-#endif
-
-#ifndef PUT_SDB_LAST_DIM
-#define PUT_SDB_LAST_DIM(a) fprintf (asm_out_file, "%d%s", a, SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_TAG
-#define PUT_SDB_TAG(a)				\
-do { fprintf (asm_out_file, "\t.tag\t");	\
-     assemble_name (asm_out_file, a);	\
-     fprintf (asm_out_file, SDB_DELIM); } while (0)
-#endif
-
-#ifndef PUT_SDB_BLOCK_START
-#define PUT_SDB_BLOCK_START(LINE)		\
-  fprintf (asm_out_file,			\
-	   "\t.def\t.bb%s\t.val\t.%s\t.scl\t100%s\t.line\t%d%s\t.endef\n", \
-	   SDB_DELIM, SDB_DELIM, SDB_DELIM, (LINE), SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_BLOCK_END
-#define PUT_SDB_BLOCK_END(LINE)			\
-  fprintf (asm_out_file,			\
-	   "\t.def\t.eb%s\t.val\t.%s\t.scl\t100%s\t.line\t%d%s\t.endef\n",  \
-	   SDB_DELIM, SDB_DELIM, SDB_DELIM, (LINE), SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_FUNCTION_START
-#define PUT_SDB_FUNCTION_START(LINE)		\
-  fprintf (asm_out_file,			\
-	   "\t.def\t.bf%s\t.val\t.%s\t.scl\t101%s\t.line\t%d%s\t.endef\n", \
-	   SDB_DELIM, SDB_DELIM, SDB_DELIM, (LINE), SDB_DELIM)
-#endif
-
-#ifndef PUT_SDB_FUNCTION_END
-#define PUT_SDB_FUNCTION_END(LINE)		\
-  fprintf (asm_out_file,			\
-	   "\t.def\t.ef%s\t.val\t.%s\t.scl\t101%s\t.line\t%d%s\t.endef\n", \
-	   SDB_DELIM, SDB_DELIM, SDB_DELIM, (LINE), SDB_DELIM)
-#endif
-
-/* Return the sdb tag identifier string for TYPE
-   if TYPE has already been defined; otherwise return a null pointer.  */
-
-#define KNOWN_TYPE_TAG(type)  TYPE_SYMTAB_POINTER (type)
-
-/* Set the sdb tag identifier string for TYPE to NAME.  */
-
-#define SET_KNOWN_TYPE_TAG(TYPE, NAME) \
-  TYPE_SYMTAB_POINTER (TYPE) = (const char *)(NAME)
-
-/* Return the name (a string) of the struct, union or enum tag
-   described by the TREE_LIST node LINK.  This is 0 for an anonymous one.  */
-
-#define TAG_NAME(link) \
-  (((link) && TREE_PURPOSE ((link)) \
-    && IDENTIFIER_POINTER (TREE_PURPOSE ((link)))) \
-   ? IDENTIFIER_POINTER (TREE_PURPOSE ((link))) : (char *) 0)
-
-/* Ensure we don't output a negative line number.  */
-#define MAKE_LINE_SAFE(line)  \
-  if ((int) line <= sdb_begin_function_line) \
-    line = sdb_begin_function_line + 1
-
-/* The debug hooks structure.  */
-const struct gcc_debug_hooks sdb_debug_hooks =
-{
-  sdbout_init,			         /* init */
-  sdbout_finish,		         /* finish */
-  debug_nothing_charstar,		 /* early_finish */
-  debug_nothing_void,			 /* assembly_start */
-  debug_nothing_int_charstar,	         /* define */
-  debug_nothing_int_charstar,	         /* undef */
-  sdbout_start_source_file,	         /* start_source_file */
-  sdbout_end_source_file,	         /* end_source_file */
-  sdbout_begin_block,		         /* begin_block */
-  sdbout_end_block,		         /* end_block */
-  debug_true_const_tree,	         /* ignore_block */
-  sdbout_source_line,		         /* source_line */
-  sdbout_begin_prologue,	         /* begin_prologue */
-  debug_nothing_int_charstar,	         /* end_prologue */
-  debug_nothing_int_charstar,	         /* begin_epilogue */
-  sdbout_end_epilogue,		         /* end_epilogue */
-  sdbout_begin_function,	         /* begin_function */
-  sdbout_end_function,		         /* end_function */
-  debug_nothing_tree,		         /* register_main_translation_unit */
-  debug_nothing_tree,		         /* function_decl */
-  sdbout_early_global_decl,		 /* early_global_decl */
-  sdbout_late_global_decl,		 /* late_global_decl */
-  sdbout_symbol,			 /* type_decl */
-  debug_nothing_tree_tree_tree_bool_bool,/* imported_module_or_decl */
-  debug_false_tree_charstarstar_uhwistar,/* die_ref_for_decl */
-  debug_nothing_tree_charstar_uhwi,      /* register_external_die */
-  debug_nothing_tree,		         /* deferred_inline_function */
-  debug_nothing_tree,		         /* outlining_inline_function */
-  sdbout_label,			         /* label */
-  debug_nothing_int,		         /* handle_pch */
-  debug_nothing_rtx_insn,	         /* var_location */
-  debug_nothing_tree,			 /* size_function */
-  debug_nothing_void,                    /* switch_text_section */
-  debug_nothing_tree_tree,		 /* set_name */
-  0,                                     /* start_end_main_source_file */
-  TYPE_SYMTAB_IS_POINTER                 /* tree_type_symtab_field */
-};
-
-/* Return a unique string to name an anonymous type.  */
-
-static char *
-gen_fake_label (void)
-{
-  char label[10];
-  char *labelstr;
-  sprintf (label, ".%dfake", unnamed_struct_number);
-  unnamed_struct_number++;
-  labelstr = xstrdup (label);
-  return labelstr;
-}
-
-/* Return the number which describes TYPE for SDB.
-   For pointers, etc., this function is recursive.
-   Each record, union or enumeral type must already have had a
-   tag number output.  */
-
-/* The number is given by d6d5d4d3d2d1bbbb
-   where bbbb is 4 bit basic type, and di indicate  one of notype,ptr,fn,array.
-   Thus, char *foo () has bbbb=T_CHAR
-			  d1=D_FCN
-			  d2=D_PTR
- N_BTMASK=     017       1111     basic type field.
- N_TSHIFT=       2                derived type shift
- N_BTSHFT=       4                Basic type shift */
-
-/* Produce the number that describes a pointer, function or array type.
-   PREV is the number describing the target, value or element type.
-   DT_type describes how to transform that type.  */
-#define PUSH_DERIVED_LEVEL(DT_type,PREV)		\
-  ((((PREV) & ~(int) N_BTMASK) << (int) N_TSHIFT)		\
-   | ((int) DT_type << (int) N_BTSHFT)			\
-   | ((PREV) & (int) N_BTMASK))
-
-/* Number of elements used in sdb_dims.  */
-static int sdb_n_dims = 0;
-
-/* Table of array dimensions of current type.  */
-static int sdb_dims[SDB_MAX_DIM];
-
-/* Size of outermost array currently being processed.  */
-static int sdb_type_size = -1;
-
-static int
-plain_type (tree type)
-{
-  int val = plain_type_1 (type, 0);
-
-  /* If we have already saved up some array dimensions, print them now.  */
-  if (sdb_n_dims > 0)
-    {
-      int i;
-      PUT_SDB_START_DIM;
-      for (i = sdb_n_dims - 1; i > 0; i--)
-	PUT_SDB_NEXT_DIM (sdb_dims[i]);
-      PUT_SDB_LAST_DIM (sdb_dims[0]);
-      sdb_n_dims = 0;
-
-      sdb_type_size = int_size_in_bytes (type);
-      /* Don't kill sdb if type is not laid out or has variable size.  */
-      if (sdb_type_size < 0)
-	sdb_type_size = 0;
-    }
-  /* If we have computed the size of an array containing this type,
-     print it now.  */
-  if (sdb_type_size >= 0)
-    {
-      PUT_SDB_SIZE (sdb_type_size);
-      sdb_type_size = -1;
-    }
-  return val;
-}
-
-static int
-template_name_p (tree name)
-{
-  const char *ptr = IDENTIFIER_POINTER (name);
-  while (*ptr && *ptr != '<')
-    ptr++;
-
-  return *ptr != '\0';
-}
-
-static void
-sdbout_record_type_name (tree type)
-{
-  const char *name = 0;
-  int no_name;
-
-  if (KNOWN_TYPE_TAG (type))
-    return;
-
-  if (TYPE_NAME (type) != 0)
-    {
-      tree t = 0;
-
-      /* Find the IDENTIFIER_NODE for the type name.  */
-      if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
-	t = TYPE_NAME (type);
-      else if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL)
-	{
-	  t = DECL_NAME (TYPE_NAME (type));
-	  /* The DECL_NAME for templates includes "<>", which breaks
-	     most assemblers.  Use its assembler name instead, which
-	     has been mangled into being safe.  */
-	  if (t && template_name_p (t))
-	    t = DECL_ASSEMBLER_NAME (TYPE_NAME (type));
-	}
-
-      /* Now get the name as a string, or invent one.  */
-      if (t != NULL_TREE)
-	name = IDENTIFIER_POINTER (t);
-    }
-
-  no_name = (name == 0 || *name == 0);
-  if (no_name)
-    name = gen_fake_label ();
-
-  SET_KNOWN_TYPE_TAG (type, name);
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-  if (no_name)
-    sdbout_queue_anonymous_type (type);
-#endif
-}
-
-/* Return the .type value for type TYPE.
-
-   LEVEL indicates how many levels deep we have recursed into the type.
-   The SDB debug format can only represent 6 derived levels of types.
-   After that, we must output inaccurate debug info.  We deliberately
-   stop before the 7th level, so that ADA recursive types will not give an
-   infinite loop.  */
-
-static int
-plain_type_1 (tree type, int level)
-{
-  if (type == 0)
-    type = void_type_node;
-  else if (type == error_mark_node)
-    type = integer_type_node;
-  else
-    type = TYPE_MAIN_VARIANT (type);
-
-  switch (TREE_CODE (type))
-    {
-    case VOID_TYPE:
-    case NULLPTR_TYPE:
-      return T_VOID;
-    case BOOLEAN_TYPE:
-    case INTEGER_TYPE:
-      {
-	int size = int_size_in_bytes (type) * BITS_PER_UNIT;
-
-	/* Carefully distinguish all the standard types of C,
-	   without messing up if the language is not C.
-	   Note that we check only for the names that contain spaces;
-	   other names might occur by coincidence in other languages.  */
-	if (TYPE_NAME (type) != 0
-	    && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
-	    && DECL_NAME (TYPE_NAME (type)) != 0
-	    && TREE_CODE (DECL_NAME (TYPE_NAME (type))) == IDENTIFIER_NODE)
-	  {
-	    const char *const name
-	      = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)));
-
-	    if (!strcmp (name, "char"))
-	      return T_CHAR;
-	    if (!strcmp (name, "unsigned char"))
-	      return T_UCHAR;
-	    if (!strcmp (name, "signed char"))
-	      return T_CHAR;
-	    if (!strcmp (name, "int"))
-	      return T_INT;
-	    if (!strcmp (name, "unsigned int"))
-	      return T_UINT;
-	    if (!strcmp (name, "short int"))
-	      return T_SHORT;
-	    if (!strcmp (name, "short unsigned int"))
-	      return T_USHORT;
-	    if (!strcmp (name, "long int"))
-	      return T_LONG;
-	    if (!strcmp (name, "long unsigned int"))
-	      return T_ULONG;
-	  }
-
-	if (size == INT_TYPE_SIZE)
-	  return (TYPE_UNSIGNED (type) ? T_UINT : T_INT);
-	if (size == CHAR_TYPE_SIZE)
-	  return (TYPE_UNSIGNED (type) ? T_UCHAR : T_CHAR);
-	if (size == SHORT_TYPE_SIZE)
-	  return (TYPE_UNSIGNED (type) ? T_USHORT : T_SHORT);
-	if (size == LONG_TYPE_SIZE)
-	  return (TYPE_UNSIGNED (type) ? T_ULONG : T_LONG);
-	if (size == LONG_LONG_TYPE_SIZE)	/* better than nothing */
-	  return (TYPE_UNSIGNED (type) ? T_ULONG : T_LONG);
-	return 0;
-      }
-
-    case REAL_TYPE:
-      {
-	int precision = TYPE_PRECISION (type);
-	if (precision == FLOAT_TYPE_SIZE)
-	  return T_FLOAT;
-	if (precision == DOUBLE_TYPE_SIZE)
-	  return T_DOUBLE;
-	if (precision == LONG_DOUBLE_TYPE_SIZE)
-	  return T_DOUBLE;	/* better than nothing */
-
-	return 0;
-      }
-
-    case ARRAY_TYPE:
-      {
-	int m;
-	if (level >= 6)
-	  return T_VOID;
-	else
-	  m = plain_type_1 (TREE_TYPE (type), level+1);
-	if (sdb_n_dims < SDB_MAX_DIM)
-	  sdb_dims[sdb_n_dims++]
-	    = (TYPE_DOMAIN (type)
-	       && TYPE_MIN_VALUE (TYPE_DOMAIN (type)) != 0
-	       && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != 0
-	       && tree_fits_shwi_p (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
-	       && tree_fits_shwi_p (TYPE_MIN_VALUE (TYPE_DOMAIN (type)))
-	       ? (tree_to_shwi (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
-		  - tree_to_shwi (TYPE_MIN_VALUE (TYPE_DOMAIN (type))) + 1)
-	       : 0);
-
-	return PUSH_DERIVED_LEVEL (DT_ARY, m);
-      }
-
-    case RECORD_TYPE:
-    case UNION_TYPE:
-    case QUAL_UNION_TYPE:
-    case ENUMERAL_TYPE:
-      {
-	const char *tag;
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-	sdbout_record_type_name (type);
-#endif
-#ifndef SDB_ALLOW_UNKNOWN_REFERENCES
-	if ((TREE_ASM_WRITTEN (type) && KNOWN_TYPE_TAG (type) != 0)
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-	    || TYPE_MODE (type) != VOIDmode
-#endif
-	    )
-#endif
-	  {
-	    /* Output the referenced structure tag name
-	       only if the .def has already been finished.
-	       At least on 386, the Unix assembler
-	       cannot handle forward references to tags.  */
-	    /* But the 88100, it requires them, sigh...  */
-	    /* And the MIPS requires unknown refs as well...  */
-	    tag = KNOWN_TYPE_TAG (type);
-	    PUT_SDB_TAG (tag);
-	    /* These 3 lines used to follow the close brace.
-	       However, a size of 0 without a tag implies a tag of 0,
-	       so if we don't know a tag, we can't mention the size.  */
-	    sdb_type_size = int_size_in_bytes (type);
-	    if (sdb_type_size < 0)
-	      sdb_type_size = 0;
-	  }
-	return ((TREE_CODE (type) == RECORD_TYPE) ? T_STRUCT
-		: (TREE_CODE (type) == UNION_TYPE) ? T_UNION
-		: (TREE_CODE (type) == QUAL_UNION_TYPE) ? T_UNION
-		: T_ENUM);
-      }
-    case POINTER_TYPE:
-    case REFERENCE_TYPE:
-      {
-	int m;
-	if (level >= 6)
-	  return T_VOID;
-	else
-	  m = plain_type_1 (TREE_TYPE (type), level+1);
-	return PUSH_DERIVED_LEVEL (DT_PTR, m);
-      }
-    case FUNCTION_TYPE:
-    case METHOD_TYPE:
-      {
-	int m;
-	if (level >= 6)
-	  return T_VOID;
-	else
-	  m = plain_type_1 (TREE_TYPE (type), level+1);
-	return PUSH_DERIVED_LEVEL (DT_FCN, m);
-      }
-    default:
-      return 0;
-    }
-}
-
-/* Output the symbols defined in block number DO_BLOCK.
-
-   This function works by walking the tree structure of blocks,
-   counting blocks until it finds the desired block.  */
-
-static int do_block = 0;
-
-static void
-sdbout_block (tree block)
-{
-  while (block)
-    {
-      /* Ignore blocks never expanded or otherwise marked as real.  */
-      if (TREE_USED (block))
-	{
-	  /* When we reach the specified block, output its symbols.  */
-	  if (BLOCK_NUMBER (block) == do_block)
-	    sdbout_syms (BLOCK_VARS (block));
-
-	  /* If we are past the specified block, stop the scan.  */
-	  if (BLOCK_NUMBER (block) > do_block)
-	    return;
-
-	  /* Scan the blocks within this block.  */
-	  sdbout_block (BLOCK_SUBBLOCKS (block));
-	}
-
-      block = BLOCK_CHAIN (block);
-    }
-}
-
-/* Call sdbout_symbol on each decl in the chain SYMS.  */
-
-static void
-sdbout_syms (tree syms)
-{
-  while (syms)
-    {
-      if (TREE_CODE (syms) != LABEL_DECL)
-	sdbout_symbol (syms, 1);
-      syms = TREE_CHAIN (syms);
-    }
-}
-
-/* Output SDB information for a symbol described by DECL.
-   LOCAL is nonzero if the symbol is not file-scope.  */
-
-void
-sdbout_symbol (tree decl, int local)
-{
-  tree type = TREE_TYPE (decl);
-  tree context = NULL_TREE;
-  rtx value;
-  int regno = -1;
-  const char *name;
-
-  /* If we are called before sdbout_init is run, just save the symbol
-     for later.  */
-  if (!sdbout_initialized)
-    {
-      preinit_symbols = tree_cons (0, decl, preinit_symbols);
-      return;
-    }
-
-  sdbout_one_type (type);
-
-  switch (TREE_CODE (decl))
-    {
-    case CONST_DECL:
-      /* Enum values are defined by defining the enum type.  */
-      return;
-
-    case FUNCTION_DECL:
-      /* Don't mention a nested function under its parent.  */
-      context = decl_function_context (decl);
-      if (context == current_function_decl)
-	return;
-      /* Check DECL_INITIAL to distinguish declarations from definitions.
-	 Don't output debug info here for declarations; they will have
-	 a DECL_INITIAL value of 0.  */
-      if (! DECL_INITIAL (decl))
-	return;
-      if (!MEM_P (DECL_RTL (decl))
-	  || GET_CODE (XEXP (DECL_RTL (decl), 0)) != SYMBOL_REF)
-	return;
-      PUT_SDB_DEF (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)));
-      PUT_SDB_VAL (XEXP (DECL_RTL (decl), 0));
-      PUT_SDB_SCL (TREE_PUBLIC (decl) ? C_EXT : C_STAT);
-      break;
-
-    case TYPE_DECL:
-      /* Done with tagged types.  */
-      if (DECL_NAME (decl) == 0)
-	return;
-      if (DECL_IGNORED_P (decl))
-	return;
-      /* Don't output intrinsic types.  GAS chokes on SDB .def
-	 statements that contain identifiers with embedded spaces
-	 (eg "unsigned long").  */
-      if (DECL_IS_BUILTIN (decl))
-	return;
-
-      /* Output typedef name.  */
-      if (template_name_p (DECL_NAME (decl)))
-	PUT_SDB_DEF (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)));
-      else
-	PUT_SDB_DEF (IDENTIFIER_POINTER (DECL_NAME (decl)));
-      PUT_SDB_SCL (C_TPDEF);
-      break;
-
-    case PARM_DECL:
-      /* Parm decls go in their own separate chains
-	 and are output by sdbout_reg_parms and sdbout_parms.  */
-      gcc_unreachable ();
-
-    case VAR_DECL:
-      /* Don't mention a variable that is external.
-	 Let the file that defines it describe it.  */
-      if (DECL_EXTERNAL (decl))
-	return;
-
-      /* Ignore __FUNCTION__, etc.  */
-      if (DECL_IGNORED_P (decl))
-	return;
-
-      /* If there was an error in the declaration, don't dump core
-	 if there is no RTL associated with the variable doesn't
-	 exist.  */
-      if (!DECL_RTL_SET_P (decl))
-	return;
-
-      value = DECL_RTL (decl);
-
-      if (!is_global_var (decl))
-	value = eliminate_regs (value, VOIDmode, NULL_RTX);
-
-      SET_DECL_RTL (decl, value);
-#ifdef LEAF_REG_REMAP
-      if (crtl->uses_only_leaf_regs)
-	leaf_renumber_regs_insn (value);
-#endif
-
-      /* Don't mention a variable at all
-	 if it was completely optimized into nothingness.
-
-	 If DECL was from an inline function, then its rtl
-	 is not identically the rtl that was used in this
-	 particular compilation.  */
-      if (REG_P (value))
-	{
-	  regno = REGNO (value);
-	  if (regno >= FIRST_PSEUDO_REGISTER)
-	    return;
-	}
-      else if (GET_CODE (value) == SUBREG)
-	{
-	  while (GET_CODE (value) == SUBREG)
-	    value = SUBREG_REG (value);
-	  if (REG_P (value))
-	    {
-	      if (REGNO (value) >= FIRST_PSEUDO_REGISTER)
-		return;
-	    }
-	  regno = REGNO (alter_subreg (&value, true));
-	  SET_DECL_RTL (decl, value);
-	}
-      /* Don't output anything if an auto variable
-	 gets RTL that is static.
-	 GAS version 2.2 can't handle such output.  */
-      else if (MEM_P (value) && CONSTANT_P (XEXP (value, 0))
-	       && ! TREE_STATIC (decl))
-	return;
-
-      /* Emit any structure, union, or enum type that has not been output.
-	 This occurs for tag-less structs (et al) used to declare variables
-	 within functions.  */
-      if (TREE_CODE (type) == ENUMERAL_TYPE
-	  || TREE_CODE (type) == RECORD_TYPE
-	  || TREE_CODE (type) == UNION_TYPE
-	  || TREE_CODE (type) == QUAL_UNION_TYPE)
-	{
-	  if (COMPLETE_TYPE_P (type)		/* not a forward reference */
-	      && KNOWN_TYPE_TAG (type) == 0)	/* not yet declared */
-	    sdbout_one_type (type);
-	}
-
-      /* Defer SDB information for top-level initialized variables! */
-      if (! local
-	  && MEM_P (value)
-	  && DECL_INITIAL (decl))
-	return;
-
-      /* C++ in 2.3 makes nameless symbols.  That will be fixed later.
-	 For now, avoid crashing.  */
-      if (DECL_NAME (decl) == NULL_TREE)
-	return;
-
-      /* Record the name for, starting a symtab entry.  */
-      if (local)
-	name = IDENTIFIER_POINTER (DECL_NAME (decl));
-      else
-	name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-
-      if (MEM_P (value)
-	  && GET_CODE (XEXP (value, 0)) == SYMBOL_REF)
-	{
-	  PUT_SDB_DEF (name);
-	  if (TREE_PUBLIC (decl))
-	    {
-	      PUT_SDB_VAL (XEXP (value, 0));
-	      PUT_SDB_SCL (C_EXT);
-	    }
-	  else
-	    {
-	      PUT_SDB_VAL (XEXP (value, 0));
-	      PUT_SDB_SCL (C_STAT);
-	    }
-	}
-      else if (regno >= 0)
-	{
-	  PUT_SDB_DEF (name);
-	  PUT_SDB_INT_VAL (DBX_REGISTER_NUMBER (regno));
-	  PUT_SDB_SCL (C_REG);
-	}
-      else if (MEM_P (value)
-	       && (MEM_P (XEXP (value, 0))
-		   || (REG_P (XEXP (value, 0))
-		       && REGNO (XEXP (value, 0)) != HARD_FRAME_POINTER_REGNUM
-		       && REGNO (XEXP (value, 0)) != STACK_POINTER_REGNUM)))
-	/* If the value is indirect by memory or by a register
-	   that isn't the frame pointer
-	   then it means the object is variable-sized and address through
-	   that register or stack slot.  COFF has no way to represent this
-	   so all we can do is output the variable as a pointer.  */
-	{
-	  PUT_SDB_DEF (name);
-	  if (REG_P (XEXP (value, 0)))
-	    {
-	      PUT_SDB_INT_VAL (DBX_REGISTER_NUMBER (REGNO (XEXP (value, 0))));
-	      PUT_SDB_SCL (C_REG);
-	    }
-	  else
-	    {
-	      /* DECL_RTL looks like (MEM (MEM (PLUS (REG...)
-		 (CONST_INT...)))).
-		 We want the value of that CONST_INT.  */
-	      /* Encore compiler hates a newline in a macro arg, it seems.  */
-	      PUT_SDB_INT_VAL (DEBUGGER_AUTO_OFFSET
-			       (XEXP (XEXP (value, 0), 0)));
-	      PUT_SDB_SCL (C_AUTO);
-	    }
-
-	  /* Effectively do build_pointer_type, but don't cache this type,
-	     since it might be temporary whereas the type it points to
-	     might have been saved for inlining.  */
-	  /* Don't use REFERENCE_TYPE because dbx can't handle that.  */
-	  type = make_node (POINTER_TYPE);
-	  TREE_TYPE (type) = TREE_TYPE (decl);
-	}
-      else if (MEM_P (value)
-	       && ((GET_CODE (XEXP (value, 0)) == PLUS
-		    && REG_P (XEXP (XEXP (value, 0), 0))
-		    && CONST_INT_P (XEXP (XEXP (value, 0), 1)))
-		   /* This is for variables which are at offset zero from
-		      the frame pointer.  This happens on the Alpha.
-		      Non-frame pointer registers are excluded above.  */
-		   || (REG_P (XEXP (value, 0)))))
-	{
-	  /* DECL_RTL looks like (MEM (PLUS (REG...) (CONST_INT...)))
-	     or (MEM (REG...)).  We want the value of that CONST_INT
-	     or zero.  */
-	  PUT_SDB_DEF (name);
-	  PUT_SDB_INT_VAL (DEBUGGER_AUTO_OFFSET (XEXP (value, 0)));
-	  PUT_SDB_SCL (C_AUTO);
-	}
-      else
-	{
-	  /* It is something we don't know how to represent for SDB.  */
-	  return;
-	}
-      break;
-
-    default:
-      break;
-    }
-  PUT_SDB_TYPE (plain_type (type));
-  PUT_SDB_ENDEF;
-}
-
-/* Output SDB information for a top-level initialized variable
-   that has been delayed.  */
-
-static void
-sdbout_toplevel_data (tree decl)
-{
-  tree type = TREE_TYPE (decl);
-
-  if (DECL_IGNORED_P (decl))
-    return;
-
-  gcc_assert (VAR_P (decl));
-  gcc_assert (MEM_P (DECL_RTL (decl)));
-  gcc_assert (DECL_INITIAL (decl));
-
-  PUT_SDB_DEF (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)));
-  PUT_SDB_VAL (XEXP (DECL_RTL (decl), 0));
-  if (TREE_PUBLIC (decl))
-    {
-      PUT_SDB_SCL (C_EXT);
-    }
-  else
-    {
-      PUT_SDB_SCL (C_STAT);
-    }
-  PUT_SDB_TYPE (plain_type (type));
-  PUT_SDB_ENDEF;
-}
-
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-
-/* Machinery to record and output anonymous types.  */
-
-static void
-sdbout_queue_anonymous_type (tree type)
-{
-  anonymous_types = tree_cons (NULL_TREE, type, anonymous_types);
-}
-
-static void
-sdbout_dequeue_anonymous_types (void)
-{
-  tree types, link;
-
-  while (anonymous_types)
-    {
-      types = nreverse (anonymous_types);
-      anonymous_types = NULL_TREE;
-
-      for (link = types; link; link = TREE_CHAIN (link))
-	{
-	  tree type = TREE_VALUE (link);
-
-	  if (type && ! TREE_ASM_WRITTEN (type))
-	    sdbout_one_type (type);
-	}
-    }
-}
-
-#endif
-
-/* Given a chain of ..._TYPE nodes, all of which have names,
-   output definitions of those names, as typedefs.  */
-
-void
-sdbout_types (tree types)
-{
-  tree link;
-
-  for (link = types; link; link = TREE_CHAIN (link))
-    sdbout_one_type (link);
-
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-  sdbout_dequeue_anonymous_types ();
-#endif
-}
-
-static void
-sdbout_type (tree type)
-{
-  if (type == error_mark_node)
-    type = integer_type_node;
-  PUT_SDB_TYPE (plain_type (type));
-}
-
-/* Output types of the fields of type TYPE, if they are structs.
-
-   Formerly did not chase through pointer types, since that could be circular.
-   They must come before TYPE, since forward refs are not allowed.
-   Now james@bigtex.cactus.org says to try them.  */
-
-static void
-sdbout_field_types (tree type)
-{
-  tree tail;
-
-  for (tail = TYPE_FIELDS (type); tail; tail = TREE_CHAIN (tail))
-    /* This condition should match the one for emitting the actual
-       members below.  */
-    if (TREE_CODE (tail) == FIELD_DECL
-	&& DECL_NAME (tail)
-	&& DECL_SIZE (tail)
-	&& tree_fits_uhwi_p (DECL_SIZE (tail))
-	&& tree_fits_shwi_p (bit_position (tail)))
-      {
-	if (POINTER_TYPE_P (TREE_TYPE (tail)))
-	  sdbout_one_type (TREE_TYPE (TREE_TYPE (tail)));
-	else
-	  sdbout_one_type (TREE_TYPE (tail));
-      }
-}
-
-/* Use this to put out the top level defined record and union types
-   for later reference.  If this is a struct with a name, then put that
-   name out.  Other unnamed structs will have .xxfake labels generated so
-   that they may be referred to later.
-   The label will be stored in the KNOWN_TYPE_TAG slot of a type.
-   It may NOT be called recursively.  */
-
-static void
-sdbout_one_type (tree type)
-{
-  if (current_function_decl != NULL_TREE
-      && DECL_SECTION_NAME (current_function_decl) != NULL)
-    ; /* Don't change section amid function.  */
-  else
-    switch_to_section (current_function_section ());
-
-  switch (TREE_CODE (type))
-    {
-    case RECORD_TYPE:
-    case UNION_TYPE:
-    case QUAL_UNION_TYPE:
-    case ENUMERAL_TYPE:
-      type = TYPE_MAIN_VARIANT (type);
-      /* Don't output a type twice.  */
-      if (TREE_ASM_WRITTEN (type))
-	/* James said test TREE_ASM_BEING_WRITTEN here.  */
-	return;
-
-      /* Output nothing if type is not yet defined.  */
-      if (!COMPLETE_TYPE_P (type))
-	return;
-
-      TREE_ASM_WRITTEN (type) = 1;
-
-      /* This is reputed to cause trouble with the following case,
-	 but perhaps checking TYPE_SIZE above will fix it.  */
-
-      /* Here is a testcase:
-
-	struct foo {
-	  struct badstr *bbb;
-	} forwardref;
-
-	typedef struct intermediate {
-	  int aaaa;
-	} intermediate_ref;
-
-	typedef struct badstr {
-	  int ccccc;
-	} badtype;   */
-
-      /* This change, which ought to make better output,
-	 used to make the COFF assembler unhappy.
-	 Changes involving KNOWN_TYPE_TAG may fix the problem.  */
-      /* Before really doing anything, output types we want to refer to.  */
-      /* Note that in version 1 the following two lines
-	 are not used if forward references are in use.  */
-      if (TREE_CODE (type) != ENUMERAL_TYPE)
-	sdbout_field_types (type);
-
-      /* Output a structure type.  */
-      {
-	int size = int_size_in_bytes (type);
-	int member_scl = 0;
-	tree tem;
-
-	/* Record the type tag, but not in its permanent place just yet.  */
-	sdbout_record_type_name (type);
-
-	PUT_SDB_DEF (KNOWN_TYPE_TAG (type));
-
-	switch (TREE_CODE (type))
-	  {
-	  case UNION_TYPE:
-	  case QUAL_UNION_TYPE:
-	    PUT_SDB_SCL (C_UNTAG);
-	    PUT_SDB_TYPE (T_UNION);
-	    member_scl = C_MOU;
-	    break;
-
-	  case RECORD_TYPE:
-	    PUT_SDB_SCL (C_STRTAG);
-	    PUT_SDB_TYPE (T_STRUCT);
-	    member_scl = C_MOS;
-	    break;
-
-	  case ENUMERAL_TYPE:
-	    PUT_SDB_SCL (C_ENTAG);
-	    PUT_SDB_TYPE (T_ENUM);
-	    member_scl = C_MOE;
-	    break;
-
-	  default:
-	    break;
-	  }
-
-	PUT_SDB_SIZE (size);
-	PUT_SDB_ENDEF;
-
-	/* Print out the base class information with fields
-	   named after the types they hold.  */
-	/* This is only relevant to aggregate types.  TYPE_BINFO is used
-	   for other purposes in an ENUMERAL_TYPE, so we must exclude that
-	   case.  */
-	if (TREE_CODE (type) != ENUMERAL_TYPE && TYPE_BINFO (type))
-	  {
-	    int i;
-	    tree binfo, child;
-
-	    for (binfo = TYPE_BINFO (type), i = 0;
-		 BINFO_BASE_ITERATE (binfo, i, child); i++)
-	      {
-		tree child_type = BINFO_TYPE (child);
-		tree child_type_name;
-
-		if (TYPE_NAME (child_type) == 0)
-		  continue;
-		if (TREE_CODE (TYPE_NAME (child_type)) == IDENTIFIER_NODE)
-		  child_type_name = TYPE_NAME (child_type);
-		else if (TREE_CODE (TYPE_NAME (child_type)) == TYPE_DECL)
-		  {
-		    child_type_name = DECL_NAME (TYPE_NAME (child_type));
-		    if (child_type_name && template_name_p (child_type_name))
-		      child_type_name
-			= DECL_ASSEMBLER_NAME (TYPE_NAME (child_type));
-		  }
-		else
-		  continue;
-
-		PUT_SDB_DEF (IDENTIFIER_POINTER (child_type_name));
-		PUT_SDB_INT_VAL (tree_to_shwi (BINFO_OFFSET (child)));
-		PUT_SDB_SCL (member_scl);
-		sdbout_type (BINFO_TYPE (child));
-		PUT_SDB_ENDEF;
-	      }
-	  }
-
-	/* Output the individual fields.  */
-
-	if (TREE_CODE (type) == ENUMERAL_TYPE)
-	  {
-	    for (tem = TYPE_VALUES (type); tem; tem = TREE_CHAIN (tem))
-	      {
-	        tree value = TREE_VALUE (tem);
-
-	        if (TREE_CODE (value) == CONST_DECL)
-	          value = DECL_INITIAL (value);
-
-	        if (tree_fits_shwi_p (value))
-		  {
-		    PUT_SDB_DEF (IDENTIFIER_POINTER (TREE_PURPOSE (tem)));
-		    PUT_SDB_INT_VAL (tree_to_shwi (value));
-		    PUT_SDB_SCL (C_MOE);
-		    PUT_SDB_TYPE (T_MOE);
-		    PUT_SDB_ENDEF;
-		  }
-	      }
-	  }
-	else			/* record or union type */
-	  for (tem = TYPE_FIELDS (type); tem; tem = TREE_CHAIN (tem))
-	    /* Output the name, type, position (in bits), size (in bits)
-	       of each field.  */
-
-	    /* Omit here the nameless fields that are used to skip bits.
-	       Also omit fields with variable size or position.
-	       Also omit non FIELD_DECL nodes that GNU C++ may put here.  */
-	    if (TREE_CODE (tem) == FIELD_DECL
-		&& DECL_NAME (tem)
-		&& DECL_SIZE (tem)
-		&& tree_fits_uhwi_p (DECL_SIZE (tem))
-		&& tree_fits_shwi_p (bit_position (tem)))
-	      {
-		const char *name;
-
-		name = IDENTIFIER_POINTER (DECL_NAME (tem));
-		PUT_SDB_DEF (name);
-		if (DECL_BIT_FIELD_TYPE (tem))
-		  {
-		    PUT_SDB_INT_VAL (int_bit_position (tem));
-		    PUT_SDB_SCL (C_FIELD);
-		    sdbout_type (DECL_BIT_FIELD_TYPE (tem));
-		    PUT_SDB_SIZE (tree_to_uhwi (DECL_SIZE (tem)));
-		  }
-		else
-		  {
-		    PUT_SDB_INT_VAL (int_bit_position (tem) / BITS_PER_UNIT);
-		    PUT_SDB_SCL (member_scl);
-		    sdbout_type (TREE_TYPE (tem));
-		  }
-		PUT_SDB_ENDEF;
-	      }
-	/* Output end of a structure,union, or enumeral definition.  */
-
-	PUT_SDB_PLAIN_DEF ("eos");
-	PUT_SDB_INT_VAL (size);
-	PUT_SDB_SCL (C_EOS);
-	PUT_SDB_TAG (KNOWN_TYPE_TAG (type));
-	PUT_SDB_SIZE (size);
-	PUT_SDB_ENDEF;
-	break;
-      }
-
-    default:
-      break;
-    }
-}
-
-/* The following two functions output definitions of function parameters.
-   Each parameter gets a definition locating it in the parameter list.
-   Each parameter that is a register variable gets a second definition
-   locating it in the register.
-
-   Printing or argument lists in gdb uses the definitions that
-   locate in the parameter list.  But reference to the variable in
-   expressions uses preferentially the definition as a register.  */
-
-/* Output definitions, referring to storage in the parmlist,
-   of all the parms in PARMS, which is a chain of PARM_DECL nodes.  */
-
-static void
-sdbout_parms (tree parms)
-{
-  for (; parms; parms = TREE_CHAIN (parms))
-    if (DECL_NAME (parms)
-	&& TREE_TYPE (parms) != error_mark_node
-	&& DECL_RTL_SET_P (parms)
-	&& DECL_INCOMING_RTL (parms))
-      {
-	int current_sym_value = 0;
-	const char *name = IDENTIFIER_POINTER (DECL_NAME (parms));
-
-	if (name == 0 || *name == 0)
-	  name = gen_fake_label ();
-
-	/* Perform any necessary register eliminations on the parameter's rtl,
-	   so that the debugging output will be accurate.  */
-	DECL_INCOMING_RTL (parms)
-	  = eliminate_regs (DECL_INCOMING_RTL (parms), VOIDmode, NULL_RTX);
-	SET_DECL_RTL (parms,
-		      eliminate_regs (DECL_RTL (parms), VOIDmode, NULL_RTX));
-
-	if (PARM_PASSED_IN_MEMORY (parms))
-	  {
-	    rtx addr = XEXP (DECL_INCOMING_RTL (parms), 0);
-	    tree type;
-
-	    /* ??? Here we assume that the parm address is indexed
-	       off the frame pointer or arg pointer.
-	       If that is not true, we produce meaningless results,
-	       but do not crash.  */
-	    if (GET_CODE (addr) == PLUS
-		&& CONST_INT_P (XEXP (addr, 1)))
-	      current_sym_value = INTVAL (XEXP (addr, 1));
-	    else
-	      current_sym_value = 0;
-
-	    if (REG_P (DECL_RTL (parms))
-		&& REGNO (DECL_RTL (parms)) < FIRST_PSEUDO_REGISTER)
-	      type = DECL_ARG_TYPE (parms);
-	    else
-	      {
-		int original_sym_value = current_sym_value;
-
-		/* This is the case where the parm is passed as an int or
-		   double and it is converted to a char, short or float
-		   and stored back in the parmlist.  In this case, describe
-		   the parm with the variable's declared type, and adjust
-		   the address if the least significant bytes (which we are
-		   using) are not the first ones.  */
-		scalar_mode from_mode, to_mode;
-		if (BYTES_BIG_ENDIAN
-		    && TREE_TYPE (parms) != DECL_ARG_TYPE (parms)
-		    && is_a <scalar_mode> (TYPE_MODE (DECL_ARG_TYPE (parms)),
-					   &from_mode)
-		    && is_a <scalar_mode> (GET_MODE (DECL_RTL (parms)),
-					   &to_mode))
-		  current_sym_value += (GET_MODE_SIZE (from_mode)
-					- GET_MODE_SIZE (to_mode));
-
-		if (MEM_P (DECL_RTL (parms))
-		    && GET_CODE (XEXP (DECL_RTL (parms), 0)) == PLUS
-		    && (GET_CODE (XEXP (XEXP (DECL_RTL (parms), 0), 1))
-			== CONST_INT)
-		    && (INTVAL (XEXP (XEXP (DECL_RTL (parms), 0), 1))
-			== current_sym_value))
-		  type = TREE_TYPE (parms);
-		else
-		  {
-		    current_sym_value = original_sym_value;
-		    type = DECL_ARG_TYPE (parms);
-		  }
-	      }
-
-	    PUT_SDB_DEF (name);
-	    PUT_SDB_INT_VAL (DEBUGGER_ARG_OFFSET (current_sym_value, addr));
-	    PUT_SDB_SCL (C_ARG);
-	    PUT_SDB_TYPE (plain_type (type));
-	    PUT_SDB_ENDEF;
-	  }
-	else if (REG_P (DECL_RTL (parms)))
-	  {
-	    rtx best_rtl;
-	    /* Parm passed in registers and lives in registers or nowhere.  */
-
-	    /* If parm lives in a register, use that register;
-	       pretend the parm was passed there.  It would be more consistent
-	       to describe the register where the parm was passed,
-	       but in practice that register usually holds something else.  */
-	    if (REGNO (DECL_RTL (parms)) < FIRST_PSEUDO_REGISTER)
-	      best_rtl = DECL_RTL (parms);
-	    /* If the parm lives nowhere,
-	       use the register where it was passed.  */
-	    else
-	      best_rtl = DECL_INCOMING_RTL (parms);
-
-	    PUT_SDB_DEF (name);
-	    PUT_SDB_INT_VAL (DBX_REGISTER_NUMBER (REGNO (best_rtl)));
-	    PUT_SDB_SCL (C_REGPARM);
-	    PUT_SDB_TYPE (plain_type (TREE_TYPE (parms)));
-	    PUT_SDB_ENDEF;
-	  }
-	else if (MEM_P (DECL_RTL (parms))
-		 && XEXP (DECL_RTL (parms), 0) != const0_rtx)
-	  {
-	    /* Parm was passed in registers but lives on the stack.  */
-
-	    /* DECL_RTL looks like (MEM (PLUS (REG...) (CONST_INT...))),
-	       in which case we want the value of that CONST_INT,
-	       or (MEM (REG ...)) or (MEM (MEM ...)),
-	       in which case we use a value of zero.  */
-	    if (REG_P (XEXP (DECL_RTL (parms), 0))
-		|| MEM_P (XEXP (DECL_RTL (parms), 0)))
-	      current_sym_value = 0;
-	    else
-	      current_sym_value = INTVAL (XEXP (XEXP (DECL_RTL (parms), 0), 1));
-
-	    /* Again, this assumes the offset is based on the arg pointer.  */
-	    PUT_SDB_DEF (name);
-	    PUT_SDB_INT_VAL (DEBUGGER_ARG_OFFSET (current_sym_value,
-						  XEXP (DECL_RTL (parms), 0)));
-	    PUT_SDB_SCL (C_ARG);
-	    PUT_SDB_TYPE (plain_type (TREE_TYPE (parms)));
-	    PUT_SDB_ENDEF;
-	  }
-      }
-}
-
-/* Output definitions for the places where parms live during the function,
-   when different from where they were passed, when the parms were passed
-   in memory.
-
-   It is not useful to do this for parms passed in registers
-   that live during the function in different registers, because it is
-   impossible to look in the passed register for the passed value,
-   so we use the within-the-function register to begin with.
-
-   PARMS is a chain of PARM_DECL nodes.  */
-
-static void
-sdbout_reg_parms (tree parms)
-{
-  for (; parms; parms = TREE_CHAIN (parms))
-    if (DECL_NAME (parms)
-        && TREE_TYPE (parms) != error_mark_node
-        && DECL_RTL_SET_P (parms)
-        && DECL_INCOMING_RTL (parms))
-      {
-	const char *name = IDENTIFIER_POINTER (DECL_NAME (parms));
-
-	/* Report parms that live in registers during the function
-	   but were passed in memory.  */
-	if (REG_P (DECL_RTL (parms))
-	    && REGNO (DECL_RTL (parms)) < FIRST_PSEUDO_REGISTER
-	    && PARM_PASSED_IN_MEMORY (parms))
-	  {
-	    if (name == 0 || *name == 0)
-	      name = gen_fake_label ();
-	    PUT_SDB_DEF (name);
-	    PUT_SDB_INT_VAL (DBX_REGISTER_NUMBER (REGNO (DECL_RTL (parms))));
-	    PUT_SDB_SCL (C_REG);
-	    PUT_SDB_TYPE (plain_type (TREE_TYPE (parms)));
-	    PUT_SDB_ENDEF;
-	  }
-	/* Report parms that live in memory but not where they were passed.  */
-	else if (MEM_P (DECL_RTL (parms))
-		 && GET_CODE (XEXP (DECL_RTL (parms), 0)) == PLUS
-		 && CONST_INT_P (XEXP (XEXP (DECL_RTL (parms), 0), 1))
-		 && PARM_PASSED_IN_MEMORY (parms)
-		 && ! rtx_equal_p (DECL_RTL (parms), DECL_INCOMING_RTL (parms)))
-	  {
-#if 0 /* ??? It is not clear yet what should replace this.  */
-	    int offset = DECL_OFFSET (parms) / BITS_PER_UNIT;
-	    /* A parm declared char is really passed as an int,
-	       so it occupies the least significant bytes.
-	       On a big-endian machine those are not the low-numbered ones.  */
-	    if (BYTES_BIG_ENDIAN
-		&& offset != -1
-		&& TREE_TYPE (parms) != DECL_ARG_TYPE (parms))
-	      offset += (GET_MODE_SIZE (TYPE_MODE (DECL_ARG_TYPE (parms)))
-			 - GET_MODE_SIZE (GET_MODE (DECL_RTL (parms))));
-	    if (INTVAL (XEXP (XEXP (DECL_RTL (parms), 0), 1)) != offset) {...}
-#endif
-	      {
-		if (name == 0 || *name == 0)
-		  name = gen_fake_label ();
-		PUT_SDB_DEF (name);
-		PUT_SDB_INT_VAL (DEBUGGER_AUTO_OFFSET
-				 (XEXP (DECL_RTL (parms), 0)));
-		PUT_SDB_SCL (C_AUTO);
-		PUT_SDB_TYPE (plain_type (TREE_TYPE (parms)));
-		PUT_SDB_ENDEF;
-	      }
-	  }
-      }
-}
-
-/* Output early debug information for a global DECL.  Called from
-   rest_of_decl_compilation during parsing.  */
-
-static void
-sdbout_early_global_decl (tree decl ATTRIBUTE_UNUSED)
-{
-  /* NYI for non-dwarf.  */
-}
-
-/* Output late debug information for a global DECL after location
-   information is available.  */
-
-static void
-sdbout_late_global_decl (tree decl)
-{
-  if (VAR_P (decl) && !DECL_EXTERNAL (decl) && DECL_RTL_SET_P (decl))
-    {
-      /* The COFF linker can move initialized global vars to the end.
-	 And that can screw up the symbol ordering.  Defer those for
-	 sdbout_finish ().  */
-      if (!DECL_INITIAL (decl) || !TREE_PUBLIC (decl))
-	sdbout_symbol (decl, 0);
-      else
-	vec_safe_push (deferred_global_decls, decl);
-
-      /* Output COFF information for non-global file-scope initialized
-	 variables.  */
-      if (DECL_INITIAL (decl) && MEM_P (DECL_RTL (decl)))
-	sdbout_toplevel_data (decl);
-    }
-}
-
-/* Output initialized global vars at the end, in the order of
-   definition.  See comment in sdbout_global_decl.  */
-
-static void
-sdbout_finish (const char *main_filename ATTRIBUTE_UNUSED)
-{
-  size_t i;
-  tree decl;
-
-  FOR_EACH_VEC_SAFE_ELT (deferred_global_decls, i, decl)
-    sdbout_symbol (decl, 0);
-}
-
-/* Describe the beginning of an internal block within a function.
-   Also output descriptions of variables defined in this block.
-
-   N is the number of the block, by order of beginning, counting from 1,
-   and not counting the outermost (function top-level) block.
-   The blocks match the BLOCKs in DECL_INITIAL (current_function_decl),
-   if the count starts at 0 for the outermost one.  */
-
-static void
-sdbout_begin_block (unsigned int line, unsigned int n)
-{
-  tree decl = current_function_decl;
-  MAKE_LINE_SAFE (line);
-
-  /* The SCO compiler does not emit a separate block for the function level
-     scope, so we avoid it here also.  */
-  PUT_SDB_BLOCK_START (line - sdb_begin_function_line);
-
-  if (n == 1)
-    {
-      /* Include the outermost BLOCK's variables in block 1.  */
-      do_block = BLOCK_NUMBER (DECL_INITIAL (decl));
-      sdbout_block (DECL_INITIAL (decl));
-    }
-  /* If -g1, suppress all the internal symbols of functions
-     except for arguments.  */
-  if (debug_info_level != DINFO_LEVEL_TERSE)
-    {
-      do_block = n;
-      sdbout_block (DECL_INITIAL (decl));
-    }
-
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-  sdbout_dequeue_anonymous_types ();
-#endif
-}
-
-/* Describe the end line-number of an internal block within a function.  */
-
-static void
-sdbout_end_block (unsigned int line, unsigned int n ATTRIBUTE_UNUSED)
-{
-  MAKE_LINE_SAFE (line);
-
-  /* The SCO compiler does not emit a separate block for the function level
-     scope, so we avoid it here also.  */
-  if (n != 1)
-    PUT_SDB_BLOCK_END (line - sdb_begin_function_line);
-}
-
-/* Output a line number symbol entry for source file FILENAME and line
-   number LINE.  */
-
-static void
-sdbout_source_line (unsigned int line, unsigned int column ATTRIBUTE_UNUSED,
-		    const char *filename ATTRIBUTE_UNUSED,
-                    int discriminator ATTRIBUTE_UNUSED,
-                    bool is_stmt ATTRIBUTE_UNUSED)
-{
-  /* COFF relative line numbers must be positive.  */
-  if ((int) line > sdb_begin_function_line)
-    {
-#ifdef SDB_OUTPUT_SOURCE_LINE
-      SDB_OUTPUT_SOURCE_LINE (asm_out_file, line);
-#else
-      fprintf (asm_out_file, "\t.ln\t%d\n",
-	       ((sdb_begin_function_line > -1)
-		? line - sdb_begin_function_line : 1));
-#endif
-    }
-}
-
-/* Output sdb info for the current function name.
-   Called from assemble_start_function.  */
-
-static void
-sdbout_begin_function (tree decl ATTRIBUTE_UNUSED)
-{
-  sdbout_symbol (current_function_decl, 0);
-}
-
-/* Called at beginning of function body after prologue.  Record the
-   function's starting line number, so we can output relative line numbers
-   for the other lines.  Describe beginning of outermost block.  Also
-   describe the parameter list.  */
-
-static void
-sdbout_begin_prologue (unsigned int line, unsigned int column ATTRIBUTE_UNUSED,
-		       const char *file ATTRIBUTE_UNUSED)
-{
-  sdbout_end_prologue (line, file);
-}
-
-static void
-sdbout_end_prologue (unsigned int line, const char *file ATTRIBUTE_UNUSED)
-{
-  sdb_begin_function_line = line - 1;
-  PUT_SDB_FUNCTION_START (line);
-  sdbout_parms (DECL_ARGUMENTS (current_function_decl));
-  sdbout_reg_parms (DECL_ARGUMENTS (current_function_decl));
-}
-
-/* Called at end of function (before epilogue).
-   Describe end of outermost block.  */
-
-static void
-sdbout_end_function (unsigned int line)
-{
-#ifdef SDB_ALLOW_FORWARD_REFERENCES
-  sdbout_dequeue_anonymous_types ();
-#endif
-
-  MAKE_LINE_SAFE (line);
-  PUT_SDB_FUNCTION_END (line - sdb_begin_function_line);
-
-  /* Indicate we are between functions, for line-number output.  */
-  sdb_begin_function_line = -1;
-}
-
-/* Output sdb info for the absolute end of a function.
-   Called after the epilogue is output.  */
-
-static void
-sdbout_end_epilogue (unsigned int line ATTRIBUTE_UNUSED,
-		     const char *file ATTRIBUTE_UNUSED)
-{
-  const char *const name ATTRIBUTE_UNUSED
-    = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (current_function_decl));
-
-#ifdef PUT_SDB_EPILOGUE_END
-  PUT_SDB_EPILOGUE_END (name);
-#else
-  fprintf (asm_out_file, "\t.def\t");
-  assemble_name (asm_out_file, name);
-  fprintf (asm_out_file, "%s\t.val\t.%s\t.scl\t-1%s\t.endef\n",
-	   SDB_DELIM, SDB_DELIM, SDB_DELIM);
-#endif
-}
-
-/* Output sdb info for the given label.  Called only if LABEL_NAME (insn)
-   is present.  */
-
-static void
-sdbout_label (rtx_code_label *insn)
-{
-  PUT_SDB_DEF (LABEL_NAME (insn));
-  PUT_SDB_VAL (insn);
-  PUT_SDB_SCL (C_LABEL);
-  PUT_SDB_TYPE (T_NULL);
-  PUT_SDB_ENDEF;
-}
-
-/* Change to reading from a new source file.  */
-
-static void
-sdbout_start_source_file (unsigned int line ATTRIBUTE_UNUSED,
-			  const char *filename ATTRIBUTE_UNUSED)
-{
-}
-
-/* Revert to reading a previous source file.  */
-
-static void
-sdbout_end_source_file (unsigned int line ATTRIBUTE_UNUSED)
-{
-}
-
-/* Set up for SDB output at the start of compilation.  */
-
-static void
-sdbout_init (const char *input_file_name ATTRIBUTE_UNUSED)
-{
-  tree t;
-
-  vec_alloc (deferred_global_decls, 12);
-
-  /* Emit debug information which was queued by sdbout_symbol before
-     we got here.  */
-  sdbout_initialized = true;
-
-  for (t = nreverse (preinit_symbols); t; t = TREE_CHAIN (t))
-    sdbout_symbol (TREE_VALUE (t), 0);
-  preinit_symbols = 0;
-}
-
-#include "gt-sdbout.h"
diff --git a/gcc/sdbout.h b/gcc/sdbout.h
deleted file mode 100644
index 204b68790ce..00000000000
--- a/gcc/sdbout.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/* sdbout.h - Various declarations for functions found in sdbout.c
-   Copyright (C) 1998-2017 Free Software Foundation, Inc.
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation; either version 3, or (at your option) any later
-version.
-
-GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#ifndef GCC_SDBOUT_H
-#define GCC_SDBOUT_H
-
-extern void sdbout_symbol (tree, int);
-extern void sdbout_types (tree);
-
-#endif /* GCC_SDBOUT_H */
diff --git a/gcc/selftest-diagnostic.c b/gcc/selftest-diagnostic.c
new file mode 100644
index 00000000000..201806288e4
--- /dev/null
+++ b/gcc/selftest-diagnostic.c
@@ -0,0 +1,62 @@
+/* Selftest support for diagnostics.
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic.h"
+#include "selftest.h"
+#include "selftest-diagnostic.h"
+
+/* The selftest code should entirely disappear in a production
+   configuration, hence we guard all of it with #if CHECKING_P.  */
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Implementation of class selftest::test_diagnostic_context.  */
+
+test_diagnostic_context::test_diagnostic_context ()
+{
+  diagnostic_initialize (this, 0);
+  show_caret = true;
+  show_column = true;
+  start_span = start_span_cb;
+}
+
+test_diagnostic_context::~test_diagnostic_context ()
+{
+  diagnostic_finish (this);
+}
+
+/* Implementation of diagnostic_start_span_fn, hiding the
+   real filename (to avoid printing the names of tempfiles).  */
+
+void
+test_diagnostic_context::start_span_cb (diagnostic_context *context,
+					expanded_location exploc)
+{
+  exploc.file = "FILENAME";
+  default_diagnostic_start_span_fn (context, exploc);
+}
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/selftest-diagnostic.h b/gcc/selftest-diagnostic.h
new file mode 100644
index 00000000000..61525dcfd43
--- /dev/null
+++ b/gcc/selftest-diagnostic.h
@@ -0,0 +1,49 @@
+/* Selftest support for diagnostics.
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_SELFTEST_DIAGNOSTIC_H
+#define GCC_SELFTEST_DIAGNOSTIC_H
+
+/* The selftest code should entirely disappear in a production
+   configuration, hence we guard all of it with #if CHECKING_P.  */
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Convenience subclass of diagnostic_context for testing
+   the diagnostic subsystem.  */
+
+class test_diagnostic_context : public diagnostic_context
+{
+ public:
+  test_diagnostic_context ();
+  ~test_diagnostic_context ();
+
+  /* Implementation of diagnostic_start_span_fn, hiding the
+     real filename (to avoid printing the names of tempfiles).  */
+  static void
+  start_span_cb (diagnostic_context *context, expanded_location exploc);
+};
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
+
+#endif /* GCC_SELFTEST_DIAGNOSTIC_H */
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index 80ae8f9799b..6030d3b22e7 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -67,6 +67,7 @@ selftest::run_tests ()
   sreal_c_tests ();
   fibonacci_heap_c_tests ();
   typed_splay_tree_c_tests ();
+  unique_ptr_tests_cc_tests ();
 
   /* Mid-level data structures.  */
   input_c_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index c5135b0cb60..253cfc2d732 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -195,6 +195,7 @@ extern void store_merging_c_tests ();
 extern void typed_splay_tree_c_tests ();
 extern void tree_c_tests ();
 extern void tree_cfg_c_tests ();
+extern void unique_ptr_tests_cc_tests ();
 extern void vec_c_tests ();
 extern void wide_int_cc_tests ();
 extern void predict_c_tests ();
diff --git a/gcc/sese.c b/gcc/sese.c
index 8aa8015290d..89cddf0ec97 100644
--- a/gcc/sese.c
+++ b/gcc/sese.c
@@ -156,12 +156,8 @@ new_sese_info (edge entry, edge exit)
   region->liveout = NULL;
   region->debug_liveout = NULL;
   region->params.create (3);
-  region->rename_map = new rename_map_t;
-  region->parameter_rename_map = new parameter_rename_map_t;
-  region->copied_bb_map = new bb_map_t;
+  region->rename_map = new hash_map <tree, tree>;
   region->bbs.create (3);
-  region->incomplete_phis.create (3);
-
 
   return region;
 }
@@ -175,24 +171,9 @@ free_sese_info (sese_info_p region)
   BITMAP_FREE (region->liveout);
   BITMAP_FREE (region->debug_liveout);
 
-  for (rename_map_t::iterator it = region->rename_map->begin ();
-       it != region->rename_map->end (); ++it)
-    (*it).second.release ();
-
-  for (bb_map_t::iterator it = region->copied_bb_map->begin ();
-       it != region->copied_bb_map->end (); ++it)
-    (*it).second.release ();
-
   delete region->rename_map;
-  delete region->parameter_rename_map;
-  delete region->copied_bb_map;
-
   region->rename_map = NULL;
-  region->parameter_rename_map = NULL;
-  region->copied_bb_map = NULL;
-
   region->bbs.release ();
-  region->incomplete_phis.release ();
 
   XDELETE (region);
 }
@@ -459,41 +440,16 @@ scev_analyzable_p (tree def, sese_l &region)
 tree
 scalar_evolution_in_region (const sese_l &region, loop_p loop, tree t)
 {
-  gimple *def;
-  struct loop *def_loop;
-
   /* SCOP parameters.  */
   if (TREE_CODE (t) == SSA_NAME
       && !defined_in_sese_p (t, region))
     return t;
 
-  if (TREE_CODE (t) != SSA_NAME
-      || loop_in_sese_p (loop, region))
-    /* FIXME: we would need instantiate SCEV to work on a region, and be more
-       flexible wrt. memory loads that may be invariant in the region.  */
-    return instantiate_scev (region.entry, loop,
-			     analyze_scalar_evolution (loop, t));
-
-  def = SSA_NAME_DEF_STMT (t);
-  def_loop = loop_containing_stmt (def);
-
-  if (loop_in_sese_p (def_loop, region))
-    {
-      t = analyze_scalar_evolution (def_loop, t);
-      def_loop = superloop_at_depth (def_loop, loop_depth (loop) + 1);
-      t = compute_overall_effect_of_inner_loop (def_loop, t);
-      return t;
-    }
-
-  bool has_vdefs = false;
-  if (invariant_in_sese_p_rec (t, region, &has_vdefs))
-    return t;
-
-  /* T variates in REGION.  */
-  if (has_vdefs)
-    return chrec_dont_know;
+  if (!loop_in_sese_p (loop, region))
+    loop = NULL;
 
-  return instantiate_scev (region.entry, loop, t);
+  return instantiate_scev (region.entry, loop,
+			   analyze_scalar_evolution (loop, t));
 }
 
 /* Return true if BB is empty, contains only DEBUG_INSNs.  */
diff --git a/gcc/sese.h b/gcc/sese.h
index faefd806d9d..cbc20ab1064 100644
--- a/gcc/sese.h
+++ b/gcc/sese.h
@@ -22,14 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_SESE_H
 #define GCC_SESE_H
 
-typedef hash_map<tree, tree> parameter_rename_map_t;
-typedef hash_map<basic_block, vec<basic_block> > bb_map_t;
-typedef hash_map<tree, vec<tree> > rename_map_t;
 typedef struct ifsese_s *ifsese;
-/* First phi is the new codegenerated phi second one is original phi.  */
-typedef std::pair <gphi *, gphi *> phi_rename;
-/* First edge is the init edge and second is the back edge w.r.t. a loop.  */
-typedef std::pair<edge, edge> init_back_edge_pair_t;
 
 /* A Single Entry, Single Exit region is a part of the CFG delimited
    by two edges.  */
@@ -92,24 +85,12 @@ typedef struct sese_info_t
   /* Parameters used within the SCOP.  */
   vec<tree> params;
 
-  /* Maps an old name to one or more new names.  When there are several new
-     names, one has to select the definition corresponding to the immediate
-     dominator.  */
-  rename_map_t *rename_map;
-
-  /* Parameters to be renamed.  */
-  parameter_rename_map_t *parameter_rename_map;
+  /* Maps an old name to a new decl.  */
+  hash_map<tree, tree> *rename_map;
 
   /* Basic blocks contained in this SESE.  */
   vec<basic_block> bbs;
 
-  /* Copied basic blocks indexed by the original bb.  */
-  bb_map_t *copied_bb_map;
-
-  /* A vector of phi nodes to be updated when all arguments are available.  The
-     pair contains first the old_phi and second the new_phi.  */
-  vec<phi_rename> incomplete_phis;
-
   /* The condition region generated for this sese.  */
   ifsese if_region;
 
diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index 3cad7760f9c..0e4ff6cd46a 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -561,8 +561,7 @@ handle_simple_exit (edge e)
       BB_END (old_bb) = end;
 
       redirect_edge_succ (e, new_bb);
-      new_bb->count = e->count;
-      new_bb->frequency = EDGE_FREQUENCY (e);
+      new_bb->count = e->count ();
       e->flags |= EDGE_FALLTHRU;
 
       e = make_single_succ_edge (new_bb, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
@@ -888,7 +887,7 @@ try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq)
     if (!dominated_by_p (CDI_DOMINATORS, e->src, pro))
       {
 	num += EDGE_FREQUENCY (e);
-	den += e->src->frequency;
+	den += e->src->count.to_frequency (cfun);
       }
 
   if (den == 0)
@@ -921,8 +920,6 @@ try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq)
 	if (dump_file)
 	  fprintf (dump_file, "Duplicated %d to %d\n", bb->index, dup->index);
 
-	bb->frequency = RDIV (num * bb->frequency, den);
-	dup->frequency -= bb->frequency;
 	bb->count = bb->count.apply_scale (num, den);
 	dup->count -= bb->count;
       }
@@ -996,8 +993,7 @@ try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq)
 	  continue;
 	}
 
-      new_bb->count += e->src->count.apply_probability (e->probability);
-      new_bb->frequency += EDGE_FREQUENCY (e);
+      new_bb->count += e->count ();
 
       redirect_edge_and_branch_force (e, new_bb);
       if (dump_file)
@@ -1182,7 +1178,7 @@ place_prologue_for_one_component (unsigned int which, basic_block head)
 	     work: this does not always add up to the block frequency at
 	     all, and even if it does, rounding error makes for bad
 	     decisions.  */
-	  SW (bb)->own_cost = bb->frequency;
+	  SW (bb)->own_cost = bb->count.to_frequency (cfun);
 
 	  edge e;
 	  edge_iterator ei;
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index b1b4767d8c4..212b5068cd0 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -357,7 +357,7 @@ delegitimize_mem_from_attrs (rtx x)
 		x = adjust_address_nv (newx, mode, offset);
 	    }
 	  else if (GET_MODE (x) == GET_MODE (newx)
-		   && known_zero (offset))
+		   && must_eq (offset, 0))
 	    x = newx;
 	}
     }
@@ -6210,7 +6210,7 @@ simplify_subreg (machine_mode outermode, rtx op,
   if (may_ge (byte, innersize))
     return NULL_RTX;
 
-  if (outermode == innermode && known_zero (byte))
+  if (outermode == innermode && must_eq (byte, 0U))
     return op;
 
   if (multiple_p (byte, GET_MODE_UNIT_SIZE (innermode)))
@@ -6256,8 +6256,8 @@ simplify_subreg (machine_mode outermode, rtx op,
       rtx newx;
 
       if (outermode == innermostmode
-	  && known_zero (byte)
-	  && known_zero (SUBREG_BYTE (op)))
+	  && must_eq (byte, 0U)
+	  && must_eq (SUBREG_BYTE (op), 0))
 	return SUBREG_REG (op);
 
       /* Work out the memory offset of the final OUTERMODE value relative
@@ -6639,7 +6639,8 @@ test_vector_ops_duplicate (machine_mode mode, rtx scalar_reg)
 				      mode, offset));
 
   machine_mode narrower_mode;
-  if (may_gt (nunits, 2U)
+  if (may_ne (nunits, 2U)
+      && multiple_p (nunits, 2)
       && mode_for_vector (inner_mode, 2).exists (&narrower_mode)
       && VECTOR_MODE_P (narrower_mode))
     {
diff --git a/gcc/ssa-iterators.h b/gcc/ssa-iterators.h
index c8aa77bd4f3..740cbf13cb2 100644
--- a/gcc/ssa-iterators.h
+++ b/gcc/ssa-iterators.h
@@ -93,6 +93,12 @@ struct imm_use_iterator
      break;							\
    }
 
+/* Similarly for return.  */
+#define RETURN_FROM_IMM_USE_STMT(ITER, VAL)			\
+  {								\
+    end_imm_use_stmt_traverse (&(ITER));			\
+    return (VAL);						\
+  }
 
 /* Use this iterator in combination with FOR_EACH_IMM_USE_STMT to
    get access to each occurrence of ssavar on the stmt returned by
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 95f0afa994d..bd7d44471d8 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2761,7 +2761,7 @@ bit_field_mode_iterator
   m_bitregion_end (bitregion_end), m_align (align),
   m_volatilep (volatilep), m_count (0)
 {
-  if (known_zero (m_bitregion_end))
+  if (must_eq (m_bitregion_end, 0))
     {
       /* We can assume that any aligned chunk of ALIGN bits that overlaps
 	 the bitfield is mapped and won't trap, provided that ALIGN isn't
@@ -2809,7 +2809,7 @@ bit_field_mode_iterator::next_mode (scalar_int_mode *out_mode)
 
       /* Stop if the mode goes outside the bitregion.  */
       HOST_WIDE_INT start = m_bitpos - substart;
-      if (maybe_nonzero (m_bitregion_start)
+      if (may_ne (m_bitregion_start, 0)
 	  && may_lt (start, m_bitregion_start))
 	break;
       HOST_WIDE_INT end = start + unit;
diff --git a/gcc/substring-locations.c b/gcc/substring-locations.c
index 433023d9845..7de435b969f 100644
--- a/gcc/substring-locations.c
+++ b/gcc/substring-locations.c
@@ -63,7 +63,7 @@ along with GCC; see the file COPYING3.  If not see
      printf(fmt, msg);
             ^~~
 
-   For each of cases 1-3, if param_range is non-NULL, then it is used
+   For each of cases 1-3, if param_loc is not UNKNOWN_LOCATION, then it is used
    as a secondary range within the warning.  For example, here it
    is used with case 1:
 
@@ -100,7 +100,7 @@ along with GCC; see the file COPYING3.  If not see
 ATTRIBUTE_GCC_DIAG (5,0)
 bool
 format_warning_va (const substring_loc &fmt_loc,
-		   const source_range *param_range,
+		   location_t param_loc,
 		   const char *corrected_substring,
 		   int opt, const char *gmsgid, va_list *ap)
 {
@@ -136,13 +136,8 @@ format_warning_va (const substring_loc &fmt_loc,
 
   rich_location richloc (line_table, primary_loc);
 
-  if (param_range)
-    {
-      location_t param_loc = make_location (param_range->m_start,
-					    param_range->m_start,
-					    param_range->m_finish);
-      richloc.add_range (param_loc, false);
-    }
+  if (param_loc != UNKNOWN_LOCATION)
+    richloc.add_range (param_loc, false);
 
   if (!err && corrected_substring && substring_within_range)
     richloc.add_fixit_replace (fmt_substring_range, corrected_substring);
@@ -160,8 +155,8 @@ format_warning_va (const substring_loc &fmt_loc,
 	if (corrected_substring)
 	  substring_richloc.add_fixit_replace (fmt_substring_range,
 					       corrected_substring);
-	inform_at_rich_loc (&substring_richloc,
-			    "format string is defined here");
+	inform (&substring_richloc,
+		"format string is defined here");
       }
 
   return warned;
@@ -171,13 +166,13 @@ format_warning_va (const substring_loc &fmt_loc,
 
 bool
 format_warning_at_substring (const substring_loc &fmt_loc,
-			     const source_range *param_range,
+			     location_t param_loc,
 			     const char *corrected_substring,
 			     int opt, const char *gmsgid, ...)
 {
   va_list ap;
   va_start (ap, gmsgid);
-  bool warned = format_warning_va (fmt_loc, param_range, corrected_substring,
+  bool warned = format_warning_va (fmt_loc, param_loc, corrected_substring,
 				   opt, gmsgid, &ap);
   va_end (ap);
 
diff --git a/gcc/substring-locations.h b/gcc/substring-locations.h
index a91cc6c8b4a..3d7796db3e6 100644
--- a/gcc/substring-locations.h
+++ b/gcc/substring-locations.h
@@ -77,13 +77,13 @@ class substring_loc
 /* Functions for emitting a warning about a format string.  */
 
 extern bool format_warning_va (const substring_loc &fmt_loc,
-			       const source_range *param_range,
+			       location_t param_loc,
 			       const char *corrected_substring,
 			       int opt, const char *gmsgid, va_list *ap)
   ATTRIBUTE_GCC_DIAG (5,0);
 
 extern bool format_warning_at_substring (const substring_loc &fmt_loc,
-					 const source_range *param_range,
+					 location_t param_loc,
 					 const char *corrected_substring,
 					 int opt, const char *gmsgid, ...)
   ATTRIBUTE_GCC_DIAG (5,0);
diff --git a/gcc/system.h b/gcc/system.h
index 01bc134d1cc..187193ff485 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -720,6 +720,16 @@ extern int vsnprintf (char *, size_t, const char *, va_list);
 #define __builtin_expect(a, b) (a)
 #endif
 
+/* Some of the headers included by <memory> can use "abort" within a
+   namespace, e.g. "_VSTD::abort();", which fails after we use the
+   preprocessor to redefine "abort" as "fancy_abort" below.
+   Given that unique-ptr.h can use "free", we need to do this after "free"
+   is declared but before "abort" is overridden.  */
+
+#ifdef INCLUDE_UNIQUE_PTR
+# include "unique-ptr.h"
+#endif
+
 /* Redefine abort to report an internal error w/o coredump, and
    reporting the location of the error in the source file.  */
 extern void fancy_abort (const char *, int, const char *)
@@ -1008,7 +1018,8 @@ extern void fancy_abort (const char *, int, const char *)
 	ROUND_TOWARDS_ZERO SF_SIZE DF_SIZE XF_SIZE TF_SIZE LIBGCC2_TF_CEXT \
 	LIBGCC2_LONG_DOUBLE_TYPE_SIZE STRUCT_VALUE			   \
 	EH_FRAME_IN_DATA_SECTION TARGET_FLT_EVAL_METHOD_NON_DEFAULT	   \
-	JCR_SECTION_NAME TARGET_USE_JCR_SECTION
+	JCR_SECTION_NAME TARGET_USE_JCR_SECTION SDB_DEBUGGING_INFO	   \
+	SDB_DEBUG
 
 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE	\
diff --git a/gcc/target-insns.def b/gcc/target-insns.def
index 4669439c7e1..75976b2f8d9 100644
--- a/gcc/target-insns.def
+++ b/gcc/target-insns.def
@@ -60,6 +60,7 @@ DEF_TARGET_INSN (jump, (rtx x0))
 DEF_TARGET_INSN (load_multiple, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (mem_thread_fence, (rtx x0))
 DEF_TARGET_INSN (memory_barrier, (void))
+DEF_TARGET_INSN (memory_blockage, (void))
 DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (nonlocal_goto, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (nonlocal_goto_receiver, (void))
diff --git a/gcc/target.def b/gcc/target.def
index 3129c3f3210..6593df14ee0 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -876,9 +876,8 @@ to generate it on the spot.",
 
 DEFHOOK
 (output_source_filename,
- "Output COFF information or DWARF debugging information which indicates\
- that filename @var{name} is the current source file to the stdio\
- stream @var{file}.\n\
+ "Output DWARF debugging information which indicates that filename\
+ @var{name} is the current source file to the stdio stream @var{file}.\n\
  \n\
  This target hook need not be defined if the standard form of output\
  for the file format in use is appropriate.",
@@ -2837,7 +2836,7 @@ DEFHOOK
  "This hook should return true if @var{x} should not be emitted into\n\
 debug sections.",
  bool, (rtx x),
- hook_bool_rtx_false)
+ default_const_not_ok_for_debug_p)
 
 /* Given an address RTX, say whether it is valid.  */
 DEFHOOK
@@ -3489,6 +3488,19 @@ if @var{extended} is false, 16 or greater than 128 and a multiple of 32.",
  opt_scalar_float_mode, (int n, bool extended),
  default_floatn_mode)
 
+DEFHOOK
+(floatn_builtin_p,
+  "Define this to return true if the @code{_Float@var{n}} and\n\
+@code{_Float@var{n}x} built-in functions should implicitly enable the\n\
+built-in function without the @code{__builtin_} prefix in addition to the\n\
+normal built-in function with the @code{__builtin_} prefix.  The default is\n\
+to only enable built-in functions without the @code{__builtin_} prefix for\n\
+the GNU C langauge.  In strict ANSI/ISO mode, the built-in function without\n\
+the @code{__builtin_} prefix is not enabled.  The argument @code{FUNC} is the\n\
+@code{enum built_in_function} id of the function to be enabled.",
+ bool, (int func),
+ default_floatn_builtin_p)
+
 /* Compute cost of moving data from a register of class FROM to one of
    TO, using MODE.  */
 DEFHOOK
diff --git a/gcc/target.h b/gcc/target.h
index 1b8decd1b49..e7bdad33d34 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -171,9 +171,11 @@ enum vect_cost_for_stmt
   scalar_store,
   vector_stmt,
   vector_load,
+  vector_gather_load,
   unaligned_load,
   unaligned_store,
   vector_store,
+  vector_scatter_store,
   vec_to_scalar,
   scalar_to_vec,
   cond_branch_not_taken,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 1251e452ff5..5d8ecd31b8c 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -81,7 +81,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "params.h"
 #include "real.h"
-
+#include "langhooks.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -176,6 +176,14 @@ default_legitimize_address_displacement (rtx *, rtx *, poly_int64,
   return false;
 }
 
+bool
+default_const_not_ok_for_debug_p (rtx x)
+{
+  if (GET_CODE (x) == UNSPEC)
+    return true;
+  return false;
+}
+
 rtx
 default_expand_builtin_saveregs (void)
 {
@@ -554,6 +562,28 @@ default_floatn_mode (int n, bool extended)
   return opt_scalar_float_mode ();
 }
 
+/* Define this to return true if the _Floatn and _Floatnx built-in functions
+   should implicitly enable the built-in function without the __builtin_ prefix
+   in addition to the normal built-in function with the __builtin_ prefix.  The
+   default is to only enable built-in functions without the __builtin_ prefix
+   for the GNU C langauge.  The argument FUNC is the enum builtin_in_function
+   id of the function to be enabled.  */
+
+bool
+default_floatn_builtin_p (int func ATTRIBUTE_UNUSED)
+{
+  static bool first_time_p = true;
+  static bool c_or_objective_c;
+
+  if (first_time_p)
+    {
+      first_time_p = false;
+      c_or_objective_c = lang_GNU_C () || lang_GNU_OBJC ();
+    }
+
+  return c_or_objective_c;
+}
+
 /* Make some target macros useable by target-independent code.  */
 bool
 targhook_words_big_endian (void)
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index ac315a476e4..917431f17ee 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -26,6 +26,7 @@ extern void default_external_libcall (rtx);
 extern rtx default_legitimize_address (rtx, rtx, machine_mode);
 extern bool default_legitimize_address_displacement (rtx *, rtx *,
 						     poly_int64, machine_mode);
+extern bool default_const_not_ok_for_debug_p (rtx);
 
 extern int default_unspec_may_trap_p (const_rtx, unsigned);
 extern machine_mode default_promote_function_mode (const_tree, machine_mode,
@@ -74,6 +75,7 @@ extern tree default_mangle_assembler_name (const char *);
 extern bool default_scalar_mode_supported_p (scalar_mode);
 extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode);
 extern opt_scalar_float_mode default_floatn_mode (int, bool);
+extern bool default_floatn_builtin_p (int);
 extern bool targhook_words_big_endian (void);
 extern bool targhook_float_words_big_endian (void);
 extern bool default_float_exceptions_rounding_supported_p (void);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 566864c2183..10331b39929 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,1301 @@
+2017-11-04  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81735
+	* gfortran.dg/pr81735.f90: New test.
+
+2017-11-03  Steven G. Kargl  <kargl@gcc.gnu.org>
+
+	PR fortran/82796
+	* gfortran.dg/equiv_pure.f90: New test.
+
+2017-11-03  Jeff Law  <law@redhat.com>
+
+	PR target/82823
+	* g++.dg/torture/pr82823.C: New test.
+
+	* gcc.target/i386/stack-check-12.c: New test.
+
+2017-11-03  Jakub Jelinek  <jakub@redhat.com>
+
+	PR tree-optimization/78821
+	* gcc.dg/store_merging_13.c: New test.
+	* gcc.dg/store_merging_14.c: New test.
+
+2017-11-03  Steven G. Kargl  <kargl@gcc.gnu.org>
+
+	* gfortran.dg/large_real_kind_2.F90: Test passes on FreeBSD.  Remove
+	dg-xfail-if directive.
+
+2017-11-03  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* gcc.target/mips/msa.c: Add -fcommon to dg-options.
+
+2017-11-03  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR testsuite/82828
+	PR rtl-optimization/70263
+	* gcc.target/i386/pr70263-2.c: Fix invalid testcase.
+
+2017-11-03  Marc Glisse  <marc.glisse@inria.fr>
+
+	* gcc.dg/tree-ssa/negneg-1.c: New file.
+	* gcc.dg/tree-ssa/negneg-2.c: Likewise.
+	* gcc.dg/tree-ssa/negneg-3.c: Likewise.
+	* gcc.dg/tree-ssa/negneg-4.c: Likewise.
+
+2017-11-03  Jan Hubicka  <hubicka@ucw.cz>
+
+	* gcc.dg/no-strict-overflow-3.c (foo): Update magic
+	value to not clash with frequency.
+	* gcc.dg/strict-overflow-3.c (foo): Likewise.
+	* gcc.dg/tree-ssa/builtin-sprintf-2.c: Update template.
+	* gcc.dg/tree-ssa/dump-2.c: Update template.
+	* gcc.dg/tree-ssa/ifc-10.c: Update template.
+	* gcc.dg/tree-ssa/ifc-11.c: Update template.
+	* gcc.dg/tree-ssa/ifc-12.c: Update template.
+	* gcc.dg/tree-ssa/ifc-20040816-1.c: Update template.
+	* gcc.dg/tree-ssa/ifc-20040816-2.c: Update template.
+	* gcc.dg/tree-ssa/ifc-5.c: Update template.
+	* gcc.dg/tree-ssa/ifc-8.c: Update template.
+	* gcc.dg/tree-ssa/ifc-9.c: Update template.
+	* gcc.dg/tree-ssa/ifc-cd.c: Update template.
+	* gcc.dg/tree-ssa/ifc-pr56541.c: Update template.
+	* gcc.dg/tree-ssa/ifc-pr68583.c: Update template.
+	* gcc.dg/tree-ssa/ifc-pr69489-1.c: Update template.
+	* gcc.dg/tree-ssa/ifc-pr69489-2.c: Update template.
+	* gcc.target/i386/pr61403.c: Update template.
+
+2017-11-03  Nathan Sidwell  <nathan@acm.org>
+
+	* lib/scanlang.exp: Fix error message to refer to scan-lang-dump.
+
+	PR c++/82710
+	* g++.dg/warn/pr82710.C: More cases.
+
+2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
+
+	* gcc.dg/pr82809.c: New test.
+
+2017-11-02  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/81957
+	* g++.dg/cpp0x/variadic-crash5.C: New.
+
+2017-11-02  Steve Ellcey  <sellcey@cavium.com>
+
+	PR target/79868
+	* gcc.target/aarch64/spellcheck_1.c: Update dg-error string to match
+	new format.
+	* gcc.target/aarch64/spellcheck_2.c: Ditto.
+	* gcc.target/aarch64/spellcheck_3.c: Ditto.
+	* gcc.target/aarch64/target_attr_11.c: Ditto.
+	* gcc.target/aarch64/target_attr_12.c: Ditto.
+	* gcc.target/aarch64/target_attr_17.c: Ditto.
+
+2017-11-02  Nathan Sidwell  <nathan@acm.org>
+
+	PR c++/82710
+	* g++.dg/warn/pr82710.C: New.
+
+	* g++.dg/lang-dump.C: New.
+
+2017-11-02  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82795
+	* gcc.target/i386/pr82795.c: New testcase.
+
+2017-11-02  Claudiu Zissulescu <claziss@synopsys.com>
+
+	* gcc.target/arc/loop-1.c: Add test.
+
+2017-11-02  Tom de Vries  <tom@codesourcery.com>
+
+	PR testsuite/82415
+	* gcc.target/i386/naked-1.c: Make scan patterns more precise.
+	* gcc.target/i386/naked-2.c: Same.
+
+2017-11-02  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/82765
+	* gcc.dg/pr82765.c: New testcase.
+
+2017-11-02  Tom de Vries  <tom@codesourcery.com>
+
+	* gfortran.dg/implied_do_io_1.f90: Fix scan-tree-dump-times pattern.
+
+2017-11-01  Jakub Jelinek  <jakub@redhat.com>
+
+	PR rtl-optimization/82778
+	* g++.dg/opt/pr82778.C: New test.
+
+2017-11-01  Michael Collison  <michael.collison@arm.com>
+
+	PR rtl-optimization/82597
+	* gcc.dg/pr82597.c: New test.
+
+2017-11-01  Uros Bizjak  <ubizjak@gmail.com>
+
+	* gcc.target/alpha/sqrt.c: New test.
+
+2017-10-31  Daniel Santos  <daniel.santos@pobox.com>
+
+	* gcc.target/i386/pr82002-1.c: New test.
+	* gcc.target/i386/pr82002-2a.c: New xfail test.
+	* gcc.target/i386/pr82002-2b.c: New xfail test.
+
+2017-10-31  Martin Jambor  <mjambor@suse.cz>
+
+	PR c++/81702
+	* g++.dg/tree-ssa/pr81702.C: New test.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* jit.dg/jit.exp (jit-dg-test): If PRESERVE_EXECUTABLES is set in
+	the environment, don't delete the generated executable.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* g++.dg/cpp0x/auto21.C: Update dg-error to reflect addition of quotes.
+	* g++.dg/cpp0x/missing-initializer_list-include.C: Likewise.
+
+2017-10-31  David Malcolm  <dmalcolm@redhat.com>
+
+	* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Update
+	for renaming of error_at_rich_loc and inform_at_rich_loc.
+	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+	(test_show_locus): Likewise for renaming of warning_at_rich_loc.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* g++.dg/gcov/loop.C: New test.
+	* lib/gcov.exp: Support human readable format for counts.
+
+2017-10-31  Martin Liska  <mliska@suse.cz>
+
+	* g++.dg/gcov/ternary.C: New test.
+	* g++.dg/gcov/gcov-threads-1.C (main): Update expected line count.
+	* lib/gcov.exp: Support new format for intermediate file format.
+
+2017-11-01  Julia Koval  <julia.koval@intel.com>
+
+	* gcc.target/i386/avx-1.c: Handle new intrinsics.
+	* gcc.target/i386/avx512-check.h: Check GFNI bit.
+	* gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
+	* gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
+	* gcc.target/i386/gfni-1.c: New.
+	* gcc.target/i386/gfni-2.c: New.
+	* gcc.target/i386/gfni-3.c: New.
+	* gcc.target/i386/gfni-4.c: New.
+	* gcc.target/i386/i386.exp: (check_effective_target_gfni): New.
+	* gcc.target/i386/sse-12.c: Handle new intrinsics.
+	* gcc.target/i386/sse-13.c: Ditto.
+	* gcc.target/i386/sse-14.c: Ditto.
+	* gcc.target/i386/sse-22.c: Ditto.
+	* gcc.target/i386/sse-23.c: Ditto.
+	* g++.dg/other/i386-2.C: Ditto.
+	* g++.dg/other/i386-3.C: Ditto.
+
+2017-11-01  Michael Collison  <michael.collison@arm.com>
+
+	PR rtl-optimization/82597
+	* gcc.dg/pr82597.c: New test.
+
+2017-10-30  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/67595
+	* g++.dg/concepts/pr67595.C: New.
+
+2017-10-30  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/80850
+	* gfortran.dg/class_64_f90 : New test.
+
+2017-10-30  Uros Bizjak  <ubizjak@gmail.com>
+
+	* g++.dg/pr82725.C: Move to ...
+	* g++.dg/cpp0x/pr82725.C: ... here.  Add c++11 target directive.
+
+2017-10-30  Steven G. Kargl   <kargl@gcc.gnu.org>
+
+	* gfortran.dg/dtio_13.f90: Remove TODO comment and dg-error test.
+
+2017-10-30  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82085
+	* g++.dg/cpp1y/var-templ56.C: New.
+
+2017-10-30  Nathan Sidwell  <nathan@acm.org>
+
+	* g++.dg/other/operator2.C: Adjust diagnostic.
+	* g++.old-deja/g++.jason/operator.C: Likewise.
+
+2017-10-30  Steven Munroe  <munroesj@gcc.gnu.org>
+
+	* sse2-check.h: New file.
+	* sse2-addpd-1.c: New file.
+	* sse2-addsd-1.c: New file.
+	* sse2-andnpd-1.c: New file.
+	* sse2-andpd-1.c: New file.
+	* sse2-cmppd-1.c: New file.
+	* sse2-cmpsd-1.c: New file.
+	* sse2-comisd-1.c: New file.
+	* sse2-comisd-2.c: New file.
+	* sse2-comisd-3.c: New file.
+	* sse2-comisd-4.c: New file.
+	* sse2-comisd-5.c: New file.
+	* sse2-comisd-6.c: New file.
+	* sse2-cvtdq2pd-1.c: New file.
+	* sse2-cvtdq2ps-1.c: New file.
+	* sse2-cvtpd2dq-1.c: New file.
+	* sse2-cvtpd2ps-1.c: New file.
+	* sse2-cvtps2dq-1.c: New file.
+	* sse2-cvtps2pd-1.c: New file.
+	* sse2-cvtsd2si-1.c: New file.
+	* sse2-cvtsd2si-2.c: New file.
+	* sse2-cvtsd2ss-1.c: New file.
+	* sse2-cvtsi2sd-1.c: New file.
+	* sse2-cvtsi2sd-2.c: New file.
+	* sse2-cvtss2sd-1.c: New file.
+	* sse2-cvttpd2dq-1.c: New file.
+	* sse2-cvttps2dq-1.c: New file.
+	* sse2-cvttsd2si-1.c: New file.
+	* sse2-cvttsd2si-2.c: New file.
+	* sse2-divpd-1.c: New file.
+	* sse2-divsd-1.c: New file.
+	* sse2-maxpd-1.c: New file.
+	* sse2-maxsd-1.c: New file.
+	* sse2-minpd-1.c: New file.
+	* sse2-minsd-1.c: New file.
+	* sse2-mmx.c: New file.
+	* sse2-movhpd-1.c: New file.
+	* sse2-movhpd-2.c: New file.
+	* sse2-movlpd-1.c: New file.
+	* sse2-movlpd-2.c: New file.
+	* sse2-movmskpd-1.c: New file.
+	* sse2-movq-1.c: New file.
+	* sse2-movq-2.c: New file.
+	* sse2-movq-3.c: New file.
+	* sse2-movsd-1.c: New file.
+	* sse2-movsd-2.c: New file.
+	* sse2-movsd-3.c: New file.
+	* sse2-mulpd-1.c: New file.
+	* sse2-mulsd-1.c: New file.
+	* sse2-orpd-1.c: New file.
+	* sse2-packssdw-1.c: New file.
+	* sse2-packsswb-1.c: New file.
+	* sse2-packuswb-1.c: New file.
+	* sse2-paddb-1.c: New file.
+	* sse2-paddd-1.c: New file.
+	* sse2-paddq-1.c: New file.
+	* sse2-paddsb-1.c: New file.
+	* sse2-paddsw-1.c: New file.
+	* sse2-paddusb-1.c: New file.
+	* sse2-paddusw-1.c: New file.
+	* sse2-paddw-1.c: New file.
+	* sse2-pavgb-1.c: New file.
+	* sse2-pavgw-1.c: New file.
+	* sse2-pcmpeqb-1.c: New file.
+	* sse2-pcmpeqd-1.c: New file.
+	* sse2-pcmpeqw-1.c: New file.
+	* sse2-pcmpgtb-1.c: New file.
+	* sse2-pcmpgtd-1.c: New file.
+	* sse2-pcmpgtw-1.c: New file.
+	* sse2-pextrw.c: New file.
+	* sse2-pinsrw.c: New file.
+	* sse2-pmaddwd-1.c: New file.
+	* sse2-pmaxsw-1.c: New file.
+	* sse2-pmaxub-1.c: New file.
+	* sse2-pminsw-1.c: New file.
+	* sse2-pminub-1.c: New file.
+	* sse2-pmovmskb-1.c: New file.
+	* sse2-pmulhuw-1.c: New file.
+	* sse2-pmulhw-1.c: New file.
+	* sse2-pmullw-1.c: New file.
+	* sse2-pmuludq-1.c: New file.
+	* sse2-psadbw-1.c: New file.
+	* sse2-pshufd-1.c: New file.
+	* sse2-pshufhw-1.c: New file.
+	* sse2-pshuflw-1.c: New file.
+	* sse2-pslld-1.c: New file.
+	* sse2-pslld-2.c: New file.
+	* sse2-pslldq-1.c: New file.
+	* sse2-psllq-1.c: New file.
+	* sse2-psllq-2.c: New file.
+	* sse2-psllw-1.c: New file.
+	* sse2-psllw-2.c: New file.
+	* sse2-psrad-1.c: New file.
+	* sse2-psrad-2.c: New file.
+	* sse2-psraw-1.c: New file.
+	* sse2-psraw-2.c: New file.
+	* sse2-psrld-1.c: New file.
+	* sse2-psrld-2.c: New file.
+	* sse2-psrldq-1.c: New file.
+	* sse2-psrlq-1.c: New file.
+	* sse2-psrlq-2.c: New file.
+	* sse2-psrlw-1.c: New file.
+	* sse2-psrlw-2.c: New file.
+	* sse2-psubb-1.c: New file.
+	* sse2-psubd-1.c: New file.
+
+2017-10-30  Will Schmidt  <will_schmidt@vnet.ibm.com>
+
+	* gcc.target/powerpc/fold-vec-perm-longlong.c: Update to use long long
+	types for testcase arguments.
+
+2017-10-30  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82762
+	* gcc.dg/torture/pr82762.c: New testcase.
+
+2017-10-30  Richard Biener  <rguenther@suse.de>
+
+	* gcc.dg/gimplefe-27.c: New testcase.
+
+2017-10-30  Joseph Myers  <joseph@codesourcery.com>
+
+	* gcc.dg/c17-version-1.c, gcc.dg/c17-version-2.c: New tests.
+
+2017-10-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/22141
+	* gcc.dg/store_merging_10.c: New test.
+	* gcc.dg/store_merging_11.c: New test.
+	* gcc.dg/store_merging_12.c: New test.
+	* g++.dg/pr71694.C: Add -fno-store-merging to dg-options.
+
+2017-10-30  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82725
+	* g++.dg/pr82725.C: New test.
+
+2017-10-29  Jim Wilson  <wilson@tuliptree.org>
+
+	* lib/gcc-dg.exp (gcc-dg-debug-runtest): Delete -gcoff.
+	* lib/gfortran-dg.exp (gfortran-dg-debug-runtest): Delete -gcoff.
+
+2017-10-28  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/70971
+	* g++.dg/torture/pr70971.C: New.
+
+2017-10-28  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81758
+	* gfortran.dg/class_63.f90: New test.
+
+2017-10-27  Steven G. Kargl  <kargl@gcc.gnu.org>
+
+	PR fortran/82620
+	* gfortran.dg/allocate_error_7.f90: new test.
+
+2017-10-27  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82218
+	* g++.dg/cpp1y/constexpr-82218.C: New.
+
+2017-10-27  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gnat.dg/opt68.ad[sb]: New test.
+
+2017-10-27  Daniel Santos  <daniel.santos@pobox.com>
+
+	* gcc.target/i386/pr82196-1.c (dg-options): Add -mno-avx.
+
+2017-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* gcc.target/powerpc/float128-hw.c: Add support for all 4 FMA
+	variants.  Check various conversions to/from float128.  Check
+	negation.  Use {\m...\M} in the tests.
+	* gcc.target/powerpc/float128-hw2.c: New test for implicit
+	_Float128 math functions.
+	* gcc.target/powerpc/float128-hw3.c: New test for strict ansi mode
+	not implicitly adding the _Float128 math functions.
+	* gcc.target/powerpc/float128-fma2.c: Delete, test is no longer
+	valid.
+	* gcc.target/powerpc/float128-sqrt2.c: Likewise.
+
+2017-10-27  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82692
+	* gcc.dg/torture/pr82692.c: New test.
+
+2017-10-27  Will Schmidt  <will_schmidt@vnet.ibm.com>
+
+	* gcc.target/powerpc/fold-vec-neg-char.c: New.
+	* gcc.target/powerpc/fold-vec-neg-floatdouble.c: New.
+	* gcc.target/powerpc/fold-vec-neg-int.c: New.
+	* gcc.target/powerpc/fold-vec-neg-longlong.c: New.
+	* gcc.target/powerpc/fold-vec-neg-short.c: New.
+
+2017-10-27  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/56342
+	* gfortran.dg/matmul_const.f90: New test.
+
+2017-10-25  Jan Hubicka  <hubicka@ucw.cz>
+
+	* gcc.target/i386/pr70021.c: Add -mtune=skylake.
+
+2017-10-27  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82703
+	* gcc.dg/pr82703.c: New test.
+
+2017-10-27  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
+
+	* gcc.dg/ipa/propmalloc-1.c: New test-case.
+	* gcc.dg/ipa/propmalloc-2.c: Likewise.
+	* gcc.dg/ipa/propmalloc-3.c: Likewise.
+
+2017-10-27  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71385
+	* g++.dg/concepts/pr71385.C: New.
+
+2017-10-27  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80739
+	* g++.dg/cpp1y/constexpr-80739.C: New.
+
+2017-10-27  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/81659
+	* g++.dg/torture/pr81659.C: New testcase.
+
+2017-10-26  Michael Collison  <michael.collison@arm.com>
+
+	* gcc.target/aarch64/fix_trunc1.c: New testcase.
+	* gcc.target/aarch64/vect-vcvt.c: Fix scan-assembler
+	directives to allow float or integer destination registers for
+	fcvtz[su].
+
+2017-10-26  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* gcc.target/nios2/gpopt-r0rel-sec.c: New.
+
+2017-10-26  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* gcc.target/nios2/gpopt-gprel-sec.c: New.
+
+2017-10-26  Olga Makhotina  <olga.makhotina@intel.com>
+
+	* gcc.target/i386/avx512f-vcmpps-1.c (_mm512_cmpeq_ps_mask,
+	_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask,
+	_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
+	_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
+	_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
+	_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
+	_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
+	_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
+	_mm512_mask_cmpunord_ps_mask): Test new intrinsics.
+	* gcc.target/i386/avx512f-vcmpps-2.c (_mm512_cmpeq_ps_mask,
+	_mm512_cmple_ps_mask, _mm512_cmplt_ps_mask,
+	_mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask,
+	_mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask,
+	_mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask,
+	_mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask,
+	_mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask,
+	_mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask,
+	_mm512_mask_cmpunord_ps_mask): Test new intrinsics.
+	* gcc.target/i386/avx512f-vcmppd-1.c (_mm512_cmpeq_pd_mask,
+	_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
+	_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
+	_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
+	_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
+	_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
+	_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
+	_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
+	_mm512_mask_cmpunord_pd_mask): Test new intrinsics.
+	* gcc.target/i386/avx512f-vcmppd-2.c (_mm512_cmpeq_pd_mask,
+	_mm512_cmple_pd_mask, _mm512_cmplt_pd_mask,
+	_mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask,
+	_mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask,
+	_mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask,
+	_mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask,
+	_mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask,
+	_mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask,
+	_mm512_mask_cmpunord_pd_mask): Test new intrinsics.
+
+2017-10-26  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	* gcc.target/aarch64/ldp_stp_unaligned_2.c: New file.
+
+2017-10-26  James Greenhalgh  <james.greenhalgh@arm.com>
+
+	* gcc.target/arm/require-pic-register-loc.c: Use wider regex for
+	column information.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	* gcc.dg/vect/vect-reduc-dot-s8a.c
+	(dg-additional-options, dg-require-effective-target): Add +dotprod.
+	* gcc.dg/vect/vect-reduc-dot-u8a.c
+	(dg-additional-options, dg-require-effective-target): Add +dotprod.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	* lib/target-supports.exp
+	(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): New.
+	(check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
+	(add_options_for_arm_v8_2a_dotprod_neon): New.
+	(check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
+	(check_effective_target_vect_sdot_qi): Add ARM && AArch64.
+	(check_effective_target_vect_udot_qi): Likewise.
+	* gcc.target/arm/simd/vdot-exec.c: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
+	* gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	* gcc.dg/vect/vect-multitypes-1.c: Correct target selector.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	* gcc.target/aarch64/inline-lrint_2.c (dg-options): Add -fno-trapping-math.
+
+2017-10-26  Tamar Christina  <tamar.christina@arm.com>
+
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c: New.
+
+2017-10-25  David Malcolm  <dmalcolm@redhat.com>
+
+	PR c/7356
+	PR c/44515
+	* c-c++-common/pr44515.c: New test case.
+	* gcc.dg/pr7356-2.c: New test case.
+	* gcc.dg/pr7356.c: New test case.
+	* gcc.dg/spellcheck-typenames.c: Update the "singed" char "TODO"
+	case to reflect changes to output.
+	* gcc.dg/noncompile/920923-1.c: Add dg-warning to reflect changes
+	to output.
+
+2017-10-25  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc.dg/fold-cond_expr-1.c: Rename to...
+	* gcc.dg/fold-cond-2.c: ...this.
+	* gcc.dg/fold-cond-3.c: New test.
+
+2017-10-25  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82436
+	* gcc.dg/torture/pr82436-2.c: New testcase.
+
+2017-10-25  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71820
+	* g++.dg/ext/typeof12.C: New.
+
+2017-10-25  Tom de Vries  <tom@codesourcery.com>
+
+	* gcc.dg/tree-ssa/loop-1.c: Add xfail for nvptx in scan-assembler-times
+	line, and add nvptx-specific version.
+
+2017-10-25  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* gcc.target/i386/cet-sjlj-5.c: Allow for emtpy user label prefix
+	in setjmp call.
+
+2017-10-25  Jakub Jelinek  <jakub@redhat.com>
+
+	PR libstdc++/81706
+	* gcc.target/i386/pr81706.c: New test.
+	* g++.dg/ext/pr81706.C: New test.
+
+2017-10-24  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82460
+	* gcc.target/i386/pr82460-1.c: New test.
+	* gcc.target/i386/pr82460-2.c: New test.
+	* gcc.target/i386/avx512f-vpermt2pd-1.c: Adjust scan-assembler*
+	regexps to allow vpermt2* to vpermi2* replacement or vice versa
+	where possible.
+	* gcc.target/i386/avx512vl-vpermt2pd-1.c: Likewise.
+	* gcc.target/i386/avx512f-vpermt2d-1.c: Likewise.
+	* gcc.target/i386/vect-pack-trunc-2.c: Likewise.
+	* gcc.target/i386/avx512vl-vpermt2ps-1.c: Likewise.
+	* gcc.target/i386/avx512vl-vpermt2q-1.c: Likewise.
+	* gcc.target/i386/avx512f-vpermt2ps-1.c: Likewise.
+	* gcc.target/i386/avx512vl-vpermt2d-1.c: Likewise.
+	* gcc.target/i386/avx512bw-vpermt2w-1.c: Likewise.
+	* gcc.target/i386/avx512vbmi-vpermt2b-1.c: Likewise.
+	* gcc.target/i386/avx512f-vpermt2q-1.c: Likewise.
+
+	PR target/82370
+	* gcc.target/i386/pr82370.c: New test.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82466
+	* c-c++-common/Wbuiltin-declaration-mismatch-1.c: New.
+	* c-c++-common/Wno-builtin-declaration-mismatch-1.c: Likewise.
+	* g++.dg/warn/Wbuiltin_declaration_mismatch-1.C: Likewise.
+	* g++.dg/parse/builtin2.C: Adjust.
+	* g++.old-deja/g++.mike/p811.C: Likewise.
+
+2017-10-24  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80991
+	* g++.dg/ext/is_trivially_constructible5.C: New.
+
+2017-10-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* gcc.target/i386/387-ficom-1.c: Allow for ficomp without s
+	suffix.
+	* gcc.target/i386/387-ficom-2.c: Likewise.
+
+2017-10-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
+
+	* gcc.target/i386/cet-sjlj-3.c: Allow for emtpy user label prefix
+	in setjmp call.
+
+2017-10-24  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82697
+	* gcc.dg/torture/pr82697.c: New testcase.
+
+2017-10-24  Mukesh Kapoor  <mukesh.kapoor@oracle.com>
+	    Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82307
+	* g++.dg/cpp0x/enum35.C: New.
+	* g++.dg/cpp0x/enum36.C: Likewise.
+
+2017-10-24  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82659
+	* gcc.target/i386/cet-label-2.c: New test.
+	* gcc.target/i386/cet-sjlj-4.c: Likewise.
+	* gcc.target/i386/cet-sjlj-5.c: Likewise.
+	* gcc.target/i386/cet-switch-3.c: Likewise.
+	* gcc.target/i386/pr82659-1.c: Likewise.
+	* gcc.target/i386/pr82659-2.c: Likewise.
+	* gcc.target/i386/pr82659-3.c: Likewise.
+	* gcc.target/i386/pr82659-4.c: Likewise.
+	* gcc.target/i386/pr82659-5.c: Likewise.
+	* gcc.target/i386/pr82659-6.c: Likewise.
+
+2017-10-23  Sandra Loosemore  <sandra@codesourcery.com>
+
+	* gcc.target/nios2/cdx-branch.c:  Fix broken test.
+	* gcc.target/nios2/lo-addr-bypass.c: New.
+	* gcc.target/nios2/lo-addr-char.c: New.
+	* gcc.target/nios2/lo-addr-int.c: New.
+	* gcc.target/nios2/lo-addr-pic.c: New.
+	* gcc.target/nios2/lo-addr-short.c: New.
+	* gcc.target/nios2/lo-addr-tls.c: New.
+	* gcc.target/nios2/lo-addr-uchar.c: New.
+	* gcc.target/nios2/lo-addr-ushort.c: New.
+	* gcc.target/nios2/lo-addr-volatile.c: New.
+
+2017-10-23  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80449
+	* g++.dg/cpp1z/class-deduction46.C: New.
+
+2017-10-23  Jakub Jelinek  <jakub@redhat.com>
+
+	PR debug/82630
+	* g++.dg/guality/pr82630.C: New test.
+
+2017-10-23  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/82662
+	* gcc.target/i386/pr82662.c: New test.
+
+2017-10-23  Marek Polacek  <polacek@redhat.com>
+
+	PR c/82681
+	* gcc.dg/c90-const-expr-11.c: Fix typos in dg-warning.
+	* gcc.dg/overflow-warn-5.c: Likewise.
+	* gcc.dg/overflow-warn-8.c: Likewise.
+
+2017-10-23  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82673
+	* gcc.target/i386/pr82673.c: New test.
+
+2017-10-23  Jakub Jelinek  <jakub@redhat.com>
+
+	* lib/scanasm.exp (dg-function-on-line): Accept optional column info.
+	* gcc.dg/debug/dwarf2/pr53948.c: Likewise.
+	* g++.dg/debug/dwarf2/pr77363.C: Likewise.
+	* gcc.dg/debug/dwarf2/asm-line1.c: Add -gno-column-info to dg-options.
+	* gcc.dg/debug/dwarf2/discriminator.c: Likewise.
+	* g++.dg/debug/dwarf2/typedef6.C: Likewise.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82672
+	* gfortran.dg/graphite/pr82672.f90: New testcase.
+
+2017-10-23  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/77555
+	* g++.dg/torture/pr77555.C: New.
+
+2017-10-23  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82129
+	* gcc.dg/torture/pr82129.c: New testcase.
+
+2017-10-22  Uros Bizjak  <ubizjak@gmail.com>
+
+	PR target/52451
+	* gcc.dg/torture/pr52451.c: New test.
+
+2017-10-22  Uros Bizjak  <ubizjak@gmail.com>
+	    Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82628
+	* gcc.dg/torture/pr82628.c: New test.
+
+2017-10-22  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* c-c++-common/attr-nocf-check-1a.c: Remove test.
+	* c-c++-common/attr-nocf-check-3a.c: Likewise.
+	* gcc.target/i386/attr-nocf-check-1a.c: Add test.
+	* gcc.target/i386/attr-nocf-check-3a.c: Likewise.
+
+2017-10-21  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* c-c++-common/attr-nocf-check-1.c: Shorten a cheking message.
+	* c-c++-common/attr-nocf-check-3.c: Likewise.
+	* c-c++-common/fcf-protection-1.c: Add x86 specific message.
+	* c-c++-common/fcf-protection-2.c: Likewise.
+	* c-c++-common/fcf-protection-3.c: Likewise.
+	* c-c++-common/fcf-protection-5.c: Likewise.
+	* c-c++-common/attr-nocf-check-1a.c: New test.
+	* c-c++-common/attr-nocf-check-3a.c: Likewise.
+	* g++.dg/cet-notrack-1.C: Likewise.
+	* gcc.target/i386/cet-intrin-1.c: Likewise.
+	* gcc.target/i386/cet-intrin-10.c: Likewise.
+	* gcc.target/i386/cet-intrin-2.c: Likewise.
+	* gcc.target/i386/cet-intrin-3.c: Likewise.
+	* gcc.target/i386/cet-intrin-4.c: Likewise.
+	* gcc.target/i386/cet-intrin-5.c: Likewise.
+	* gcc.target/i386/cet-intrin-6.c: Likewise.
+	* gcc.target/i386/cet-intrin-7.c: Likewise.
+	* gcc.target/i386/cet-intrin-8.c: Likewise.
+	* gcc.target/i386/cet-intrin-9.c: Likewise.
+	* gcc.target/i386/cet-label.c: Likewise.
+	* gcc.target/i386/cet-notrack-1a.c: Likewise.
+	* gcc.target/i386/cet-notrack-1b.c: Likewise.
+	* gcc.target/i386/cet-notrack-2a.c: Likewise.
+	* gcc.target/i386/cet-notrack-2b.c: Likewise.
+	* gcc.target/i386/cet-notrack-3.c: Likewise.
+	* gcc.target/i386/cet-notrack-4a.c: Likewise.
+	* gcc.target/i386/cet-notrack-4b.c: Likewise.
+	* gcc.target/i386/cet-notrack-5a.c: Likewise.
+	* gcc.target/i386/cet-notrack-5b.c: Likewise.
+	* gcc.target/i386/cet-notrack-6a.c: Likewise.
+	* gcc.target/i386/cet-notrack-6b.c: Likewise.
+	* gcc.target/i386/cet-notrack-7.c: Likewise.
+	* gcc.target/i386/cet-property-1.c: Likewise.
+	* gcc.target/i386/cet-property-2.c: Likewise.
+	* gcc.target/i386/cet-rdssp-1.c: Likewise.
+	* gcc.target/i386/cet-sjlj-1.c: Likewise.
+	* gcc.target/i386/cet-sjlj-2.c: Likewise.
+	* gcc.target/i386/cet-sjlj-3.c: Likewise.
+	* gcc.target/i386/cet-switch-1.c: Likewise.
+	* gcc.target/i386/cet-switch-2.c: Likewise.
+	* lib/target-supports.exp (check_effective_target_cet): New proc.
+
+2017-10-20  Jan Hubicka  <hubicka@ucw.cz>
+
+	* gcc.target/i386/pr79683.c: Disable costmodel.
+
+2017-10-21  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gnat.dg/specs/discr_private.ads: Rename into ...
+	* gnat.dg/specs/discr2.ads: ...this.
+	* gnat.dg/specs/discr_record_constant.ads: Rename into...
+	* gnat.dg/specs/discr3.ads: ...this.
+	* gnat.dg/specs/discr4.ads: New test.
+	* gnat.dg/specs/discr4_pkg.ads: New helper.
+
+2017-10-21  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/82586
+	* gfortran.dg/pdt_16.f03 : New test.
+	* gfortran.dg/pdt_4.f03 : Catch the changed messages.
+	* gfortran.dg/pdt_8.f03 : Ditto.
+
+	PR fortran/82587
+	* gfortran.dg/pdt_17.f03 : New test.
+
+	PR fortran/82589
+	* gfortran.dg/pdt_18.f03 : New test.
+
+2017-10-20  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
+
+	* c-c++-common/fcf-protection-1.c: New test.
+	* c-c++-common/fcf-protection-2.c: Likewise.
+	* c-c++-common/fcf-protection-3.c: Likewise.
+	* c-c++-common/fcf-protection-4.c: Likewise.
+	* c-c++-common/fcf-protection-5.c: Likewise.
+	* c-c++-common/attr-nocf-check-1.c: Likewise.
+	* c-c++-common/attr-nocf-check-2.c: Likewise.
+	* c-c++-common/attr-nocf-check-3.c: Likewise.
+
+2017-10-20  Ed Schonberg  <schonberg@adacore.com>
+
+	* gnat.dg/sync_iface_call.adb, gnat.dg/sync_iface_call_pkg.ads,
+	gnat.dg/sync_iface_call_pkg2.adb, gnat.dg/sync_iface_call_pkg2.ads:
+	New testcase.
+
+2017-10-20  Justin Squirek  <squirek@adacore.com>
+
+	* gnat.dg/default_pkg_actual.adb, gnat.dg/default_pkg_actual2.adb: New
+	testcases.
+
+2017-10-20  Ed Schonberg  <schonberg@adacore.com>
+
+	* gnat.dg/dimensions.adb, gnat.dg/dimensions.ads: New testcase.
+
+2017-10-20  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82473
+	* gcc.dg/torture/pr82473.c: New testcase.
+
+2017-10-20  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82603
+	* gcc.dg/torture/pr82603.c: New testcase.
+
+2017-10-20  Tom de Vries  <tom@codesourcery.com>
+
+	* gcc.dg/tree-ssa/ldist-27.c: Remove dg-require-stack-size.
+	(main): Move s ...
+	(s): ... here.
+
+2017-10-20  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82158
+	* gcc.dg/tree-ssa/noreturn-1.c: New test.
+
+	PR target/82370
+	* gcc.target/i386/avx-pr82370.c: New test.
+	* gcc.target/i386/avx2-pr82370.c: New test.
+	* gcc.target/i386/avx512f-pr82370.c: New test.
+	* gcc.target/i386/avx512bw-pr82370.c: New test.
+	* gcc.target/i386/avx512vl-pr82370.c: New test.
+	* gcc.target/i386/avx512vlbw-pr82370.c: New test.
+
+2017-10-20  Orlando Arias  <oarias@knights.ucf.edu>
+
+	* lib/target-supports.exp (check_effective_target_keeps_null_pointer_checks):
+	Add msp430 to the list.
+
+2017-10-19  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82308
+	* g++.dg/cpp1z/class-deduction45.C: New.
+
+2017-10-19  Uros Bizjak  <ubizjak@gmail.com>
+	    Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82618
+	* gcc.target/i386/pr82618.c: New test.
+
+2017-10-19  Martin Sebor  <msebor@redhat.com>
+
+	PR tree-optimization/82596
+	* gcc/testsuite/gcc.dg/pr82596.c: New test.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc.dg/Walloca-15.c: New test.
+	* gnat.dg/stack_usage4.adb: Likewise.
+	* gnat.dg/stack_usage4_pkg.ads: New helper.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/82600
+	* g++.dg/warn/Wreturn-local-addr-4.C: New test.
+
+2017-10-19  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc.dg/debug/dwarf2/sso.c: Rename into...
+	* gcc.dg/debug/dwarf2/sso-1.c: ...this.
+	* gcc.dg/debug/dwarf2/sso-2.c: New test.
+	* gcc.dg/debug/dwarf2/sso-3.c: Likewise.
+
+2017-10-19  Richard Earnshaw  <rearnsha@arm.com>
+
+	PR target/82445
+	* gcc.target/arm/peep-ldrd-1.c: Tighten test scan pattern.
+	* gcc.target/arm/peep-strd-1.c: Likewise.
+	* gcc.target/arm/peep-ldrd-2.c: New test.
+	* gcc.target/arm/peep-strd-2.c: New test.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	* c-c++-common/ubsan/builtin-1.c: New test.
+
+	* c-c++-common/ubsan/float-cast-overflow-1.c: Drop value keyword
+	from expected output regexps.
+	* c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-3.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-5.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-6.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-9.c: Likewise.
+	* c-c++-common/ubsan/float-cast-overflow-10.c: Likewise.
+	* g++.dg/ubsan/float-cast-overflow-bf.C: Likewise.
+	* gcc.dg/ubsan/float-cast-overflow-bf.c: Likewise.
+	* g++.dg/asan/default-options-1.C (__asan_default_options): Add
+	used attribute.
+	* g++.dg/asan/asan_test.C: Run with ASAN_OPTIONS=handle_segv=2
+	in the environment.
+
+	PR target/82580
+	* gcc.target/i386/pr82580.c: Use {\msbb} instead of "sbb" in
+	scan-assembler-times.  Check that there are no movzb* instructions
+	if lp64.
+
+2017-10-19  Tom de Vries  <tom@codesourcery.com>
+
+	* gcc.dg/tree-ssa/ldist-27.c: Use dg-require-stack-size.
+
+2017-10-19  Tom de Vries  <tom@codesourcery.com>
+
+	* lib/target-supports-dg.exp (dg-require-stack-size): New proc.
+	* gcc.c-torture/execute/20030209-1.c: Use dg-require-stack-size.
+	* gcc.c-torture/execute/20040805-1.c: Same.
+	* gcc.c-torture/execute/920410-1.c: Same.
+	* gcc.c-torture/execute/921113-1.c: Same.
+	* gcc.c-torture/execute/921208-2.c: Same.
+	* gcc.c-torture/execute/comp-goto-1.c: Same.
+	* gcc.c-torture/execute/pr20621-1.c: Same.
+	* gcc.c-torture/execute/pr28982b.c: Same.
+	* gcc.dg/tree-prof/comp-goto-1.c: Same.
+
+2017-10-19  Martin Liska  <mliska@suse.cz>
+
+	PR sanitizer/82517
+	* gcc.dg/asan/pr82517.c: New test.
+
+2017-10-19  Jakub Jelinek  <jakub@redhat.com>
+
+	PR fortran/82568
+	* gfortran.dg/gomp/pr82568.f90: New test.
+
+2017-10-19  Bernhard Reutner-Fischer  <aldot@gcc.gnu.org>
+
+	* gfortran.dg/spellcheck-operator.f90: New testcase.
+	* gfortran.dg/spellcheck-procedure_1.f90: New testcase.
+	* gfortran.dg/spellcheck-procedure_2.f90: New testcase.
+	* gfortran.dg/spellcheck-structure.f90: New testcase.
+	* gfortran.dg/spellcheck-parameter.f90: New testcase.
+
+2017-10-18  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/82567
+	* gfortran.dg/array_constructor_51.f90: New test.
+
+2017-10-18  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/79795
+	* gfortran.dg/assumed_size_2.f90: New test.
+
+2017-10-18  Uros Bizjak  <ubizjak@gmail.com>
+	    Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82580
+	* gcc.target/i386/pr82580.c: New test.
+
+2017-10-18  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR libfortran/82233
+	* gfortran.dg/execute_command_line_3.f90:  Remove unneeded output.
+	Move test with wait=.false. before the last test.
+
+2017-10-18  Vladimir Makarov  <vmakarov@redhat.com>
+
+	PR middle-end/82556
+	* gcc.target/i386/pr82556.c: New.
+
+2017-10-18  Bin Cheng  <bin.cheng@arm.com>
+
+	* gcc.dg/tree-ssa/ldist-17.c: Adjust test string.
+	* gcc.dg/tree-ssa/ldist-32.c: New test.
+	* gcc.dg/tree-ssa/ldist-35.c: New test.
+	* gcc.dg/tree-ssa/ldist-36.c: New test.
+
+2017-10-18  Bin Cheng  <bin.cheng@arm.com>
+
+	PR tree-optimization/82574
+	* gcc.dg/tree-ssa/pr82574.c: New test.
+
+2017-10-18  Martin Liska  <mliska@suse.cz>
+
+	* gcc.dg/tree-prof/switch-case-2.c: Scan IPA profile dump
+	file instead of expand. Reason is that switch statement is
+	not yet expanded as decision tree, which also contains a BB
+	with count == 2000.
+
+017-10-18  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/82550
+	* gfortran.dg/submodule_30.f08 : New test.
+
+2017-10-18  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
+	* gcc.target/s390/zvector/vec-cmp-2.c
+	(all_eq_double, all_ne_double, all_gt_double)
+	(all_lt_double, all_ge_double, all_le_double)
+	(any_eq_double, any_ne_double, any_gt_double)
+	(any_lt_double, any_ge_double, any_le_double)
+	(all_eq_int, all_ne_int, all_gt_int)
+	(all_lt_int, all_ge_int, all_le_int)
+	(any_eq_int, any_ne_int, any_gt_int)
+	(any_lt_int, any_ge_int, any_le_int): Set global variable instead
+	of calling foo().  Fix return type.
+
+2017-10-18  Martin Liska  <mliska@suse.cz>
+
+	PR sanitizer/82545
+	* gcc.dg/asan/pr82545.c: New test.
+
+2017-10-18  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/69057
+	* g++.dg/cpp1y/auto-fn45.C: New.
+
+2017-10-18  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/68884
+	* g++.dg/cpp0x/variadic-crash4.C: New.
+
+2017-10-18  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/79474
+	* g++.dg/cpp1y/auto-fn44.C: New.
+
+2017-10-17  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* gcc.dg/attr-alloc_size-11.c: UnXFAIL for visium-*-*.
+
+2017-10-17  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71821
+	* g++.dg/cpp0x/alignas12.C: New.
+
+2017-10-17  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71368
+	* g++.dg/concepts/pr71368.C: New.
+
+2017-10-17  Nathan Sidwell  <nathan@acm.org>
+
+	PR c++/82560
+	* g++.dg/cpp0x/pr82560.C: New.
+
+	PR middle-end/82577
+	* g++.dg/opt/pr82577.C: New.
+
+2017-10-17  Qing Zhao <qing.zhao@oracle.com>
+	    Wilco Dijkstra <wilco.dijkstra@arm.com>
+
+	PR middle-end/80295
+	* gcc.target/aarch64/pr80295.c: New test.
+
+2017-10-17  Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/82563
+	* gcc.dg/graphite/pr82563.c: New testcase.
+
+2017-10-17  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/67831
+	* g++.dg/cpp0x/constexpr-ice18.C: New.
+
+2017-10-17  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/82570
+	* g++.dg/cpp1z/constexpr-lambda18.C: New.
+
+2017-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	PR tree-optimization/82549
+	* gcc.c-torture/compile/pr82549.c: New test.
+
+2017-10-17  Martin Liska  <mliska@suse.cz>
+
+	* lib/scanasm.exp: Print how many times a regex pattern is
+	found.
+	* lib/scandump.exp: Likewise.
+
+2017-10-17  Olga Makhotina  <olga.makhotina@intel.com>
+
+	* gcc.target/i386/avx512dq-vreducesd-1.c (_mm_mask_reduce_sd,
+	_mm_maskz_reduce_sd): Test new intrinsics.
+	* gcc.target/i386/avx512dq-vreducesd-2.c: New.
+	* gcc.target/i386/avx512dq-vreducess-1.c (_mm_mask_reduce_ss,
+	_mm_maskz_reduce_ss): Test new intrinsics.
+	* gcc.target/i386/avx512dq-vreducess-2.c: New.
+	* gcc.target/i386/avx-1.c (__builtin_ia32_reducesd,
+	__builtin_ia32_reducess): Remove builtin.
+	(__builtin_ia32_reducesd_mask,
+	__builtin_ia32_reducess_mask): Test new builtin.
+	* gcc.target/i386/sse-13.c: Ditto.
+	* gcc.target/i386/sse-23.c: Ditto.
+
+2017-10-16  Martin Liska  <mliska@suse.cz>
+
+	* c-c++-common/ubsan/attrib-5.c (float_cast2): Fix warning scan
+	so that it will work for both C and C++ FEs.
+
+2017-10-16  Fritz Reese <fritzoreese@gmail.com>
+
+	PR fortran/82511
+	* gfortran.dg/dec_structure_22.f90: New testcase.
+
+2017-10-16  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/64931
+	* g++.dg/cpp1y/auto-fn43.C: New.
+
+2017-10-16  Wilco Dijkstra  <wdijkstr@arm.com>
+
+	PR target/82442
+	* gcc.dg/vect/pr31699.c: Fix testcase.
+
+2017-10-16  Tamar Christina  <tamar.christina@arm.com>
+
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c: New.
+	* gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c: New.
+
+2017-10-16  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/53574
+	* g++.dg/other/pr53574.C: New test.
+
+2017-10-16  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/61323
+	* g++.dg/cpp0x/constexpr-61323.C: New.
+
+2017-10-15  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/54090
+	* g++.dg/template/crash128.C: New.
+
+2017-10-15  Thomas Koenig  <tkoenig@gcc.gnu.org>
+
+	PR fortran/82372
+	* gfortran.dg/illegal_char.f90: New test.
+
+2017-10-14  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+	    Michael Collison <michael.collison@arm.com>
+
+	* gcc.target/aarch64/cmpelim_mult_uses_1.c: New test.
+
+2017-10-14  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80908
+	* g++.dg/cpp1z/noexcept-type18.C: New.
+
+2017-10-14  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/81016
+	* g++.dg/cpp1z/pr81016.C: New.
+
+2017-10-14  Jakub Jelinek  <jakub@redhat.com>
+
+	PR middle-end/62263
+	PR middle-end/82498
+	* c-c++-common/rotate-8.c: Expect no PHIs in optimized dump.
+
+	PR middle-end/62263
+	PR middle-end/82498
+	* c-c++-common/rotate-5.c (f2): New function.  Move old
+	function to ...
+	(f4): ... this.  Use 127 instead of 128.
+	(f3, f5, f6): New functions.
+	(main): Test all f[1-6] functions, with both 0 and 1 as
+	second arguments.
+	* c-c++-common/rotate-6.c: New test.
+	* c-c++-common/rotate-6a.c: New test.
+	* c-c++-common/rotate-7.c: New test.
+	* c-c++-common/rotate-7a.c: New test.
+	* c-c++-common/rotate-8.c: New test.
+
+2017-10-14  Hristian Kirtchev  <kirtchev@adacore.com>
+
+	* gnat.dg/remote_call_iface.ads, gnat.dg/remote_call_iface.adb: New
+	testcase.
+
+2017-10-14  Jakub Jelinek  <jakub@redhat.com>
+
+	PR rtl-optimization/81423
+	* gcc.c-torture/execute/pr81423.c (foo): Add missing cast.  Change L
+	suffixes to LL.
+	(main): Punt if either long long isn't 64-bit or int isn't 32-bit.
+
+2017-10-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR sanitizer/82353
+	* g++.dg/ubsan/pr82353-2.C: New test.
+	* g++.dg/ubsan/pr82353-2-aux.cc: New file.
+	* g++.dg/ubsan/pr82353-2.h: New file.
+
+2017-10-13  Paul Thomas  <pault@gcc.gnu.org>
+
+	PR fortran/81048
+	* gfortran.dg/derived_init_4.f90 : New test.
+
+2017-10-13  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/69078
+	* g++.dg/cpp1y/lambda-generic-69078-1.C: New.
+	* g++.dg/cpp1y/lambda-generic-69078-2.C: Likewise.
+
+2017-10-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR target/82274
+	* gcc.dg/pr82274-1.c: New test.
+	* gcc.dg/pr82274-2.c: New test.
+
+2017-10-13  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/80873
+	* g++.dg/cpp1y/auto-fn41.C: New.
+	* g++.dg/cpp1y/auto-fn42.C: Likewise.
+
+2017-10-13  David Malcolm  <dmalcolm@redhat.com>
+
+	* g++.dg/cpp0x/udlit-extern-c.C: New test case.
+	* g++.dg/diagnostic/unclosed-extern-c.C: Add example of a template
+	erroneously covered by an unclosed extern "C".
+	* g++.dg/template/extern-c.C: New test case.
+
+2017-10-13  Richard Biener  <rguenther@suse.de>
+
+	* gcc.dg/graphite/pr35356-3.c: XFAIL again.
+	* gcc.dg/graphite/pr81373-2.c: Copy from gcc.dg/graphite/pr81373.c
+	with alternate flags.
+
+2017-10-13  Richard Biener  <rguenther@suse.de>
+
+	* gcc.dg/graphite/scop-10.c: Enlarge array to avoid undefined
+	behavior.
+	* gcc.dg/graphite/scop-7.c: Likewise.
+	* gcc.dg/graphite/scop-8.c: Likewise.
+
+2017-10-13  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/82499
+	* gcc.target/i386/pr82499-1.c: New file.
+	* gcc.target/i386/pr82499-2.c: Likewise.
+	* gcc.target/i386/pr82499-3.c: Likewise.
+
 2017-10-13  Jakub Jelinek  <jakub@redhat.com>
 
 	PR target/82524
@@ -11631,6 +12929,11 @@
 	PR lto/79587
 	* gcc.dg/tree-prof/pr79587.c: New test.
 
+2017-02-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	PR tree-optimization/68644
+	* gcc.dg/tree-ssa/ivopts-lt-2.c: Skip for powerpc*-*-*.
+
 2017-02-21  Marek Polacek  <polacek@redhat.com>
 
 	PR c++/79535
diff --git a/gcc/testsuite/c-c++-common/Wbuiltin-declaration-mismatch-1.c b/gcc/testsuite/c-c++-common/Wbuiltin-declaration-mismatch-1.c
new file mode 100644
index 00000000000..63343b8bfee
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wbuiltin-declaration-mismatch-1.c
@@ -0,0 +1,4 @@
+/* PR c++/82466 */
+/* { dg-options "-Wbuiltin-declaration-mismatch" } */
+
+int printf;  /* { dg-warning "declared as non-function" } */
diff --git a/gcc/testsuite/c-c++-common/Wno-builtin-declaration-mismatch-1.c b/gcc/testsuite/c-c++-common/Wno-builtin-declaration-mismatch-1.c
new file mode 100644
index 00000000000..6409412ac6a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wno-builtin-declaration-mismatch-1.c
@@ -0,0 +1,4 @@
+/* PR c++/82466 */
+/* { dg-options "-Wno-builtin-declaration-mismatch" } */
+
+int printf;
diff --git a/gcc/testsuite/c-c++-common/attr-nocf-check-1.c b/gcc/testsuite/c-c++-common/attr-nocf-check-1.c
new file mode 100644
index 00000000000..15f69731b91
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-nocf-check-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+
+int func (int) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored" } */
+int (*fptr) (int) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored" } */
+typedef void (*nocf_check_t) (void) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored" } */
+
+int
+foo1 (int arg)
+{
+  return func (arg) + fptr (arg);
+}
+
+void
+foo2 (void (*foo) (void))
+{
+  void (*func) (void) __attribute__((nocf_check)) = foo; /* { dg-warning "'nocf_check' attribute ignored" } */
+  func ();
+}
+
+void
+foo3 (nocf_check_t foo)
+{
+  foo ();
+}
+
+void
+foo4 (void (*foo) (void) __attribute__((nocf_check))) /* { dg-warning "'nocf_check' attribute ignored" } */
+{
+  foo ();
+}
diff --git a/gcc/testsuite/c-c++-common/attr-nocf-check-2.c b/gcc/testsuite/c-c++-common/attr-nocf-check-2.c
new file mode 100644
index 00000000000..9ab01804782
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-nocf-check-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+int var1 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+int *var2 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+void (**var3) (void) __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
diff --git a/gcc/testsuite/c-c++-common/attr-nocf-check-3.c b/gcc/testsuite/c-c++-common/attr-nocf-check-3.c
new file mode 100644
index 00000000000..ad1ca7eec9b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-nocf-check-3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+
+int  foo (void) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored" } */
+void (*foo1) (void) __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored" } */
+void (*foo2) (void);
+
+int
+foo (void) /* The function's address is not tracked.  */
+{
+  /* This call site is not tracked for
+     control-flow instrumentation.  */
+  (*foo1)();
+
+  foo1 = foo2;
+  /* This call site is still not tracked for
+     control-flow instrumentation.  */
+  (*foo1)();
+
+  /* This call site is tracked for
+     control-flow instrumentation.  */
+  (*foo2)();
+
+  foo2 = foo1;
+  /* This call site is still tracked for
+     control-flow instrumentation.  */
+  (*foo2)();
+
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/fcf-protection-1.c b/gcc/testsuite/c-c++-common/fcf-protection-1.c
new file mode 100644
index 00000000000..2e9337c3051
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fcf-protection-1.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection=full" } */
+/* { dg-error "'-fcf-protection=full' requires CET support on this target" "" { target { "i?86-*-* x86_64-*-*" } } 0 } */
+/* { dg-error "'-fcf-protection=full' is not supported for this target" "" { target { ! "i?86-*-* x86_64-*-*" } } 0 } */
diff --git a/gcc/testsuite/c-c++-common/fcf-protection-2.c b/gcc/testsuite/c-c++-common/fcf-protection-2.c
new file mode 100644
index 00000000000..aa0d2a04645
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fcf-protection-2.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection=branch" } */
+/* { dg-error "'-fcf-protection=branch' requires CET support on this target" "" { target { "i?86-*-* x86_64-*-*" } } 0 } */
+/* { dg-error "'-fcf-protection=branch' is not supported for this target" "" { target { ! "i?86-*-* x86_64-*-*" } } 0 } */
diff --git a/gcc/testsuite/c-c++-common/fcf-protection-3.c b/gcc/testsuite/c-c++-common/fcf-protection-3.c
new file mode 100644
index 00000000000..028775adc35
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fcf-protection-3.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection=return" } */
+/* { dg-error "'-fcf-protection=return' requires CET support on this target" "" { target { "i?86-*-* x86_64-*-*" } } 0 } */
+/* { dg-error "'-fcf-protection=return' is not supported for this target" "" { target { ! "i?86-*-* x86_64-*-*" } } 0 } */
diff --git a/gcc/testsuite/c-c++-common/fcf-protection-4.c b/gcc/testsuite/c-c++-common/fcf-protection-4.c
new file mode 100644
index 00000000000..af4fc0b2812
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fcf-protection-4.c
@@ -0,0 +1,2 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection=none" } */
diff --git a/gcc/testsuite/c-c++-common/fcf-protection-5.c b/gcc/testsuite/c-c++-common/fcf-protection-5.c
new file mode 100644
index 00000000000..a5f8e116992
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fcf-protection-5.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection" } */
+/* { dg-error "'-fcf-protection=full' requires CET support on this target" "" { target { "i?86-*-* x86_64-*-*" } } 0 } */
+/* { dg-error "'-fcf-protection=full' is not supported for this target" "" { target { ! "i?86-*-* x86_64-*-*" } } 0 } */
diff --git a/gcc/testsuite/c-c++-common/pr44515.c b/gcc/testsuite/c-c++-common/pr44515.c
new file mode 100644
index 00000000000..dbb7750907c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr44515.c
@@ -0,0 +1,14 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+void bar(void);
+void foo(void)
+{
+  bar() /* { dg-error "expected ';' before '.' token" } */
+}
+/* { dg-begin-multiline-output "" }
+   bar()
+        ^
+        ;
+ }
+ ~       
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/rotate-5.c b/gcc/testsuite/c-c++-common/rotate-5.c
index 35b14b86c3a..629ab2f7274 100644
--- a/gcc/testsuite/c-c++-common/rotate-5.c
+++ b/gcc/testsuite/c-c++-common/rotate-5.c
@@ -15,12 +15,40 @@ f1 (unsigned long long x, unsigned int y)
   return (x << y) | (x >> ((-y) & 63));
 }
 
+__attribute__((noinline, noclone))
+unsigned long long
+f2 (unsigned long long x, unsigned int y)
+{
+  return (x << y) + (x >> ((-y) & 63));
+}
+
+__attribute__((noinline, noclone))
+unsigned long long
+f3 (unsigned long long x, unsigned int y)
+{
+  return (x << y) ^ (x >> ((-y) & 63));
+}
+
 #if __CHAR_BIT__ * __SIZEOF_INT128__ == 128
 __attribute__((noinline, noclone))
 unsigned __int128
-f2 (unsigned __int128 x, unsigned int y)
+f4 (unsigned __int128 x, unsigned int y)
+{
+  return (x << y) | (x >> ((-y) & 127));
+}
+
+__attribute__((noinline, noclone))
+unsigned __int128
+f5 (unsigned __int128 x, unsigned int y)
 {
-  return (x << y) | (x >> ((-y) & 128));
+  return (x << y) + (x >> ((-y) & 127));
+}
+
+__attribute__((noinline, noclone))
+unsigned __int128
+f6 (unsigned __int128 x, unsigned int y)
+{
+  return (x << y) ^ (x >> ((-y) & 127));
 }
 #endif
 #endif
@@ -31,12 +59,45 @@ main ()
 #if __CHAR_BIT__ * __SIZEOF_LONG_LONG__ == 64
   if (f1 (0x123456789abcdef0ULL, 0) != 0x123456789abcdef0ULL)
     abort ();
+  if (f2 (0x123456789abcdef0ULL, 0) != 0x2468acf13579bde0ULL)
+    abort ();
+  if (f3 (0x123456789abcdef0ULL, 0) != 0)
+    abort ();
+  if (f1 (0x123456789abcdef0ULL, 1) != 0x2468acf13579bde0ULL)
+    abort ();
+  if (f2 (0x123456789abcdef0ULL, 1) != 0x2468acf13579bde0ULL)
+    abort ();
+  if (f3 (0x123456789abcdef0ULL, 1) != 0x2468acf13579bde0ULL)
+    abort ();
 #if __CHAR_BIT__ * __SIZEOF_INT128__ == 128
-  if (f2 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+  if (f4 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
 	  | 0x0fedcba987654321ULL, 0)
       != ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
           | 0x0fedcba987654321ULL))
     abort ();
+  if (f5 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+	  | 0x0fedcba987654321ULL, 0)
+      != ((((unsigned __int128) 0x2468acf13579bde0ULL) << 64)
+          | 0x1fdb97530eca8642ULL))
+    abort ();
+  if (f6 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+	  | 0x0fedcba987654321ULL, 0) != 0)
+    abort ();
+  if (f4 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+	  | 0x0fedcba987654321ULL, 1)
+      != ((((unsigned __int128) 0x2468acf13579bde0ULL) << 64)
+          | 0x1fdb97530eca8642ULL))
+    abort ();
+  if (f5 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+	  | 0x0fedcba987654321ULL, 1)
+      != ((((unsigned __int128) 0x2468acf13579bde0ULL) << 64)
+          | 0x1fdb97530eca8642ULL))
+    abort ();
+  if (f6 ((((unsigned __int128) 0x123456789abcdef0ULL) << 64)
+	  | 0x0fedcba987654321ULL, 1)
+      != ((((unsigned __int128) 0x2468acf13579bde0ULL) << 64)
+          | 0x1fdb97530eca8642ULL))
+    abort ();
 #endif
 #endif
   return 0;
diff --git a/gcc/testsuite/c-c++-common/rotate-6.c b/gcc/testsuite/c-c++-common/rotate-6.c
new file mode 100644
index 00000000000..715f8a48c93
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/rotate-6.c
@@ -0,0 +1,582 @@
+/* Check rotate pattern detection.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
+/* Rotates should be recognized only in functions with | instead of + or ^,
+   or in functions that have constant shift counts (unused attribute on y).  */
+/* { dg-final { scan-tree-dump-times "r\[<>]\[<>]" 48 "optimized" } } */
+
+unsigned int
+f1 (unsigned int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f2 (unsigned int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f3 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) | (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f4 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> 1);
+}
+
+unsigned short int
+f5 (unsigned short int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f6 (unsigned short int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f7 (unsigned char x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f8 (unsigned char x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f9 (unsigned int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f10 (unsigned int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f11 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) | (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f12 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> 1);
+}
+
+unsigned short int
+f13 (unsigned short int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f14 (unsigned short int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f15 (unsigned char x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f16 (unsigned char x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f17 (unsigned int x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f18 (unsigned int x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f19 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << 1);
+}
+
+unsigned int
+f20 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) ^ (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f21 (unsigned short int x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f22 (unsigned short int x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f23 (unsigned char x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ - 1))) ^ (x << (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f24 (unsigned char x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ - 1))) ^ (x << (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f25 (unsigned int x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f26 (unsigned int x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f27 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << 1);
+}
+
+unsigned int
+f28 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) ^ (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f29 (unsigned short int x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f30 (unsigned short int x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f31 (unsigned char x, unsigned int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f32 (unsigned char x, unsigned long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f33 (unsigned int x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f34 (unsigned int x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f35 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) | (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f36 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << 1);
+}
+
+unsigned short int
+f37 (unsigned short int x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f38 (unsigned short int x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f39 (unsigned char x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f40 (unsigned char x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f41 (unsigned int x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f42 (unsigned int x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f43 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) | (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f44 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << 1);
+}
+
+unsigned short int
+f45 (unsigned short int x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f46 (unsigned short int x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f47 (unsigned char x, unsigned int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f48 (unsigned char x, unsigned long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f49 (unsigned int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f50 (unsigned int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f51 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> 1);
+}
+
+unsigned int
+f52 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) ^ (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f53 (unsigned short int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f54 (unsigned short int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f55 (unsigned char x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f56 (unsigned char x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f57 (unsigned int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f58 (unsigned int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f59 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> 1);
+}
+
+unsigned int
+f60 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) ^ (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f61 (unsigned short int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f62 (unsigned short int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f63 (unsigned char x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f64 (unsigned char x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f65 (unsigned int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f66 (unsigned int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f67 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f68 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> 1);
+}
+
+unsigned short int
+f69 (unsigned short int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f70 (unsigned short int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f71 (unsigned char x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f72 (unsigned char x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f73 (unsigned int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f74 (unsigned int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f75 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f76 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> 1);
+}
+
+unsigned short int
+f77 (unsigned short int x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f78 (unsigned short int x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f79 (unsigned char x, unsigned int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f80 (unsigned char x, unsigned long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f81 (unsigned int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f82 (unsigned int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f83 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> 1);
+}
+
+unsigned int
+f84 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f85 (unsigned short int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f86 (unsigned short int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f87 (unsigned char x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) + (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f88 (unsigned char x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) + (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f89 (unsigned int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f90 (unsigned int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f91 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> 1);
+}
+
+unsigned int
+f92 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f93 (unsigned short int x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f94 (unsigned short int x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f95 (unsigned char x, unsigned int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f96 (unsigned char x, unsigned long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
diff --git a/gcc/testsuite/c-c++-common/rotate-6a.c b/gcc/testsuite/c-c++-common/rotate-6a.c
new file mode 100644
index 00000000000..06ba56a5dde
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/rotate-6a.c
@@ -0,0 +1,6 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -Wno-overflow" } */
+
+#define ROTATE_N "rotate-6.c"
+
+#include "rotate-1a.c"
diff --git a/gcc/testsuite/c-c++-common/rotate-7.c b/gcc/testsuite/c-c++-common/rotate-7.c
new file mode 100644
index 00000000000..390cef680d9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/rotate-7.c
@@ -0,0 +1,582 @@
+/* Check rotate pattern detection.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
+/* Rotates should be recognized only in functions with | instead of + or ^,
+   or in functions that have constant shift counts (unused attribute on y).  */
+/* { dg-final { scan-tree-dump-times "r\[<>]\[<>]" 48 "optimized" } } */
+
+unsigned int
+f1 (unsigned int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f2 (unsigned int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f3 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) | (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f4 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> 1);
+}
+
+unsigned short int
+f5 (unsigned short int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f6 (unsigned short int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f7 (unsigned char x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f8 (unsigned char x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f9 (unsigned int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f10 (unsigned int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f11 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) | (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f12 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x >> 1);
+}
+
+unsigned short int
+f13 (unsigned short int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f14 (unsigned short int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f15 (unsigned char x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f16 (unsigned char x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f17 (unsigned int x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f18 (unsigned int x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f19 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x << 1);
+}
+
+unsigned int
+f20 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) ^ (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f21 (unsigned short int x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f22 (unsigned short int x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f23 (unsigned char x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ - 1))) ^ (x << (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f24 (unsigned char x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ - 1))) ^ (x << (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f25 (unsigned int x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f26 (unsigned int x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f27 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x << 1);
+}
+
+unsigned int
+f28 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) ^ (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f29 (unsigned short int x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f30 (unsigned short int x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f31 (unsigned char x, int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f32 (unsigned char x, long int y)
+{
+  return (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f33 (unsigned int x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f34 (unsigned int x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f35 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) | (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f36 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x << 1);
+}
+
+unsigned short int
+f37 (unsigned short int x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f38 (unsigned short int x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f39 (unsigned char x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f40 (unsigned char x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ - 1))) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f41 (unsigned int x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f42 (unsigned int x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f43 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> 1) | (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f44 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) | (x << 1);
+}
+
+unsigned short int
+f45 (unsigned short int x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f46 (unsigned short int x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f47 (unsigned char x, int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f48 (unsigned char x, long int y)
+{
+  return (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f49 (unsigned int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f50 (unsigned int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f51 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) ^ (x >> 1);
+}
+
+unsigned int
+f52 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) ^ (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f53 (unsigned short int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f54 (unsigned short int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f55 (unsigned char x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f56 (unsigned char x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) ^ (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f57 (unsigned int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f58 (unsigned int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f59 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) ^ (x >> 1);
+}
+
+unsigned int
+f60 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) ^ (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f61 (unsigned short int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f62 (unsigned short int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f63 (unsigned char x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f64 (unsigned char x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) ^ (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f65 (unsigned int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f66 (unsigned int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f67 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f68 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> 1);
+}
+
+unsigned short int
+f69 (unsigned short int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f70 (unsigned short int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f71 (unsigned char x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f72 (unsigned char x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ - 1))) + (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f73 (unsigned int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f74 (unsigned int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f75 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f76 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> 1);
+}
+
+unsigned short int
+f77 (unsigned short int x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f78 (unsigned short int x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f79 (unsigned char x, int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f80 (unsigned char x, long int y)
+{
+  return (x << (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned int
+f81 (unsigned int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f82 (unsigned int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f83 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) + (x >> 1);
+}
+
+unsigned int
+f84 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned short int
+f85 (unsigned short int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f86 (unsigned short int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1))) + (x >> (y & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f87 (unsigned char x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) + (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f88 (unsigned char x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ - 1))) + (x >> (y & (__CHAR_BIT__ - 1)));
+}
+
+unsigned int
+f89 (unsigned int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f90 (unsigned int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned int
+f91 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1))) + (x >> 1);
+}
+
+unsigned int
+f92 (unsigned int x, int y __attribute__((unused)))
+{
+  return (x << 1) + (x >> ((-1) & (__CHAR_BIT__ * sizeof (unsigned int) - 1)));
+}
+
+unsigned short int
+f93 (unsigned short int x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f94 (unsigned short int x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f95 (unsigned char x, int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f96 (unsigned char x, long int y)
+{
+  return (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1))) + (x >> (y & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
diff --git a/gcc/testsuite/c-c++-common/rotate-7a.c b/gcc/testsuite/c-c++-common/rotate-7a.c
new file mode 100644
index 00000000000..4fb08465403
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/rotate-7a.c
@@ -0,0 +1,6 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -Wno-overflow" } */
+
+#define ROTATE_N "rotate-7.c"
+
+#include "rotate-1a.c"
diff --git a/gcc/testsuite/c-c++-common/rotate-8.c b/gcc/testsuite/c-c++-common/rotate-8.c
new file mode 100644
index 00000000000..9ba3e940930
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/rotate-8.c
@@ -0,0 +1,171 @@
+/* PR middle-end/62263 */
+/* PR middle-end/82498 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "r\[<>]\[<>]" 23 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */
+
+unsigned int
+f1 (unsigned int x, unsigned char y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned int
+f2 (unsigned int x, signed char y)
+{
+  y &= __CHAR_BIT__ * __SIZEOF_INT__ - 1;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned int
+f3 (unsigned int x, unsigned char y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))));
+}
+
+unsigned int
+f4 (unsigned int x, unsigned char y)
+{
+  y = y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1);
+  return y ? (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y)) : x;
+}
+
+unsigned int
+f5 (unsigned int x, unsigned char y)
+{
+  y = y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1);
+  return (x << y) | (x >> ((__CHAR_BIT__ * __SIZEOF_INT__ - y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f6 (unsigned int x, unsigned char y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((__CHAR_BIT__ * __SIZEOF_INT__ - (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f7 (unsigned int x, unsigned char y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((__CHAR_BIT__ * __SIZEOF_INT__ - y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f8 (unsigned int x, unsigned char y)
+{
+  return (x << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f9 (unsigned int x, int y)
+{
+  return (0x12345678U << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f10 (unsigned int x, int y)
+{
+  return (0x12345678U >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f11 (unsigned int x, int y)
+{
+  return (0x12345678U >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned int
+f12 (unsigned int x, int y)
+{
+  return (0x12345678U << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+}
+
+unsigned
+f13 (unsigned x, unsigned char y)
+{
+  if (y == 0)
+    return x;
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f14 (unsigned x, unsigned y)
+{
+  if (y == 0)
+    return x;
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f15 (unsigned x, unsigned short y)
+{
+  if (y == 0)
+    return x;
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f16 (unsigned x, unsigned char y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  if (y == 0)
+    return x;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f17 (unsigned x, unsigned y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  if (y == 0)
+    return x;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f18 (unsigned x, unsigned short y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  if (y == 0)
+    return x;
+  return (x << y) | (x >> (__CHAR_BIT__ * __SIZEOF_INT__ - y));
+}
+
+unsigned
+f19 (unsigned x, unsigned char y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (((unsigned char) -y) % (__CHAR_BIT__ * __SIZEOF_INT__)));
+}
+
+unsigned
+f20 (unsigned x, unsigned int y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (-y % (__CHAR_BIT__ * __SIZEOF_INT__)));
+}
+
+unsigned
+f21 (unsigned x, unsigned short y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (((unsigned short) -y) % (__CHAR_BIT__ * __SIZEOF_INT__)));
+}
+
+unsigned
+f22 (unsigned x, unsigned char y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
+}
+
+unsigned
+f23 (unsigned x, unsigned short y)
+{
+  y %= __CHAR_BIT__ * __SIZEOF_INT__;
+  return (x << y) | (x >> (-y & ((__CHAR_BIT__ * __SIZEOF_INT__) - 1)));
+}
diff --git a/gcc/testsuite/c-c++-common/ubsan/attrib-5.c b/gcc/testsuite/c-c++-common/ubsan/attrib-5.c
index fee1df1c433..209b5dd7d2b 100644
--- a/gcc/testsuite/c-c++-common/ubsan/attrib-5.c
+++ b/gcc/testsuite/c-c++-common/ubsan/attrib-5.c
@@ -3,8 +3,7 @@
 
 __attribute__((no_sanitize("foobar")))
 static void
-float_cast2 (void)
-{ /* { dg-warning "attribute directive ignored" } */
+float_cast2 (void) { /* { dg-warning "attribute directive ignored" } */
   volatile double d = 300;
   volatile signed char c;
   c = d;
diff --git a/gcc/testsuite/c-c++-common/ubsan/builtin-1.c b/gcc/testsuite/c-c++-common/ubsan/builtin-1.c
new file mode 100644
index 00000000000..2f340e3e70f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/builtin-1.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=undefined" } */
+
+#include <stdio.h>
+
+__attribute__((noinline, noclone)) unsigned long long
+foo (unsigned int x, unsigned long int y, unsigned long long int z, __UINTMAX_TYPE__ w)
+{
+  unsigned long long ret = 0;
+  fprintf (stderr, "FOO MARKER1\n");
+  ret += __builtin_ctz (x);
+  ret += __builtin_ctzl (y);
+  ret += __builtin_ctzll (z);
+  ret += __builtin_ctzimax (w);
+  fprintf (stderr, "FOO MARKER2\n");
+  ret += __builtin_clz (x);
+  ret += __builtin_clzl (y);
+  ret += __builtin_clzll (z);
+  ret += __builtin_clzimax (w);
+  fprintf (stderr, "FOO MARKER3\n");
+  return ret;
+}
+
+int
+main ()
+{
+  volatile __UINTMAX_TYPE__ t = 0;
+  t = foo (t, t, t, t);
+  return 0;
+}
+
+/* { dg-output "FOO MARKER1(\n|\r\n|\r)" } */
+/* { dg-output "(\[^\n\r]*runtime error: passing zero to ctz\\\(\\\), which is not a valid argument\[^\n\r]*(\n|\r\n|\r)){4}" } */
+/* { dg-output "FOO MARKER2(\n|\r\n|\r)" } */
+/* { dg-output "(\[^\n\r]*runtime error: passing zero to clz\\\(\\\), which is not a valid argument\[^\n\r]*(\n|\r\n|\r)){4}" } */
+/* { dg-output "FOO MARKER3" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
index aae88aa3180..8139cc1723f 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
@@ -91,115 +91,115 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -133 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129.5 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128.5 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 132 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256.5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 260 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32773 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769.5 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 32768 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 32768.5 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 32772 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 65536 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 65536.5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 65540 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type 'long long unsigned int'" } */
+/* { dg-output " -133 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129.5 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128.5 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 132 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256.5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 260 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32773 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769.5 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 32768 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 32768.5 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 32772 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 65536 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 65536.5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 65540 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 9.22337e\\\+18 is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.84467e\\\+19 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type 'long long unsigned int'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c
index a54a838870b..a4e8ec457b5 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-10.c
@@ -9,38 +9,38 @@
 #include "float-cast-overflow-8.c"
 
 /* _Decimal32 */
-/* { dg-output "value <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output " <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* _Decimal64 */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* _Decimal128 */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-2.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-2.c
index b25e312b61b..426c625fc6b 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-2.c
@@ -30,44 +30,44 @@ main (void)
   return 0;
 }
 
-/* { dg-output "runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value nan is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -?nan is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value inf is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -inf is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -5 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -1.5 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value nan is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -?nan is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value inf is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*runtime error: value -inf is outside the range of representable values of type '__int128 unsigned'" } */
+/* { dg-output "runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 1.70141e\\\+38 is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: nan is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -?nan is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: inf is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -inf is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: 3.40282e\\\+38 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -5 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -1.5 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: nan is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -?nan is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: inf is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*runtime error: -inf is outside the range of representable values of type '__int128 unsigned'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-3.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-3.c
index ba82111a4df..6567ca9a444 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-3.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-3.c
@@ -26,15 +26,15 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -133* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 132 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 260 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type" } */
+/* { dg-output " -133* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 132 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 260 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-4.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-4.c
index af76e4a3343..48ad257c641 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-4.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-4.c
@@ -30,23 +30,23 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -inf is outside the range of representable values of type" } */
+/* { dg-output " -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -?nan is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* inf is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -inf is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-5.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-5.c
index 4c2fbb4d9ea..25a94950970 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-5.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-5.c
@@ -26,15 +26,15 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[^\n\r]* is outside the range of representable values of type" } */
+/* { dg-output " \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[^\n\r]* is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-6.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-6.c
index a2b5f9a28ce..90ec26838f8 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-6.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-6.c
@@ -26,15 +26,15 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -133 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -129 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 128.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 132 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 256.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 260 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type" } */
+/* { dg-output " -133 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -129 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 128.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 132 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 256.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 260 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c
index 4adb22ae3b4..4e7beeb08db 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-8.c
@@ -99,45 +99,45 @@ main ()
 }
 
 /* float */
-/* { dg-output "value -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output " -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
 /* No error for float and __int128 unsigned max value, as ui128_MAX is +Inf in float.  */
 /* double */
-/* { dg-output "\[^\n\r]*value -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output "\[^\n\r]* -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
 /* long double */
-/* { dg-output "\[^\n\r]*value -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output "\[^\n\r]* -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" { target { ilp32 || lp64 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target { int128 } } } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-9.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-9.c
index f2d71f6a533..ca9b425d23e 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-9.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-9.c
@@ -6,30 +6,30 @@
 #include "float-cast-overflow-8.c"
 
 /* __float80 */
-/* { dg-output "value -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
+/* { dg-output " -129 is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* (-129|-1) is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -32769 is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* \[0-9.e+-]* is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
 /* __float128 */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
-/* { dg-output "\[^\n\r]*value <unknown> is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'signed char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type '__int128'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
+/* { dg-output "\[^\n\r]* <unknown> is outside the range of representable values of type '__int128 unsigned'\[^\n\r]*(\n|\r\n|\r)" { target int128 } } */
diff --git a/gcc/testsuite/g++.dg/asan/asan_test.C b/gcc/testsuite/g++.dg/asan/asan_test.C
index 410e4ce72d4..f3f7626ef3b 100644
--- a/gcc/testsuite/g++.dg/asan/asan_test.C
+++ b/gcc/testsuite/g++.dg/asan/asan_test.C
@@ -8,6 +8,7 @@
 // { dg-additional-options "-DASAN_AVOID_EXPENSIVE_TESTS=1" { target { ! run_expensive_tests } } }
 // { dg-additional-options "-msse2" { target { i?86-*-linux* x86_64-*-linux* } } }
 // { dg-additional-options "-D__NO_INLINE__" { target { *-*-linux-gnu } } }
+// { dg-set-target-env-var ASAN_OPTIONS "handle_segv=2" }
 // { dg-final { asan-gtest } }
 
 #include "asan_test.cc"
diff --git a/gcc/testsuite/g++.dg/asan/default-options-1.C b/gcc/testsuite/g++.dg/asan/default-options-1.C
index dc818917ddc..98abdfbd3ff 100644
--- a/gcc/testsuite/g++.dg/asan/default-options-1.C
+++ b/gcc/testsuite/g++.dg/asan/default-options-1.C
@@ -3,7 +3,7 @@
 const char *kAsanDefaultOptions="verbosity=1 foo=bar";
 
 extern "C"
-__attribute__((no_sanitize_address))
+__attribute__((no_sanitize_address, used))
 const char *__asan_default_options() {
   return kAsanDefaultOptions;
 }
diff --git a/gcc/testsuite/g++.dg/cet-notrack-1.C b/gcc/testsuite/g++.dg/cet-notrack-1.C
new file mode 100644
index 00000000000..43dbbd6a7f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cet-notrack-1.C
@@ -0,0 +1,25 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-fcf-protection -mcet" } */
+/* { dg-final { scan-assembler "endbr32|endbr64" } } */
+/* { dg-final { scan-assembler-times "\tcall\[ \t]+puts" 2 } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+#include <stdio.h>
+
+struct A {
+virtual int foo() __attribute__((nocf_check)) { return 42; }
+};
+
+struct B : A {
+int foo() __attribute__((nocf_check)) { return 73; }
+};
+
+int main() {
+B b;
+A& a = b;
+int (A::*amem) () __attribute__((nocf_check)) = &A::foo; // take address
+if ((a.*amem)() == 73) // use the address
+  printf("pass\n");
+else
+  printf("fail\n");
+return 0;
+}
diff --git a/gcc/testsuite/g++.dg/concepts/pr67595.C b/gcc/testsuite/g++.dg/concepts/pr67595.C
new file mode 100644
index 00000000000..63162fb4c72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr67595.C
@@ -0,0 +1,13 @@
+// { dg-options "-std=c++17 -fconcepts" }
+
+template <class X> concept bool allocatable = requires{{new X}->X * };
+template <class X> concept bool semiregular = allocatable<X>;
+template <class X> concept bool readable = requires{requires semiregular<X>};
+template <class> int weak_input_iterator = requires{{0}->readable};
+template <class X> bool input_iterator{weak_input_iterator<X>};
+template <class X> bool forward_iterator{input_iterator<X>};
+template <class X> bool bidirectional_iterator{forward_iterator<X>};
+template <class X>
+concept bool random_access_iterator{bidirectional_iterator<X>};
+void fn1(random_access_iterator);
+int main() { fn1(0); }  // { dg-error "" }
diff --git a/gcc/testsuite/g++.dg/concepts/pr71368.C b/gcc/testsuite/g++.dg/concepts/pr71368.C
new file mode 100644
index 00000000000..f0e0a956366
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr71368.C
@@ -0,0 +1,25 @@
+// { dg-options "-std=c++17 -fconcepts" }
+
+struct inner;
+
+template<typename X> concept bool CompoundReq = requires {
+    // fine with concrete type in trailing type, i.e. inner& instead of X&
+    { X::inner_member() } -> X&;
+};
+
+template<typename X> concept bool Concept = requires {
+    { X::outer_member() } -> CompoundReq;
+};
+
+struct inner { static inner& inner_member(); };
+struct outer { static inner outer_member(); };
+
+int main()
+{
+    // fine
+    static_assert( CompoundReq<inner> );
+    static_assert( CompoundReq<decltype( outer::outer_member() )> );
+
+    // ICE
+    static_assert( Concept<outer> );
+}
diff --git a/gcc/testsuite/g++.dg/concepts/pr71385.C b/gcc/testsuite/g++.dg/concepts/pr71385.C
new file mode 100644
index 00000000000..bd5d08cb6f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/pr71385.C
@@ -0,0 +1,12 @@
+// { dg-options "-std=c++17 -fconcepts" }
+
+template<class T>
+concept bool Addable(){
+ return requires(T x){
+  {x + x} -> T;
+ };
+}
+
+int main(){
+ Addable t = 0;
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/alignas12.C b/gcc/testsuite/g++.dg/cpp0x/alignas12.C
new file mode 100644
index 00000000000..bc163441529
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alignas12.C
@@ -0,0 +1,6 @@
+// PR c++/71821
+// { dg-do compile { target c++11 } }
+
+template < typename > constexpr int f () {  return 4; }
+
+alignas (f < int >) char c;  // { dg-error "non-integral type" }
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto21.C b/gcc/testsuite/g++.dg/cpp0x/auto21.C
index a827b3df853..346e98c254e 100644
--- a/gcc/testsuite/g++.dg/cpp0x/auto21.C
+++ b/gcc/testsuite/g++.dg/cpp0x/auto21.C
@@ -1,5 +1,5 @@
 // Origin PR c++/47208
 // { dg-do compile { target c++11 } }
 
-constexpr auto list = { }; // { dg-error "deducing from brace-enclosed initializer list requires #include <initializer_list>" }
+constexpr auto list = { }; // { dg-error "deducing from brace-enclosed initializer list requires '#include <initializer_list>'" }
 static const int l = list.size();
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-61323.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-61323.C
new file mode 100644
index 00000000000..f194bb8be82
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-61323.C
@@ -0,0 +1,26 @@
+// PR c++/61323
+// { dg-do compile { target c++11 } }
+
+char* table1[10];
+template<unsigned size, char*(&table)[size]> void test1() { }
+void tester1() { test1<10,table1>(); }
+
+static char* table2[10];
+template<unsigned size, char*(&table)[size]> void test2() { }
+void tester2() { test2<10,table2>(); }
+
+const char* table3[10];
+template<unsigned size, const char*(&table)[size]> void test3() { }
+void tester3() { test3<10,table3>(); }
+
+const char* const table4[10] = {};
+template<unsigned size, const char*const (&table)[size]> void test4() { }
+void tester4() { test4<10,table4>(); }
+
+const char* volatile table5[10] = {};
+template<unsigned size, const char* volatile (&table)[size]> void test5() { }
+void tester5() { test5<10,table5>(); }
+
+const char* const table6[10] = {};
+template<unsigned size, const char*const (&table)[size]> void test6() { }
+void tester6() { test6<10,table6>(); }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ice18.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice18.C
new file mode 100644
index 00000000000..0b5ff701306
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice18.C
@@ -0,0 +1,11 @@
+// PR c++/67831
+// { dg-do compile { target c++11 } }
+
+struct Task {
+  struct TaskStaticData {
+    constexpr TaskStaticData() {}
+  } const &tsd;
+  constexpr Task() : tsd(TaskStaticData()) {}
+};
+
+Task tasks{Task()};
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum35.C b/gcc/testsuite/g++.dg/cpp0x/enum35.C
new file mode 100644
index 00000000000..bcc1b26b390
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum35.C
@@ -0,0 +1,14 @@
+// PR c++/82307
+// { dg-do run { target c++11 } }
+
+#include <cassert>
+
+enum : unsigned long long { VAL };
+
+bool foo (unsigned long long) { return true; }
+bool foo (int) { return false; }
+
+int main()
+{
+  assert (foo(VAL));
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum36.C b/gcc/testsuite/g++.dg/cpp0x/enum36.C
new file mode 100644
index 00000000000..4859670309f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum36.C
@@ -0,0 +1,14 @@
+// PR c++/82307
+// { dg-do run { target c++11 } }
+
+#include <cassert>
+
+enum : short { VAL };
+
+bool foo (int) { return true; }
+bool foo (unsigned long long) { return false; }
+
+int main()
+{
+  assert (foo (VAL));
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/missing-initializer_list-include.C b/gcc/testsuite/g++.dg/cpp0x/missing-initializer_list-include.C
index 8e803c82f24..7d72ec45de4 100644
--- a/gcc/testsuite/g++.dg/cpp0x/missing-initializer_list-include.C
+++ b/gcc/testsuite/g++.dg/cpp0x/missing-initializer_list-include.C
@@ -7,7 +7,7 @@
 
 void test (int i)
 {
-  auto a = { &i }; // { dg-error "deducing from brace-enclosed initializer list requires #include <initializer_list>" }
+  auto a = { &i }; // { dg-error "deducing from brace-enclosed initializer list requires '#include <initializer_list>'" }
 }
 
 /* Verify the output from -fdiagnostics-generate-patch.
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept31.C b/gcc/testsuite/g++.dg/cpp0x/noexcept31.C
new file mode 100644
index 00000000000..c4c0e7dd466
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept31.C
@@ -0,0 +1,12 @@
+// PR c++/77369
+// { dg-do compile { target c++11 } }
+
+template<typename F> int caller(F f) noexcept(noexcept(f())) { f(); return 0; }
+
+void func1() noexcept { }
+
+void func2() { throw 1; }
+
+int instantiate_caller_with_func1 = caller(func1);
+
+static_assert( !noexcept(caller(func2)), "" );
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr82560.C b/gcc/testsuite/g++.dg/cpp0x/pr82560.C
new file mode 100644
index 00000000000..3408bae518e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr82560.C
@@ -0,0 +1,28 @@
+// { dg-do run { target c++11 } }
+// PR82560, failed to destruct default arg inside new
+
+static int liveness = 0;
+
+struct Foo {
+
+  Foo (int) {
+    liveness++;
+  }
+
+  ~Foo() {
+    liveness--;
+  }
+
+};
+
+struct Bar {
+  Bar (Foo = 0) { }
+  ~Bar() { }
+};
+
+int main()
+{
+  delete new Bar();
+
+  return liveness != 0;;
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr82725.C b/gcc/testsuite/g++.dg/cpp0x/pr82725.C
new file mode 100644
index 00000000000..14cb6d897c9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr82725.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && c++11 } } }
+// { dg-require-effective-target pie }
+// { dg-options "-O2 -fpie -mtls-direct-seg-refs" }
+
+struct string
+{
+  __SIZE_TYPE__ length;
+  const char *ptr;
+};
+
+string
+tempDir ()
+{
+  thread_local string cache;
+  return cache;
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-extern-c.C b/gcc/testsuite/g++.dg/cpp0x/udlit-extern-c.C
new file mode 100644
index 00000000000..d47a49c3fa8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-extern-c.C
@@ -0,0 +1,7 @@
+// { dg-do compile { target c++11 } }
+
+extern "C" { // { dg-message "1: 'extern .C.' linkage started here" }
+
+constexpr double operator"" _deg ( double degrees ); // { dg-error "literal operator with C linkage" }
+
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-crash4.C b/gcc/testsuite/g++.dg/cpp0x/variadic-crash4.C
new file mode 100644
index 00000000000..2974fe933e1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-crash4.C
@@ -0,0 +1,14 @@
+// PR c++/68884
+// { dg-do compile { target c++11 } }
+
+namespace std {
+  template <typename _Tp, _Tp __v> struct A { static constexpr _Tp value = __v; };
+typedef A<bool, true> true_type;
+}
+template <int> struct VsA;
+template <class ValueType> struct ValueTemplate {
+  template <template <ValueType> class, class> struct IsInstanceOf;
+  template <template <ValueType> class TemplateA, ValueType... TypesA>
+  struct IsInstanceOf<TemplateA, TemplateA<TypesA...>> : std::true_type {};
+};
+bool foo = ValueTemplate<int>::IsInstanceOf<VsA, VsA<0>>::value;
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-crash5.C b/gcc/testsuite/g++.dg/cpp0x/variadic-crash5.C
new file mode 100644
index 00000000000..6866f39975a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-crash5.C
@@ -0,0 +1,28 @@
+// PR c++/81957
+// { dg-do compile { target c++11 } }
+
+template <class T, T v>
+struct integral_constant { };
+
+struct f {
+  template<bool b, typename Int>
+  void operator()(integral_constant<bool,b>, Int i) {
+  }
+};
+
+template<bool...Bs, typename F, typename ...T>
+auto dispatch(F f, T...t) -> decltype(f(integral_constant<bool,Bs>()..., t...)) {
+  return f(integral_constant<bool,Bs>()..., t...);
+}
+
+template<bool...Bs, typename F, typename ...T>
+auto dispatch(F f, bool b, T...t) -> decltype(dispatch<Bs..., true>(f, t...)) {
+  if (b)
+    return dispatch<Bs..., true>(f, t...);
+  else
+    return dispatch<Bs..., false>(f, t...);
+}
+
+int main() {
+  dispatch(f(), true, 5);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn41.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn41.C
new file mode 100644
index 00000000000..25a879da118
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn41.C
@@ -0,0 +1,23 @@
+// PR c++/80873
+// { dg-do compile { target c++14 } }
+
+struct S {};
+
+auto overloaded(S &);
+
+template <typename T>
+int overloaded(T &) {
+    return 0;
+}
+
+template <typename T>
+auto returns_lambda(T &param) {
+	return [&] {
+		overloaded(param);  // { dg-error "before deduction" }
+	};
+}
+
+int main() {
+	S s;
+	returns_lambda(s);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn42.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn42.C
new file mode 100644
index 00000000000..0f2b68efa42
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn42.C
@@ -0,0 +1,21 @@
+// PR c++/80873
+// { dg-do compile { target c++14 } }
+
+struct Buffer {};
+
+auto parse(Buffer b);
+template <typename T> void parse(T target);
+
+template <typename T>
+auto field(T target) {
+	return [&] {
+		parse(target);
+	};
+}
+
+template <typename T>
+void parse(T target) {}
+
+auto parse(Buffer b) {
+	field(0);
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn43.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn43.C
new file mode 100644
index 00000000000..7256ecb0d01
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn43.C
@@ -0,0 +1,13 @@
+// PR c++/64931
+// { dg-do compile { target c++14 } }
+
+template<typename T>
+struct S {
+  T data[32];
+};
+
+auto
+foo (S<int> & x)
+{
+  return x;
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn44.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn44.C
new file mode 100644
index 00000000000..e35215d64c7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn44.C
@@ -0,0 +1,12 @@
+// PR c++/79474
+// { dg-do compile { target c++14 } }
+
+struct Funject
+{  
+  operator auto() { return +[](bool b) {return b;}; }
+  operator auto() { return +[](bool b, bool, bool) {return b;}; }  // { dg-error "cannot be overloaded" }
+};
+
+Funject fun;
+auto bbb = fun(true);
+auto bbbb = fun(true, false, true);  // { dg-error "no match" }
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn45.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn45.C
new file mode 100644
index 00000000000..a9c163dd736
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn45.C
@@ -0,0 +1,27 @@
+// PR c++/69057
+// { dg-do compile { target c++14 } }
+
+#include <cassert>
+
+using GLenum = unsigned int;
+
+template <typename T>
+inline constexpr auto from_enum(const T& x) noexcept
+{
+    // Comment this line to prevent segmentation fault:
+    assert(true);
+    // ------------------------------------------------
+
+    return (GLenum)x;
+}
+
+enum class buffer_target : GLenum
+{
+    array
+};
+
+struct vbo
+{
+    static constexpr GLenum target_value{from_enum(buffer_target::array)};
+    GLenum x{target_value};
+};
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-80739.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-80739.C
new file mode 100644
index 00000000000..5bfa082866a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-80739.C
@@ -0,0 +1,20 @@
+// PR c++/80739
+// { dg-do compile { target c++14 } }
+
+using size_t = decltype(sizeof(0));
+template <class T> struct element {
+    constexpr element() noexcept: x0(0), x1(0), x2(0), x3(0) {}
+    T x0; int x1, x2, x3;
+};
+template <class T> struct container {
+    constexpr container() noexcept: data() {data = element<T>();}
+    element<T> data;
+};
+template <class T> constexpr bool test() {
+    return (container<T>(), true);
+}
+int main() {
+    constexpr bool tmp0 = test<int>();
+    constexpr bool tmp1 = test<size_t>();
+    return tmp0 && tmp1;
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-82218.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-82218.C
new file mode 100644
index 00000000000..06507a9f437
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-82218.C
@@ -0,0 +1,128 @@
+// PR c++/82218
+// { dg-do compile { target c++14 } }
+
+template<typename _Tp>
+struct identity
+{
+  typedef _Tp type;
+};
+
+template<typename _Tp>
+inline _Tp&&
+forward(typename identity<_Tp>::type&& __t)
+{ return __t; }
+
+template < typename T >
+class delegate;
+
+template < typename R, typename... Params >
+class delegate< R(Params...) > final
+{
+private:
+  using CallbackType = R (*)(void*, Params...);
+
+  using FunctionPtr = R (*)(Params...);
+
+  template < typename Object >
+  using MethodPtr = R (Object::*)(Params...);
+
+  template < typename Object >
+  using ConstMethodPtr = R (Object::*)(Params...) const;
+
+  void* obj_;
+  CallbackType cb_;
+
+  template < typename Object, MethodPtr< Object > Mptr >
+  constexpr static R invoke_method(void* obj, Params... params) noexcept(
+      noexcept((static_cast< Object* >(obj)->*Mptr)(params...)))
+  {
+    return (static_cast< Object* >(obj)->*Mptr)(params...);
+  }
+
+  template < typename Object, ConstMethodPtr< Object > Mptr >
+  constexpr static R invoke_method(void* obj, Params... params) noexcept(
+      noexcept((static_cast< Object* >(obj)->*Mptr)(params...)))
+  {
+    return (static_cast< Object* >(obj)->*Mptr)(params...);
+  }
+
+  template < FunctionPtr Fptr >
+  constexpr static R invoke_function(void*, Params... params) noexcept(
+      noexcept((*Fptr)(params...)))
+  {
+    return (*Fptr)(params...);
+  }
+
+  constexpr delegate(void* obj, CallbackType callback) noexcept : obj_(obj),
+                                                                  cb_(callback)
+  {
+  }
+
+  constexpr static R error_function(Params...)
+  {
+    while(1);
+  }
+
+public:
+  using base_type = delegate< R(Params...) >;
+
+  delegate()
+  {
+    *this = from< error_function >();
+  }
+
+  delegate(const base_type&) = default;
+  delegate(base_type&&)      = default;
+
+  base_type& operator=(const base_type&)  = default;
+  base_type& operator=(base_type&&)       = default;
+
+  template < typename Object, MethodPtr< Object > Mptr >
+  constexpr static auto from(Object& obj) noexcept
+  {
+    return delegate(&obj, &invoke_method< Object, Mptr >);
+  }
+
+  template < typename Object, ConstMethodPtr< Object > Mptr >
+  constexpr static auto from(Object& obj) noexcept
+  {
+    return delegate(&obj, &invoke_method< Object, Mptr >);
+  }
+
+  template < FunctionPtr Fptr >
+  constexpr static auto from() noexcept
+  {
+    static_assert(Fptr != nullptr, "Function pointer must not be null");
+
+    return delegate(nullptr, &invoke_function< Fptr >);
+  }
+
+  template < typename... Args >
+  constexpr auto operator()(Args&&... params) const
+      noexcept(noexcept((*cb_)(obj_, forward< Args >(params)...)))
+  {
+    return (*cb_)(obj_, forward< Args >(params)...);
+  }
+
+  constexpr bool valid() const noexcept
+  {
+    return (cb_ != &invoke_function< error_function >);
+  }
+
+  constexpr bool operator==(const delegate& other) const noexcept
+  {
+    return (obj_ == other.obj_) && (cb_ == other.cb_);
+  }
+
+  constexpr bool operator!=(const delegate& other) const noexcept
+  {
+    return (obj_ != other.obj_) || (cb_ != other.cb_);
+  }
+};
+
+delegate< void(void) > a;
+
+void test()
+{
+  a();
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-1.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-1.C
new file mode 100644
index 00000000000..3f10f82672d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-1.C
@@ -0,0 +1,26 @@
+// PR c++/69078
+// { dg-do run { target c++14 } }
+// { dg-options "-Wall" }
+
+struct Class {
+    Class(void (*_param)()) : data(_param) {}
+    void (*data)();
+};
+
+void funUser(void (*test)(int)) {
+    test(60);
+}
+
+void user(Class& c, int i) {
+    (void)i;
+    if (!c.data) __builtin_abort();
+}
+
+void probe() {}
+
+int main() {
+    static Class instance = { probe };
+    funUser([](auto... p) {
+        user(instance, p...);
+    });
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-2.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-2.C
new file mode 100644
index 00000000000..318e0967250
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-69078-2.C
@@ -0,0 +1,21 @@
+// PR c++/69078
+// { dg-do run { target c++14 } }
+
+#include <cassert>
+
+template<typename F>
+void run( F &&f ) {
+  f(nullptr);
+}
+
+struct V {
+  int i;
+};
+
+int main() {
+  static V const s={2};
+  assert (s.i == 2);
+  run([](auto){
+      assert (s.i == 2);
+    });
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ56.C b/gcc/testsuite/g++.dg/cpp1y/var-templ56.C
new file mode 100644
index 00000000000..d0f762b8e11
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ56.C
@@ -0,0 +1,11 @@
+// PR c++/82085
+// { dg-do compile { target c++14 } }
+
+template <const char& V>
+using char_sequence_t = int;
+
+template <typename T>
+constexpr char name_of_v = 'x';
+
+template <typename T>
+using type = char_sequence_t<name_of_v<T>>;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction45.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction45.C
new file mode 100644
index 00000000000..3fe8dd33b79
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction45.C
@@ -0,0 +1,24 @@
+// PR c++/82308
+// { dg-options -std=c++17 }
+
+template<typename, unsigned>
+struct array {};
+
+template <unsigned R>
+class X {
+public:
+  using T = array<int, R>;
+
+  enum class C : char { A, B };
+  X(T bounds, C c = C::B) : t(bounds) {}
+
+private:
+  T t;
+};
+
+int main()
+{
+  array<int, 2> a;
+  X    d{a};
+  X<2> e{a};
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction46.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction46.C
new file mode 100644
index 00000000000..cf38ed65fa8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction46.C
@@ -0,0 +1,6 @@
+// PR c++/80449
+// { dg-options -std=c++17 }
+
+template<class S> struct C;
+template<> struct C<int> { C(int, int) {} };
+auto k = C{0, 0};  // { dg-error "cannot deduce" }
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda18.C b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda18.C
new file mode 100644
index 00000000000..639018ba945
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda18.C
@@ -0,0 +1,30 @@
+// PR c++/82570
+// { dg-options "-std=c++17" }
+
+template< typename Body >
+inline void iterate(Body body)
+{
+	body(10);
+}
+
+template< typename Pred >
+inline void foo(Pred pred)
+{
+	iterate([&](int param)
+	{
+		if (pred(param))
+		{
+			unsigned char buf[4];
+			buf[0] = 0;
+			buf[1] = 1;
+			buf[2] = 2;
+			buf[3] = 3;
+		}
+	});
+}
+
+int main()
+{
+	foo([](int x) { return x > 0; });
+	return 0;
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type13.C b/gcc/testsuite/g++.dg/cpp1z/noexcept-type13.C
index 8eb3be0bd61..b51d7af2b11 100644
--- a/gcc/testsuite/g++.dg/cpp1z/noexcept-type13.C
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type13.C
@@ -5,7 +5,7 @@
 void foo () throw () {}		// { dg-bogus "mangled name" }
 
 template <class T>
-T bar (T x) { return x; }	// { dg-warning "mangled name" "" { target c++14_down } }
+T bar (T x) { return x; }
 
 void baz () {			// { dg-bogus "mangled name" }
   return (bar (foo)) ();
diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type18.C b/gcc/testsuite/g++.dg/cpp1z/noexcept-type18.C
new file mode 100644
index 00000000000..e01fd0a2030
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type18.C
@@ -0,0 +1,15 @@
+// { dg-options "-std=c++17" }
+
+template<typename T>
+struct S;
+
+template<bool IsNoexcept>
+struct S<void(*)() noexcept(IsNoexcept)> {
+	S() {}
+};
+
+void f() {}
+
+int main() {
+	S<decltype(&f)> {};
+}
diff --git a/gcc/testsuite/g++.dg/cpp1z/pr81016.C b/gcc/testsuite/g++.dg/cpp1z/pr81016.C
new file mode 100644
index 00000000000..4826fbfb775
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/pr81016.C
@@ -0,0 +1,4 @@
+// { dg-options "-std=c++17" }
+
+template <typename a, a> struct b;
+template <typename c> struct b<bool, c::d>; // { dg-error "template parameter" }
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/pr77363.C b/gcc/testsuite/g++.dg/debug/dwarf2/pr77363.C
index 47b71433815..cd06c360a98 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/pr77363.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/pr77363.C
@@ -1,9 +1,9 @@
 // PR debug/77363
 // { dg-options "-gdwarf-2 -dA -fno-merge-debug-strings" }
-// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type2\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_type" } }
-// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type3\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_type" } }
-// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type4\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_type" } }
-// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type5\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_type" } }
+// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type2\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*(\[^\n\r\]* DW_AT_decl_column\[^\n\r\]*\[\n\r]*)?\[^\n\r\]* DW_AT_type" } }
+// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type3\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*(\[^\n\r\]* DW_AT_decl_column\[^\n\r\]*\[\n\r]*)?\[^\n\r\]* DW_AT_type" } }
+// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type4\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*(\[^\n\r\]* DW_AT_decl_column\[^\n\r\]*\[\n\r]*)?\[^\n\r\]* DW_AT_type" } }
+// { dg-final { scan-assembler "DIE \\(\[^\n\r\]*\\) DW_TAG_typedef\[^\n\r\]*\[\n\r]*\[^\n\r\]*type5\[^\n\r\]* DW_AT_name\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_file\[^\n\r\]*\[\n\r]*\[^\n\r\]* DW_AT_decl_line\[^\n\r\]*\[\n\r]*(\[^\n\r\]* DW_AT_decl_column\[^\n\r\]*\[\n\r]*)?\[^\n\r\]* DW_AT_type" } }
 
 typedef unsigned short type1;
 typedef unsigned char type2;
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/typedef6.C b/gcc/testsuite/g++.dg/debug/dwarf2/typedef6.C
index 7945deadaa2..654eba023da 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/typedef6.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/typedef6.C
@@ -1,5 +1,5 @@
 // Origin PR debug/
-// { dg-options "-gdwarf-2 -dA" }
+// { dg-options "-gdwarf-2 -dA -gno-column-info" }
 
 class C {
 public:
diff --git a/gcc/testsuite/g++.dg/diagnostic/unclosed-extern-c.C b/gcc/testsuite/g++.dg/diagnostic/unclosed-extern-c.C
index fda3532266d..44f538e33ec 100644
--- a/gcc/testsuite/g++.dg/diagnostic/unclosed-extern-c.C
+++ b/gcc/testsuite/g++.dg/diagnostic/unclosed-extern-c.C
@@ -1,3 +1,12 @@
-extern "C" { /* { dg-message "12: to match this '.'" } */
+extern "C" { // { dg-line open_extern_c }
+
+  int foo (void);
+
+/* Missing close-brace for the extern "C" here.  */
+
+template <typename T> // { dg-error "template with C linkage" }
+void bar (void);
+// { dg-message "1: 'extern .C.' linkage started here" "" { target *-*-* } open_extern_c }
 
 void test (void); /* { dg-error "17: expected '.' at end of input" } */
+// { message "12: to match this '.'" "" { target *-*-* } open_extern_c }
diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible5.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible5.C
new file mode 100644
index 00000000000..15ea33675ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible5.C
@@ -0,0 +1,12 @@
+// PR c++/80991
+// { dg-do compile { target c++11 } }
+
+template<bool> void foo()
+{
+  static_assert(__is_trivially_constructible(int, int), "");
+}
+
+void bar()
+{
+  foo<true>();
+}
diff --git a/gcc/testsuite/g++.dg/ext/pr81706.C b/gcc/testsuite/g++.dg/ext/pr81706.C
new file mode 100644
index 00000000000..f0ed8ab6d71
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/pr81706.C
@@ -0,0 +1,32 @@
+// PR libstdc++/81706
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-O3 -mavx2 -mno-avx512f" }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
+// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
diff --git a/gcc/testsuite/g++.dg/ext/typeof12.C b/gcc/testsuite/g++.dg/ext/typeof12.C
new file mode 100644
index 00000000000..4ba75732db1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/typeof12.C
@@ -0,0 +1,11 @@
+// PR c++/71820
+
+void f (void (*) (int, int)) {}
+
+template < typename T > void g (T x, __typeof__ x) {}  // { dg-message "sorry, unimplemented: mangling" }
+
+int main ()
+{
+  f (g < int >); 
+  return 0; 
+}
diff --git a/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C b/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
index cc9266ab8ea..cc912f9ddf4 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
+++ b/gcc/testsuite/g++.dg/gcov/gcov-threads-1.C
@@ -31,14 +31,14 @@ int main(int argc, char **argv) {
   {
     ids[i] = i;
     int r = pthread_create (&t[i], NULL, ContentionNoDeadlock_thread, &ids[i]);
-    assert (r == 0);				/* count(5) */
+    assert (r == 0);				/* count(5*) */
   }
 
   int ret;
   for (int i = 0; i < NR; i++)
     {
       int r = pthread_join (t[i], (void**)&ret);
-      assert (r == 0);				/* count(5) */
+      assert (r == 0);				/* count(5*) */
     }
 
   return 0;					/* count(1) */
diff --git a/gcc/testsuite/g++.dg/gcov/loop.C b/gcc/testsuite/g++.dg/gcov/loop.C
new file mode 100644
index 00000000000..7f3be5587af
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/loop.C
@@ -0,0 +1,27 @@
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+
+unsigned
+loop (unsigned n, int value)		  /* count(14k) */
+{
+  for (unsigned i = 0; i < n - 1; i++)
+  {
+    value += i;				  /* count(21M) */
+  }
+
+  return value;
+}
+
+int main(int argc, char **argv)
+{
+  unsigned sum = 0;
+  for (unsigned i = 0; i < 7 * 1000; i++)
+  {
+    sum += loop (1000, sum);
+    sum += loop (2000, sum);		  /* count(7k) */
+  }
+
+  return 0;				  /* count(1) */
+}
+
+/* { dg-final { run-gcov branches { -abj loop.C } } } */
diff --git a/gcc/testsuite/g++.dg/gcov/ternary.C b/gcc/testsuite/g++.dg/gcov/ternary.C
new file mode 100644
index 00000000000..d055928c295
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/ternary.C
@@ -0,0 +1,12 @@
+// { dg-options "-fprofile-arcs -ftest-coverage" }
+// { dg-do run { target native } }
+
+int b, c, d, e;
+
+int main()
+{
+  int a = b < 1 ? (c < 3 ? d : c) : e;	/* count(1*) */
+  return a;
+}
+
+// { dg-final { run-gcov remove-gcda ternary.C } }
diff --git a/gcc/testsuite/g++.dg/guality/pr82630.C b/gcc/testsuite/g++.dg/guality/pr82630.C
new file mode 100644
index 00000000000..71d11acf5e2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/guality/pr82630.C
@@ -0,0 +1,58 @@
+// PR debug/82630
+// { dg-do run }
+// { dg-additional-options "-fPIC" { target fpic } }
+
+struct C
+{
+  int &c;
+  long d;
+  __attribute__((always_inline)) C (int &x) : c(x), d() {}
+};
+int v;
+
+__attribute__((noipa)) void
+fn1 (const void *x)
+{
+  asm volatile ("" : : "g" (x) : "memory");
+}
+
+__attribute__((noipa)) void
+fn2 (C x)
+{
+  int a = x.c + x.d;
+  asm volatile ("" : : "g" (a) : "memory");
+}
+
+__attribute__((noipa)) void
+fn3 (void)
+{
+  asm volatile ("" : : : "memory");
+}
+
+__attribute__((noipa))
+#ifdef __i386__
+__attribute__((regparm (2)))
+#endif
+static void
+fn4 (int *x, const char *y, C z)
+{
+  fn2 (C (*x));
+  fn1 ("baz");
+  fn2 (z);	// { dg-final { gdb-test 41 "y\[0\]" "'f'" } }
+  fn1 ("baz");	// { dg-final { gdb-test 41 "y\[1\]" "'o'" } }
+}
+
+__attribute__((noipa)) void
+fn5 (int *x)
+{
+  fn4 (x, "foo", C (*x));
+  fn3 ();
+}
+
+int
+main ()
+{
+  int a = 10;
+  fn5 (&a);
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/lang-dump.C b/gcc/testsuite/g++.dg/lang-dump.C
new file mode 100644
index 00000000000..b2eddafa79e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lang-dump.C
@@ -0,0 +1,21 @@
+// { dg-additional-options "-fdump-lang-all" }
+// Just check we don't explode when asking for language dumps.  Does
+// not necessarily mean any particular language dump is useful.
+
+struct X 
+{
+  int m;
+  virtual ~X ();
+};
+
+X::~X () {}
+
+struct Y : X
+{
+};
+
+int frob (int a)
+{
+  return 2 * a;
+}
+
diff --git a/gcc/testsuite/g++.dg/opt/pr82577.C b/gcc/testsuite/g++.dg/opt/pr82577.C
new file mode 100644
index 00000000000..1a06897a403
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr82577.C
@@ -0,0 +1,22 @@
+// { dg-additional-options "-O2" }
+// PR c++/82577 ICE when optimizing
+
+#if __cplusplus > 201500L
+// register is no longer a keyword in C++17.
+#define register
+#endif
+
+class a {
+public:
+  int *b();
+};
+struct c {
+  int d;
+  a e;
+} f;
+void fn1(register c *g) {
+  register int *h;
+  do
+    (h) = g->e.b() + (g)->d;
+  while (&f);
+}
diff --git a/gcc/testsuite/g++.dg/opt/pr82778.C b/gcc/testsuite/g++.dg/opt/pr82778.C
new file mode 100644
index 00000000000..eeac0c5f38b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr82778.C
@@ -0,0 +1,37 @@
+// PR rtl-optimization/82778
+// { dg-do compile }
+// { dg-options "-O2" }
+
+template <typename a, int b> struct c {
+  typedef a d[b];
+  static a e(d f, int g) { return f[g]; }
+};
+template <typename a, int b> struct B {
+  typedef c<a, b> h;
+  typename h::d i;
+  long j;
+  a at() { return h::e(i, j); }
+};
+int k, m, r, s, t;
+char l, n, q;
+short o, p, w;
+struct C {
+  int u;
+};
+B<C, 4> v;
+void x() {
+  if (((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m))
+    s = ((-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) < 0) /
+             (-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) ^
+              -25 & o) &&
+         p) >>
+        (0 <= 0
+             ? 0 ||
+                   (-(((p > (q ? v.at().u : k)) >> l - 226) + !(n ^ r * m)) <
+                    0) /
+                       (-(((p > (q ? v.at().u : k)) >> l - 226) +
+                          !(n ^ r * m)) ^ -25 & o)
+             : 0);
+  w = (p > (q ? v.at().u : k)) >> l - 226;
+  t = !(n ^ r * m);
+}
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 63c5f738baa..7e35e686cff 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,12 +1,12 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni" } */
 
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h.h are usable
-   with -O -pedantic-errors.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h gfniintrin.h
+   and mm_malloc.h.h are usable with -O -pedantic-errors.  */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index 16a96efe2a5..7e44d47a93c 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,10 +1,10 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
    popcntintrin.h, fmaintrin.h, pkuintrin.h, avx5124fmapsintrin.h,
-   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h and mm_malloc.h are
-   usable with -O -fkeep-inline-functions.  */
+   avx5124vnniwintrin.h, avx512vpopcntdqintrin.h gfniintrin.h and
+   mm_malloc.h are usable with -O -fkeep-inline-functions.  */
 
 #include <x86intrin.h>
diff --git a/gcc/testsuite/g++.dg/other/operator2.C b/gcc/testsuite/g++.dg/other/operator2.C
index 4b952bf11eb..cc68d53354e 100644
--- a/gcc/testsuite/g++.dg/other/operator2.C
+++ b/gcc/testsuite/g++.dg/other/operator2.C
@@ -3,7 +3,7 @@
 
 struct A
 {
-  operator int&(int);  // { dg-error "void" }
+  operator int&(int);  // { dg-error "no arguments" }
 };
 
 A a;
diff --git a/gcc/testsuite/g++.dg/other/pr53574.C b/gcc/testsuite/g++.dg/other/pr53574.C
new file mode 100644
index 00000000000..cc899a552c8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/pr53574.C
@@ -0,0 +1,48 @@
+// PR c++/53574
+// { dg-do compile { target c++11 } }
+// { dg-options "-fstack-usage" }
+
+template <typename> struct A { typedef int type; };
+struct B {
+  typedef __SIZE_TYPE__ H;
+};
+template <typename> class allocator : B {};
+template <typename _Alloc> struct C {
+  template <typename T>
+  static typename T::H foo(T *);
+  typedef decltype(foo((_Alloc *)0)) H;
+  template <typename U>
+  static typename A<H>::type bar(U) { return typename A<H>::type (); }
+  static int baz(_Alloc p1) { bar(p1); return 0; }
+};
+template <typename _Alloc> struct I : C<_Alloc> {};
+template <typename, typename> struct J {
+  typedef I<allocator<int>> K;
+  K k;
+};
+struct D : J<int, allocator<int>> {
+  void fn(int, int) {
+    K m;
+    I<K>::baz(m);
+  }
+};
+template <class Ch, class = int, class = int> struct F {
+  F();
+  F(const Ch *);
+  F test();
+  D d;
+};
+int l;
+struct G {
+  G(F<char>);
+};
+char n;
+template <class Ch, class Tr, class Alloc> F<Ch, Tr, Alloc>::F(const Ch *) {
+  test();
+}
+template <class Ch, class Tr, class Alloc>
+F<Ch, Tr, Alloc> F<Ch, Tr, Alloc>::test() {
+  d.fn(l, 0);
+  return F<Ch, Tr, Alloc> ();
+}
+G fn1() { return G(&n); }
diff --git a/gcc/testsuite/g++.dg/parse/builtin2.C b/gcc/testsuite/g++.dg/parse/builtin2.C
index c524ea68416..daa80bb11b0 100644
--- a/gcc/testsuite/g++.dg/parse/builtin2.C
+++ b/gcc/testsuite/g++.dg/parse/builtin2.C
@@ -1,5 +1,5 @@
 // PR c++/14432
-// { dg-options "" }
+// { dg-options "-Wno-builtin-declaration-mismatch" }
 
 struct Y {}; 
 Y y1; 
diff --git a/gcc/testsuite/g++.dg/pr71694.C b/gcc/testsuite/g++.dg/pr71694.C
index e79f62aeb13..0a8baf230bf 100644
--- a/gcc/testsuite/g++.dg/pr71694.C
+++ b/gcc/testsuite/g++.dg/pr71694.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fno-store-merging" } */
 
 struct B {
     B() {}
diff --git a/gcc/testsuite/g++.dg/template/bitfield4.C b/gcc/testsuite/g++.dg/template/bitfield4.C
new file mode 100644
index 00000000000..4927b7ab144
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/bitfield4.C
@@ -0,0 +1,6 @@
+// PR c++/82357
+
+template <typename> struct A {
+  A() { x |= 0; }
+  int x : 8;
+};
diff --git a/gcc/testsuite/g++.dg/template/cast4.C b/gcc/testsuite/g++.dg/template/cast4.C
new file mode 100644
index 00000000000..2f46c7189eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/cast4.C
@@ -0,0 +1,4 @@
+template <class T> void f()
+{
+  static_cast<int&>(42);	// { dg-error "static_cast" }
+}
diff --git a/gcc/testsuite/g++.dg/template/crash128.C b/gcc/testsuite/g++.dg/template/crash128.C
new file mode 100644
index 00000000000..2682e3dc3ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/crash128.C
@@ -0,0 +1,19 @@
+// PR c++/54090
+
+template <int n>
+struct X {
+
+  template <int N, bool = (n >= N), typename T = void> struct Y;
+
+  template <int N, typename T>
+  struct Y<N, true, T> {};
+
+  static const int M = n / 2;
+
+  template <typename T>
+  struct Y<X::M, true, T> {};
+};
+
+void foo() {
+  X<10>::Y<10/2> y;
+}
diff --git a/gcc/testsuite/g++.dg/template/extern-c.C b/gcc/testsuite/g++.dg/template/extern-c.C
new file mode 100644
index 00000000000..c0dd7cb66d5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/extern-c.C
@@ -0,0 +1,66 @@
+template <typename T> void specializable (T);
+
+/* Invalid template: within "extern C".  */
+
+extern "C" { // { dg-message "1: 'extern .C.' linkage started here" }
+
+template <typename T> // { dg-error "template with C linkage" }
+void within_extern_c_braces (void);
+
+}
+
+/* Valid template: not within "extern C".  */
+
+template <typename T>
+void not_within_extern_c (void);
+
+
+/* Invalid specialization: within "extern C".  */
+
+extern "C" { // { dg-message "1: 'extern .C.' linkage started here" }
+
+template <>  // { dg-error "template specialization with C linkage" }
+void specializable (int);
+
+}
+
+
+/* Valid specialization: not within "extern C".  */
+template <>
+void specializable (char);
+
+
+/* Example of extern C without braces.  */
+
+extern "C" template <typename T> // { dg-line open_extern_c_no_braces }
+void within_extern_c_no_braces (void);
+// { dg-error "12: template with C linkage" "" { target *-*-* } open_extern_c_no_braces }
+// { dg-message "1: 'extern .C.' linkage started here" "" { target *-*-* } open_extern_c_no_braces }
+
+
+/* Nested extern "C" specifications.
+   We should report within the innermost extern "C" that's still open.  */
+
+extern "C" {
+  extern "C" { // { dg-line middle_open_extern_c }
+    extern "C" {
+    }
+
+    template <typename T>  // { dg-error "template with C linkage" }
+    void within_nested_extern_c (void);
+    // { dg-message "3: 'extern .C.' linkage started here" "" { target *-*-* } middle_open_extern_c }
+
+    extern "C++" {
+      /* Valid template: within extern "C++".  */
+      template <typename T>
+      void within_nested_extern_cpp (void);
+
+      extern "C" {  // { dg-line last_open_extern_c }
+	/* Invalid template: within "extern C".  */
+	template <typename T> // { dg-error "template with C linkage" }
+	void within_extern_c_within_extern_cpp (void);
+	// { dg-message "7: 'extern .C.' linkage started here" "" { target *-*-* } last_open_extern_c }	
+      }
+    }
+  }
+}
diff --git a/gcc/testsuite/g++.dg/torture/pr70971.C b/gcc/testsuite/g++.dg/torture/pr70971.C
new file mode 100644
index 00000000000..23f33aafaba
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr70971.C
@@ -0,0 +1,48 @@
+// { dg-additional-options "-std=c++14" }
+
+template<typename Signature>
+class function;
+
+template<typename R, typename... Args>
+class invoker_base
+{
+ public:
+  virtual ~invoker_base() { }
+};
+
+template<typename F, typename R, typename... Args>
+class functor_invoker : public invoker_base<R, Args...>
+{
+ public:
+  explicit functor_invoker(const F& f) : f(f) { }
+ private:
+  F f;
+};
+
+template<typename R, typename... Args>
+class function<R (Args...)> {
+ public:
+  template<typename F>
+  function(const F& f) : invoker(0) {
+    invoker = new functor_invoker<F, R, Args...>(f); 
+  }
+  ~function() {
+    if (invoker)
+      delete invoker;
+  }
+ private:
+  invoker_base<R, Args...>* invoker;
+};
+
+template<typename>
+struct unique_ptr { };
+
+struct A {};
+template <class...> struct typelist {};
+template <class... Cs> unique_ptr<A> chooseB(typelist<Cs...>);
+template <class... Cs, class Idx, class... Rest>
+unique_ptr<A> chooseB(typelist<Cs...> choices, Idx, Rest... rest) {
+  auto f = [=](auto) { return [=] { return chooseB(choices, rest...); }; };
+  function<unique_ptr<A>()> fs[]{f(Cs{})...};
+}
+main() { chooseB(typelist<double, char>{}, 0, 1, 2); }
diff --git a/gcc/testsuite/g++.dg/torture/pr77555.C b/gcc/testsuite/g++.dg/torture/pr77555.C
new file mode 100644
index 00000000000..540d1a09a5f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr77555.C
@@ -0,0 +1,20 @@
+// { dg-do link }
+// { dg-options "-std=c++11" }
+
+extern "C" int printf(const char*, ...);
+struct A {
+  A(int, char *p2) { printf(p2); }
+};
+template <int, typename> struct B { static A static_var; };
+template <int LINE, typename GETTER>
+A B<LINE, GETTER>::static_var{0, GETTER::get()};
+struct C {
+  void unused() {
+    static char function_static;
+    struct D {
+      static char *get() { return &function_static; }
+    };
+    auto addr = B<0, D>::static_var;
+  }
+};
+int main() {}
diff --git a/gcc/testsuite/g++.dg/torture/pr81659.C b/gcc/testsuite/g++.dg/torture/pr81659.C
new file mode 100644
index 00000000000..3696957532e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr81659.C
@@ -0,0 +1,19 @@
+// { dg-do compile }
+
+void
+a (int b)
+{
+  if (b)
+    throw;
+  try
+    {
+      a (3);
+    }
+  catch (int)
+    {
+    }
+  catch (int)
+    {
+    }
+}
+
diff --git a/gcc/testsuite/g++.dg/torture/pr82823.C b/gcc/testsuite/g++.dg/torture/pr82823.C
new file mode 100644
index 00000000000..dab369e7ad3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr82823.C
@@ -0,0 +1,26 @@
+// { dg-do compile }
+// { dg-additional-options "-fstack-clash-protection" }
+// { dg-require-effective-target supports_stack_clash_protection }
+
+
+class a
+{
+public:
+  ~a ();
+  int b;
+};
+class c
+{
+public:
+  a m_fn1 ();
+};
+class d
+{
+  int e ();
+  c f;
+};
+int
+d::e ()
+{
+  return f.m_fn1 ().b;
+}
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr81702.C b/gcc/testsuite/g++.dg/tree-ssa/pr81702.C
new file mode 100644
index 00000000000..85acd857e67
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr81702.C
@@ -0,0 +1,110 @@
+// { dg-do compile }
+// { dg-options "-O2" }
+
+namespace std {
+  struct type_info
+  {
+    virtual bool __do_catch(const type_info *__thr_type, void **__thr_obj,
+       unsigned __outer) const;
+  };
+}
+
+template< typename VALUE_T, typename TYPE >
+struct List_policy
+{
+  typedef VALUE_T *Value_type;
+  typedef TYPE **Type;
+  typedef TYPE *Head_type;
+  typedef TYPE Item_type;
+};
+
+template< typename POLICY >
+class List
+{
+public:
+  typedef typename POLICY::Value_type Value_type;
+  class Iterator
+  {
+    typedef typename POLICY::Type Internal_type;
+  public:
+    typedef typename POLICY::Value_type value_type;
+    typedef typename POLICY::Value_type Value_type;
+    Value_type operator -> () const { return static_cast<Value_type>(*_c); }
+    Internal_type _c;
+  };
+  Iterator begin() { return Iterator(); }
+  Iterator end() { return Iterator(); }
+  typename POLICY::Head_type _f;
+};
+
+template<typename ELEM_TYPE> class H_list_item_t { };
+
+template< typename T, typename POLICY >
+class H_list : public List<POLICY>
+{
+public:
+  typedef typename POLICY::Item_type Item;
+  typedef List<POLICY> Base;
+  typedef typename Base::Iterator Iterator;
+  Iterator insert(T *e, Iterator const &pred)
+  {
+    Item **x = &this->_f;
+    *x = static_cast<Item*>(e);
+    return Iterator();
+  }
+};
+
+template< typename T >
+struct H_list_t : H_list<T, List_policy< T, H_list_item_t<T> > >
+{
+  H_list_t(bool b) : H_list<T, List_policy< T, H_list_item_t<T> > >(b) {}
+};
+
+template< typename BASE, typename MATCH_RESULT >
+struct Type_matcher : H_list_item_t<BASE>
+{
+  explicit Type_matcher(std::type_info const *type);
+  typedef MATCH_RESULT Match_result;
+
+private:
+  std::type_info *_type;
+  typedef H_list_t<BASE> List;
+  typedef typename List::Iterator Iterator;
+  static List _for_type;
+};
+
+template< typename BASE, typename MR >
+Type_matcher<BASE, MR>::Type_matcher(std::type_info const *t)
+{
+  Iterator c = _for_type.begin();
+  t->__do_catch(c->_type, 0, 0);
+  _for_type.insert(static_cast<BASE*>(this), _for_type.begin());
+}
+
+template< typename VI, typename HW >
+class Fa : public Type_matcher<Fa<VI, HW>, VI*>
+{
+public:
+  typedef Fa<VI, HW> Self;
+  virtual VI *do_match(HW *f) = 0;
+  explicit Fa(std::type_info const *type) : Type_matcher<Self, VI*>(type) {}
+};
+
+class Res {};
+typedef Fa<Res, Res> R_fac;
+
+template< typename VI, typename HW_BASE, typename HW, typename BASE >
+class Fa_t : public BASE
+{
+public:
+  Fa_t() : BASE(&typeid(HW)) {}
+  VI *do_match(HW_BASE *) { return 0; }
+};
+
+template< typename VI, typename HW >
+class Resource_factory_t : public Fa_t<VI, Res, HW, R_fac > {};
+
+class Foo {};
+class Foo2;
+class Foo3 : public Res {};
+Resource_factory_t<Foo3, Foo> _x;
diff --git a/gcc/testsuite/g++.dg/ubsan/float-cast-overflow-bf.C b/gcc/testsuite/g++.dg/ubsan/float-cast-overflow-bf.C
index f01c576c3db..385a109c359 100644
--- a/gcc/testsuite/g++.dg/ubsan/float-cast-overflow-bf.C
+++ b/gcc/testsuite/g++.dg/ubsan/float-cast-overflow-bf.C
@@ -52,11 +52,11 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type" } */
+/* { dg-output " -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.14748e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 4.29497e\\\+09 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/g++.dg/ubsan/pr82353-2-aux.cc b/gcc/testsuite/g++.dg/ubsan/pr82353-2-aux.cc
new file mode 100644
index 00000000000..75d466b39bb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/pr82353-2-aux.cc
@@ -0,0 +1,32 @@
+// PR sanitizer/82353
+
+#include "pr82353-2.h"
+
+B a;
+E b;
+B C::c0;
+unsigned D::d0;
+
+void
+foo ()
+{
+  a.b1 = p.f2.e2.b1 = 5;
+}
+
+void
+bar ()
+{
+  int c = p.f2.e4.d1.a0 - -~p.f4 * 89;
+  q.c0.b0 = i > g * a.b0 * h - k % a.b1;
+  if ((~(m * j) && -~p.f4 * 90284000534361) % ~m * j)
+    b.e2.b0 << l << f;
+  o = -~p.f4 * 89;
+  int d = p.f4;
+  if (b.e2.b0)
+    b.e2.b1 = c;
+  bool e = ~-~p.f4;
+  a.b1 % e;
+  if (k / p.f2.e2.b1)
+    b.e4.d0 = g * a.b0 * h;
+  n = j;
+}
diff --git a/gcc/testsuite/g++.dg/ubsan/pr82353-2.C b/gcc/testsuite/g++.dg/ubsan/pr82353-2.C
new file mode 100644
index 00000000000..31a35ac3a02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/pr82353-2.C
@@ -0,0 +1,20 @@
+// PR sanitizer/82353
+// { dg-do run }
+// { dg-options "-fsanitize=undefined -fno-sanitize-recover=undefined -std=c++11 -O2 -w" }
+// { dg-additional-sources "pr82353-2-aux.cc" }
+
+#include "pr82353-2.h"
+
+unsigned long f, g;
+bool h, k, j, i;
+unsigned char l, m;
+short n;
+unsigned o;
+F p;
+
+int
+main ()
+{
+  foo ();
+  bar ();
+}
diff --git a/gcc/testsuite/g++.dg/ubsan/pr82353-2.h b/gcc/testsuite/g++.dg/ubsan/pr82353-2.h
new file mode 100644
index 00000000000..4693d2299f2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/pr82353-2.h
@@ -0,0 +1,31 @@
+extern unsigned long f, g;
+extern bool h, i, j, k;
+extern unsigned char l, m;
+extern short n;
+extern unsigned o;
+struct B {
+  short b0 : 27;
+  long b1 : 10;
+};
+struct A {
+  int a0 : 5;
+};
+struct C {
+  static B c0;
+};
+struct D {
+  static unsigned d0;
+  A d1;
+};
+struct E {
+  B e2;
+  D e4;
+};
+struct F {
+  E f2;
+  short f4;
+};
+extern F p;
+extern C q;
+void foo ();
+void bar ();
diff --git a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
index 53032330142..8b24b337efa 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
@@ -17,7 +17,6 @@ void mydata::Set (float x)
     data[i] = x;
 }
 
-/* 256-bit vectors will be handled by loop vectorisation instead, since there
-   is no prologue or epilogue that would raise the cost.  SLP isn't yet
-   possible with variable-length vectors.  */
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" { xfail { { vect256 || vect_variable_length } || aarch64*-*-* } } } } */
+/* For targets without vector loop peeling the loop becomes cheap
+   enough to be vectorized.  */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" { xfail { ! vect_peeling_profitable } } } } */
diff --git a/gcc/testsuite/g++.dg/warn/Wbuiltin_declaration_mismatch-1.C b/gcc/testsuite/g++.dg/warn/Wbuiltin_declaration_mismatch-1.C
new file mode 100644
index 00000000000..713073cb421
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wbuiltin_declaration_mismatch-1.C
@@ -0,0 +1,7 @@
+// PR c++/82466
+// { dg-options "-Wbuiltin-declaration-mismatch" }
+
+namespace N
+{
+  int printf;
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wreturn-local-addr-4.C b/gcc/testsuite/g++.dg/warn/Wreturn-local-addr-4.C
new file mode 100644
index 00000000000..492dcb9e76f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wreturn-local-addr-4.C
@@ -0,0 +1,18 @@
+// PR c++/82600
+// { dg-do compile }
+
+void *b[10];
+
+template <int N>
+void **
+foo (int x)
+{
+  void **a = b;		// { dg-bogus "address of local variable 'a' returned" }
+  return &a[x];
+}
+
+void **
+bar (int x)
+{
+  return foo <0> (x);
+}
diff --git a/gcc/testsuite/g++.dg/warn/pr82710.C b/gcc/testsuite/g++.dg/warn/pr82710.C
new file mode 100644
index 00000000000..93585eacf99
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/pr82710.C
@@ -0,0 +1,48 @@
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wparentheses -Wno-non-template-friend" }
+
+// the MVP warning triggered on a friend decl.  */
+class X;
+enum class Q {}; // C++ 11ness
+enum R {};
+
+namespace here 
+{
+  // these friends
+  X friendFunc1();
+  X *friendFunc2 ();
+  int friendFunc3 ();
+  int bob ();
+  Q bill ();
+  R ben ();
+}
+
+namespace nm
+{
+  namespace here 
+  {
+    // Not these friends
+    void friendFunc1 ();
+    void friendFunc2 ();
+    void friendFunc3 ();
+    int bob ();
+    Q bill ();
+    R ben ();
+  }
+
+  class TestClass
+  {
+    friend X (::here::friendFunc1 ()); // parens are needed
+    friend X *(::here::friendFunc2 ()); // { dg-warning "" }
+    friend X *::here::friendFunc2 ();
+    friend int (::here::friendFunc3 ()); // { dg-warning "" }
+  };
+
+  template <typename T> class X
+  {
+    friend typename T::frob (::here::bob ());
+    friend Q (::here::bill ());
+    friend R (::here::ben ());
+  };
+}
+
diff --git a/gcc/testsuite/g++.old-deja/g++.jason/operator.C b/gcc/testsuite/g++.old-deja/g++.jason/operator.C
index 339e6a447b4..bdcd5493a97 100644
--- a/gcc/testsuite/g++.old-deja/g++.jason/operator.C
+++ b/gcc/testsuite/g++.old-deja/g++.jason/operator.C
@@ -9,7 +9,7 @@ struct A {
   static int operator()(int a);	   // { dg-error "must be a nonstatic member" }
   static int operator+(A,A);	   // { dg-error "either a non-static member" } 
   int operator+(int a, int b = 1); // { dg-error "either zero or one" }
-  int operator++(char);		   // { dg-error "must take 'int'" } 
+  int operator++(char);		   // { dg-error "must have 'int'" }
   void operator delete (void *);   
   void operator delete (void *, unsigned long);	
 };
diff --git a/gcc/testsuite/g++.old-deja/g++.mike/p811.C b/gcc/testsuite/g++.old-deja/g++.mike/p811.C
index 5c8260aa1f8..2ca04abdcba 100644
--- a/gcc/testsuite/g++.old-deja/g++.mike/p811.C
+++ b/gcc/testsuite/g++.old-deja/g++.mike/p811.C
@@ -1,5 +1,5 @@
 // { dg-do assemble  }
-// { dg-options "" }
+// { dg-options "-Wno-builtin-declaration-mismatch" }
 // This test case caused the compiler to abort at one point in time.
 // prms-id: 811
 
diff --git a/gcc/testsuite/g++.target/aarch64/aarch64.exp b/gcc/testsuite/g++.target/aarch64/aarch64.exp
new file mode 100644
index 00000000000..5eaa8725c9d
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/aarch64.exp
@@ -0,0 +1,38 @@
+#  Specific regression driver for AArch64.
+#  Copyright (C) 2009-2017 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.  */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib g++-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_1.C b/gcc/testsuite/g++.target/aarch64/sve_catch_1.C
new file mode 100644
index 00000000000..48b007fc1be
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_1.C
@@ -0,0 +1,70 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi;
+
+/* Throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y)
+{
+  /* A wild write to x and y.  */
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_2.C b/gcc/testsuite/g++.target/aarch64/sve_catch_2.C
new file mode 100644
index 00000000000..4acdefd235a
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_2.C
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_1.C"
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_3.C b/gcc/testsuite/g++.target/aarch64/sve_catch_3.C
new file mode 100644
index 00000000000..b7e701668e5
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_3.C
@@ -0,0 +1,79 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi, sink;
+
+/* Take 2 stack arguments and throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y, int z1, int z2, int z3, int z4,
+    int z5, int z6, int z7, int z8)
+{
+  /* A wild write to x and y.  */
+  sink = z1;
+  sink = z2;
+  sink = z3;
+  sink = z4;
+  sink = z5;
+  sink = z6;
+  sink = z7;
+  sink = z8;
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y, 1, 2, 3, 4, 5, 6, 7, 8);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_4.C b/gcc/testsuite/g++.target/aarch64/sve_catch_4.C
new file mode 100644
index 00000000000..cb75672e6b6
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_4.C
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_3.C"
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_5.C b/gcc/testsuite/g++.target/aarch64/sve_catch_5.C
new file mode 100644
index 00000000000..7d0d430fd91
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_5.C
@@ -0,0 +1,82 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi, sink;
+volatile void *ptr;
+
+/* Take 2 stack arguments and throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y, int z1, int z2, int z3, int z4,
+    int z5, int z6, int z7, int z8)
+{
+  /* A wild write to x and y.  */
+  sink = z1;
+  sink = z2;
+  sink = z3;
+  sink = z4;
+  sink = z5;
+  sink = z6;
+  sink = z7;
+  sink = z8;
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Create a true variable-sized frame.  */
+  ptr = __builtin_alloca (testi + 40);
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y, 1, 2, 3, 4, 5, 6, 7, 8);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.target/aarch64/sve_catch_6.C b/gcc/testsuite/g++.target/aarch64/sve_catch_6.C
new file mode 100644
index 00000000000..184d7ee111e
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve_catch_6.C
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_5.C"
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82549.c b/gcc/testsuite/gcc.c-torture/compile/pr82549.c
new file mode 100644
index 00000000000..11525cde032
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82549.c
@@ -0,0 +1,9 @@
+/* PR tree-optimization/82549 */
+
+int a, b[1];
+
+int
+main ()
+{
+  return !a || b[-2] || b[-2];
+}
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82816.c b/gcc/testsuite/gcc.c-torture/compile/pr82816.c
new file mode 100644
index 00000000000..8e9bd001bac
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82816.c
@@ -0,0 +1,12 @@
+struct A
+{
+  int b:3;
+} d, e;
+
+int c;
+
+void f ()
+{
+  char g = d.b * e.b;
+  c = g;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/20030209-1.c b/gcc/testsuite/gcc.c-torture/execute/20030209-1.c
index 8f076ecb0c7..52f71ec3543 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20030209-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20030209-1.c
@@ -1,12 +1,5 @@
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "8*100*100" } */
 
-#ifdef STACK_SIZE
-#if STACK_SIZE < 8*100*100
-#define SKIP
-#endif
-#endif
-
-#ifndef SKIP
 double x[100][100];
 int main ()
 {
@@ -18,10 +11,3 @@ int main ()
     abort ();
   exit (0);
 }
-#else
-int
-main ()
-{
-  exit (0);
-}
-#endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/20040805-1.c b/gcc/testsuite/gcc.c-torture/execute/20040805-1.c
index d3208d69f9d..f31109266b1 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20040805-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20040805-1.c
@@ -1,6 +1,6 @@
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "0x12000" } */
 
-#if __INT_MAX__ < 32768 || (defined(STACK_SIZE) && STACK_SIZE < 0x12000)
+#if __INT_MAX__ < 32768
 int main () { exit (0); }
 #else
 int a[2] = { 2, 3 };
diff --git a/gcc/testsuite/gcc.c-torture/execute/920410-1.c b/gcc/testsuite/gcc.c-torture/execute/920410-1.c
index 44a72bd7bb5..daeff5e3990 100644
--- a/gcc/testsuite/gcc.c-torture/execute/920410-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/920410-1.c
@@ -1,8 +1,4 @@
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "40000 * 4 + 256" } */
 
-#define STACK_REQUIREMENT (40000 * 4 + 256)
-#if defined (STACK_SIZE) && STACK_SIZE < STACK_REQUIREMENT
-main () { exit (0); }
-#else
 main(){int d[40000];d[0]=0;exit(0);}
-#endif
+
diff --git a/gcc/testsuite/gcc.c-torture/execute/921113-1.c b/gcc/testsuite/gcc.c-torture/execute/921113-1.c
index d3e44e358d2..824e69f04c4 100644
--- a/gcc/testsuite/gcc.c-torture/execute/921113-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/921113-1.c
@@ -1,9 +1,4 @@
-/* { dg-add-options stack_size } */
-
-#define STACK_REQUIREMENT (128 * 128 * 4 + 1024)
-#if defined (STACK_SIZE) && STACK_SIZE < STACK_REQUIREMENT
-main () { exit (0); }
-#else
+/* { dg-require-stack-size "128 * 128 * 4 + 1024" } */
 
 typedef struct {
   float wsx;
@@ -62,4 +57,3 @@ main()
   exit(0);
 }
 
-#endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/921208-2.c b/gcc/testsuite/gcc.c-torture/execute/921208-2.c
index da9ee524924..01e14f8cffe 100644
--- a/gcc/testsuite/gcc.c-torture/execute/921208-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/921208-2.c
@@ -1,10 +1,5 @@
 /* { dg-require-effective-target untyped_assembly } */
-/* { dg-add-options stack_size } */
-
-#define STACK_REQUIREMENT (100000 * 4 + 1024)
-#if defined (STACK_SIZE) && STACK_SIZE < STACK_REQUIREMENT
-main () { exit (0); }
-#else
+/* { dg-require-stack-size "100000 * 4 + 1024" } */
 
 g(){}
 
@@ -25,5 +20,3 @@ main ()
   f();
   exit(0);
 }
-
-#endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c b/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
index 2a840521487..4379fe70e9c 100644
--- a/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
@@ -1,9 +1,9 @@
 /* { dg-require-effective-target label_values } */
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "4000" } */
 
 #include <stdlib.h>
 
-#if (!defined(STACK_SIZE) || STACK_SIZE >= 4000) && __INT_MAX__ >= 2147483647
+#if __INT_MAX__ >= 2147483647
 typedef unsigned int uint32;
 typedef signed int sint32;
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr20621-1.c b/gcc/testsuite/gcc.c-torture/execute/pr20621-1.c
index 9d0119b9689..b2a9785cd6f 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr20621-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr20621-1.c
@@ -1,12 +1,9 @@
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "0x10000" } */
 
 /* When generating o32 MIPS PIC, main's $gp save slot was out of range
    of a single load instruction.  */
 struct big { int i[sizeof (int) >= 4 && sizeof (void *) >= 4 ? 0x4000 : 4]; };
 struct big gb;
 int foo (struct big b, int x) { return b.i[x]; }
-#if defined(STACK_SIZE) && STACK_SIZE <= 0x10000
-int main (void) { return 0; }
-#else
 int main (void) { return foo (gb, 0) + foo (gb, 1); }
-#endif
+
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr28982b.c b/gcc/testsuite/gcc.c-torture/execute/pr28982b.c
index f28425e8fd7..b68fa9a7051 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr28982b.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr28982b.c
@@ -1,11 +1,8 @@
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "0x80100" } */
 
 /* Like pr28982a.c, but with the spill slots outside the range of
    a single sp-based load on ARM.  This test tests for cases where
    the addresses in the base and index reloads require further reloads.  */
-#if defined(STACK_SIZE) && STACK_SIZE <= 0x80100
-int main (void) { return 0; }
-#else
 #define NITER 4
 #define NVARS 20
 #define MULTI(X) \
@@ -57,4 +54,3 @@ main (void)
       return 1;
   return 0;
 }
-#endif
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr81423.c b/gcc/testsuite/gcc.c-torture/execute/pr81423.c
index 731aa8f1c65..be7413be334 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr81423.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr81423.c
@@ -1,3 +1,5 @@
+/* PR rtl-optimization/81423 */
+
 extern void abort (void);
 
 unsigned long long int ll = 0;
@@ -10,11 +12,11 @@ foo (void)
 {
   ll = -5597998501375493990LL;
 
-  ll = (5677365550390624949L - ll) - (ull1 > 0);
+  ll = (unsigned int) (5677365550390624949LL - ll) - (ull1 > 0);
   unsigned long long int ull3;
   ull3 = (unsigned int)
-    (2067854353L <<
-     (((ll + -2129105131L) ^ 10280750144413668236ULL) -
+    (2067854353LL <<
+     (((ll + -2129105131LL) ^ 10280750144413668236ULL) -
       10280750143997242009ULL)) >> ((2873442921854271231ULL | ull2)
 				    - 12098357307243495419ULL);
 
@@ -24,9 +26,10 @@ foo (void)
 int
 main (void)
 {
-  /* We need a long long of exactly 64 bits for this test.  */
-  ll--;
-  if (ll != 0xffffffffffffffffULL)
+  /* We need a long long of exactly 64 bits and int of exactly 32 bits
+     for this test.  */
+  if (__SIZEOF_LONG_LONG__ * __CHAR_BIT__ != 64
+      || __SIZEOF_INT__ * __CHAR_BIT__ != 32)
     return 0;
 
   ull3 = foo ();
diff --git a/gcc/testsuite/gcc.dg/Walloca-15.c b/gcc/testsuite/gcc.dg/Walloca-15.c
new file mode 100644
index 00000000000..f34ffd98b61
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloca-15.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target alloca } */
+/* { dg-options "-Walloca-larger-than=128 -O2" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+void bar (void*);
+
+void foo1 (size_t len)
+{
+  bar (__builtin_alloca_with_align_and_max (len, 8, 128));
+}
+
+void foo2 (size_t len)
+{
+  bar (__builtin_alloca_with_align_and_max (len, 8, 256)); /* { dg-warning "may be too large" } */
+}
diff --git a/gcc/testsuite/gcc.dg/asan/pr82517.c b/gcc/testsuite/gcc.dg/asan/pr82517.c
new file mode 100644
index 00000000000..c7743ecb8b1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr82517.c
@@ -0,0 +1,43 @@
+/* PR sanitizer/82517.  */
+
+static int *pp;
+
+void
+baz ()
+{
+  return;
+}
+
+void
+bar (int *p)
+{
+  *p = 1;
+}
+
+void
+foo (int a)
+{
+  if (a == 2)
+    {
+    lab:
+      baz ();
+      return;
+    }
+  if (a > 1)
+    {
+      int x __attribute__ ((aligned (256)));
+      pp = &x;
+      bar (&x);
+      if (!x)
+	goto lab;
+    }
+}
+
+int
+main (int argc, char **argv)
+{
+  foo (4);
+  foo (3);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/asan/pr82545.c b/gcc/testsuite/gcc.dg/asan/pr82545.c
new file mode 100644
index 00000000000..8870db3653f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr82545.c
@@ -0,0 +1,17 @@
+/* PR sanitizer/82545.  */
+/* { dg-do compile } */
+
+extern void c(int);
+extern void d(void);
+
+void *buf[5];
+
+void a(void) {
+  {
+    int b;
+    &b;
+    __builtin_setjmp(buf);
+    c(b);
+  }
+  d();
+}
diff --git a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
index fe6154a0d77..6e109955183 100644
--- a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
+++ b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
@@ -47,8 +47,8 @@ typedef __SIZE_TYPE__    size_t;
 
 /* The following tests fail because of missing range information.  The xfail
    exclusions are PR79356.  */
-TEST (signed char, SCHAR_MIN + 2, ALLOC_MAX);   /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for signed char" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* ia64-*-* mips*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } } */
-TEST (short, SHRT_MIN + 2, ALLOC_MAX); /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for short" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* ia64-*-* mips*-*-* powerpc*-*-* sparc*-*-* s390x-*-* } } } } */
+TEST (signed char, SCHAR_MIN + 2, ALLOC_MAX);   /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for signed char" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* ia64-*-* mips*-*-* powerpc*-*-* sparc*-*-* s390*-*-* visium-*-* } } } } */
+TEST (short, SHRT_MIN + 2, ALLOC_MAX); /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for short" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* ia64-*-* mips*-*-* powerpc*-*-* sparc*-*-* s390x-*-* visium-*-* } } } } */
 TEST (int, INT_MIN + 2, ALLOC_MAX);    /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
 TEST (int, -3, ALLOC_MAX);             /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
 TEST (int, -2, ALLOC_MAX);             /* { dg-warning "argument 1 range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
diff --git a/gcc/testsuite/gcc.dg/c17-version-1.c b/gcc/testsuite/gcc.dg/c17-version-1.c
new file mode 100644
index 00000000000..4e69a6eec11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-version-1.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION__ for C17.  Test -std=c17.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c17 -pedantic-errors" } */
+
+#if __STDC_VERSION__ == 201710L
+int i;
+#else
+#error "Bad __STDC_VERSION__."
+#endif
diff --git a/gcc/testsuite/gcc.dg/c17-version-2.c b/gcc/testsuite/gcc.dg/c17-version-2.c
new file mode 100644
index 00000000000..3f367204094
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-version-2.c
@@ -0,0 +1,9 @@
+/* Test __STDC_VERSION__ for C17.  Test -std=iso9899:2017.  */
+/* { dg-do compile } */
+/* { dg-options "-std=iso9899:2017 -pedantic-errors" } */
+
+#if __STDC_VERSION__ == 201710L
+int i;
+#else
+#error "Bad __STDC_VERSION__."
+#endif
diff --git a/gcc/testsuite/gcc.dg/c90-const-expr-11.c b/gcc/testsuite/gcc.dg/c90-const-expr-11.c
index e4f2aff7874..a2720c47bf4 100644
--- a/gcc/testsuite/gcc.dg/c90-const-expr-11.c
+++ b/gcc/testsuite/gcc.dg/c90-const-expr-11.c
@@ -20,7 +20,7 @@ f (void)
   /* Overflow.  */
   struct t b = { INT_MAX + 1 }; /* { dg-warning "integer overflow in expression" } */
   /* { dg-error "overflow in constant expression" "constant" { target *-*-* } .-1 } */
-  struct t c = { DBL_MAX }; /* { dg-warning "overflow in conversion from .double. to .int. chages value " } */
+  struct t c = { DBL_MAX }; /* { dg-warning "overflow in conversion from .double. to .int. changes value " } */
   /* { dg-error "overflow in constant expression" "constant" { target *-*-* } .-1 } */
   /* Bad operator outside sizeof.  */
   struct s d = { 1 ? 1.0 : atan (a.d) }; /* { dg-error "is not a constant expression|near initialization" } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c b/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c
index 3773e1c83c3..aebfcad6008 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c
@@ -1,6 +1,6 @@
 /* PR debug/50983 */
 /* { dg-do compile { target *-*-gnu* } } */
-/* { dg-options "-O0 -gdwarf" } */
+/* { dg-options "-O0 -gdwarf -gno-column-info" } */
 /* { dg-final { scan-assembler "is_stmt 1" } } */
 
 int i;
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/discriminator.c b/gcc/testsuite/gcc.dg/debug/dwarf2/discriminator.c
index b77f7b1bfff..fa24de8d7d4 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/discriminator.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/discriminator.c
@@ -1,7 +1,7 @@
 /* HAVE_AS_DWARF2_DEBUG_LINE macro needs to be defined to pass the unittest.
    However, dg cannot access it, so we restrict to GNU targets.  */
 /* { dg-do compile { target *-*-gnu* } } */
-/* { dg-options "-O0 -gdwarf" } */
+/* { dg-options "-O0 -gdwarf -gno-column-info" } */
 /* { dg-final { scan-assembler "loc \[0-9] 11 \[0-9]( is_stmt \[0-9])?\n" } } */
 /* { dg-final { scan-assembler "loc \[0-9] 11 \[0-9]( is_stmt \[0-9])? discriminator 2\n" } } */
 /* { dg-final { scan-assembler "loc \[0-9] 11 \[0-9]( is_stmt \[0-9])? discriminator 1\n" } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr53948.c b/gcc/testsuite/gcc.dg/debug/dwarf2/pr53948.c
index 0ec3e84d704..4485e19c1cd 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr53948.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr53948.c
@@ -1,7 +1,7 @@
 /* Test that we have line information for the line
    with local variable initializations.  */
 /* { dg-options "-O0 -gdwarf -dA" } */
-/* { dg-final { scan-assembler ".loc 1 8 0|\[#/!\]\[ \t\]+line 8" } } */
+/* { dg-final { scan-assembler ".loc 1 8 \[0-9\]|\[#/!\]\[ \t\]+line 8" } } */
 
 
 int f (register int a, register int b) {
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/sso.c b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-1.c
index 698c636a130..698c636a130 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/sso.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-1.c
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/sso-2.c b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-2.c
new file mode 100644
index 00000000000..0965084d260
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-2.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf-3 -dA" } */
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define REVERSE_SSO __attribute__((scalar_storage_order("big-endian")));
+#else
+#define REVERSE_SSO __attribute__((scalar_storage_order("little-endian")));
+#endif
+
+struct reverse
+{
+  int i;
+  short a[4];
+} REVERSE_SSO;
+
+struct native
+{
+  int i;
+  short a[4];
+};
+
+struct reverse R;
+struct native  N;
+
+/* Verify that we have endianity on the common base type of 'i' and the
+ *  element of 'a' in the first 2 structures.  */
+/* { dg-final { scan-assembler-times " DW_AT_endianity" 2 } } */
+/* { dg-final { scan-assembler-times "DIE \\(\[0-9a-z\]*\\) DW_TAG_base_type" 5 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/sso-3.c b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-3.c
new file mode 100644
index 00000000000..004327c78ad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/sso-3.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf-3 -dA" } */
+
+typedef int   int_t;
+typedef short short_t;
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define REVERSE_SSO __attribute__((scalar_storage_order("big-endian")));
+#else
+#define REVERSE_SSO __attribute__((scalar_storage_order("little-endian")));
+#endif
+
+struct reverse
+{
+  int_t i;
+  short_t a[4];
+} REVERSE_SSO;
+
+struct native
+{
+  int_t i;
+  short_t a[4];
+};
+
+struct reverse R;
+struct native  N;
+
+/* Verify that we have endianity on the common base type of 'i' and the
+ *  element of 'a' in the first 2 structures.  */
+/* { dg-final { scan-assembler-times " DW_AT_endianity" 2 } } */
+/* { dg-final { scan-assembler-times "DIE \\(\[0-9a-z\]*\\) DW_TAG_base_type" 5 } } */
diff --git a/gcc/testsuite/gcc.dg/fold-cond_expr-1.c b/gcc/testsuite/gcc.dg/fold-cond-2.c
index 68ec75480ad..68ec75480ad 100644
--- a/gcc/testsuite/gcc.dg/fold-cond_expr-1.c
+++ b/gcc/testsuite/gcc.dg/fold-cond-2.c
diff --git a/gcc/testsuite/gcc.dg/fold-cond-3.c b/gcc/testsuite/gcc.dg/fold-cond-3.c
new file mode 100644
index 00000000000..fe0ba65ebac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-cond-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-original" } */
+
+unsigned long f1 (int x)
+{
+  return x > 0 ? (unsigned long) x : 0;
+}
+
+unsigned long f2 (int x, int y)
+{
+  return x > y ? (unsigned long) x : (unsigned long) y;
+}
+
+unsigned long f3 (int x)
+{
+  return x < 0 ? (unsigned long) x : 0;
+}
+
+unsigned long f4 (int x, int y)
+{
+  return x < y ? (unsigned long) x : (unsigned long) y;
+}
+
+unsigned long f5 (unsigned int x, unsigned int y)
+{
+  return x > y ? (unsigned long) x : (unsigned long) y;
+}
+
+unsigned long f6 (unsigned int x, unsigned int y)
+{
+  return x < y ? (unsigned long) x : (unsigned long) y;
+}
+
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 3 "original"} } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "original"} } */
diff --git a/gcc/testsuite/gcc.dg/gimplefe-27.c b/gcc/testsuite/gcc.dg/gimplefe-27.c
new file mode 100644
index 00000000000..604a2cc2fcc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gimplefe-27.c
@@ -0,0 +1,9 @@
+/* { dg-options "-O -fgimple" } */
+
+int __GIMPLE ()
+p (int n)
+{
+  int _2;
+  _2 = n_1(D) != 0 ? 2 : 0;
+  return _2;
+}
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-3.c b/gcc/testsuite/gcc.dg/graphite/interchange-3.c
index 4aec824183a..cb93f5d0920 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-3.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-3.c
@@ -47,4 +47,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
+/* { dg-final { scan-tree-dump "tiled" "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-7.c b/gcc/testsuite/gcc.dg/graphite/interchange-7.c
index 81a6d832327..81a0a4daf55 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-7.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-7.c
@@ -46,4 +46,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
+/* { dg-final { scan-tree-dump "tiled" "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-9.c b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
index 88a357893e9..75d269e4527 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-9.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
@@ -44,4 +44,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
+/* { dg-final { scan-tree-dump "tiled" "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/pr35356-3.c b/gcc/testsuite/gcc.dg/graphite/pr35356-3.c
index f2827a2bb6d..8db042ffc6f 100644
--- a/gcc/testsuite/gcc.dg/graphite/pr35356-3.c
+++ b/gcc/testsuite/gcc.dg/graphite/pr35356-3.c
@@ -36,4 +36,5 @@ match (void)
    "Y[winner].y > 0".  This could be fixed when we will use predicates
    for such cases.  */
 
-/* { dg-final { scan-tree-dump-times "loop_1" 0 "graphite" } } */
+/* { dg-final { scan-tree-dump-times "loop_1" 0 "graphite" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "number of SCoPs: 0" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/pr81373-2.c b/gcc/testsuite/gcc.dg/graphite/pr81373-2.c
new file mode 100644
index 00000000000..6a654bec977
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr81373-2.c
@@ -0,0 +1,40 @@
+/* { dg-options "-fno-tree-scev-cprop -floop-nest-optimize -fgraphite-identity -O -fdump-tree-graphite-all" } */
+
+void bar (void);
+
+int toto()
+{
+  int i, j, k;
+  int a[101][100];
+  int b[100];
+
+  for (i = 1; i < 100; i++)
+    {
+      for (j = 1; j < 100; j++)
+	for (k = 1; k < 100; k++)
+	  a[j][k] = a[j+1][i-1] + 2;
+
+      b[i] = b[i-1] + 2;
+
+      bar ();
+
+      for (j = 1; j < 100; j++)
+	a[j][i] = a[j+1][i-1] + 2;
+
+      b[i] = b[i-1] + 2;
+
+      bar ();
+
+      for (j = 1; j < 100; j++)
+	a[j][i] = a[j+1][i-1] + 2;
+
+      b[i] = a[i-1][i] + 2;
+
+      for (j = 1; j < 100; j++)
+	a[j][i] = a[j+1][i-1] + 2;
+    }
+
+  return a[3][5] + b[1];
+}
+
+/* { dg-final { scan-tree-dump-times "number of SCoPs: 2" 1 "graphite"} } */
diff --git a/gcc/testsuite/gcc.dg/graphite/pr82563.c b/gcc/testsuite/gcc.dg/graphite/pr82563.c
new file mode 100644
index 00000000000..cd492fa79c8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr82563.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -floop-nest-optimize" } */
+
+int tj, cw, xf;
+
+void
+zp (int *ei)
+{
+  for (;;)
+    {
+      int hd = 0;
+
+      if (cw != 0 && xf != 0)
+	{
+	  for (hd = 0; hd < 3; ++hd)
+	    cw = (tj != 0) ? 0 : *ei;
+	  for (;;)
+	    ;
+	}
+
+      while (tj != 0)
+	tj = (__UINTPTR_TYPE__)&hd;
+    }
+}
diff --git a/gcc/testsuite/gcc.dg/graphite/scop-10.c b/gcc/testsuite/gcc.dg/graphite/scop-10.c
index 39ed5d7ea7b..20d53510b4e 100644
--- a/gcc/testsuite/gcc.dg/graphite/scop-10.c
+++ b/gcc/testsuite/gcc.dg/graphite/scop-10.c
@@ -4,7 +4,7 @@ int toto()
 {
   int i, j, k;
   int a[100][100];
-  int b[100];
+  int b[200];
 
   for (i = 1; i < 100; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/graphite/scop-7.c b/gcc/testsuite/gcc.dg/graphite/scop-7.c
index 3e337d0c603..2f0a50470e9 100644
--- a/gcc/testsuite/gcc.dg/graphite/scop-7.c
+++ b/gcc/testsuite/gcc.dg/graphite/scop-7.c
@@ -4,7 +4,7 @@ int toto()
 {
   int i, j, k;
   int a[100][100];
-  int b[100];
+  int b[200];
 
   for (i = 1; i < 100; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/graphite/scop-8.c b/gcc/testsuite/gcc.dg/graphite/scop-8.c
index 71d5c531fb8..3ceb5d874d6 100644
--- a/gcc/testsuite/gcc.dg/graphite/scop-8.c
+++ b/gcc/testsuite/gcc.dg/graphite/scop-8.c
@@ -4,7 +4,7 @@ int toto()
 {
   int i, j, k;
   int a[100][100];
-  int b[100];
+  int b[200];
 
   for (i = 1; i < 100; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
index cc108c2bbc3..fb36afe003e 100644
--- a/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
@@ -45,4 +45,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
+/* { dg-final { scan-tree-dump "tiled" "graphite" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/propmalloc-1.c b/gcc/testsuite/gcc.dg/ipa/propmalloc-1.c
new file mode 100644
index 00000000000..9a95f817079
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propmalloc-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-pure-const-details" } */
+
+__attribute__((noinline, no_icf, used))
+static void *f(__SIZE_TYPE__ n)
+{
+  void *p = __builtin_malloc (n);
+  if (p == 0)
+    __builtin_abort ();
+  return p;
+}
+
+__attribute__((noinline, no_icf, used))
+static void *bar(__SIZE_TYPE__ n)
+{
+  void *p = f (n);
+  return p;
+}
+
+/* { dg-final { scan-ipa-dump "Function f found to be malloc" "pure-const" } } */
+/* { dg-final { scan-ipa-dump "Function bar found to be malloc" "pure-const" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/propmalloc-2.c b/gcc/testsuite/gcc.dg/ipa/propmalloc-2.c
new file mode 100644
index 00000000000..95b2fd74a7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propmalloc-2.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-pure-const-details" } */
+
+__attribute__((noinline, used, no_icf))
+static void *foo (__SIZE_TYPE__ n)
+{
+  return __builtin_malloc (n * 10);
+}
+
+__attribute__((noinline, used, no_icf))
+static void *bar(__SIZE_TYPE__ n, int cond)
+{
+  void *p;
+  if (cond)
+    p = foo (n);
+  else
+    p = __builtin_malloc (n);
+
+  return p;
+}
+
+/* { dg-final { scan-ipa-dump "Function foo found to be malloc" "pure-const" } } */
+/* { dg-final { scan-ipa-dump "Function bar found to be malloc" "pure-const" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/propmalloc-3.c b/gcc/testsuite/gcc.dg/ipa/propmalloc-3.c
new file mode 100644
index 00000000000..13558ddd07d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propmalloc-3.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-pure-const-details" } */
+
+static void *foo(__SIZE_TYPE__, int) __attribute__((noinline, no_icf, used));
+
+__attribute__((noinline, used, no_icf))
+static void *bar(__SIZE_TYPE__ n, int m)
+{
+  return foo (n, m);
+}
+
+static void *foo(__SIZE_TYPE__ n, int m)
+{
+  void *p;
+  if (m > 0)
+    p = bar (n, --m);
+  else
+    p = __builtin_malloc (n);
+
+  return p;
+}
+
+/* { dg-final { scan-ipa-dump "Function foo found to be malloc" "pure-const" } } */
+/* { dg-final { scan-ipa-dump "Function bar found to be malloc" "pure-const" } } */
diff --git a/gcc/testsuite/gcc.dg/no-strict-overflow-3.c b/gcc/testsuite/gcc.dg/no-strict-overflow-3.c
index fd4defbd447..d68008a3dde 100644
--- a/gcc/testsuite/gcc.dg/no-strict-overflow-3.c
+++ b/gcc/testsuite/gcc.dg/no-strict-overflow-3.c
@@ -9,7 +9,7 @@
 int
 foo (int i, int j)
 {
-  return i + 100 < j + 1000;
+  return i + 100 < j + 1234;
 }
 
-/* { dg-final { scan-tree-dump "1000" "optimized" } } */
+/* { dg-final { scan-tree-dump "1234" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/noncompile/920923-1.c b/gcc/testsuite/gcc.dg/noncompile/920923-1.c
index 1cb140ebabc..006a07131f9 100644
--- a/gcc/testsuite/gcc.dg/noncompile/920923-1.c
+++ b/gcc/testsuite/gcc.dg/noncompile/920923-1.c
@@ -1,5 +1,6 @@
 /* { dg-message "undeclared identifier is reported only once" "reminder for mmu_base" { target *-*-* } 0 } */
 typedef BYTE unsigned char;	/* { dg-error "expected" } */
+/* { dg-warning "useless type name in empty declaration" ""  { target *-*-* } .-1 } */
 typedef int item_n;
 typedef int perm_set;
 struct PENT { caddr_t v_addr; };/* { dg-error "unknown type name" } */
diff --git a/gcc/testsuite/gcc.dg/overflow-warn-5.c b/gcc/testsuite/gcc.dg/overflow-warn-5.c
index b2c8dc31d95..1a5aa0c6059 100644
--- a/gcc/testsuite/gcc.dg/overflow-warn-5.c
+++ b/gcc/testsuite/gcc.dg/overflow-warn-5.c
@@ -3,5 +3,5 @@
 /* { dg-options "-Woverflow" } */
 
 unsigned char rx_async(unsigned char p) {
-    return p & 512; /* { dg-warning "overflow in conversion from .int. to .unsigned char. chages value" } */
+    return p & 512; /* { dg-warning "overflow in conversion from .int. to .unsigned char. changes value" } */
 }
diff --git a/gcc/testsuite/gcc.dg/overflow-warn-8.c b/gcc/testsuite/gcc.dg/overflow-warn-8.c
index ace605517dc..e76bcac5e07 100644
--- a/gcc/testsuite/gcc.dg/overflow-warn-8.c
+++ b/gcc/testsuite/gcc.dg/overflow-warn-8.c
@@ -7,7 +7,7 @@ void foo (int j)
   int i3 = 1 + INT_MAX; /* { dg-warning "integer overflow" } */
   int i4 = +1 + INT_MAX; /* { dg-warning "integer overflow" } */
   int i5 = (int)((double)1.0 + INT_MAX);
-  int i6 = (double)1.0 + INT_MAX; /* { dg-warning "overflow in conversion from .double. to .int. chages value" } */
+  int i6 = (double)1.0 + INT_MAX; /* { dg-warning "overflow in conversion from .double. to .int. changes value" } */
   int i7 = 0 ? (int)(double)1.0 + INT_MAX : 1;
   int i8 = 1 ? 1 : (int)(double)1.0 + INT_MAX;
   int i9 = j ? (int)(double)1.0 + INT_MAX : 1; /* { dg-warning "integer overflow" } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
index f025f963e69..0bdd877dbd5 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
@@ -45,14 +45,14 @@ show_tree (tree node)
 
   if (richloc.get_num_locations () < 2)
     {
-      error_at_rich_loc (&richloc, "range not found");
+      error_at (&richloc, "range not found");
       return;
     }
 
   enum tree_code code = TREE_CODE (node);
 
   location_range *range = richloc.get_range (1);
-  inform_at_rich_loc (&richloc, "%s", get_tree_code_name (code));
+  inform (&richloc, "%s", get_tree_code_name (code));
 
   /* Recurse.  */
   int min_idx = 0;
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index 0a8eeba1846..9751e1cd25e 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -176,7 +176,7 @@ test_show_locus (function *fun)
       rich_location richloc (line_table, get_loc (line, 15));
       add_range (&richloc, get_loc (line, 10), get_loc (line, 14), false);
       add_range (&richloc, get_loc (line, 16), get_loc (line, 16), false);
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
     }
 
   if (0 == strcmp (fnname, "test_simple_2"))
@@ -185,7 +185,7 @@ test_show_locus (function *fun)
       rich_location richloc (line_table, get_loc (line, 24));
       add_range (&richloc, get_loc (line, 6), get_loc (line, 22), false);
       add_range (&richloc, get_loc (line, 26), get_loc (line, 43), false);
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
     }
 
   if (0 == strcmp (fnname, "test_multiline"))
@@ -195,7 +195,7 @@ test_show_locus (function *fun)
       add_range (&richloc, get_loc (line, 7), get_loc (line, 23), false);
       add_range (&richloc, get_loc (line + 1, 9), get_loc (line + 1, 26),
 		 false);
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
     }
 
   if (0 == strcmp (fnname, "test_many_lines"))
@@ -205,7 +205,7 @@ test_show_locus (function *fun)
       add_range (&richloc, get_loc (line, 7), get_loc (line + 4, 65), false);
       add_range (&richloc, get_loc (line + 5, 9), get_loc (line + 10, 61),
 		 false);
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
     }
 
   /* Example of a rich_location where the range is larger than
@@ -216,7 +216,7 @@ test_show_locus (function *fun)
       location_t start = get_loc (line, 12);
       location_t finish = get_loc (line, 16);
       rich_location richloc (line_table, make_location (start, start, finish));
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
     }
 
   /* Example of a single-range location where the range starts
@@ -251,7 +251,7 @@ test_show_locus (function *fun)
       add_range (&richloc, caret_b, caret_b, true);
       global_dc->caret_chars[0] = 'A';
       global_dc->caret_chars[1] = 'B';
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
       global_dc->caret_chars[0] = '^';
       global_dc->caret_chars[1] = '^';
     }
@@ -265,7 +265,7 @@ test_show_locus (function *fun)
       rich_location richloc (line_table, make_location (start, start, finish));
       richloc.add_fixit_insert_before ("{");
       richloc.add_fixit_insert_after ("}");
-      warning_at_rich_loc (&richloc, 0, "example of insertion hints");
+      warning_at (&richloc, 0, "example of insertion hints");
     }
 
   if (0 == strcmp (fnname, "test_fixit_insert_newline"))
@@ -277,7 +277,7 @@ test_show_locus (function *fun)
       location_t case_loc = make_location (case_start, case_start, case_finish);
       rich_location richloc (line_table, case_loc);
       richloc.add_fixit_insert_before (line_start, "      break;\n");
-      warning_at_rich_loc (&richloc, 0, "example of newline insertion hint");
+      warning_at (&richloc, 0, "example of newline insertion hint");
     }
 
   if (0 == strcmp (fnname, "test_fixit_remove"))
@@ -290,7 +290,7 @@ test_show_locus (function *fun)
       src_range.m_start = start;
       src_range.m_finish = finish;
       richloc.add_fixit_remove (src_range);
-      warning_at_rich_loc (&richloc, 0, "example of a removal hint");
+      warning_at (&richloc, 0, "example of a removal hint");
     }
 
   if (0 == strcmp (fnname, "test_fixit_replace"))
@@ -303,7 +303,7 @@ test_show_locus (function *fun)
       src_range.m_start = start;
       src_range.m_finish = finish;
       richloc.add_fixit_replace (src_range, "gtk_widget_show_all");
-      warning_at_rich_loc (&richloc, 0, "example of a replacement hint");
+      warning_at (&richloc, 0, "example of a replacement hint");
     }
 
   if (0 == strcmp (fnname, "test_mutually_exclusive_suggestions"))
@@ -319,14 +319,14 @@ test_show_locus (function *fun)
 	rich_location richloc (line_table, make_location (start, start, finish));
 	richloc.add_fixit_replace (src_range, "replacement_1");
 	richloc.fixits_cannot_be_auto_applied ();
-	warning_at_rich_loc (&richloc, 0, "warning 1");
+	warning_at (&richloc, 0, "warning 1");
       }
 
       {
 	rich_location richloc (line_table, make_location (start, start, finish));
 	richloc.add_fixit_replace (src_range, "replacement_2");
 	richloc.fixits_cannot_be_auto_applied ();
-	warning_at_rich_loc (&richloc, 0, "warning 2");
+	warning_at (&richloc, 0, "warning 2");
       }
     }  
 
@@ -346,7 +346,7 @@ test_show_locus (function *fun)
       richloc.add_range (caret_b, true);
       global_dc->caret_chars[0] = '1';
       global_dc->caret_chars[1] = '2';
-      warning_at_rich_loc (&richloc, 0, "test");
+      warning_at (&richloc, 0, "test");
       global_dc->caret_chars[0] = '^';
       global_dc->caret_chars[1] = '^';
     }
@@ -411,8 +411,8 @@ test_show_locus (function *fun)
 	 statically-allocated buffer in class rich_location,
 	 and then trigger a reallocation of the dynamic buffer.  */
       gcc_assert (richloc.get_num_locations () > 3 + (2 * 16));
-      warning_at_rich_loc (&richloc, 0, "test of %i locations",
-			   richloc.get_num_locations ());
+      warning_at (&richloc, 0, "test of %i locations",
+		  richloc.get_num_locations ());
     }
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/poly-int-tests.h b/gcc/testsuite/gcc.dg/plugin/poly-int-tests.h
index 9409ec7bc0f..b7a93856003 100644
--- a/gcc/testsuite/gcc.dg/plugin/poly-int-tests.h
+++ b/gcc/testsuite/gcc.dg/plugin/poly-int-tests.h
@@ -437,64 +437,6 @@ test_must_eq ()
 			 ph::make (0, 3, 5)));
 }
 
-/* Test known_zero.  */
-
-template<unsigned int N, typename C, typename T>
-static void
-test_known_zero ()
-{
-  typedef poly_helper<T> ph;
-
-  ASSERT_EQ (known_zero (ph::make (0, 0, 1)), N <= 2);
-  ASSERT_EQ (known_zero (ph::make (0, 1, 0)), N == 1);
-  ASSERT_TRUE (known_zero (ph::make (0, 0, 0)));
-  ASSERT_FALSE (known_zero (ph::make (1, 0, 0)));
-}
-
-/* Test maybe_nonzero.  */
-
-template<unsigned int N, typename C, typename T>
-static void
-test_maybe_nonzero ()
-{
-  typedef poly_helper<T> ph;
-
-  ASSERT_EQ (maybe_nonzero (ph::make (0, 0, 1)), N == 3);
-  ASSERT_EQ (maybe_nonzero (ph::make (0, 1, 0)), N >= 2);
-  ASSERT_FALSE (maybe_nonzero (ph::make (0, 0, 0)));
-  ASSERT_TRUE (maybe_nonzero (ph::make (1, 0, 0)));
-}
-
-/* Test known_one.  */
-
-template<unsigned int N, typename C, typename T>
-static void
-test_known_one ()
-{
-  typedef poly_helper<T> ph;
-
-  ASSERT_EQ (known_one (ph::make (1, 0, 1)), N <= 2);
-  ASSERT_EQ (known_one (ph::make (1, 1, 0)), N == 1);
-  ASSERT_TRUE (known_one (ph::make (1, 0, 0)));
-  ASSERT_FALSE (known_one (ph::make (0, 0, 0)));
-}
-
-/* Test known_all_ones.  */
-
-template<unsigned int N, typename C, typename T>
-static void
-test_known_all_ones ()
-{
-  typedef poly_helper<T> ph;
-
-  ASSERT_EQ (known_all_ones (ph::make (-1, 0, -1)), N <= 2);
-  ASSERT_EQ (known_all_ones (ph::make (-1, -1, 0)), N == 1);
-  ASSERT_EQ (known_all_ones (ph::make (-1, -1, -1)), N == 1);
-  ASSERT_TRUE (known_all_ones (ph::make (-1, 0, 0)));
-  ASSERT_FALSE (known_all_ones (ph::make (0, 0, 0)));
-  ASSERT_FALSE (known_all_ones (ph::make (1, 0, 0)));
-}
-
 /* Test can_align_p.  */
 
 template<unsigned int N, typename C, typename T>
@@ -903,44 +845,6 @@ test_must_ne_2 ()
   ASSERT_TRUE (must_ne (T (11, 0), T (4, 2)));
 }
 
-/* Test maybe_zero for poly_int<2, C>.  */
-
-template<typename C>
-static void
-test_maybe_zero_2 ()
-{
-  typedef poly_int<2, C> T;
-
-  ASSERT_TRUE (maybe_zero (T (0, 0)));
-  ASSERT_TRUE (maybe_zero (T (0, 1)));
-  ASSERT_TRUE (maybe_zero (T (0, -1)));
-  ASSERT_FALSE (maybe_zero (T (1, 0)));
-  ASSERT_FALSE (maybe_zero (T (1, 2)));
-  ASSERT_FALSE (maybe_zero (T (1, -2)));
-  ASSERT_FALSE (maybe_zero (T (-1, 0)));
-  ASSERT_FALSE (maybe_zero (T (-1, 2)));
-  ASSERT_FALSE (maybe_zero (T (-1, -2)));
-}
-
-/* Test known_nonzero for poly_int<2, C>.  */
-
-template<typename C>
-static void
-test_known_nonzero_2 ()
-{
-  typedef poly_int<2, C> T;
-
-  ASSERT_FALSE (known_nonzero (T (0, 0)));
-  ASSERT_FALSE (known_nonzero (T (0, 1)));
-  ASSERT_FALSE (known_nonzero (T (0, -1)));
-  ASSERT_TRUE (known_nonzero (T (1, 0)));
-  ASSERT_TRUE (known_nonzero (T (1, 2)));
-  ASSERT_TRUE (known_nonzero (T (1, -2)));
-  ASSERT_TRUE (known_nonzero (T (-1, 0)));
-  ASSERT_TRUE (known_nonzero (T (-1, 2)));
-  ASSERT_TRUE (known_nonzero (T (-1, -2)));
-}
-
 /* Test may_le for both signed and unsigned C.  */
 
 template<unsigned int N, typename C, typename T>
@@ -2235,6 +2139,22 @@ test_can_div_away_from_zero_p ()
   ASSERT_EQ (const_quot, C (0));
 }
 
+/* Test known_size_p.  */
+
+template<unsigned int N, typename C, typename T>
+static void
+test_known_size_p ()
+{
+  typedef poly_helper<T> ph;
+
+  ASSERT_EQ (known_size_p (ph::make (-1, 0, -1)), N == 3);
+  ASSERT_EQ (known_size_p (ph::make (-1, -1, 0)), N >= 2);
+  ASSERT_EQ (known_size_p (ph::make (-1, -1, -1)), N >= 2);
+  ASSERT_FALSE (known_size_p (ph::make (-1, 0, 0)));
+  ASSERT_TRUE (known_size_p (ph::make (0, 0, 0)));
+  ASSERT_TRUE (known_size_p (ph::make (1, 0, 0)));
+}
+
 /* Test maybe_in_range_p for both signed and unsigned C.  */
 
 template<unsigned int N, typename C, typename T>
@@ -2633,44 +2553,6 @@ test_signed_must_ne_2 ()
   ASSERT_TRUE (must_ne (T (-3, 4), T (6, -1)));
 }
 
-/* Test maybe_zero for poly_int<2, C>, given that C is signed.  */
-
-template<typename C>
-static void
-test_signed_maybe_zero_2 ()
-{
-  typedef poly_int<2, C> T;
-
-  ASSERT_TRUE (maybe_zero (T (3, -3)));
-  ASSERT_TRUE (maybe_zero (T (16, -4)));
-  ASSERT_TRUE (maybe_zero (T (-15, 5)));
-  ASSERT_FALSE (maybe_zero (T (3, -4)));
-  ASSERT_FALSE (maybe_zero (T (3, -6)));
-  ASSERT_FALSE (maybe_zero (T (15, -4)));
-  ASSERT_FALSE (maybe_zero (T (17, -4)));
-  ASSERT_FALSE (maybe_zero (T (-14, 5)));
-  ASSERT_FALSE (maybe_zero (T (-16, 5)));
-}
-
-/* Test known_nonzero for poly_int<2, C>, given that C is signed.  */
-
-template<typename C>
-static void
-test_signed_known_nonzero_2 ()
-{
-  typedef poly_int<2, C> T;
-
-  ASSERT_FALSE (known_nonzero (T (3, -3)));
-  ASSERT_FALSE (known_nonzero (T (16, -4)));
-  ASSERT_FALSE (known_nonzero (T (-15, 5)));
-  ASSERT_TRUE (known_nonzero (T (3, -4)));
-  ASSERT_TRUE (known_nonzero (T (3, -6)));
-  ASSERT_TRUE (known_nonzero (T (15, -4)));
-  ASSERT_TRUE (known_nonzero (T (17, -4)));
-  ASSERT_TRUE (known_nonzero (T (-14, 5)));
-  ASSERT_TRUE (known_nonzero (T (-16, 5)));
-}
-
 /* Test negation for signed C, both via operators and wi::.  */
 
 template<unsigned int N, typename C, typename RC, typename T>
@@ -4623,76 +4505,16 @@ test_uhwi ()
 				  wi::uhwi (210, 16)));
 }
 
-/* Test known_zero for non-polynomial T.  */
-
-template<typename T>
-static void
-test_nonpoly_known_zero ()
-{
-  ASSERT_TRUE (known_zero (T (0)));
-  ASSERT_FALSE (known_zero (T (1)));
-  ASSERT_FALSE (known_zero (T (2)));
-  ASSERT_FALSE (known_zero (T (-1)));
-}
-
-/* Test maybe_zero for non-polynomial T.  */
-
-template<typename T>
-static void
-test_nonpoly_maybe_zero ()
-{
-  ASSERT_TRUE (maybe_zero (T (0)));
-  ASSERT_FALSE (maybe_zero (T (1)));
-  ASSERT_FALSE (maybe_zero (T (2)));
-  ASSERT_FALSE (maybe_zero (T (-1)));
-}
-
-/* Test known_nonzero for non-polynomial T.  */
-
-template<typename T>
-static void
-test_nonpoly_known_nonzero ()
-{
-  ASSERT_FALSE (known_nonzero (T (0)));
-  ASSERT_TRUE (known_nonzero (T (1)));
-  ASSERT_TRUE (known_nonzero (T (2)));
-  ASSERT_TRUE (known_nonzero (T (-1)));
-}
-
-/* Test maybe_nonzero for non-polynomial T.  */
-
-template<typename T>
-static void
-test_nonpoly_maybe_nonzero ()
-{
-  ASSERT_FALSE (maybe_nonzero (T (0)));
-  ASSERT_TRUE (maybe_nonzero (T (1)));
-  ASSERT_TRUE (maybe_nonzero (T (2)));
-  ASSERT_TRUE (maybe_nonzero (T (-1)));
-}
-
-/* Test known_one for non-polynomial T.  */
-
-template<typename T>
-static void
-test_nonpoly_known_one ()
-{
-  ASSERT_FALSE (known_one (T (0)));
-  ASSERT_TRUE (known_one (T (1)));
-  ASSERT_FALSE (known_one (T (2)));
-  ASSERT_FALSE (known_one (T (-1)));
-}
-
-/* Test known_all_ones for non-polynomial T.  */
+/* Test known_size_p for non-polynomial T.  */
 
 template<typename T>
 static void
-test_nonpoly_known_all_ones ()
+test_nonpoly_known_size_p ()
 {
-  ASSERT_FALSE (known_all_ones (T (0)));
-  ASSERT_FALSE (known_all_ones (T (1)));
-  ASSERT_FALSE (known_all_ones (T (2)));
-  ASSERT_TRUE (known_all_ones (T (-1)));
+  ASSERT_TRUE (known_size_p (T (0)));
+  ASSERT_TRUE (known_size_p (T (1)));
+  ASSERT_TRUE (known_size_p (T (2)));
+  ASSERT_FALSE (known_size_p (T (-1)));
 }
 
 /* Test poly-int.h operations on non-polynomial type T.  */
@@ -4701,12 +4523,7 @@ template<typename T>
 static void
 test_nonpoly_type ()
 {
-  test_nonpoly_known_zero<T> ();
-  test_nonpoly_maybe_zero<T> ();
-  test_nonpoly_known_nonzero<T> ();
-  test_nonpoly_maybe_nonzero<T> ();
-  test_nonpoly_known_one<T> ();
-  test_nonpoly_known_all_ones<T> ();
+  test_nonpoly_known_size_p<T> ();
 }
 
 /* Test poly-int.h operations on non-polynomial values.  */
@@ -4747,10 +4564,6 @@ test_general ()
   test_shift_left<N, C, T> ();
   test_may_ne<N, C, T> ();
   test_must_eq<N, C, T> ();
-  test_known_zero<N, C, T> ();
-  test_maybe_nonzero<N, C, T> ();
-  test_known_one<N, C, T> ();
-  test_known_all_ones<N, C, T> ();
   test_can_align_p<N, C, T> ();
   test_can_align_up<N, C, T> ();
   test_can_align_down<N, C, T> ();
@@ -4764,6 +4577,7 @@ test_general ()
   test_force_get_misalignment<N, C, T> ();
   test_known_alignment<N, C, T> ();
   test_can_ior_p<N, C, T> ();
+  test_known_size_p<N, C, T> ();
 }
 
 /* Test things that work for poly_int<2, C>, given that C is signed.  */
@@ -4774,8 +4588,6 @@ test_ordered_2 ()
 {
   test_may_eq_2<C> ();
   test_must_ne_2<C> ();
-  test_maybe_zero_2<C> ();
-  test_known_nonzero_2<C> ();
 }
 
 /* Test things that work for poly_int-based types T, given that the
@@ -4829,8 +4641,6 @@ test_signed_2 ()
   test_ordered_2<C> ();
   test_signed_may_eq_2<C> ();
   test_signed_must_ne_2<C> ();
-  test_signed_maybe_zero_2<C> ();
-  test_signed_known_nonzero_2<C> ();
 }
 
 /* Test things that work for poly_int-based types T, given that the
diff --git a/gcc/testsuite/gcc.dg/pr7356-2.c b/gcc/testsuite/gcc.dg/pr7356-2.c
new file mode 100644
index 00000000000..ad679756978
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr7356-2.c
@@ -0,0 +1,33 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+int i /* { dg-error "6: expected ';' before 'int'" } */
+int j;
+/* { dg-begin-multiline-output "" }
+ int i 
+      ^
+      ;
+ int j;
+ ~~~   
+   { dg-end-multiline-output "" } */
+
+
+void test (void)
+{
+  int i /* { dg-error "8: expected ';' before 'int'" } */
+  int j;
+
+  /* { dg-begin-multiline-output "" }
+   int i 
+        ^
+        ;
+   int j;
+   ~~~   
+     { dg-end-multiline-output "" } */
+}
+
+int old_style_params (first, second)
+     int first;
+     int second;
+{
+  return first + second;
+}
diff --git a/gcc/testsuite/gcc.dg/pr7356.c b/gcc/testsuite/gcc.dg/pr7356.c
new file mode 100644
index 00000000000..84baf078b96
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr7356.c
@@ -0,0 +1,17 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+a /* { dg-line stray_token } */
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+int main(int argc, char** argv)
+{
+  return 0;
+}
+
+/* { dg-error "expected ';' before '.*'" "" { target *-*-* } stray_token } */
+/* { dg-begin-multiline-output "" }
+ a
+  ^
+  ;
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/pr82274-1.c b/gcc/testsuite/gcc.dg/pr82274-1.c
new file mode 100644
index 00000000000..f96b7338fc4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82274-1.c
@@ -0,0 +1,16 @@
+/* PR target/82274 */
+/* { dg-do run } */
+/* { dg-shouldfail "trapv" } */
+/* { dg-options "-ftrapv" } */
+
+int
+main ()
+{
+#ifdef __SIZEOF_INT128__
+  volatile __int128 m = -(((__int128) 1) << (__CHAR_BIT__ * __SIZEOF_INT128__ / 2));
+#else
+  volatile long long m = -(1LL << (__CHAR_BIT__ * __SIZEOF_LONG_LONG__ / 2));
+#endif
+  m = m * m;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr82274-2.c b/gcc/testsuite/gcc.dg/pr82274-2.c
new file mode 100644
index 00000000000..a9643b5a923
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82274-2.c
@@ -0,0 +1,26 @@
+/* PR target/82274 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+int
+main ()
+{
+#ifdef __SIZEOF_INT128__
+  __int128 m = -(((__int128) 1) << (__CHAR_BIT__ * __SIZEOF_INT128__ / 2));
+  volatile __int128 mv = m;
+  __int128 r;
+#else
+  long long m = -(1LL << (__CHAR_BIT__ * __SIZEOF_LONG_LONG__ / 2));
+  volatile long long mv = m;
+  long long r;
+#endif
+  if (!__builtin_mul_overflow (mv, mv, &r))
+    __builtin_abort ();
+  if (!__builtin_mul_overflow_p (mv, mv, r))
+    __builtin_abort ();
+  if (!__builtin_mul_overflow (m, m, &r))
+    __builtin_abort ();
+  if (!__builtin_mul_overflow_p (m, m, r))
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr82596.c b/gcc/testsuite/gcc.dg/pr82596.c
new file mode 100644
index 00000000000..5dc67c28e8c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82596.c
@@ -0,0 +1,27 @@
+/* PR tree-optimization/82596 - missing -Warray-bounds on an out-of-bounds
+   index into string literal
+   { dg-do compile }
+   { dg-options "-O2 -Warray-bounds" } */
+
+#define SIZE_MAX  __SIZE_MAX__
+#define SSIZE_MAX __PTRDIFF_MAX__
+#define SSIZE_MIN (-SSIZE_MAX - 1)
+
+void sink (int, ...);
+
+#define T(arg) sink (arg)
+
+void test_cststring (int i)
+{
+  T (""[SSIZE_MIN]);                      /* { dg-warning "below array bounds" "string" { xfail lp64 } } */
+  T (""[SSIZE_MIN + 1]);                  /* { dg-warning "below array bounds" "string" } */
+  T (""[-1]);                             /* { dg-warning "below array bounds" "string" } */
+  T (""[0]);
+  T (""[1]);                              /* { dg-warning "above array bounds" "string" } */
+  T ("0"[2]);                             /* { dg-warning "above array bounds" "string" } */
+  T ("012"[2]);
+  T ("012"[3]);
+  T ("012"[4]);                           /* { dg-warning "above array bounds" "string" } */
+  T ("0123"[SSIZE_MAX]);                  /* { dg-warning "above array bounds" "string" } */
+  T ("0123"[SIZE_MAX]);                   /* { dg-warning "above array bounds" "string" } */
+}
diff --git a/gcc/testsuite/gcc.dg/pr82597.c b/gcc/testsuite/gcc.dg/pr82597.c
new file mode 100644
index 00000000000..98ae264d1c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82597.c
@@ -0,0 +1,40 @@
+/* PR rtl-optimization/82597 */
+/* { dg-do compile  }*/
+/* { dg-options "-O2 -funroll-loops" } */
+
+int pb;
+
+void
+ch (unsigned char np, char fc)
+{
+  unsigned char *y6 = &np;
+
+  if (fc != 0)
+    {
+      unsigned char *z1 = &np;
+
+      for (;;)
+        if (*y6 != 0)
+          for (fc = 0; fc < 12; ++fc)
+            {
+              int hh;
+              int tp;
+
+              if (fc != 0)
+                hh = (*z1 != 0) ? fc : 0;
+              else
+                hh = pb;
+
+              tp = fc > 0;
+              if (hh == tp)
+                *y6 = 1;
+            }
+    }
+
+  if (np != 0)
+    y6 = (unsigned char *)&fc;
+  if (pb != 0 && *y6 != 0)
+    for (;;)
+      {
+      }
+}
diff --git a/gcc/testsuite/gcc.dg/pr82703.c b/gcc/testsuite/gcc.dg/pr82703.c
new file mode 100644
index 00000000000..0bd2f91eea4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82703.c
@@ -0,0 +1,28 @@
+/* PR target/82703 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-tree-sra -ftree-vectorize" } */
+
+__attribute__((noinline, noclone)) void
+compare (const double *p, const double *q)
+{
+  for (int i = 0; i < 3; ++i)
+    if (p[i] != q[i])
+      __builtin_abort ();
+}
+
+double vr[3] = { 4, 4, 4 };
+
+int
+main ()
+{
+  double v1[3] = { 1, 2, 3 };
+  double v2[3] = { 3, 2, 1 };
+  double v3[3];
+  __builtin_memcpy (v3, v1, sizeof (v1));
+  for (int i = 0; i < 3; ++i)
+    v3[i] += v2[i];
+  for (int i = 0; i < 3; ++i)
+    v1[i] += v2[i];
+  compare (v3, vr);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr82765.c b/gcc/testsuite/gcc.dg/pr82765.c
new file mode 100644
index 00000000000..dde0aeba7ef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82765.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -w" } */
+
+int a[1][1];
+int main() { int *b[] = {a, a[1820408606019012862278468], a, a, a}; }
diff --git a/gcc/testsuite/gcc.dg/pr82809.c b/gcc/testsuite/gcc.dg/pr82809.c
new file mode 100644
index 00000000000..9f74ee86534
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr82809.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fno-tree-dominator-opts" } */
+
+struct locale_time_t
+{
+  const char *abday[7];
+  const unsigned int *wabday[7];
+};
+
+static const unsigned int empty_wstr[1] = { 0 };
+
+void
+time_read (struct locale_time_t *time)
+{
+  int cnt;
+
+  for (cnt=0; cnt < 7; cnt++)
+    {
+      time->abday[cnt] = "";
+      time->wabday[cnt] = empty_wstr;
+    }
+}
diff --git a/gcc/testsuite/gcc.dg/spellcheck-typenames.c b/gcc/testsuite/gcc.dg/spellcheck-typenames.c
index f3b8102d5a4..3717ad89f1b 100644
--- a/gcc/testsuite/gcc.dg/spellcheck-typenames.c
+++ b/gcc/testsuite/gcc.dg/spellcheck-typenames.c
@@ -100,8 +100,9 @@ baz value; /* { dg-error "1: unknown type name .baz.; use .enum. keyword to refe
    { dg-end-multiline-output "" } */
 
 /* TODO: it would be better to detect the "singed" vs "signed" typo here.  */
-singed char ch; /* { dg-error "8: before .char." } */
+singed char ch; /* { dg-error "7: before .char." } */
 /* { dg-begin-multiline-output "" }
  singed char ch;
-        ^~~~
+       ^~~~~
+       ;
    { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/store_merging_10.c b/gcc/testsuite/gcc.dg/store_merging_10.c
new file mode 100644
index 00000000000..440f6e1f6c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store_merging_10.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging" } */
+
+struct S {
+  unsigned int b1:1;
+  unsigned int b2:1;
+  unsigned int b3:1;
+  unsigned int b4:1;
+  unsigned int b5:1;
+  unsigned int b6:27;
+};
+
+struct T {
+  unsigned int b1:1;
+  unsigned int b2:16;
+  unsigned int b3:14;
+  unsigned int b4:1;
+};
+
+__attribute__((noipa)) void
+foo (struct S *x)
+{
+  x->b1 = 1;
+  x->b2 = 0;
+  x->b3 = 1;
+  x->b4 = 1;
+  x->b5 = 0;
+}
+
+__attribute__((noipa)) void
+bar (struct T *x)
+{
+  x->b1 = 1;
+  x->b2 = 0;
+  x->b4 = 0;
+}
+
+struct S s = { 0, 1, 0, 0, 1, 0x3a5f05a };
+struct T t = { 0, 0xf5af, 0x3a5a, 1 };
+
+int
+main ()
+{
+  asm volatile ("" : : : "memory");
+  foo (&s);
+  bar (&t);
+  asm volatile ("" : : : "memory");
+  if (s.b1 != 1 || s.b2 != 0 || s.b3 != 1 || s.b4 != 1 || s.b5 != 0 || s.b6 != 0x3a5f05a)
+    __builtin_abort ();
+  if (t.b1 != 1 || t.b2 != 0 || t.b3 != 0x3a5a || t.b4 != 0)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
diff --git a/gcc/testsuite/gcc.dg/store_merging_11.c b/gcc/testsuite/gcc.dg/store_merging_11.c
new file mode 100644
index 00000000000..399538e522e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store_merging_11.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging" } */
+
+struct S { unsigned char b[2]; unsigned short c; unsigned char d[4]; unsigned long e; };
+
+__attribute__((noipa)) void
+foo (struct S *p)
+{
+  p->b[1] = 1;
+  p->c = 23;
+  p->d[0] = 4;
+  p->d[1] = 5;
+  p->d[2] = 6;
+  p->d[3] = 7;
+  p->e = 8;
+}
+
+__attribute__((noipa)) void
+bar (struct S *p)
+{
+  p->b[1] = 9;
+  p->c = 112;
+  p->d[0] = 10;
+  p->d[1] = 11;
+}
+
+struct S s = { { 30, 31 }, 32, { 33, 34, 35, 36 }, 37 };
+
+int
+main ()
+{
+  asm volatile ("" : : : "memory");
+  foo (&s);
+  asm volatile ("" : : : "memory");
+  if (s.b[0] != 30 || s.b[1] != 1 || s.c != 23 || s.d[0] != 4 || s.d[1] != 5
+      || s.d[2] != 6 || s.d[3] != 7 || s.e != 8)
+    __builtin_abort ();
+  bar (&s);
+  asm volatile ("" : : : "memory");
+  if (s.b[0] != 30 || s.b[1] != 9 || s.c != 112 || s.d[0] != 10 || s.d[1] != 11
+      || s.d[2] != 6 || s.d[3] != 7 || s.e != 8)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
diff --git a/gcc/testsuite/gcc.dg/store_merging_12.c b/gcc/testsuite/gcc.dg/store_merging_12.c
new file mode 100644
index 00000000000..67f23449e93
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store_merging_12.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wall" } */
+
+struct S { unsigned int b1:1, b2:1, b3:1, b4:1, b5:1, b6:27; };
+void bar (struct S *);
+void foo (int x)
+{
+  struct S s;
+  s.b2 = 1; s.b3 = 0; s.b4 = 1; s.b5 = 0; s.b1 = x; s.b6 = x;	/* { dg-bogus "is used uninitialized in this function" } */
+  bar (&s);
+}
diff --git a/gcc/testsuite/gcc.dg/store_merging_13.c b/gcc/testsuite/gcc.dg/store_merging_13.c
new file mode 100644
index 00000000000..d4e9ad2d260
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store_merging_13.c
@@ -0,0 +1,157 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging" } */
+
+struct S { unsigned char a, b; unsigned short c; unsigned char d, e, f, g; unsigned long long h; };
+
+__attribute__((noipa)) void
+f1 (struct S *p)
+{
+  p->a = 1;
+  p->b = 2;
+  p->c = 3;
+  p->d = 4;
+  p->e = 5;
+  p->f = 6;
+  p->g = 7;
+}
+
+__attribute__((noipa)) void
+f2 (struct S *__restrict p, struct S *__restrict q)
+{
+  p->a = q->a;
+  p->b = q->b;
+  p->c = q->c;
+  p->d = q->d;
+  p->e = q->e;
+  p->f = q->f;
+  p->g = q->g;
+}
+
+__attribute__((noipa)) void
+f3 (struct S *p, struct S *q)
+{
+  unsigned char pa = q->a;
+  unsigned char pb = q->b;
+  unsigned short pc = q->c;
+  unsigned char pd = q->d;
+  unsigned char pe = q->e;
+  unsigned char pf = q->f;
+  unsigned char pg = q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f4 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a | q->a;
+  unsigned char pb = p->b | q->b;
+  unsigned short pc = p->c | q->c;
+  unsigned char pd = p->d | q->d;
+  unsigned char pe = p->e | q->e;
+  unsigned char pf = p->f | q->f;
+  unsigned char pg = p->g | q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f5 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a & q->a;
+  unsigned char pb = p->b & q->b;
+  unsigned short pc = p->c & q->c;
+  unsigned char pd = p->d & q->d;
+  unsigned char pe = p->e & q->e;
+  unsigned char pf = p->f & q->f;
+  unsigned char pg = p->g & q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f6 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a ^ q->a;
+  unsigned char pb = p->b ^ q->b;
+  unsigned short pc = p->c ^ q->c;
+  unsigned char pd = p->d ^ q->d;
+  unsigned char pe = p->e ^ q->e;
+  unsigned char pf = p->f ^ q->f;
+  unsigned char pg = p->g ^ q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+struct S s = { 20, 21, 22, 23, 24, 25, 26, 27 };
+struct S t = { 0x71, 0x72, 0x7f04, 0x78, 0x31, 0x32, 0x34, 0xf1f2f3f4f5f6f7f8ULL };
+struct S u = { 28, 29, 30, 31, 32, 33, 34, 35 };
+struct S v = { 36, 37, 38, 39, 40, 41, 42, 43 };
+
+int
+main ()
+{
+  asm volatile ("" : : : "memory");
+  f1 (&s);
+  asm volatile ("" : : : "memory");
+  if (s.a != 1 || s.b != 2 || s.c != 3 || s.d != 4
+      || s.e != 5 || s.f != 6 || s.g != 7 || s.h != 27)
+    __builtin_abort ();
+  f2 (&s, &u);
+  asm volatile ("" : : : "memory");
+  if (s.a != 28 || s.b != 29 || s.c != 30 || s.d != 31
+      || s.e != 32 || s.f != 33 || s.g != 34 || s.h != 27)
+    __builtin_abort ();
+  f3 (&s, &v);
+  asm volatile ("" : : : "memory");
+  if (s.a != 36 || s.b != 37 || s.c != 38 || s.d != 39
+      || s.e != 40 || s.f != 41 || s.g != 42 || s.h != 27)
+    __builtin_abort ();
+  f4 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.a != (36 | 0x71) || s.b != (37 | 0x72)
+      || s.c != (38 | 0x7f04) || s.d != (39 | 0x78)
+      || s.e != (40 | 0x31) || s.f != (41 | 0x32)
+      || s.g != (42 | 0x34) || s.h != 27)
+    __builtin_abort ();
+  f3 (&s, &u);
+  f5 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.a != (28 & 0x71) || s.b != (29 & 0x72)
+      || s.c != (30 & 0x7f04) || s.d != (31 & 0x78)
+      || s.e != (32 & 0x31) || s.f != (33 & 0x32)
+      || s.g != (34 & 0x34) || s.h != 27)
+    __builtin_abort ();
+  f2 (&s, &v);
+  f6 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.a != (36 ^ 0x71) || s.b != (37 ^ 0x72)
+      || s.c != (38 ^ 0x7f04) || s.d != (39 ^ 0x78)
+      || s.e != (40 ^ 0x31) || s.f != (41 ^ 0x32)
+      || s.g != (42 ^ 0x34) || s.h != 27)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Merging successful" 6 "store-merging" } } */
diff --git a/gcc/testsuite/gcc.dg/store_merging_14.c b/gcc/testsuite/gcc.dg/store_merging_14.c
new file mode 100644
index 00000000000..49af24951cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/store_merging_14.c
@@ -0,0 +1,157 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging" } */
+
+struct S { unsigned int i : 8, a : 7, b : 7, j : 10, c : 15, d : 7, e : 10, f : 7, g : 9, k : 16; unsigned long long h; };
+
+__attribute__((noipa)) void
+f1 (struct S *p)
+{
+  p->a = 1;
+  p->b = 2;
+  p->c = 3;
+  p->d = 4;
+  p->e = 5;
+  p->f = 6;
+  p->g = 7;
+}
+
+__attribute__((noipa)) void
+f2 (struct S *__restrict p, struct S *__restrict q)
+{
+  p->a = q->a;
+  p->b = q->b;
+  p->c = q->c;
+  p->d = q->d;
+  p->e = q->e;
+  p->f = q->f;
+  p->g = q->g;
+}
+
+__attribute__((noipa)) void
+f3 (struct S *p, struct S *q)
+{
+  unsigned char pa = q->a;
+  unsigned char pb = q->b;
+  unsigned short pc = q->c;
+  unsigned char pd = q->d;
+  unsigned short pe = q->e;
+  unsigned char pf = q->f;
+  unsigned short pg = q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f4 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a | q->a;
+  unsigned char pb = p->b | q->b;
+  unsigned short pc = p->c | q->c;
+  unsigned char pd = p->d | q->d;
+  unsigned short pe = p->e | q->e;
+  unsigned char pf = p->f | q->f;
+  unsigned short pg = p->g | q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f5 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a & q->a;
+  unsigned char pb = p->b & q->b;
+  unsigned short pc = p->c & q->c;
+  unsigned char pd = p->d & q->d;
+  unsigned short pe = p->e & q->e;
+  unsigned char pf = p->f & q->f;
+  unsigned short pg = p->g & q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+__attribute__((noipa)) void
+f6 (struct S *p, struct S *q)
+{
+  unsigned char pa = p->a ^ q->a;
+  unsigned char pb = p->b ^ q->b;
+  unsigned short pc = p->c ^ q->c;
+  unsigned char pd = p->d ^ q->d;
+  unsigned short pe = p->e ^ q->e;
+  unsigned char pf = p->f ^ q->f;
+  unsigned short pg = p->g ^ q->g;
+  p->a = pa;
+  p->b = pb;
+  p->c = pc;
+  p->d = pd;
+  p->e = pe;
+  p->f = pf;
+  p->g = pg;
+}
+
+struct S s = { 72, 20, 21, 73, 22, 23, 24, 25, 26, 74, 27 };
+struct S t = { 75, 0x71, 0x72, 76, 0x7f04, 0x78, 0x31, 0x32, 0x34, 77, 0xf1f2f3f4f5f6f7f8ULL };
+struct S u = { 78, 28, 29, 79, 30, 31, 32, 33, 34, 80, 35 };
+struct S v = { 81, 36, 37, 82, 38, 39, 40, 41, 42, 83, 43 };
+
+int
+main ()
+{
+  asm volatile ("" : : : "memory");
+  f1 (&s);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != 1 || s.b != 2 || s.j != 73 || s.c != 3 || s.d != 4
+      || s.e != 5 || s.f != 6 || s.g != 7 || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  f2 (&s, &u);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != 28 || s.b != 29 || s.j != 73 || s.c != 30 || s.d != 31
+      || s.e != 32 || s.f != 33 || s.g != 34 || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  f3 (&s, &v);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != 36 || s.b != 37 || s.j != 73 || s.c != 38 || s.d != 39
+      || s.e != 40 || s.f != 41 || s.g != 42 || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  f4 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != (36 | 0x71) || s.b != (37 | 0x72) || s.j != 73
+      || s.c != (38 | 0x7f04) || s.d != (39 | 0x78)
+      || s.e != (40 | 0x31) || s.f != (41 | 0x32)
+      || s.g != (42 | 0x34) || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  f3 (&s, &u);
+  f5 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != (28 & 0x71) || s.b != (29 & 0x72) || s.j != 73
+      || s.c != (30 & 0x7f04) || s.d != (31 & 0x78)
+      || s.e != (32 & 0x31) || s.f != (33 & 0x32)
+      || s.g != (34 & 0x34) || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  f2 (&s, &v);
+  f6 (&s, &t);
+  asm volatile ("" : : : "memory");
+  if (s.i != 72 || s.a != (36 ^ 0x71) || s.b != (37 ^ 0x72) || s.j != 73
+      || s.c != (38 ^ 0x7f04) || s.d != (39 ^ 0x78)
+      || s.e != (40 ^ 0x31) || s.f != (41 ^ 0x32)
+      || s.g != (42 ^ 0x34) || s.k != 74 || s.h != 27)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Merging successful" 6 "store-merging" } } */
diff --git a/gcc/testsuite/gcc.dg/strict-overflow-3.c b/gcc/testsuite/gcc.dg/strict-overflow-3.c
index 6215a501a72..8ef91476200 100644
--- a/gcc/testsuite/gcc.dg/strict-overflow-3.c
+++ b/gcc/testsuite/gcc.dg/strict-overflow-3.c
@@ -9,7 +9,7 @@
 int
 foo (int i, int j)
 {
-  return i + 100 < j + 1000;
+  return i + 100 < j + 1234;
 }
 
-/* { dg-final { scan-tree-dump-not "1000" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1234" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/torture/pr52451.c b/gcc/testsuite/gcc.dg/torture/pr52451.c
new file mode 100644
index 00000000000..81a3d4d158d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr52451.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-add-options ieee } */
+/* { dg-require-effective-target fenv_exceptions } */
+
+#include <fenv.h>
+
+#define TEST_C_NOEX(CMP, S)			\
+  r = nan##S CMP arg##S;			\
+  if (fetestexcept (FE_INVALID))		\
+    __builtin_abort ()
+
+#define TEST_B_NOEX(FN, S)			\
+  r = __builtin_##FN (nan##S, arg##S);		\
+  if (fetestexcept (FE_INVALID))		\
+    __builtin_abort ()
+
+#define TEST_C_EX(CMP, S)			\
+  r = nan##S CMP arg##S;			\
+  if (!fetestexcept (FE_INVALID))		\
+    __builtin_abort ();				\
+  feclearexcept (FE_INVALID)
+
+#define TEST(TYPE, S)				\
+  volatile TYPE nan##S = __builtin_nan##S ("");	\
+  volatile TYPE arg##S = 1.0##S;		\
+						\
+  TEST_C_NOEX (==, S);				\
+  TEST_C_NOEX (!=, S);				\
+						\
+  TEST_B_NOEX (isgreater, S);			\
+  TEST_B_NOEX (isless, S);			\
+  TEST_B_NOEX (isgreaterequal, S);		\
+  TEST_B_NOEX (islessequal, S);			\
+						\
+  TEST_B_NOEX (islessgreater, S);		\
+  TEST_B_NOEX (isunordered, S);			\
+						\
+  TEST_C_EX (>, S);				\
+  TEST_C_EX (<, S);				\
+  TEST_C_EX (>=, S);				\
+  TEST_C_EX (<=, S)
+
+int
+main (void)
+{
+  volatile int r;
+
+  feclearexcept (FE_INVALID);
+
+  TEST (float, f);
+  TEST (double, );
+  TEST (long double, l);
+  
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82129.c b/gcc/testsuite/gcc.dg/torture/pr82129.c
new file mode 100644
index 00000000000..b1161491fe6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82129.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-pre" } */
+
+int pj;
+
+void
+g4 (unsigned long int *bc, unsigned long int *h5)
+{
+  if (pj != 0)
+    {
+      int ib = 0;
+
+      while (bc != 0)
+	{
+m6:
+	  for (pj = 0; pj < 2; ++pj)
+	    pj = 0;
+
+	  while (pj != 0)
+	    {
+	      for (;;)
+		{
+		}
+
+	      while (ib != 0)
+		{
+		  unsigned long int tv = *bc;
+		  unsigned long int n7;
+
+		  *bc = 1;
+		  while (*bc != 0)
+		    {
+		    }
+
+ut:
+		  if (pj == 0)
+		    n7 = *h5 > 0;
+		  else
+		    {
+		      *h5 = tv;
+		      n7 = *h5;
+		    }
+		  ib += n7;
+		}
+	    }
+	}
+
+      goto ut;
+    }
+
+  goto m6;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82436-2.c b/gcc/testsuite/gcc.dg/torture/pr82436-2.c
new file mode 100644
index 00000000000..32eda186ff0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82436-2.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+
+enum
+{
+  a, b, c, d,  e,  f,  g,  h,  j,  k
+};
+
+int l;
+void m (short *s)
+{
+  short n, o, p;
+  float(*q)[k];
+  int r, i;
+  while (l > 0)
+    r = l;
+  for (;;)
+    {
+      i = 0;
+      for (; i < r; i++)
+	{
+	    {
+	      float ab = q[i][a];
+	      int i = ab;
+	      p = i;
+	    }
+	  ((short *) s)[0] = p;
+	    {
+	      float ab = q[i][b];
+	      int i = ab;
+	      o = i;
+	    }
+	  ((short *) s)[1] = o;
+	    {
+	      float ab = q[i][f];
+	      int i = ab;
+	      n = i;
+	    }
+	  ((short *) s)[2] = n;
+	  float ab = q[i][g];
+	  int i = ab;
+	  ((short *) s)[3] = i;
+	  s = (short *) s + 4;
+	}
+    }
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82473.c b/gcc/testsuite/gcc.dg/torture/pr82473.c
new file mode 100644
index 00000000000..b12de21d7db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82473.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-vectorize" } */
+
+void
+zz (int x9, short int gt)
+{
+  if (0)
+    {
+      while (gt < 1)
+	{
+	  int pz;
+
+k6:
+	  for (pz = 0; pz < 3; ++pz)
+	    x9 += gt;
+	  ++gt;
+	}
+    }
+
+  if (x9 != 0)
+    goto k6;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82603.c b/gcc/testsuite/gcc.dg/torture/pr82603.c
new file mode 100644
index 00000000000..960a48bbd3a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82603.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-loop-vectorize" } */
+
+int
+mr (unsigned int lf, int ms)
+{
+  unsigned int sw = 0;
+  char *cu = (char *)&ms;
+
+  while (ms < 1)
+    {
+      if (lf == 0)
+	ms = 0;
+      else
+	ms = 0;
+      ms += ((lf > 0) && ((lf > sw) ? 1 : ++*cu));
+    }
+
+  if (lf != 0)
+    cu = (char *)&sw;
+  *cu = lf;
+
+  return ms;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82692.c b/gcc/testsuite/gcc.dg/torture/pr82692.c
new file mode 100644
index 00000000000..254ace15ada
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82692.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-add-options ieee } */
+/* { dg-require-effective-target fenv_exceptions } */
+
+#include <fenv.h>
+
+extern void abort (void);
+extern void exit (int);
+
+double __attribute__ ((noinline, noclone))
+foo (double x)
+{
+  if (__builtin_islessequal (x, 0.0) || __builtin_isgreater (x, 1.0))
+    return x + x;
+  return x * x;
+}
+
+int
+main (void)
+{
+  volatile double x = foo (__builtin_nan (""));
+  if (fetestexcept (FE_INVALID))
+    abort ();
+  exit (0);
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82697.c b/gcc/testsuite/gcc.dg/torture/pr82697.c
new file mode 100644
index 00000000000..57da8a264b9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82697.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+
+__attribute__((noinline,noclone))
+void test(int *pi, long *pl, int f)
+{
+  *pl = 0;
+
+  *pi = 1;
+
+  if (f)
+    *pl = 2;
+}
+
+int main()
+{
+  void *p = __builtin_malloc(sizeof (long));
+
+  test(p, p, 0);
+
+  if (*(int *)p != 1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr82762.c b/gcc/testsuite/gcc.dg/torture/pr82762.c
new file mode 100644
index 00000000000..d4f57bc55f7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr82762.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+
+int printf (const char *, ...);
+
+int b, c, d, e, f, g, j, k;
+char h, i;
+volatile int l;
+
+int m (int n, int o)
+{ 
+  return o < 0 || o > 1 ? n : o;
+}
+
+int p (int n, unsigned o)
+{ 
+  return n - o;
+}
+
+int q ()
+{ 
+  char r;
+  int a, s, t, u, v, w;
+L:
+  if (t)
+    printf ("%d", d);
+  u = v;
+  while (j)
+    { 
+      while (e)
+	for (w = 0; w != 54; w += 6)
+	  { 
+	    l;
+	    s = p (u < 1, i || c);
+	    r = s < 0 || s > 1 ? 0 : 1 >> s;
+	    v = r;
+	    g = h;
+	  }
+      if (h)
+	return f;
+      if (u)
+	for (a = 0; a != 54; a += 6)
+	  f = m (2, -(k || b));
+    }
+  d = t;
+  goto L;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-prof/comp-goto-1.c b/gcc/testsuite/gcc.dg/tree-prof/comp-goto-1.c
index fe768f9a98d..baed1e3fa78 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/comp-goto-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/comp-goto-1.c
@@ -1,11 +1,11 @@
 /* { dg-require-effective-target freorder } */
 /* { dg-require-effective-target label_values } */
 /* { dg-options "-O2 -freorder-blocks-and-partition" } */
-/* { dg-add-options stack_size } */
+/* { dg-require-stack-size "4000" } */
 
 #include <stdlib.h>
 
-#if (!defined(STACK_SIZE) || STACK_SIZE >= 4000) && __INT_MAX__ >= 2147483647
+#if __INT_MAX__ >= 2147483647
 typedef unsigned int uint32;
 typedef signed int sint32;
 
diff --git a/gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c b/gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c
index dcd50241eb9..9b0dfc2dbb5 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -fdump-rtl-expand-all" } */
+/* { dg-options "-O2 -fdump-ipa-profile-all" } */
 int g;
 
 __attribute__((noinline)) void foo (int  n)
@@ -36,5 +36,5 @@ int main ()
  return 0;
 }
 /* autofdo cannot do that precise execution numbers: */
-/* { dg-final-use-not-autofdo { scan-rtl-dump-times ";; basic block\[^\\n\]*count 4000" 2 "expand"} } */
-/* { dg-final-use-not-autofdo { scan-rtl-dump-times ";; basic block\[^\\n\]*count 2000" 1 "expand" { xfail *-*-* } } } */
+/* { dg-final-use-not-autofdo { scan-ipa-dump-times ";;   basic block\[^\\n\]*count 4000" 2 "profile"} } */
+/* { dg-final-use-not-autofdo { scan-ipa-dump-times ";;   basic block\[^\\n\]*count 2000" 1 "profile"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
index 2323b7fa3e9..75d3db37ade 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
@@ -290,7 +290,7 @@ RNG (0,  6,   8, "%s%ls", "1", L"2");
 
 /*  Only conditional calls to must_not_eliminate must be made (with
     any probability):
-    { dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\] \\\[count: \[0-9INV\]*\\\]:\n *must_not_eliminate" 127 "optimized" { target { ilp32 || lp64 } } } }
-    { dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\] \\\[count: \[0-9INV\]*\\\]:\n *must_not_eliminate" 96 "optimized" { target { { ! ilp32 } && { ! lp64 } } } } }
+    { dg-final { scan-tree-dump-times "> \\\[local count: \[0-9INV\]*\\\]:\n *must_not_eliminate" 127 "optimized" { target { ilp32 || lp64 } } } }
+    { dg-final { scan-tree-dump-times "> \\\[local count: \[0-9INV\]*\\\]:\n *must_not_eliminate" 96 "optimized" { target { { ! ilp32 } && { ! lp64 } } } } }
     No unconditional calls to abort should be made:
     { dg-final { scan-tree-dump-not ";\n *must_not_eliminate" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dump-2.c b/gcc/testsuite/gcc.dg/tree-ssa/dump-2.c
index 6ae2ef5bf39..20f99c2df12 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/dump-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/dump-2.c
@@ -6,4 +6,4 @@ int f(void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "<bb \[0-9\]> \\\[100\\\.00%\\\] \\\[count: INV\\\]:" "optimized" } } */
+/* { dg-final { scan-tree-dump "<bb \[0-9\]> \\\[local count: 10000\\\]:" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
index 4097145eba6..75a8ab9b1d5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
@@ -26,5 +26,5 @@ int foo (int x, int n)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c
index a0333fbb28c..10f3d534adc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-11.c
@@ -24,5 +24,4 @@ int foo (float *x)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
index 535c1f0eb6c..9468c070489 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
@@ -29,6 +29,5 @@ int foo (int x)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
index 8badc762267..b55a533e374 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c
@@ -39,4 +39,4 @@ int main1 ()
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-2.c
index a517f6552e6..9249f3020b7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-20040816-2.c
@@ -43,5 +43,4 @@ void foo(const int * __restrict__ zr_in,
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
index 58260dd878b..35595aa98e3 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
@@ -27,4 +27,4 @@ dct_unquantize_h263_inter_c (short *block, int n, int qscale, int nCoeffs)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-8.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-8.c
index 6c26c209212..c2007486500 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-8.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-8.c
@@ -22,5 +22,4 @@ void test ()
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
index 789cb6ae23a..fce181e2fee 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
@@ -26,4 +26,4 @@ int foo (int x, int n)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-cd.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-cd.c
index 11e142af321..4932cd75a13 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-cd.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-cd.c
@@ -32,5 +32,4 @@ void foo (int *x1, int *x2, int *x3, int *x4, int *y)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr56541.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr56541.c
index 9682fbc15df..71d6398897a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr56541.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr56541.c
@@ -29,5 +29,4 @@ void foo()
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr68583.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr68583.c
index b128deb4a21..6739fad9f6c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr68583.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr68583.c
@@ -26,5 +26,5 @@ void foo (long *a)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
index 3ba7de5e6a5..a9f4ff669f8 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
@@ -20,5 +20,4 @@ void foo (int a[], int b[])
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* Sum is wrong here, but not enough for error to be reported.  */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 0 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-2.c
index 07589fd7928..c9e7c1b96ea 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-2.c
@@ -21,4 +21,4 @@ foo (const char *u, const char *v, long n)
    which is folded by vectorizer.  Both outgoing edges must have probability
    100% so the resulting profile match after folding.  */
 /* { dg-final { scan-tree-dump-times "Invalid sum of outgoing probabilities 200.0" 1 "ifcvt" } } */
-/* { dg-final { scan-tree-dump-times "Invalid sum of incoming frequencies" 1 "ifcvt" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum of incoming counts" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
index 4efc0a4a696..b3617f685a1 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
@@ -45,5 +45,5 @@ mad_synth_mute (struct mad_synth *synth)
   return;
 }
 
-/* { dg-final { scan-tree-dump "distributed: split to 0 loops and 4 library calls" "ldist" } } */
-/* { dg-final { scan-tree-dump-times "generated memset zero" 4 "ldist" } } */
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 0 loops and 1 library calls" "ldist" } } */
+/* { dg-final { scan-tree-dump-times "generated memset zero" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
index 3580c65f09b..b1fd024a942 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
@@ -11,7 +11,8 @@ struct st
   double c[M][N];
 };
 
-int __attribute__ ((noinline)) foo (struct st *s)
+int __attribute__ ((noinline))
+foo (struct st *s)
 {
   int i, j;
   for (i = 0; i != M;)
@@ -29,9 +30,11 @@ L2:
   return 0;
 }
 
-int main (void)
+struct st s;
+
+int
+main (void)
 {
-  struct st s;
   return foo (&s);
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c
new file mode 100644
index 00000000000..477d222fb3b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+
+struct st
+{
+  int a[M][N];
+  int c[M];
+  int b[M][N];
+};
+
+void
+foo (struct st *p)
+{
+  for (unsigned i = 0; i < M; ++i)
+    {
+      p->c[i] = 0;
+      for (unsigned j = N; j > 0; --j)
+	{
+	  p->a[i][j - 1] = 0;
+	  p->b[i][j - 1] = 0;
+	}
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 loops and 1 library" 1 "ldist" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_memset \\(.*, 0, 1049600\\);" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-35.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-35.c
new file mode 100644
index 00000000000..445d23d114b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-35.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+
+struct st
+{
+  int a[M][N];
+  int c[M];
+  int b[M][N];
+};
+
+void
+foo (struct st * restrict p, struct st * restrict q)
+{
+  for (unsigned i = 0; i < M; ++i)
+    {
+      p->c[i] = 0;
+      for (unsigned j = N; j > 0; --j)
+	{
+	  p->a[i][j - 1] = q->a[i][j - 1];
+	  p->b[i][j - 1] = 0;
+	}
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 loops and 1 library" 1 "ldist" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-36.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-36.c
new file mode 100644
index 00000000000..0e843f4dd55
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-36.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+
+struct st
+{
+  int a[M][N];
+  int c[M];
+  int b[M][N];
+};
+
+void
+foo (struct st * restrict p)
+{
+  for (unsigned i = 0; i < M; ++i)
+    {
+      p->c[i] = 0;
+      for (unsigned j = N; j > 0; --j)
+	{
+	  p->b[i][j - 1] = p->a[i][j - 1];
+	  p->a[i][j - 1] = 0;
+	}
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 loops and 3 library" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-1.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-1.c
index 0193c6e52fc..01c37a56671 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/loop-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-1.c
@@ -46,7 +46,7 @@ int xxx(void)
 /* CRIS keeps the address in a register.  */
 /* m68k sometimes puts the address in a register, depending on CPU and PIC.  */
 
-/* { dg-final { scan-assembler-times "foo" 5 { xfail hppa*-*-* ia64*-*-* sh*-*-* cris-*-* crisv32-*-* fido-*-* m68k-*-* i?86-*-mingw* i?86-*-cygwin* x86_64-*-mingw* visium-*-* } } } */
+/* { dg-final { scan-assembler-times "foo" 5 { xfail hppa*-*-* ia64*-*-* sh*-*-* cris-*-* crisv32-*-* fido-*-* m68k-*-* i?86-*-mingw* i?86-*-cygwin* x86_64-*-mingw* visium-*-* nvptx*-*-* } } } */
 /* { dg-final { scan-assembler-times "foo,%r" 5 { target hppa*-*-* } } } */
 /* { dg-final { scan-assembler-times "= foo"  5 { target ia64*-*-* } } } */
 /* { dg-final { scan-assembler-times "call\[ \t\]*_foo" 5 { target i?86-*-mingw* i?86-*-cygwin* } } } */
@@ -55,3 +55,4 @@ int xxx(void)
 /* { dg-final { scan-assembler-times "Jsr \\\$r" 5 { target cris-*-* } } } */
 /* { dg-final { scan-assembler-times "\[jb\]sr" 5 { target fido-*-* m68k-*-* } } } */
 /* { dg-final { scan-assembler-times "bra *tr,r\[1-9\]*,r21" 5 { target visium-*-* } } } */
+/* { dg-final { scan-assembler-times "(?n)\[ \t\]call\[ \t\].*\[ \t\]foo," 5 { target nvptx*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/negneg-1.c b/gcc/testsuite/gcc.dg/tree-ssa/negneg-1.c
new file mode 100644
index 00000000000..9c6c36998e5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/negneg-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O -frounding-math -fdump-tree-optimized-raw -Wno-psabi" } */
+
+#define DEF(num, T1, T2) T2 f##num(T1 x) { \
+    T1 y = -x; \
+    T2 z = (T2)y; \
+    return -z; \
+}
+DEF(0, int, long long)
+DEF(1, int, unsigned long long)
+DEF(2, long long, int)
+DEF(3, unsigned long long, int)
+DEF(4, long long, unsigned)
+DEF(5, unsigned long long, unsigned)
+DEF(6, float, double)
+
+typedef int vec __attribute__((vector_size(4*sizeof(int))));
+typedef unsigned uvec __attribute__((vector_size(4*sizeof(int))));
+void h(vec*p,uvec*q){
+    vec a = -*p;
+    *q = -(uvec)a;
+}
+
+/* { dg-final { scan-tree-dump-not "negate_expr" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/negneg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/negneg-2.c
new file mode 100644
index 00000000000..bd6198e633b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/negneg-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fno-rounding-math -fdump-tree-optimized-raw" } */
+
+#define DEF(num, T1, T2) T2 f##num(T1 x) { \
+    T1 y = -x; \
+    T2 z = (T2)y; \
+    return -z; \
+}
+DEF(0, double, float)
+
+/* { dg-final { scan-tree-dump-not "negate_expr" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/negneg-3.c b/gcc/testsuite/gcc.dg/tree-ssa/negneg-3.c
new file mode 100644
index 00000000000..9deb9f6f320
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/negneg-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -frounding-math -fdump-tree-optimized-raw" } */
+
+// This assumes that long long is strictly larger than int
+
+#define DEF(num, T1, T2) T2 f##num(T1 x) { \
+    T1 y = -x; \
+    T2 z = (T2)y; \
+    return -z; \
+}
+DEF(0, unsigned, long long)
+DEF(1, unsigned, unsigned long long)
+DEF(2, double, float)
+
+/* { dg-final { scan-tree-dump-times "negate_expr" 6 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/negneg-4.c b/gcc/testsuite/gcc.dg/tree-ssa/negneg-4.c
new file mode 100644
index 00000000000..e1131d06f64
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/negneg-4.c
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-O -fwrapv" } */
+
+#define DEF(num, T1, T2) T2 f##num(T1 x) { \
+    T1 y = -x; \
+    T2 z = (T2)y; \
+    return -z; \
+}
+DEF(0, int, long long)
+
+int main(){
+    volatile int a = -1 - __INT_MAX__;
+    volatile long long b = f0 (a);
+    volatile long long c = a;
+    volatile long long d = -c;
+    if (b != d)
+      __builtin_abort();
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/noreturn-1.c b/gcc/testsuite/gcc.dg/tree-ssa/noreturn-1.c
new file mode 100644
index 00000000000..ae7ee42fabc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/noreturn-1.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } *
+/* { dg-options "-O2 -fdump-tree-ssa -std=gnu11" } */
+/* { dg-final { scan-tree-dump-times "__builtin_unreachable" 4 "ssa" } } */
+
+void bar1 (void);
+void bar2 (void);
+void bar3 (void);
+void bar4 (void);
+
+_Noreturn void
+foo1 (int *p, int y)
+{
+  bar1 ();
+  *p = y;
+  return;	/* { dg-warning "function declared 'noreturn' has a 'return' statement" } */
+}		/* { dg-warning "'noreturn' function does return" "" { target *-*-* } .-1 } */
+
+_Noreturn void
+foo2 (int *p, int y)
+{
+  bar2 ();
+  *p = y;
+}		/* { dg-warning "'noreturn' function does return" } */
+
+_Noreturn void
+foo3 (int *p, int y)
+{
+  if (y > 10)
+    return;	/* { dg-warning "function declared 'noreturn' has a 'return' statement" } */
+  bar3 ();
+  *p = y;
+  return;	/* { dg-warning "function declared 'noreturn' has a 'return' statement" } */
+}		/* { dg-warning "'noreturn' function does return" } */
+
+_Noreturn void
+foo4 (int *p, int y)
+{
+  if (y > 10)
+    return;	/* { dg-warning "function declared 'noreturn' has a 'return' statement" } */
+  bar4 ();
+  *p = y;
+}		/* { dg-warning "'noreturn' function does return" } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82574.c b/gcc/testsuite/gcc.dg/tree-ssa/pr82574.c
new file mode 100644
index 00000000000..8fc459631ef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82574.c
@@ -0,0 +1,19 @@
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+unsigned char a, b, c, d[200][200];
+
+void abort (void);
+
+int main ()
+{
+  for (; a < 200; a++)
+    for (b = 0; b < 200; b++)
+      if (c)
+	d[a][b] = 1;
+
+  if ((c && d[0][0] != 1) || (!c && d[0][0] != 0))
+    abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
index c9feb256857..aad41f91f47 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
@@ -10,4 +10,4 @@ int main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "<bb 2> \\\[\[0-9.\]+%\\\] \\\[count: \[0-9INV\]*\\\]:\[\n\r \]*return 0;" "optimized" } } */
+/* { dg-final { scan-tree-dump "<bb 2> \\\[\[0-9.\]+%\\\] \\\[count: \[0-9INV\]*\\\]:\[\n\r \]*return 0;" "optimized" { xfail aarch64*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/ubsan/float-cast-overflow-bf.c b/gcc/testsuite/gcc.dg/ubsan/float-cast-overflow-bf.c
index 16268603375..538d900b0ab 100644
--- a/gcc/testsuite/gcc.dg/ubsan/float-cast-overflow-bf.c
+++ b/gcc/testsuite/gcc.dg/ubsan/float-cast-overflow-bf.c
@@ -48,25 +48,25 @@ main (void)
   return 0;
 }
 
-/* { dg-output "value -2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*value 2.5 is outside the range of representable values of type" } */
+/* { dg-output " -2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1.5 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* -1 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2 is outside the range of representable values of type\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]* 2.5 is outside the range of representable values of type" } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c
index cc04959d187..91eb28218bd 100644
--- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c
+++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-101.c
@@ -45,6 +45,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target vect_multiple_sizes } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c
index f32561dc4fd..51f62788dbf 100644
--- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c
+++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102.c
@@ -50,6 +50,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "possible dependence between data-refs" "vect" { target vect_multiple_sizes } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c
index 79a6ee45f7a..581438823fd 100644
--- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c
+++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-102a.c
@@ -50,6 +50,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "possible dependence between data-refs" "vect" { target vect_multiple_sizes } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c
index d0673e93fb2..6f4c84b4cd2 100644
--- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c
+++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-37.c
@@ -58,6 +58,5 @@ int main (void)
    If/when the aliasing problems are resolved, unalignment may
    prevent vectorization on some targets.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 4 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 6 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target vect_multiple_sizes } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c
index 232d8e526ed..6e9ddcfa5ce 100644
--- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c
+++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-79.c
@@ -46,6 +46,5 @@ int main (void)
   If/when the aliasing problems are resolved, unalignment may
   prevent vectorization on some targets.  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "can't determine dependence" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "can't determine dependence" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "can't determine dependence" "vect" { target vect_multiple_sizes } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr25413a.c b/gcc/testsuite/gcc.dg/vect/pr25413a.c
index 36e786f4f7d..a80ca868112 100644
--- a/gcc/testsuite/gcc.dg/vect/pr25413a.c
+++ b/gcc/testsuite/gcc.dg/vect/pr25413a.c
@@ -124,5 +124,5 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { ! vector_alignment_reachable } xfail { vect_element_align_preferred } } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { ! vector_alignment_reachable } xfail { vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { ! vector_alignment_reachable  } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { ! vector_alignment_reachable } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr31699.c b/gcc/testsuite/gcc.dg/vect/pr31699.c
index c73d11a224e..b0b9971fcfc 100644
--- a/gcc/testsuite/gcc.dg/vect/pr31699.c
+++ b/gcc/testsuite/gcc.dg/vect/pr31699.c
@@ -7,9 +7,9 @@
 float x[256];
 
 __attribute__ ((noinline))
-double *foo(void)
+float *foo(void)
 {
- double *z = malloc (sizeof(double) * 256);
+ float *z = malloc (sizeof(float) * 256);
 
  int i;
  for (i=0; i<256; ++i)
@@ -34,5 +34,5 @@ int main()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_intfloat_cvt } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { ! vector_alignment_reachable } xfail { vect_element_align_preferred } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { ! vector_alignment_reachable } xfail { vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { ! vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { ! vector_alignment_reachable } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr45752.c b/gcc/testsuite/gcc.dg/vect/pr45752.c
index 22a398f0acc..755205b275a 100644
--- a/gcc/testsuite/gcc.dg/vect/pr45752.c
+++ b/gcc/testsuite/gcc.dg/vect/pr45752.c
@@ -103,8 +103,8 @@ int main (int argc, const char* argv[])
 	26776, 9542, 363804, 169059, 25853, 36596, 12962, 503404, 224634,
 	35463 };
 #else
-  unsigned int check_results[N];
-  unsigned int check_results2[N];
+  volatile unsigned int check_results[N];
+  volatile unsigned int check_results2[N];
 
   for (i = 0; i < N / 5; i++)
     {
@@ -140,7 +140,7 @@ int main (int argc, const char* argv[])
       check_results2[i * 5 + 4] = (M40 * a + M41 * b + M42 * c
 				   + M43 * d + M44 * e);
 
-      asm volatile ("");
+      asm volatile ("" ::: "memory");
     }
 #endif
 
diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-5.c b/gcc/testsuite/gcc.dg/vect/pr65947-5.c
index 3e34b7a2644..709f17f80a4 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-5.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-5.c
@@ -37,7 +37,7 @@ main (void)
   for (int i = 32; i < N; ++i)
     {
       a[i] = 70 + (i & 3);
-      asm volatile ("");
+      asm volatile ("" ::: "memory");
     }
 
   check_vect ();
diff --git a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
index b5aaa924bb2..e3466d0da1d 100644
--- a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
@@ -120,4 +120,4 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target vect_int } } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target { vect_int && { ! vector_alignment_reachable } } xfail { vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-19c.c b/gcc/testsuite/gcc.dg/vect/slp-19c.c
index de47f7760c4..cda6a096332 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-19c.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-19c.c
@@ -21,7 +21,7 @@ main1 ()
   for (unsigned int i = 0; i < N * 8; ++i)
     {
       in[i] = i & 63;
-      asm volatile ("");
+      asm volatile ("" ::: "memory");
     }
 #endif
   unsigned int ia[N*2], a0, a1, a2, a3;
diff --git a/gcc/testsuite/gcc.dg/vect/slp-23.c b/gcc/testsuite/gcc.dg/vect/slp-23.c
index 8dd95528cae..88708e645d6 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-23.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-23.c
@@ -107,6 +107,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { vect_strided8 && { ! { vect_no_align} } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! { vect_strided8 || vect_no_align } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_any_perm } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_any_perm xfail vect_variable_length } } } */
+/* We fail to vectorize the second loop with variable-length SVE but
+   fall back to 128-bit vectors, which does use SLP.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_perm } xfail aarch64_sve } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm } } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/slp-28.c b/gcc/testsuite/gcc.dg/vect/slp-28.c
index d9f50d1c097..4211b94ad7f 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-28.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-28.c
@@ -57,7 +57,8 @@ main1 ()
         abort ();
     }
   
-  /* Vectorizable with a fully-masked loop or if VF==8.  */
+  /* Not vectorizable because of data dependencies: distance 3 is greater than 
+     the actual VF with SLP (2), but the analysis fail to detect that for now.  */
   for (i = 3; i < N/4; i++)
     {
       in3[i*4] = in3[(i-3)*4] + 5;
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-1.c b/gcc/testsuite/gcc.dg/vect/slp-perm-1.c
index 9a0835575d3..6bd16ef43b0 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-1.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-1.c
@@ -51,7 +51,7 @@ int main (int argc, const char* argv[])
 #if N == 16
   unsigned int check_results[N] = {1470, 395, 28271, 5958, 1655, 111653, 10446, 2915, 195035, 14934, 4175, 278417, 19422, 5435, 361799, 0};
 #else
-  unsigned int check_results[N] = {};
+  volatile unsigned int check_results[N] = {};
 
   for (unsigned int i = 0; i < N / 3; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-4.c b/gcc/testsuite/gcc.dg/vect/slp-perm-4.c
index a706f1792f5..3a4420c53e4 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-4.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-4.c
@@ -70,15 +70,15 @@ int main (int argc, const char* argv[])
     {
       input[i] = i%256;
       output[i] = 0;
-      __asm__ volatile ("");
+      asm volatile ("" ::: "memory");
     }
 
 #if N == 20
   unsigned int check_results[N]
-    = {3208, 1334, 28764, 35679, 2789, 13028, 4754, 168364, 91254, 12399, 
+    = {3208, 1334, 28764, 35679, 2789, 13028, 4754, 168364, 91254, 12399,
     22848, 8174, 307964, 146829, 22009, 32668, 11594, 447564, 202404, 31619};
 #else
-  unsigned int check_results[N];
+  volatile unsigned int check_results[N];
 
   for (i = 0; i < N / 5; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-5.c b/gcc/testsuite/gcc.dg/vect/slp-perm-5.c
index edd36ab5255..52939133ca8 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-5.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-5.c
@@ -72,8 +72,8 @@ int main (int argc, const char* argv[])
   int check_results2[N] = { 4322, 135, 13776, 629, 23230, 1123, 32684, 1617,
 			    42138, 2111, 0, 0, 0, 0, 0, 0 };
 #else
-  int check_results[N] = {};
-  int check_results2[N] = {};
+  volatile int check_results[N] = {};
+  volatile int check_results2[N] = {};
 
   for (int i = 0; i < N / 3; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
index e44ab555d98..4eb648ac71b 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
@@ -71,8 +71,8 @@ int main (int argc, const char* argv[])
   int check_results2[N] = { 0, 112, 810, 336, 1620, 560, 2430, 784, 3240, 1008,
 			    0, 0, 0, 0, 0, 0 };
 #else
-  int check_results[N] = {};
-  int check_results2[N] = {};
+  volatile int check_results[N] = {};
+  volatile int check_results2[N] = {};
 
   for (int i = 0; i < N / 3; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-7.c b/gcc/testsuite/gcc.dg/vect/slp-perm-7.c
index bc06a78d2e5..baf7f7888a3 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-7.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-7.c
@@ -66,8 +66,8 @@ int main (int argc, const char* argv[])
   int check_results[N] = {1470, 395, 28271, 5958, 1655, 111653, 10446, 2915, 195035, 14934, 4175, 278417, 19422, 5435, 361799, 0};
   int check_results2[N] = {0, 405, 810, 1215, 1620, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
 #else
-  int check_results[N] = {};
-  int check_results2[N] = {};
+  volatile int check_results[N] = {};
+  volatile int check_results2[N] = {};
 
   for (int i = 0; i < N / 3; i++)
     {
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
index c94125f5fb4..94d4455dfd9 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-8.c
@@ -31,11 +31,7 @@ int main (int argc, const char* argv[])
 {
   unsigned char input[N], output[N];
   unsigned char check_results[N];
-#if N < 256
-  unsigned char i;
-#else
   unsigned int i;
-#endif
 
   check_vect ();
 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
index ad0832348dc..b01d493b6e7 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
@@ -37,7 +37,7 @@ int main (int argc, const char* argv[])
     {
       input[i] = i;
       output[i] = 0;
-      asm volatile ("");
+      asm volatile ("" ::: "memory");
     }
 
   for (i = 0; i < N / 3; i++)
diff --git a/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c b/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
index d68687930de..5121414260b 100644
--- a/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
+++ b/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
@@ -47,10 +47,8 @@ int main (void)
 }
 
 /* 2 for the first loop.  */
-/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { target { vect_1_size } } } } */
-/* { dg-final { scan-tree-dump-times "Detected reduction\\." 4 "vect" { target { vect_2_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "Detected reduction\\." 5 "vect" { target { vect_3_sizes } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "Detected reduction\\." "vect" { target vect_multiple_sizes } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "not vectorized" "vect" { target vect_multiple_sizes } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { ! vect_no_int_min_max } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-104.c b/gcc/testsuite/gcc.dg/vect/vect-104.c
index f86043e6231..a77c98735eb 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-104.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-104.c
@@ -62,6 +62,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target vect_1_size } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 2 "vect" { target vect_2_sizes } } } */
-/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 3 "vect" { target vect_3_sizes } } } */
+/* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "possible dependence between data-refs" "vect" { target vect_multiple_sizes } } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/vect-109.c b/gcc/testsuite/gcc.dg/vect/vect-109.c
index 566cc8f7ab3..9a507105899 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-109.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-109.c
@@ -76,5 +76,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_element_align } } } */
 /* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target { vect_element_align } xfail { ! vect_unaligned_possible } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align xfail { ! vect_unaligned_possible } } } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-33.c b/gcc/testsuite/gcc.dg/vect/vect-33.c
index af5aefa2455..e215052ff77 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-33.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-33.c
@@ -37,6 +37,6 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { target { { { ! powerpc*-*-* } && vect_hw_misalign } && { ! { vect64 && vect_1_size } } } xfail { ! vect_unaligned_possible } } } } */
-/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" "vect" { target { vector_alignment_reachable && { vect64 && vect_1_size } } } } } */
+/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { target { { { ! powerpc*-*-* } && vect_hw_misalign } && { { ! vect64 } || vect_multiple_sizes } } xfail { ! vect_unaligned_possible } } } }  */
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" "vect" { target { vector_alignment_reachable && { vect64 && {! vect_multiple_sizes} } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { {! vector_alignment_reachable} || {! vect64} } && {! vect_hw_misalign} } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-42.c b/gcc/testsuite/gcc.dg/vect/vect-42.c
index 55adfc93df3..a65b4a62276 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-42.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-42.c
@@ -67,5 +67,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { { !  vector_alignment_reachable } || vect_element_align  } } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target { vect_element_align } xfail { ! { vect_unaligned_possible && vect_align_stack_vars } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align xfail { ! { vect_unaligned_possible && vect_align_stack_vars } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { { ! vector_alignment_reachable } || vect_element_align } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c b/gcc/testsuite/gcc.dg/vect/vect-44.c
index 96dab0ac03e..03ef2c0f671 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-44.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-44.c
@@ -66,6 +66,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { xfail { ! vect_unaligned_possible } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || { { ! vector_alignment_reachable } && { ! vect_element_align_preferred } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c b/gcc/testsuite/gcc.dg/vect/vect-50.c
index e11036bf130..c9500ca91e5 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-50.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-50.c
@@ -62,6 +62,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { xfail { ! vect_unaligned_possible } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || { { ! vector_alignment_reachable } && { ! vect_element_align_preferred } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { { vect_no_align && { ! vect_hw_misalign } } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-56.c b/gcc/testsuite/gcc.dg/vect/vect-56.c
index 673ab23a2b4..8060b05e781 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-56.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-56.c
@@ -70,5 +70,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { ! vect_element_align } xfail { ! vect_unaligned_possible } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } xfail { ! vect_unaligned_possible } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } xfail { vect_element_align_preferred } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } xfail { vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { { ! vect_element_align } || vect_element_align_preferred} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align && { ! vect_element_align_preferred } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-60.c b/gcc/testsuite/gcc.dg/vect/vect-60.c
index 9dcfd85ce9c..3b7477c96ab 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-60.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-60.c
@@ -71,5 +71,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { ! vect_element_align } xfail { ! vect_unaligned_possible } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } xfail { ! vect_unaligned_possible } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } xfail { vect_element_align_preferred } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } xfail { vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { { ! vect_element_align } || vect_element_align_preferred } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align && { ! vect_element_align_preferred } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-70.c b/gcc/testsuite/gcc.dg/vect/vect-70.c
index 8f212571693..793dbfb7481 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-70.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-70.c
@@ -12,6 +12,7 @@
 
 #define N (NINTS * 6)
 
+/* Keep execution time down.  */
 #if N <= 24
 #define OUTERN N
 #else
@@ -30,6 +31,7 @@ struct test1{
   struct s e[N]; /* array e.n is aligned */
 };
 
+/* Avoid big local temporaries.  */
 #if NINTS > 8
 struct test1 tmp1;
 #endif
diff --git a/gcc/testsuite/gcc.dg/vect/vect-91.c b/gcc/testsuite/gcc.dg/vect/vect-91.c
index ffa95b71d24..9430da3290a 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-91.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-91.c
@@ -7,13 +7,14 @@
 
 #define N 256
 
+/* Pick a value greater than the vector length.  */
 #if VECTOR_BITS > 128
 #define OFF (VECTOR_BITS * 5 / 32)
 #else
 #define OFF 20
 #endif
 
-extern int a[N+OFF];
+extern int a[N + OFF];
 
 /* The alignment of 'pa' is unknown. 
    Yet we do know that both the read access and write access have 
@@ -58,7 +59,7 @@ main3 ()
 
   for (i = 0; i < N; i++)
     {
-      a[i] = a[i+OFF];
+      a[i] = a[i + OFF];
     }
 
   return 0;
diff --git a/gcc/testsuite/gcc.dg/vect/vect-96.c b/gcc/testsuite/gcc.dg/vect/vect-96.c
index c72595f97b8..0cb935b9f16 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-96.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-96.c
@@ -48,7 +48,7 @@ int main (void)
    For targets that don't support unaligned loads, version for the store.  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { { {! vect_no_align} && vector_alignment_reachable } && { ! vect_align_stack_vars } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { { {! vect_no_align} && vector_alignment_reachable } && vect_align_stack_vars } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { { {! vect_no_align} && vector_alignment_reachable } && { ! vect_align_stack_vars } } xfail { ! vect_unaligned_possible } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { { {! vect_no_align} && vector_alignment_reachable } && vect_align_stack_vars } xfail { ! vect_unaligned_possible } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { { vect_no_align && { ! vect_hw_misalign } } || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
index 9f96601e4a7..378a5fe642a 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
@@ -90,5 +90,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { { ! vect_unaligned_possible } || vect_sizes_32B_16B } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { ! vect_unaligned_possible } || vect_sizes_32B_16B } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { target { vect_no_align && { { ! vect_hw_misalign } && vect_sizes_32B_16B } } }} } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-peel-3.c b/gcc/testsuite/gcc.dg/vect/vect-peel-3.c
index 231c13dfc28..d5c0cf10ce1 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-peel-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-peel-3.c
@@ -6,7 +6,7 @@
 
 #if VECTOR_BITS > 128
 #define NINTS (VECTOR_BITS / 32)
-#define EXTRA NINTS * 2
+#define EXTRA (NINTS * 2)
 #else
 #define NINTS 4
 #define EXTRA 10
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
index dc4f52019d5..ac674749b6f 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
@@ -1,4 +1,7 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-additional-options "-march=armv8.2-a+dotprod" { target { aarch64*-*-* } } } */
+/* { dg-add-options arm_v8_2a_dotprod_neon }  */
 
 #include <stdarg.h>
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
index f3cc6c78c25..d020f643bb8 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
@@ -1,4 +1,7 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-additional-options "-march=armv8.2-a+dotprod" { target { aarch64*-*-* } } } */
+/* { dg-add-options arm_v8_2a_dotprod_neon }  */
 
 #include <stdarg.h>
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c
new file mode 100644
index 00000000000..b7378adf8ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c
@@ -0,0 +1,73 @@
+/* { dg-do compile { target { aarch64*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+
+#include <arm_neon.h>
+
+/* Unsigned Dot Product instructions.  */
+
+uint32x2_t ufoo (uint32x2_t r, uint8x8_t x, uint8x8_t y)
+{
+  return vdot_u32 (r, x, y);
+}
+
+uint32x4_t ufooq (uint32x4_t r, uint8x16_t x, uint8x16_t y)
+{
+  return vdotq_u32 (r, x, y);
+}
+
+uint32x2_t ufoo_lane (uint32x2_t r, uint8x8_t x, uint8x8_t y)
+{
+  return vdot_lane_u32 (r, x, y, 0);
+}
+
+uint32x2_t ufoo_laneq (uint32x2_t r, uint8x8_t x, uint8x16_t y)
+{
+  return vdot_laneq_u32 (r, x, y, 0);
+}
+
+uint32x4_t ufooq_lane (uint32x4_t r, uint8x16_t x, uint8x8_t y)
+{
+  return vdotq_lane_u32 (r, x, y, 0);
+}
+
+uint32x4_t ufooq_laneq (uint32x4_t r, uint8x16_t x, uint8x16_t y)
+{
+  return vdotq_laneq_u32 (r, x, y, 0);
+}
+
+/* Signed Dot Product instructions.  */
+
+int32x2_t sfoo (int32x2_t r, int8x8_t x, int8x8_t y)
+{
+  return vdot_s32 (r, x, y);
+}
+
+int32x4_t sfooq (int32x4_t r, int8x16_t x, int8x16_t y)
+{
+  return vdotq_s32 (r, x, y);
+}
+
+int32x2_t sfoo_lane (int32x2_t r, int8x8_t x, int8x8_t y)
+{
+  return vdot_lane_s32 (r, x, y, 0);
+}
+
+int32x2_t sfoo_laneq (int32x2_t r, int8x8_t x, int8x16_t y)
+{
+  return vdot_laneq_s32 (r, x, y, 0);
+}
+
+int32x4_t sfooq_lane (int32x4_t r, int8x16_t x, int8x8_t y)
+{
+  return vdotq_lane_s32 (r, x, y, 0);
+}
+
+int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, int8x16_t y)
+{
+  return vdotq_laneq_s32 (r, x, y, 0);
+}
+
+/* { dg-final { scan-assembler-times {[us]dot\tv[0-9]+\.2s, v[0-9]+\.8b, v[0-9]+\.8b} 2 } } */
+/* { dg-final { scan-assembler-times {[us]dot\tv[0-9]+\.2s, v[0-9]+\.8b, v[0-9]+\.4b\[[0-9]+\]}  4 } } */
+/* { dg-final { scan-assembler-times {[us]dot\tv[0-9]+\.4s, v[0-9]+\.16b, v[0-9]+\.16b}  2 } } */
+/* { dg-final { scan-assembler-times {[us]dot\tv[0-9]+\.4s, v[0-9]+\.16b, v[0-9]+\.4b\[[0-9]+\]}  4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
new file mode 100644
index 00000000000..3e7cd6c2fc2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
@@ -0,0 +1,81 @@
+/* { dg-skip-if "can't compile on arm." { arm*-*-* } } */
+/* { dg-do run { target { aarch64*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw } */
+
+#include <arm_neon.h>
+
+extern void abort();
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+# define ORDER(x, y) y
+#else
+# define ORDER(x, y) x - y
+#endif
+
+#define P(n1,n2) n1,n1,n1,n1,n2,n2,n2,n2
+#define ARR(nm, p, ty, ...) ty nm##_##p = { __VA_ARGS__ }
+#define TEST(t1, t2, t3, f, r1, r2, n1, n2) \
+	ARR(f, x, t1, r1);		    \
+	ARR(f, y, t2, r2);		    \
+	t3 f##_##r = {0};		    \
+	f##_##r = f (f##_##r, f##_##x, f##_##y);  \
+	if (f##_##r[0] != n1 || f##_##r[1] != n2)   \
+	  abort ();
+
+#define TEST_LANE(t1, t2, t3, f, r1, r2, n1, n2, n3, n4) \
+	ARR(f, x, t1, r1);		    \
+	ARR(f, y, t2, r2);		    \
+	t3 f##_##rx = {0};		    \
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, ORDER (1, 0));  \
+	if (f##_##rx[0] != n1 || f##_##rx[1] != n2)   \
+	  abort ();				    \
+	t3 f##_##rx1 = {0};			    \
+	f##_##rx1 = f (f##_##rx1, f##_##x, f##_##y, ORDER (1, 1));  \
+	if (f##_##rx1[0] != n3 || f##_##rx1[1] != n4)   \
+	  abort ();
+
+#define Px(n1,n2,n3,n4) P(n1,n2),P(n3,n4)
+#define TEST_LANEQ(t1, t2, t3, f, r1, r2, n1, n2, n3, n4, n5, n6, n7, n8) \
+	ARR(f, x, t1, r1);		    \
+	ARR(f, y, t2, r2);		    \
+	t3 f##_##rx = {0};		    \
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, ORDER (3, 0));  \
+	if (f##_##rx[0] != n1 || f##_##rx[1] != n2)   \
+	  abort ();				    \
+	t3 f##_##rx1 = {0};			    \
+	f##_##rx1 = f (f##_##rx1, f##_##x, f##_##y, ORDER (3, 1));  \
+	if (f##_##rx1[0] != n3 || f##_##rx1[1] != n4)   \
+	  abort (); \
+	t3 f##_##rx2 = {0};				    \
+	f##_##rx2 = f (f##_##rx2, f##_##x, f##_##y, ORDER (3, 2));  \
+	if (f##_##rx2[0] != n5 || f##_##rx2[1] != n6)   \
+	  abort ();				    \
+	t3 f##_##rx3 = {0};			    \
+	f##_##rx3 = f (f##_##rx3, f##_##x, f##_##y, ORDER (3, 3));  \
+	if (f##_##rx3[0] != n7 || f##_##rx3[1] != n8)   \
+	  abort ();
+
+int
+main()
+{
+  TEST (uint8x8_t, uint8x8_t, uint32x2_t, vdot_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x8_t, int8x8_t, int32x2_t, vdot_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST (uint8x16_t, uint8x16_t, uint32x4_t, vdotq_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x16_t, int8x16_t, int32x4_t, vdotq_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST_LANE (uint8x8_t, uint8x8_t, uint32x2_t, vdot_lane_u32, P(1,2), P(2,3), 8, 16, 12, 24);
+  TEST_LANE (int8x8_t, int8x8_t, int32x2_t, vdot_lane_s32, P(1,2), P(-2,-3), -8, -16, -12, -24);
+
+  TEST_LANE (uint8x16_t, uint8x8_t, uint32x4_t, vdotq_lane_u32, P(1,2), P(2,3), 8, 16, 12, 24);
+  TEST_LANE (int8x16_t, int8x8_t, int32x4_t, vdotq_lane_s32, P(1,2), P(-2,-3), -8, -16, -12, -24);
+
+  TEST_LANEQ (uint8x8_t, uint8x16_t, uint32x2_t, vdot_laneq_u32, P(1,2), Px(2,3,1,4), 8, 16, 12, 24, 4, 8, 16, 32);
+  TEST_LANEQ (int8x8_t, int8x16_t, int32x2_t, vdot_laneq_s32, P(1,2), Px(-2,-3,-1,-4), -8, -16, -12, -24, -4, -8, -16, -32);
+
+  TEST_LANEQ (uint8x16_t, uint8x16_t, uint32x4_t, vdotq_laneq_u32, Px(1,2,2,1), Px(2,3,1,4), 8, 16, 12, 24, 4, 8, 16, 32);
+  TEST_LANEQ (int8x16_t, int8x16_t, int32x4_t, vdotq_laneq_s32, Px(1,2,2,1), Px(-2,-3,-1,-4), -8, -16, -12, -24, -4, -8, -16, -32);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h
new file mode 100644
index 00000000000..90b00aff95c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h
@@ -0,0 +1,15 @@
+TYPE char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+TYPE char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+
+__attribute__ ((noinline)) int
+foo1(int len) {
+  int i;
+  TYPE int result = 0;
+  TYPE short prod;
+
+  for (i=0; i<len; i++) {
+    prod = X[i] * Y[i];
+    result += prod;
+  }
+  return result;
+}
+\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c
new file mode 100644
index 00000000000..57b5ef82f85
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target { aarch64*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+
+#define N 64
+#define TYPE signed
+
+#include "vect-dot-qi.h"
+
+/* { dg-final { scan-assembler-times {sdot\tv[0-9]+\.4s, v[0-9]+\.16b, v[0-9]+\.16b} 4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c
new file mode 100644
index 00000000000..b2cef318500
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target { aarch64*-*-* } } } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+
+#define N 64
+#define TYPE unsigned
+
+#include "vect-dot-qi.h"
+
+/* { dg-final { scan-assembler-times {udot\tv[0-9]+\.4s, v[0-9]+\.16b, v[0-9]+\.16b} 4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c b/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c
new file mode 100644
index 00000000000..953c388037f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cmpelim_mult_uses_1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* X is both compared against zero and used.  Make sure we can still
+   generate an ADDS and avoid an explicit comparison against zero.  */
+
+int
+foo (int x, int y)
+{
+  x += y;
+  if (x != 0)
+    x = x + 2;
+  return x;
+}
+
+/* { dg-final { scan-assembler-times "adds\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, 0" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc1.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc1.c
new file mode 100644
index 00000000000..0441458f635
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc1.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+float
+f1 (float x)
+{
+  int y = x;
+
+  return (float) y;
+}
+
+double
+f2 (double x)
+{
+  long y = x;
+
+  return (double) y;
+}
+
+/* { dg-final { scan-assembler "fcvtzs\\ts\[0-9\]+, s\[0-9\]+" } } */
+/* { dg-final { scan-assembler "scvtf\\ts\[0-9\]+, s\[0-9\]+" } } */
+/* { dg-final { scan-assembler "fcvtzs\\td\[0-9\]+, d\[0-9\]+" } } */
+/* { dg-final { scan-assembler "scvtf\\td\[0-9\]+, d\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c b/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c
index 6080e186d8f..bd0c73c8d34 100644
--- a/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/inline-lrint_2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-options "-O3 -fno-math-errno" } */
+/* { dg-options "-O3 -fno-math-errno -fno-trapping-math" } */
 
 #include "lrint-matherr.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_unaligned_2.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_unaligned_2.c
new file mode 100644
index 00000000000..1e46755a39a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_unaligned_2.c
@@ -0,0 +1,18 @@
+/* { dg-options "-O2 -fomit-frame-pointer" } */
+
+/* Check that we split unaligned LDP/STP into base and aligned offset.  */
+
+typedef struct
+{
+  int a, b, c, d, e;
+} S;
+
+void foo (S *);
+
+void test (int x)
+{
+  S s = { .a = x };
+  foo (&s);
+}
+
+/* { dg-final { scan-assembler-not "mov\tx\[0-9\]+, sp" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/pr78733.c b/gcc/testsuite/gcc.target/aarch64/pr78733.c
index ce462cedf9f..3cdb3ba7373 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr78733.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr78733.c
@@ -7,4 +7,5 @@ t (void)
   return (__int128)1 << 80;
 }
 
-/* { dg-final { scan-assembler "adr" } } */
+/* { dg-final { scan-assembler "\tmov\tx0, 0" } } */
+/* { dg-final { scan-assembler "\tmov\tx1, 65536" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/pr79041-2.c b/gcc/testsuite/gcc.target/aarch64/pr79041-2.c
index a889dfdd895..62856f10438 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr79041-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr79041-2.c
@@ -8,5 +8,6 @@ t (void)
   return (__int128)1 << 80;
 }
 
-/* { dg-final { scan-assembler "adr" } } */
+/* { dg-final { scan-assembler "\tmov\tx0, 0" } } */
+/* { dg-final { scan-assembler "\tmov\tx1, 65536" } } */
 /* { dg-final { scan-assembler-not "adrp" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/pr80295.c b/gcc/testsuite/gcc.target/aarch64/pr80295.c
new file mode 100644
index 00000000000..b3866d8d6a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr80295.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=ilp32" } */
+
+void f (void *b) 
+{ 
+  __builtin_update_setjmp_buf (b); 
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/spellcheck_1.c b/gcc/testsuite/gcc.target/aarch64/spellcheck_1.c
index ccfe417e644..f57e0c54632 100644
--- a/gcc/testsuite/gcc.target/aarch64/spellcheck_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/spellcheck_1.c
@@ -4,6 +4,6 @@ __attribute__((target ("arch=armv8-a-typo"))) void
 foo ()
 {
   /* { dg-message "valid arguments are: \[^\n\r]*; did you mean 'armv8-a'?"  "" { target *-*-* } .-1 } */
-  /* { dg-error "unknown value 'armv8-a-typo' for 'arch' target attribute"  "" { target *-*-* } .-2 } */
-  /* { dg-error "target attribute 'arch=armv8-a-typo' is invalid"  "" { target *-*-* } .-3 } */
+  /* { dg-error "invalid name \\(\"armv8-a-typo\"\\) in 'target\\(\"arch=\"\\)' pragma or attribute"  "" { target *-*-* } .-2 } */
+  /* { dg-error "pragma or attribute 'target\\(\"arch=armv8-a-typo\"\\)' is not valid"  "" { target *-*-* } .-3 } */
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/spellcheck_2.c b/gcc/testsuite/gcc.target/aarch64/spellcheck_2.c
index 42ba51a7226..70096f89e0b 100644
--- a/gcc/testsuite/gcc.target/aarch64/spellcheck_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/spellcheck_2.c
@@ -3,7 +3,7 @@
 __attribute__((target ("cpu=cortex-a57-typo"))) void
 foo ()
 {
-  /* { dg-message "valid arguments are: \[^\n\r]*; did you mean 'cortex-a57?"  "" { target *-*-* } .-1 } */
-  /* { dg-error "unknown value 'cortex-a57-typo' for 'cpu' target attribute"  "" { target *-*-* } .-2 } */
-  /* { dg-error "target attribute 'cpu=cortex-a57-typo' is invalid"  "" { target *-*-* } .-3 } */
+  /* { dg-message "valid arguments are: \[^\n\r]*; did you mean 'cortex-a57'?"  "" { target *-*-* } .-1 } */
+  /* { dg-error "invalid name \\(\"cortex-a57-typo\"\\) in 'target\\(\"cpu=\"\\)' pragma or attribute"  "" { target *-*-* } .-2 } */
+  /* { dg-error "pragma or attribute 'target\\(\"cpu=cortex-a57-typo\"\\)' is not valid"  "" { target *-*-* } .-3 } */
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/spellcheck_3.c b/gcc/testsuite/gcc.target/aarch64/spellcheck_3.c
index 03d2bbf14a0..20dff2b6e45 100644
--- a/gcc/testsuite/gcc.target/aarch64/spellcheck_3.c
+++ b/gcc/testsuite/gcc.target/aarch64/spellcheck_3.c
@@ -3,7 +3,7 @@
 __attribute__((target ("tune=cortex-a57-typo"))) void
 foo ()
 {
-  /* { dg-message "valid arguments are: \[^\n\r]*; did you mean 'cortex-a57?"  "" { target *-*-* } .-1 } */
-  /* { dg-error "unknown value 'cortex-a57-typo' for 'tune' target attribute"  "" { target *-*-* } .-2 } */
-  /* { dg-error "target attribute 'tune=cortex-a57-typo' is invalid"  "" { target *-*-* } .-3 } */
+  /* { dg-message "valid arguments are: \[^\n\r]*; did you mean 'cortex-a57'?"  "" { target *-*-* } .-1 } */
+  /* { dg-error "invalid name \\(\"cortex-a57-typo\"\\) in 'target\\(\"tune=\"\\)' pragma or attribute"  "" { target *-*-* } .-2 } */
+  /* { dg-error "pragma or attribute 'target\\(\"tune=cortex-a57-typo\"\\)' is not valid"  "" { target *-*-* } .-3 } */
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-12.c b/gcc/testsuite/gcc.target/aarch64/stack-check-12.c
new file mode 100644
index 00000000000..2ce38483b6b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-check-12.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=12" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+extern void arf (unsigned long int *, unsigned long int *);
+void
+frob ()
+{
+  unsigned long int num[1000];
+  unsigned long int den[1000];
+  arf (den, num);
+}
+
+/* This verifies that the scheduler did not break the dependencies
+   by adjusting the offsets within the probe and that the scheduler
+   did not reorder around the stack probes.  */
+/* { dg-final { scan-assembler-times "sub\\tsp, sp, #4096\\n\\tstr\\txzr, .sp, 4088." 3 } } */
+
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-13.c b/gcc/testsuite/gcc.target/aarch64/stack-check-13.c
new file mode 100644
index 00000000000..d8886835989
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-check-13.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=12" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+#define ARG32(X) X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X
+#define ARG192(X) ARG32(X),ARG32(X),ARG32(X),ARG32(X),ARG32(X),ARG32(X)
+void out1(ARG192(__int128));
+int t1(int);
+
+int t3(int x)
+{
+  if (x < 1000)
+    return t1 (x) + 1;
+
+  out1 (ARG192(1));
+  return 0;
+}
+
+
+
+/* This test creates a large (> 1k) outgoing argument area that needs
+   to be probed.  We don't test the exact size of the space or the
+   exact offset to make the test a little less sensitive to trivial
+   output changes.  */
+/* { dg-final { scan-assembler-times "sub\\tsp, sp, #....\\n\\tstr\\txzr, \\\[sp" 1 } } */
+
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-14.c b/gcc/testsuite/gcc.target/aarch64/stack-check-14.c
new file mode 100644
index 00000000000..59ffe01376d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-check-14.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=12" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+int t1(int);
+
+int t2(int x)
+{
+  char *p = __builtin_alloca (4050);
+  x = t1 (x);
+  return p[x];
+}
+
+
+/* This test has a constant sized alloca that is smaller than the
+   probe interval.  But it actually requires two probes instead
+   of one because of the optimistic assumptions we made in the
+   aarch64 prologue code WRT probing state. 
+
+   The form can change quite a bit so we just check for two
+   probes without looking at the actual address.  */
+/* { dg-final { scan-assembler-times "str\\txzr," 2 } } */
+
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-15.c b/gcc/testsuite/gcc.target/aarch64/stack-check-15.c
new file mode 100644
index 00000000000..e06db6dc2f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-check-15.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=12" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+int t1(int);
+
+int t2(int x)
+{
+  char *p = __builtin_alloca (x);
+  x = t1 (x);
+  return p[x];
+}
+
+
+/* This test has a variable sized alloca.  It requires 3 probes.
+   One in the loop, one for the residual and at the end of the
+   alloca area. 
+
+   The form can change quite a bit so we just check for two
+   probes without looking at the actual address.  */
+/* { dg-final { scan-assembler-times "str\\txzr," 3 } } */
+
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c
index 95c8f696fee..2691250f79e 100644
--- a/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c
@@ -11,5 +11,5 @@ foo (int a, int b)
     return 0;
 }
 
-/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
-/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */
+/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
index 60c6d9e5ccd..d343acc1195 100644
--- a/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
@@ -11,5 +11,5 @@ foo (int a, int b)
     return 0;
 }
 
-/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, #4" 1 } } */
+/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, #4" 1 { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_arith_1.c b/gcc/testsuite/gcc.target/aarch64/sve_arith_1.c
index b3c4cb9d8a7..1a61d6a7f40 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_arith_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_arith_1.c
@@ -1,40 +1,41 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
+
+#include <stdint.h>
 
 #define DO_REGREG_OPS(TYPE, OP, NAME)				\
-void varith_##TYPE##_##NAME (TYPE* dst, TYPE* src, int count)	\
+void varith_##TYPE##_##NAME (TYPE *dst, TYPE *src, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
     dst[i] = dst[i] OP src[i];					\
 }
 
-
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, OP, NAME)		\
-void varithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void varithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
     dst[i] = dst[i] OP VALUE;				\
 }
 
 #define DO_ARITH_OPS(TYPE, OP, NAME)			\
-DO_REGREG_OPS (TYPE, OP, NAME);				\
-DO_IMMEDIATE_OPS (0, TYPE, OP, NAME ## 0);		\
-DO_IMMEDIATE_OPS (5, TYPE, OP, NAME ## 5);		\
-DO_IMMEDIATE_OPS (255, TYPE, OP, NAME ## 255);		\
-DO_IMMEDIATE_OPS (256, TYPE, OP, NAME ## 256);		\
-DO_IMMEDIATE_OPS (257, TYPE, OP, NAME ## 257);		\
-DO_IMMEDIATE_OPS (65280, TYPE, OP, NAME ## 65280);	\
-DO_IMMEDIATE_OPS (65281, TYPE, OP, NAME ## 65281);	\
-DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);
+  DO_REGREG_OPS (TYPE, OP, NAME);			\
+  DO_IMMEDIATE_OPS (0, TYPE, OP, NAME ## 0);		\
+  DO_IMMEDIATE_OPS (5, TYPE, OP, NAME ## 5);		\
+  DO_IMMEDIATE_OPS (255, TYPE, OP, NAME ## 255);	\
+  DO_IMMEDIATE_OPS (256, TYPE, OP, NAME ## 256);	\
+  DO_IMMEDIATE_OPS (257, TYPE, OP, NAME ## 257);	\
+  DO_IMMEDIATE_OPS (65280, TYPE, OP, NAME ## 65280);	\
+  DO_IMMEDIATE_OPS (65281, TYPE, OP, NAME ## 65281);	\
+  DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);
 
-DO_ARITH_OPS (char, +, add)
-DO_ARITH_OPS (short, +, add)
-DO_ARITH_OPS (int, +, add)
-DO_ARITH_OPS (long, +, add)
-DO_ARITH_OPS (char, -, minus)
-DO_ARITH_OPS (short, -, minus)
-DO_ARITH_OPS (int, -, minus)
-DO_ARITH_OPS (long, -, minus)
+DO_ARITH_OPS (int8_t, +, add)
+DO_ARITH_OPS (int16_t, +, add)
+DO_ARITH_OPS (int32_t, +, add)
+DO_ARITH_OPS (int64_t, +, add)
+DO_ARITH_OPS (int8_t, -, minus)
+DO_ARITH_OPS (int16_t, -, minus)
+DO_ARITH_OPS (int32_t, -, minus)
+DO_ARITH_OPS (int64_t, -, minus)
 
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
@@ -47,20 +48,21 @@ DO_ARITH_OPS (long, -, minus)
 
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #1\n} 4 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #5\n} 1 } } */
-/* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #255\n} } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #251\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #255\n} 4 } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #256\n} } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #257\n} } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #65280\n} } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #65281\n} } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #-1\n} } } */
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #1\n} 4 } } */
+/* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #1\n} } } */
 
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #255\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #256\n} 2 } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #257\n} } } */
-/* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #65280\n} } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #65280\n} 2 } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #65281\n} } } */
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #-1\n} } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #1\n} 1 } } */
@@ -83,8 +85,8 @@ DO_ARITH_OPS (long, -, minus)
 /* { dg-final { scan-assembler-not {\tadd\tz[0-9]+\.d, z[0-9]+\.d, #-1\n} } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, z[0-9]+\.d, #1\n} 1 } } */
 
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #1\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #5\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #1\n} } } */
+/* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #5\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #255\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #256\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #257\n} } } */
@@ -94,12 +96,11 @@ DO_ARITH_OPS (long, -, minus)
 
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #255\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #256\n} 2 } } */
+/* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #256\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #257\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #65280\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #65281\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #-1\n} } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #1\n} 1 } } */
 
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, z[0-9]+\.s, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, z[0-9]+\.s, #255\n} 1 } } */
@@ -118,4 +119,3 @@ DO_ARITH_OPS (long, -, minus)
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.d, z[0-9]+\.d, #65281\n} } } */
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.d, z[0-9]+\.d, #-1\n} } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.d, z[0-9]+\.d, #1\n} 1 } } */
-
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1.c
index d97a501512b..86d3930e476 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1.c
@@ -1,17 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-void vcvtf_32 (float *dst, signed int *src1, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+vcvtf_16 (_Float16 *dst, int16_t *src1, int size)
+{
+  for (int i = 0; i < size; i++)
+    dst[i] = (_Float16) src1[i];
+}
+
+void __attribute__ ((noinline, noclone))
+vcvtf_32 (float *dst, int32_t *src1, int size)
 {
   for (int i = 0; i < size; i++)
     dst[i] = (float) src1[i];
 }
 
-void vcvtf_64 (double *dst, signed long *src1, int size)
+void __attribute__ ((noinline, noclone))
+vcvtf_64 (double *dst, int64_t *src1, int size)
 {
   for (int i = 0; i < size; i++)
     dst[i] = (double) src1[i];
 }
 
+/* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tscvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1_run.c
index b0aa05c055e..9b431ad0ed4 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_signed_1_run.c
@@ -1,47 +1,47 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_cvtf_signed_1.c"
 
 #define ARRAY_SIZE 47
 
-#define VAL1 ((i * 3) - (15 * 3))
-#define VAL2 ((i * 0xffdfffef) - (11 * 0xffdfffef))
+#define VAL1 (i ^ 3)
+#define VAL2 ((i * 3) - (15 * 3))
+#define VAL3 ((i * 0xffdfffef) - (11 * 0xffdfffef))
 
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static float array_destf[ARRAY_SIZE];
-  static double array_destd[ARRAY_SIZE];
+  static _Float16 array_dest16[ARRAY_SIZE];
+  static float array_dest32[ARRAY_SIZE];
+  static double array_dest64[ARRAY_SIZE];
 
-  signed int array_source_i[ARRAY_SIZE];
-  signed long array_source_l[ARRAY_SIZE];
+  int16_t array_source16[ARRAY_SIZE];
+  int32_t array_source32[ARRAY_SIZE];
+  int64_t array_source64[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
     {
-      array_source_i[i] = VAL1;
-      array_source_l[i] = VAL2;
+      array_source16[i] = VAL1;
+      array_source32[i] = VAL2;
+      array_source64[i] = VAL3;
+      asm volatile ("" ::: "memory");
     }
 
-  vcvtf_32 (array_destf, array_source_i, ARRAY_SIZE);
+  vcvtf_16 (array_dest16, array_source16, ARRAY_SIZE);
+  for (int i = 0; i < ARRAY_SIZE; i++)
+    if (array_dest16[i] != (_Float16) VAL1)
+      __builtin_abort ();
+
+  vcvtf_32 (array_dest32, array_source32, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destf[i] != (float) VAL1)
-      {
-	fprintf (stderr,"%d: %f != %f\n", i, array_destf[i], (float) VAL1);
-	exit (1);
-      }
+    if (array_dest32[i] != (float) VAL2)
+      __builtin_abort ();
 
-  vcvtf_64 (array_destd, array_source_l, ARRAY_SIZE);
+  vcvtf_64 (array_dest64, array_source64, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destd[i] != (double) VAL2)
-      {
-	fprintf (stderr,"%d: %lf != %f\n", i, array_destd[i], (double) VAL2);
-	exit (1);
-      }
+    if (array_dest64[i] != (double) VAL3)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1.c
index bd8cf6f6cf5..0605307d1e3 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1.c
@@ -1,17 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-void vcvtf_32 (float *dst, unsigned int *src1, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+vcvtf_16 (_Float16 *dst, uint16_t *src1, int size)
+{
+  for (int i = 0; i < size; i++)
+    dst[i] = (_Float16) src1[i];
+}
+
+void __attribute__ ((noinline, noclone))
+vcvtf_32 (float *dst, uint32_t *src1, int size)
 {
   for (int i = 0; i < size; i++)
     dst[i] = (float) src1[i];
 }
 
-void vcvtf_64 (double *dst, unsigned long *src1, int size)
+void __attribute__ ((noinline, noclone))
+vcvtf_64 (double *dst, uint64_t *src1, int size)
 {
   for (int i = 0; i < size; i++)
     dst[i] = (double) src1[i];
 }
 
+/* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tucvtf\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1_run.c
index 5b9291ca2c2..a4434cbf478 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_cvtf_unsigned_1_run.c
@@ -1,47 +1,47 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_cvtf_unsigned_1.c"
 
 #define ARRAY_SIZE 65
 
-#define VAL1 (i * 9456)
-#define VAL2 (i * 0xfddff13f)
+#define VAL1 (i * 109)
+#define VAL2 (i * 9456)
+#define VAL3 (i * 0xfddff13f)
 
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static float array_destf[ARRAY_SIZE];
-  static double array_destd[ARRAY_SIZE];
+  static _Float16 array_dest16[ARRAY_SIZE];
+  static float array_dest32[ARRAY_SIZE];
+  static double array_dest64[ARRAY_SIZE];
 
-  unsigned int array_source_i[ARRAY_SIZE];
-  unsigned long array_source_l[ARRAY_SIZE];
+  uint16_t array_source16[ARRAY_SIZE];
+  uint32_t array_source32[ARRAY_SIZE];
+  uint64_t array_source64[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
     {
-      array_source_i[i] = VAL1;
-      array_source_l[i] = VAL2;
+      array_source16[i] = VAL1;
+      array_source32[i] = VAL2;
+      array_source64[i] = VAL3;
+      asm volatile ("" ::: "memory");
     }
 
-  vcvtf_32 (array_destf, array_source_i, ARRAY_SIZE);
+  vcvtf_16 (array_dest16, array_source16, ARRAY_SIZE);
+  for (int i = 0; i < ARRAY_SIZE; i++)
+    if (array_dest16[i] != (_Float16) VAL1)
+      __builtin_abort ();
+
+  vcvtf_32 (array_dest32, array_source32, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destf[i] != (float) VAL1)
-      {
-	fprintf (stderr,"%d: %f != %f\n", i, array_destf[i], (float) VAL1);
-	exit (1);
-      }
+    if (array_dest32[i] != (float) VAL2)
+      __builtin_abort ();
 
-  vcvtf_64 (array_destd, array_source_l, ARRAY_SIZE);
+  vcvtf_64 (array_dest64, array_source64, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destd[i] != (double) VAL2)
-      {
-	fprintf (stderr,"%d: %lf != %f\n", i, array_destd[i], (double) VAL2);
-	exit (1);
-      }
+    if (array_dest64[i] != (double) VAL3)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1.C b/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1.c
index 1f7d8a4a9ba..9fed379607b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1.c
@@ -1,12 +1,14 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O3 -fno-inline -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
+/* -fno-tree-loop-distribute-patterns prevents conversion to memset.  */
+/* { dg-options "-O3 -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (1024 / sizeof (TYPE))
 
-#define DEF_SET_IMM(TYPE,IMM,SUFFIX)		\
-void set_##TYPE##SUFFIX (TYPE *__restrict__ a)	\
+#define DEF_SET_IMM(TYPE, IMM, SUFFIX)		\
+void __attribute__ ((noinline, noclone))	\
+set_##TYPE##_##SUFFIX (TYPE *a)			\
 {						\
   for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
     a[i] = IMM;					\
@@ -93,7 +95,7 @@ DEF_SET_IMM (int64_t, 0xFE00FE00FE00FE00LL, imm_FE00_pat)
 // shouldn't assert!
 DEF_SET_IMM (int32_t, 129, imm_m129)
 DEF_SET_IMM (int32_t, 32513, imm_32513)
-DEF_SET_IMM (int32_t, -32767, imm_m32767)
+DEF_SET_IMM (int32_t, -32763, imm_m32763)
 
 /* { dg-final { scan-assembler {\tmov\tz[0-9]+\.b, #-1\n} } } */
 
@@ -130,3 +132,7 @@ DEF_SET_IMM (int32_t, -32767, imm_m32767)
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, #-2\n} 2 } } */
 
 /* { dg-final { scan-assembler {\tmov\tz[0-9]+\.h, #-512\n} } } */
+
+/* { dg-final { scan-assembler-not {#129\n} } } */
+/* { dg-final { scan-assembler-not {#32513\n} } } */
+/* { dg-final { scan-assembler-not {#-32763\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1_run.c
index cbc16e8e2bb..237f44947ab 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1_run.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_dup_imm_1_run.c
@@ -1,23 +1,20 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-std=c++11 -O3 -fno-inline -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
+/* { dg-options "-O3 -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
 
-#include "sve_dup_imm_1.C"
+#include "sve_dup_imm_1.c"
 
-#include <stdlib.h>
-
-#define TEST_SET_IMM(TYPE,IMM,SUFFIX)		\
+#define TEST_SET_IMM(TYPE, IMM, SUFFIX)		\
   {						\
     TYPE v[NUM_ELEMS (TYPE)];			\
-    set_##TYPE##SUFFIX (v);			\
-    for (int i = 0; i < NUM_ELEMS (TYPE); i++ )	\
-      if (v[i] != IMM)				\
-        result++;				\
+    set_##TYPE##_##SUFFIX (v);			\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
+      if (v[i] != (TYPE) IMM)			\
+        __builtin_abort ();			\
   }
 
-int main (int argc, char **argv)
+int __attribute__ ((optimize (1)))
+main (int argc, char **argv)
 {
-  int result = 0;
-
   TEST_SET_IMM (int8_t,  0, imm_0)
   TEST_SET_IMM (int16_t, 0, imm_0)
   TEST_SET_IMM (int32_t, 0, imm_0)
@@ -62,15 +59,12 @@ int main (int argc, char **argv)
   TEST_SET_IMM (int32_t, 0x00010001, imm_0001_pat)
   TEST_SET_IMM (int64_t, 0x0001000100010001LL, imm_0001_pat)
 
-  TEST_SET_IMM (int16_t, int16_t (0xFEFE), imm_FE_pat)
-  TEST_SET_IMM (int32_t, int32_t (0xFEFEFEFE), imm_FE_pat)
-  TEST_SET_IMM (int64_t, int64_t (0xFEFEFEFEFEFEFEFE), imm_FE_pat)
-
-  TEST_SET_IMM (int32_t, int32_t (0xFE00FE00), imm_FE00_pat)
-  TEST_SET_IMM (int64_t, int64_t (0xFE00FE00FE00FE00), imm_FE00_pat)
+  TEST_SET_IMM (int16_t, 0xFEFE, imm_FE_pat)
+  TEST_SET_IMM (int32_t, 0xFEFEFEFE, imm_FE_pat)
+  TEST_SET_IMM (int64_t, 0xFEFEFEFEFEFEFEFE, imm_FE_pat)
 
-  if (result != 0)
-    abort ();
+  TEST_SET_IMM (int32_t, 0xFE00FE00, imm_FE00_pat)
+  TEST_SET_IMM (int64_t, 0xFE00FE00FE00FE00, imm_FE00_pat)
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_dup_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve_dup_lane_1.c
index d4de247b05e..ea977207226 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_dup_lane_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_dup_lane_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define MASK_2(X) X, X
 #define MASK_4(X) MASK_2 (X), MASK_2 (X)
@@ -44,7 +47,10 @@ typedef float v8sf __attribute__((vector_size (32)));
   T (v4df, 4, 3)				\
   T (v8sf, 8, 0)				\
   T (v8sf, 8, 5)				\
-  T (v8sf, 8, 7)
+  T (v8sf, 8, 7)				\
+  T (v16hf, 16, 0)				\
+  T (v16hf, 16, 6)				\
+  T (v16hf, 16, 15)				\
 
 TEST_ALL (DUP_LANE)
 
@@ -56,9 +62,9 @@ TEST_ALL (DUP_LANE)
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[0\]} 2 } } */
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[5\]} 2 } } */
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[7\]} 2 } } */
-/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[0\]} 1 } } */
-/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[6\]} 1 } } */
-/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[15\]} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[6\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[15\]} 2 } } */
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[0\]} 1 } } */
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[19\]} 1 } } */
 /* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[31\]} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_ext_1.c b/gcc/testsuite/gcc.target/aarch64/sve_ext_1.c
index 3056c60eee7..1ec51aa2eaf 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_ext_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_ext_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define MASK_2(X) X, X + 1
 #define MASK_4(X) MASK_2 (X), MASK_2 (X + 2)
@@ -44,21 +47,24 @@ typedef float v8sf __attribute__((vector_size (32)));
   T (v4df, 4, 3)				\
   T (v8sf, 8, 1)				\
   T (v8sf, 8, 5)				\
-  T (v8sf, 8, 7)
+  T (v8sf, 8, 7)				\
+  T (v16hf, 16, 1)				\
+  T (v16hf, 16, 6)				\
+  T (v16hf, 16, 15)				\
 
 TEST_ALL (DUP_LANE)
 
 /* { dg-final { scan-assembler-not {\ttbl\t} } } */
 
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #1\n} 1 } } */
-/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #2\n} 1 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #2\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #4\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #8\n} 2 } } */
-/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #12\n} 1 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #12\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #16\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #19\n} 1 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #20\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #24\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #28\n} 2 } } */
-/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #30\n} 1 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #30\n} 2 } } */
 /* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #31\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_extract_1.c b/gcc/testsuite/gcc.target/aarch64/sve_extract_1.c
new file mode 100644
index 00000000000..1ba277ffa6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_extract_1.c
@@ -0,0 +1,93 @@
+/* { dg-do assemble } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
+
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
+typedef double v4df __attribute__((vector_size (32)));
+typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
+
+#define EXTRACT(ELT_TYPE, TYPE, INDEX)		\
+  ELT_TYPE permute_##TYPE##_##INDEX (void)	\
+  {						\
+    TYPE values;				\
+    asm ("" : "=w" (values));			\
+    return values[INDEX];			\
+  }
+
+#define TEST_ALL(T)				\
+  T (int64_t, v4di, 0)				\
+  T (int64_t, v4di, 1)				\
+  T (int64_t, v4di, 2)				\
+  T (int64_t, v4di, 3)				\
+  T (int32_t, v8si, 0)				\
+  T (int32_t, v8si, 1)				\
+  T (int32_t, v8si, 3)				\
+  T (int32_t, v8si, 4)				\
+  T (int32_t, v8si, 7)				\
+  T (int16_t, v16hi, 0)				\
+  T (int16_t, v16hi, 1)				\
+  T (int16_t, v16hi, 7)				\
+  T (int16_t, v16hi, 8)				\
+  T (int16_t, v16hi, 15)			\
+  T (int8_t, v32qi, 0)				\
+  T (int8_t, v32qi, 1)				\
+  T (int8_t, v32qi, 15)				\
+  T (int8_t, v32qi, 16)				\
+  T (int8_t, v32qi, 31)				\
+  T (double, v4df, 0)				\
+  T (double, v4df, 1)				\
+  T (double, v4df, 2)				\
+  T (double, v4df, 3)				\
+  T (float, v8sf, 0)				\
+  T (float, v8sf, 1)				\
+  T (float, v8sf, 3)				\
+  T (float, v8sf, 4)				\
+  T (float, v8sf, 7)				\
+  T (_Float16, v16hf, 0)			\
+  T (_Float16, v16hf, 1)			\
+  T (_Float16, v16hf, 7)			\
+  T (_Float16, v16hf, 8)			\
+  T (_Float16, v16hf, 15)
+
+TEST_ALL (EXTRACT)
+
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\td[0-9]+, v[0-9]+\.d\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\td[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tx[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\ts[0-9]+, v[0-9]+\.s\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[4\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[0\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\th[0-9]+, v[0-9]+\.h\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[8\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[0\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[15\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[16\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_extract_2.c b/gcc/testsuite/gcc.target/aarch64/sve_extract_2.c
new file mode 100644
index 00000000000..b163f28ef28
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_extract_2.c
@@ -0,0 +1,93 @@
+/* { dg-do assemble } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=512 --save-temps" } */
+
+#include <stdint.h>
+
+typedef int64_t v8di __attribute__((vector_size (64)));
+typedef int32_t v16si __attribute__((vector_size (64)));
+typedef int16_t v32hi __attribute__((vector_size (64)));
+typedef int8_t v64qi __attribute__((vector_size (64)));
+typedef double v8df __attribute__((vector_size (64)));
+typedef float v16sf __attribute__((vector_size (64)));
+typedef _Float16 v32hf __attribute__((vector_size (64)));
+
+#define EXTRACT(ELT_TYPE, TYPE, INDEX)		\
+  ELT_TYPE permute_##TYPE##_##INDEX (void)	\
+  {						\
+    TYPE values;				\
+    asm ("" : "=w" (values));			\
+    return values[INDEX];			\
+  }
+
+#define TEST_ALL(T)				\
+  T (int64_t, v8di, 0)				\
+  T (int64_t, v8di, 1)				\
+  T (int64_t, v8di, 2)				\
+  T (int64_t, v8di, 7)				\
+  T (int32_t, v16si, 0)				\
+  T (int32_t, v16si, 1)				\
+  T (int32_t, v16si, 3)				\
+  T (int32_t, v16si, 4)				\
+  T (int32_t, v16si, 15)			\
+  T (int16_t, v32hi, 0)				\
+  T (int16_t, v32hi, 1)				\
+  T (int16_t, v32hi, 7)				\
+  T (int16_t, v32hi, 8)				\
+  T (int16_t, v32hi, 31)			\
+  T (int8_t, v64qi, 0)				\
+  T (int8_t, v64qi, 1)				\
+  T (int8_t, v64qi, 15)				\
+  T (int8_t, v64qi, 16)				\
+  T (int8_t, v64qi, 63)				\
+  T (double, v8df, 0)				\
+  T (double, v8df, 1)				\
+  T (double, v8df, 2)				\
+  T (double, v8df, 7)				\
+  T (float, v16sf, 0)				\
+  T (float, v16sf, 1)				\
+  T (float, v16sf, 3)				\
+  T (float, v16sf, 4)				\
+  T (float, v16sf, 15)				\
+  T (_Float16, v32hf, 0)			\
+  T (_Float16, v32hf, 1)			\
+  T (_Float16, v32hf, 7)			\
+  T (_Float16, v32hf, 8)			\
+  T (_Float16, v32hf, 31)
+
+TEST_ALL (EXTRACT)
+
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\td[0-9]+, v[0-9]+\.d\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\td[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tx[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\ts[0-9]+, v[0-9]+\.s\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[4\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[0\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\th[0-9]+, v[0-9]+\.h\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[8\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[0\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[15\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[16\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_extract_3.c b/gcc/testsuite/gcc.target/aarch64/sve_extract_3.c
new file mode 100644
index 00000000000..87ac2351768
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_extract_3.c
@@ -0,0 +1,124 @@
+/* { dg-do assemble } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=1024 --save-temps" } */
+
+#include <stdint.h>
+
+typedef int64_t v16di __attribute__((vector_size (128)));
+typedef int32_t v32si __attribute__((vector_size (128)));
+typedef int16_t v64hi __attribute__((vector_size (128)));
+typedef int8_t v128qi __attribute__((vector_size (128)));
+typedef double v16df __attribute__((vector_size (128)));
+typedef float v32sf __attribute__((vector_size (128)));
+typedef _Float16 v64hf __attribute__((vector_size (128)));
+
+#define EXTRACT(ELT_TYPE, TYPE, INDEX)		\
+  ELT_TYPE permute_##TYPE##_##INDEX (void)	\
+  {						\
+    TYPE values;				\
+    asm ("" : "=w" (values));			\
+    return values[INDEX];			\
+  }
+
+#define TEST_ALL(T)				\
+  T (int64_t, v16di, 0)				\
+  T (int64_t, v16di, 1)				\
+  T (int64_t, v16di, 2)				\
+  T (int64_t, v16di, 7)				\
+  T (int64_t, v16di, 8)				\
+  T (int64_t, v16di, 9)				\
+  T (int64_t, v16di, 15)			\
+  T (int32_t, v32si, 0)				\
+  T (int32_t, v32si, 1)				\
+  T (int32_t, v32si, 3)				\
+  T (int32_t, v32si, 4)				\
+  T (int32_t, v32si, 15)			\
+  T (int32_t, v32si, 16)			\
+  T (int32_t, v32si, 21)			\
+  T (int32_t, v32si, 31)			\
+  T (int16_t, v64hi, 0)				\
+  T (int16_t, v64hi, 1)				\
+  T (int16_t, v64hi, 7)				\
+  T (int16_t, v64hi, 8)				\
+  T (int16_t, v64hi, 31)			\
+  T (int16_t, v64hi, 32)			\
+  T (int16_t, v64hi, 47)			\
+  T (int16_t, v64hi, 63)			\
+  T (int8_t, v128qi, 0)				\
+  T (int8_t, v128qi, 1)				\
+  T (int8_t, v128qi, 15)			\
+  T (int8_t, v128qi, 16)			\
+  T (int8_t, v128qi, 63)			\
+  T (int8_t, v128qi, 64)			\
+  T (int8_t, v128qi, 100)			\
+  T (int8_t, v128qi, 127)			\
+  T (double, v16df, 0)				\
+  T (double, v16df, 1)				\
+  T (double, v16df, 2)				\
+  T (double, v16df, 7)				\
+  T (double, v16df, 8)				\
+  T (double, v16df, 9)				\
+  T (double, v16df, 15)				\
+  T (float, v32sf, 0)				\
+  T (float, v32sf, 1)				\
+  T (float, v32sf, 3)				\
+  T (float, v32sf, 4)				\
+  T (float, v32sf, 15)				\
+  T (float, v32sf, 16)				\
+  T (float, v32sf, 21)				\
+  T (float, v32sf, 31)				\
+  T (_Float16, v64hf, 0)			\
+  T (_Float16, v64hf, 1)			\
+  T (_Float16, v64hf, 7)			\
+  T (_Float16, v64hf, 8)			\
+  T (_Float16, v64hf, 31)			\
+  T (_Float16, v64hf, 32)			\
+  T (_Float16, v64hf, 47)			\
+  T (_Float16, v64hf, 63)
+
+TEST_ALL (EXTRACT)
+
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\td[0-9]+, v[0-9]+\.d\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\td[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[7\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tx[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\ts[0-9]+, v[0-9]+\.s\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[4\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[15\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[0\]\n} 5 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\th[0-9]+, v[0-9]+\.h\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[8\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[31\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[0\]\n} 5 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[15\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[16\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[63\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #64\n} 7 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #72\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #84\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #94\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #100\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_extract_4.c b/gcc/testsuite/gcc.target/aarch64/sve_extract_4.c
new file mode 100644
index 00000000000..e61a2fa94e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_extract_4.c
@@ -0,0 +1,135 @@
+/* { dg-do assemble } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=2048 --save-temps" } */
+
+#include <stdint.h>
+
+typedef int64_t v32di __attribute__((vector_size (256)));
+typedef int32_t v64si __attribute__((vector_size (256)));
+typedef int16_t v128hi __attribute__((vector_size (256)));
+typedef int8_t v256qi __attribute__((vector_size (256)));
+typedef double v32df __attribute__((vector_size (256)));
+typedef float v64sf __attribute__((vector_size (256)));
+typedef _Float16 v128hf __attribute__((vector_size (256)));
+
+#define EXTRACT(ELT_TYPE, TYPE, INDEX)		\
+  ELT_TYPE permute_##TYPE##_##INDEX (void)	\
+  {						\
+    TYPE values;				\
+    asm ("" : "=w" (values));			\
+    return values[INDEX];			\
+  }
+
+#define TEST_ALL(T)				\
+  T (int64_t, v32di, 0)				\
+  T (int64_t, v32di, 1)				\
+  T (int64_t, v32di, 2)				\
+  T (int64_t, v32di, 7)				\
+  T (int64_t, v32di, 8)				\
+  T (int64_t, v32di, 9)				\
+  T (int64_t, v32di, 15)			\
+  T (int64_t, v32di, 31)			\
+  T (int32_t, v64si, 0)				\
+  T (int32_t, v64si, 1)				\
+  T (int32_t, v64si, 3)				\
+  T (int32_t, v64si, 4)				\
+  T (int32_t, v64si, 15)			\
+  T (int32_t, v64si, 16)			\
+  T (int32_t, v64si, 21)			\
+  T (int32_t, v64si, 31)			\
+  T (int32_t, v64si, 63)			\
+  T (int16_t, v128hi, 0)			\
+  T (int16_t, v128hi, 1)			\
+  T (int16_t, v128hi, 7)			\
+  T (int16_t, v128hi, 8)			\
+  T (int16_t, v128hi, 31)			\
+  T (int16_t, v128hi, 32)			\
+  T (int16_t, v128hi, 47)			\
+  T (int16_t, v128hi, 63)			\
+  T (int16_t, v128hi, 127)			\
+  T (int8_t, v256qi, 0)				\
+  T (int8_t, v256qi, 1)				\
+  T (int8_t, v256qi, 15)			\
+  T (int8_t, v256qi, 16)			\
+  T (int8_t, v256qi, 63)			\
+  T (int8_t, v256qi, 64)			\
+  T (int8_t, v256qi, 100)			\
+  T (int8_t, v256qi, 127)			\
+  T (int8_t, v256qi, 255)			\
+  T (double, v32df, 0)				\
+  T (double, v32df, 1)				\
+  T (double, v32df, 2)				\
+  T (double, v32df, 7)				\
+  T (double, v32df, 8)				\
+  T (double, v32df, 9)				\
+  T (double, v32df, 15)				\
+  T (double, v32df, 31)				\
+  T (float, v64sf, 0)				\
+  T (float, v64sf, 1)				\
+  T (float, v64sf, 3)				\
+  T (float, v64sf, 4)				\
+  T (float, v64sf, 15)				\
+  T (float, v64sf, 16)				\
+  T (float, v64sf, 21)				\
+  T (float, v64sf, 31)				\
+  T (float, v64sf, 63)				\
+  T (_Float16, v128hf, 0)			\
+  T (_Float16, v128hf, 1)			\
+  T (_Float16, v128hf, 7)			\
+  T (_Float16, v128hf, 8)			\
+  T (_Float16, v128hf, 31)			\
+  T (_Float16, v128hf, 32)			\
+  T (_Float16, v128hf, 47)			\
+  T (_Float16, v128hf, 63)			\
+  T (_Float16, v128hf, 127)
+
+TEST_ALL (EXTRACT)
+
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tx[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\td[0-9]+, v[0-9]+\.d\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\td[0-9]+, v[0-9]+\.d\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.d, z[0-9]+\.d\[7\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tx[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[0\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\ts[0-9]+, v[0-9]+\.s\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\ts[0-9]+, v[0-9]+\.s\[3\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[4\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.s, z[0-9]+\.s\[15\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[0\]\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tdup\th[0-9]+, v[0-9]+\.h\[0\]\n} } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\th[0-9]+, v[0-9]+\.h\[7\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[8\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.h, z[0-9]+\.h\[31\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
+
+/* Also used to move the result of a non-Advanced SIMD extract.  */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[0\]\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[1\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumov\tw[0-9]+, v[0-9]+\.b\[15\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[16\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.b, z[0-9]+\.b\[63\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tlastb\tw[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #64\n} 7 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #72\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #84\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #94\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #100\n} 1 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #120\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #124\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #126\n} 2 } } */
+/* { dg-final { scan-assembler-times {\text\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b, #127\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fabs_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fabs_1.c
index 61ec667363a..33e1db5d1df 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fabs_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fabs_1.c
@@ -9,9 +9,10 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
+DO_OPS (_Float16, fabsf)
 DO_OPS (float, fabsf)
 DO_OPS (double, fabs)
 
+/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1.c
index 8fd41db0a1f..7c5f6ddc996 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1.c
@@ -1,17 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-void vfcvtz_32 (signed int *dst, float *src1, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+vfcvtz_16 (int16_t *dst, _Float16 *src1, int size)
+{
+  for (int i = 0; i < size; i++)
+    dst[i] = (int16_t) src1[i];
+}
+
+void __attribute__ ((noinline, noclone))
+vfcvtz_32 (int32_t *dst, float *src1, int size)
 {
   for (int i = 0; i < size; i++)
-    dst[i] = (signed int) src1[i];
+    dst[i] = (int32_t) src1[i];
 }
 
-void vfcvtz_64 (signed long *dst, double *src1, int size)
+void __attribute__ ((noinline, noclone))
+vfcvtz_64 (int64_t *dst, double *src1, int size)
 {
   for (int i = 0; i < size; i++)
-    dst[i] = (signed long) src1[i];
+    dst[i] = (int64_t) src1[i];
 }
 
+/* { dg-final { scan-assembler-times {\tfcvtzs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfcvtzs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfcvtzs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1_run.c
index 58ae7737a89..48968f8ce19 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_signed_1_run.c
@@ -1,47 +1,47 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O3 -march=armv8-a+sve" } */
 
 #include "sve_fcvtz_signed_1.c"
 
 #define ARRAY_SIZE 81
 
-#define VAL1 ((i * 237.86) - (29 * 237.86))
-#define VAL2 ((double) ((i * 0xf8dfef2f) - (11 * 0xf8dfef2f)))
+#define VAL1 ((i * 17) - 180)
+#define VAL2 ((i * 237.86) - (29 * 237.86))
+#define VAL3 ((double) ((i * 0xf8dfef2f) - (11 * 0xf8dfef2f)))
 
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static signed int array_desti[ARRAY_SIZE];
-  static signed long array_destl[ARRAY_SIZE];
+  static int16_t array_dest16[ARRAY_SIZE];
+  static int32_t array_dest32[ARRAY_SIZE];
+  static int64_t array_dest64[ARRAY_SIZE];
 
-  float array_source_f[ARRAY_SIZE];
-  double array_source_d[ARRAY_SIZE];
+  _Float16 array_source16[ARRAY_SIZE];
+  float array_source32[ARRAY_SIZE];
+  double array_source64[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
     {
-      array_source_f[i] = VAL1;
-      array_source_d[i] = VAL2;
+      array_source16[i] = VAL1;
+      array_source32[i] = VAL2;
+      array_source64[i] = VAL3;
+      asm volatile ("" ::: "memory");
     }
 
-  vfcvtz_32 (array_desti, array_source_f, ARRAY_SIZE);
+  vfcvtz_16 (array_dest16, array_source16, ARRAY_SIZE);
+  for (int i = 0; i < ARRAY_SIZE; i++)
+    if (array_dest16[i] != (int16_t) VAL1)
+      __builtin_abort ();
+
+  vfcvtz_32 (array_dest32, array_source32, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_desti[i] != (int) VAL1)
-      {
-	fprintf (stderr,"%d: %d != %d\n", i, array_desti[i], (int) VAL1);
-	exit (1);
-      }
+    if (array_dest32[i] != (int32_t) VAL2)
+      __builtin_abort ();
 
-  vfcvtz_64 (array_destl, array_source_d, ARRAY_SIZE);
+  vfcvtz_64 (array_dest64, array_source64, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destl[i] != (long) VAL2)
-      {
-	fprintf (stderr,"%d: %ld != %ld\n", i, array_destl[i], (long) VAL2);
-	exit (1);
-      }
+    if (array_dest64[i] != (int64_t) VAL3)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1.c
index b4dcd26cfd0..2691cf0bc17 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1.c
@@ -1,17 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-void vfcvtz_32 (unsigned int *dst, float *src1, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+vfcvtz_16 (uint16_t *dst, _Float16 *src1, int size)
+{
+  for (int i = 0; i < size; i++)
+    dst[i] = (uint16_t) src1[i];
+}
+
+void __attribute__ ((noinline, noclone))
+vfcvtz_32 (uint32_t *dst, float *src1, int size)
 {
   for (int i = 0; i < size; i++)
-    dst[i] = (unsigned int) src1[i];
+    dst[i] = (uint32_t) src1[i];
 }
 
-void vfcvtz_64 (unsigned long *dst, double *src1, int size)
+void __attribute__ ((noinline, noclone))
+vfcvtz_64 (uint64_t *dst, double *src1, int size)
 {
   for (int i = 0; i < size; i++)
-    dst[i] = (unsigned long) src1[i];
+    dst[i] = (uint64_t) src1[i];
 }
 
+/* { dg-final { scan-assembler-times {\tfcvtzu\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfcvtzu\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfcvtzu\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c
index e196d174c66..9c1be7c8a6f 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c
@@ -1,47 +1,47 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_fcvtz_unsigned_1.c"
 
 #define ARRAY_SIZE 75
 
-#define VAL1 (i * 2574.33)
-#define VAL2 ((double) (i * 0xff23efef))
+#define VAL1 (i * 19)
+#define VAL2 (i * 2574.33)
+#define VAL3 ((double) (i * 0xff23efef))
 
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static unsigned int array_desti[ARRAY_SIZE];
-  static unsigned long array_destl[ARRAY_SIZE];
+  static uint16_t array_dest16[ARRAY_SIZE];
+  static uint32_t array_dest32[ARRAY_SIZE];
+  static uint64_t array_dest64[ARRAY_SIZE];
 
-  float array_source_f[ARRAY_SIZE];
-  double array_source_d[ARRAY_SIZE];
+  _Float16 array_source16[ARRAY_SIZE];
+  float array_source32[ARRAY_SIZE];
+  double array_source64[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
     {
-      array_source_f[i] = VAL1;
-      array_source_d[i] = VAL2;
+      array_source16[i] = VAL1;
+      array_source32[i] = VAL2;
+      array_source64[i] = VAL3;
+      asm volatile ("" ::: "memory");
     }
 
-  vfcvtz_32 (array_desti, array_source_f, ARRAY_SIZE);
+  vfcvtz_16 (array_dest16, array_source16, ARRAY_SIZE);
+  for (int i = 0; i < ARRAY_SIZE; i++)
+    if (array_dest16[i] != (uint16_t) VAL1)
+      __builtin_abort ();
+
+  vfcvtz_32 (array_dest32, array_source32, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_desti[i] != (int) VAL1)
-      {
-	fprintf (stderr,"%d: %d != %d\n", i, array_desti[i], (int) VAL1);
-	exit (1);
-      }
+    if (array_dest32[i] != (uint32_t) VAL2)
+      __builtin_abort ();
 
-  vfcvtz_64 (array_destl, array_source_d, ARRAY_SIZE);
+  vfcvtz_64 (array_dest64, array_source64, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_destl[i] != (long) VAL2)
-      {
-	fprintf (stderr,"%d: %ld != %ld\n", i, array_destl[i], (long) VAL2);
-	exit (1);
-      }
+    if (array_dest64[i] != (uint64_t) VAL3)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fdiv_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fdiv_1.c
index d0becaf25f1..b193726ea0a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fdiv_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fdiv_1.c
@@ -1,30 +1,41 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)                        \
-void vdiv##TYPE (TYPE *dst, TYPE src1)     \
-{                                          \
-  *dst = *dst / src1;                      \
-}                                          \
-void vdivr##TYPE (TYPE *_dst, TYPE _src1)  \
-{                                          \
-  register TYPE dst  asm("z0");            \
-  register TYPE src1 asm("z2");            \
-  register TYPE src2 asm("z4");            \
-  dst = *_dst;                             \
-  asm volatile ("" :: "w" (dst));          \
-  src1 = _src1;                            \
-  asm volatile ("" :: "w" (src1));         \
-  dst = src1 / dst;                        \
-  *_dst = dst;                             \
+#define DO_OP(TYPE)				\
+void vdiv_##TYPE (TYPE *x, TYPE y)		\
+{						\
+  register TYPE dst asm("z0");			\
+  register TYPE src asm("z2");			\
+  dst = *x;					\
+  src = y;					\
+  asm volatile ("" :: "w" (dst), "w" (src));	\
+  dst = dst / src;				\
+  asm volatile ("" :: "w" (dst));		\
+  *x = dst;					\
+}						\
+void vdivr_##TYPE (TYPE *x, TYPE y)		\
+{						\
+  register TYPE dst asm("z0");			\
+  register TYPE src asm("z2");			\
+  dst = *x;					\
+  src = y;					\
+  asm volatile ("" :: "w" (dst), "w" (src));	\
+  dst = src / dst;				\
+  asm volatile ("" :: "w" (dst));		\
+  *x = dst;					\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
+/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+
 /* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fdup_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fdup_1.c
index 9ed825b9d35..148e0f9bd89 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fdup_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fdup_1.c
@@ -1,22 +1,24 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -fno-inline -march=armv8-a+sve" } */
+/* { dg-do assemble } */
+/* -fno-tree-loop-distribute-patterns prevents conversion to memset.  */
+/* { dg-options "-O3 -march=armv8-a+sve -fno-tree-loop-distribute-patterns --save-temps" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (1024 / sizeof (TYPE))
 
-#define DEF_SET_IMM(TYPE,IMM,SUFFIX)		\
-void set_##TYPE##SUFFIX (TYPE *restrict a)	\
+#define DEF_SET_IMM(TYPE, IMM, SUFFIX)		\
+void __attribute__ ((noinline, noclone))	\
+set_##TYPE##_##SUFFIX (TYPE *a)			\
 {						\
   for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
     a[i] = IMM;					\
 }
 
 #define DEF_SET_IMM_FP(IMM, SUFFIX) \
-DEF_SET_IMM (float, IMM, SUFFIX)    \
-DEF_SET_IMM (double, IMM, SUFFIX)
+  DEF_SET_IMM (float, IMM, SUFFIX)  \
+  DEF_SET_IMM (double, IMM, SUFFIX)
 
-//Valid
+/* Valid.  */
 DEF_SET_IMM_FP (1, imm1)
 DEF_SET_IMM_FP (0x1.1p0, imm1p0)
 DEF_SET_IMM_FP (0x1.fp0, immfp0)
@@ -25,8 +27,10 @@ DEF_SET_IMM_FP (0x1.1p-3, imm1pm3)
 DEF_SET_IMM_FP (0x1.fp4, immfp4)
 DEF_SET_IMM_FP (0x1.fp-3, immfpm3)
 
-//Invalid
+/* Should use MOV instead.  */
 DEF_SET_IMM_FP (0, imm0)
+
+/* Invalid.  */
 DEF_SET_IMM_FP (0x1.1fp0, imm1fp0)
 DEF_SET_IMM_FP (0x1.1p5, imm1p5)
 DEF_SET_IMM_FP (0x1.1p-4, imm1pm4)
@@ -43,6 +47,8 @@ DEF_SET_IMM_FP (0x1.1fp-4, imm1fpm4)
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #3.1e\+1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2.421875e-1\n} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, #0\n} 1 } } */
+
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d,} 7 } } */
 
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #1.0e\+0\n} 1 } } */
@@ -52,3 +58,5 @@ DEF_SET_IMM_FP (0x1.1fp-4, imm1fpm4)
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #1.328125e-1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #3.1e\+1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2.421875e-1\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #0\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fdup_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_fdup_1_run.c
index bcd5180bbc6..f4cb1a0bf71 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fdup_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fdup_1_run.c
@@ -1,28 +1,24 @@
 /* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-O3 -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O3 -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
 
 #include "sve_fdup_1.c"
 
-#include <stdlib.h>
-
 #define TEST_SET_IMM(TYPE,IMM,SUFFIX)		\
   {						\
     TYPE v[NUM_ELEMS (TYPE)];			\
-    set_##TYPE##SUFFIX (v);			\
+    set_##TYPE##_##SUFFIX (v);			\
     for (int i = 0; i < NUM_ELEMS (TYPE); i++ )	\
       if (v[i] != IMM)				\
-        result++;				\
+	__builtin_abort ();			\
   }
 
 #define TEST_SET_IMM_FP(IMM, SUFFIX) \
-TEST_SET_IMM (float, IMM, SUFFIX)    \
-TEST_SET_IMM (double, IMM, SUFFIX)
-
+  TEST_SET_IMM (float, IMM, SUFFIX)  \
+  TEST_SET_IMM (double, IMM, SUFFIX)
 
-int main  (int argc, char **argv)
+int __attribute__ ((optimize (1)))
+main (int argc, char **argv)
 {
-  int result = 0;
-
   TEST_SET_IMM_FP (1, imm1)
   TEST_SET_IMM_FP (0x1.1p0, imm1p0)
   TEST_SET_IMM_FP (0x1.fp0, immfp0)
@@ -38,8 +34,5 @@ int main  (int argc, char **argv)
   TEST_SET_IMM_FP (0x1.1fp5, imm1fp5)
   TEST_SET_IMM_FP (0x1.1fp-4, imm1fpm4)
 
-  if (result != 0)
-    abort ();
-
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fmad_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fmad_1.c
index 75e39c4e3e4..2b1dbb087bc 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fmad_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fmad_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)                                   \
-void vmad##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (dst * src1) + src2;                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (dst * src1) + src2;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfmad\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmad\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmad\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmad\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmad\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fmla_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fmla_1.c
index 657773eada1..d5e4df266bf 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fmla_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fmla_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)                                   \
-void vmla##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (src1 * src2) + dst;                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (src1 * src2) + dst;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfmla\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmla\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmla\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmla\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmla\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fmls_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fmls_1.c
index 5aca3b145a9..c3f2c8a5823 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fmls_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fmls_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)                                   \
-void vmls##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (-src1 * src2) + dst;                         \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (-src1 * src2) + dst;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfmls\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmls\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmls\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmls\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmls\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fmsb_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fmsb_1.c
index 5f4143fc5da..30e1895c8d5 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fmsb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fmsb_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)                                   \
-void vmsb##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (-dst * src1) + src2;                         \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (-dst * src1) + src2;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfmsb\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmsb\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmsb\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmsb\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmsb\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fmul_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fmul_1.c
index f4fb574beac..3b648297963 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fmul_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fmul_1.c
@@ -1,31 +1,38 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
 #define DO_REGREG_OPS(TYPE, OP, NAME)				\
-void varith_##TYPE##_##NAME (TYPE* dst, TYPE* src, int count)	\
+void varith_##TYPE##_##NAME (TYPE *dst, TYPE *src, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
     dst[i] = dst[i] OP src[i];					\
 }
 
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, OP, NAME)		\
-void varithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void varithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
-    dst[i] = dst[i] OP VALUE;				\
+    dst[i] = dst[i] OP (TYPE) VALUE;			\
 }
 
 #define DO_ARITH_OPS(TYPE, OP, NAME)				\
-DO_REGREG_OPS (TYPE, OP, NAME);					\
-DO_IMMEDIATE_OPS (0.5, TYPE, OP, NAME ## 0point5);		\
-DO_IMMEDIATE_OPS (2, TYPE, OP, NAME ## 2);			\
-DO_IMMEDIATE_OPS (5, TYPE, OP, NAME ## 5);			\
-DO_IMMEDIATE_OPS (-0.5, TYPE, OP, NAME ## minus0point5);	\
-DO_IMMEDIATE_OPS (-2, TYPE, OP, NAME ## minus2);
+  DO_REGREG_OPS (TYPE, OP, NAME);				\
+  DO_IMMEDIATE_OPS (0.5, TYPE, OP, NAME ## 0point5);		\
+  DO_IMMEDIATE_OPS (2, TYPE, OP, NAME ## 2);			\
+  DO_IMMEDIATE_OPS (5, TYPE, OP, NAME ## 5);			\
+  DO_IMMEDIATE_OPS (-0.5, TYPE, OP, NAME ## minus0point5);	\
+  DO_IMMEDIATE_OPS (-2, TYPE, OP, NAME ## minus2);
 
+DO_ARITH_OPS (_Float16, *, mul)
 DO_ARITH_OPS (float, *, mul)
 DO_ARITH_OPS (double, *, mul)
 
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0.5\n} 1 } } */
+/* { dg-final { scan-assembler-not   {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2} } } */
+/* { dg-final { scan-assembler-not   {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #5} } } */
+/* { dg-final { scan-assembler-not   {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #-} } } */
+
 /* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
 /* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0.5\n} 1 } } */
 /* { dg-final { scan-assembler-not   {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fneg_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fneg_1.c
index ddb703bf875..7af81662fb9 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fneg_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fneg_1.c
@@ -1,15 +1,17 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
 #define DO_OPS(TYPE)					\
-void vneg_##TYPE (TYPE* dst, TYPE* src, int count)	\
+void vneg_##TYPE (TYPE *dst, TYPE *src, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
     dst[i] = -src[i];					\
 }
 
+DO_OPS (_Float16)
 DO_OPS (float)
 DO_OPS (double)
 
+/* { dg-final { scan-assembler-times {\tfneg\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfneg\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfneg\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fnmad_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fnmad_1.c
index 877261029db..84a95187314 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fnmad_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fnmad_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)					\
-void vfnmad##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)	\
-{							\
-  register TYPE dst  asm("z0");				\
-  register TYPE src1 asm("z2");				\
-  register TYPE src2 asm("z4");				\
-  dst = *_dst;						\
-  asm volatile ("" :: "w" (dst));			\
-  src1 = _src1;						\
-  asm volatile ("" :: "w" (src1));			\
-  src2 = _src2;						\
-  asm volatile ("" :: "w" (src2));			\
-  dst = (-src2) + (-dst * src1);			\
-  asm volatile ("" :: "w" (dst));			\
-  *_dst = dst;						\
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (-dst * src1) - src2;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfnmad\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfnmad\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmad\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmad\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmad\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fnmla_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fnmla_1.c
index 463c90550d6..dcc4811f1d8 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fnmla_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fnmla_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)					\
-void vfnmla##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)	\
-{							\
-  register TYPE dst  asm("z0");				\
-  register TYPE src1 asm("z2");				\
-  register TYPE src2 asm("z4");				\
-  dst = *_dst;						\
-  asm volatile ("" :: "w" (dst));			\
-  src1 = _src1;						\
-  asm volatile ("" :: "w" (src1));			\
-  src2 = _src2;						\
-  asm volatile ("" :: "w" (src2));			\
-  dst = (-dst) + (-src1 * src2);			\
-  asm volatile ("" :: "w" (dst));			\
-  *_dst = dst;						\
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (-src1 * src2) - dst;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfnmla\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfnmla\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmla\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmla\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmla\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fnmls_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fnmls_1.c
index 312600fea20..7a89399f4be 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fnmls_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fnmls_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)					\
-void vfnmls##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)	\
-{							\
-  register TYPE dst  asm("z0");				\
-  register TYPE src1 asm("z2");				\
-  register TYPE src2 asm("z4");				\
-  dst = *_dst;						\
-  asm volatile ("" :: "w" (dst));			\
-  src1 = _src1;						\
-  asm volatile ("" :: "w" (src1));			\
-  src2 = _src2;						\
-  asm volatile ("" :: "w" (src2));			\
-  dst = (-dst) + (src1 * src2);				\
-  asm volatile ("" :: "w" (dst));			\
-  *_dst = dst;						\
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (src1 * src2) - dst;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfnmls\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfnmls\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmls\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmls\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmls\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fnmsb_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fnmsb_1.c
index 71e36b0028f..6c95b0abc8e 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fnmsb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fnmsb_1.c
@@ -1,28 +1,29 @@
 /* { dg-do assemble } */
 /* { dg-options " -O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
+typedef _Float16 v16hf __attribute__((vector_size(32)));
 typedef float v8sf __attribute__((vector_size(32)));
 typedef double v4df __attribute__((vector_size(32)));
 
-#define DO_OP(TYPE)					\
-void vfnmsb##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)	\
-{							\
-  register TYPE dst  asm("z0");				\
-  register TYPE src1 asm("z2");				\
-  register TYPE src2 asm("z4");				\
-  dst = *_dst;						\
-  asm volatile ("" :: "w" (dst));			\
-  src1 = _src1;						\
-  asm volatile ("" :: "w" (src1));			\
-  src2 = _src2;						\
-  asm volatile ("" :: "w" (src2));			\
-  dst = (-src2) + (dst * src1);				\
-  asm volatile ("" :: "w" (dst));			\
-  *_dst = dst;						\
+#define DO_OP(TYPE)						\
+void vmad##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (dst * src1) - src2;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
+DO_OP (v16hf)
 DO_OP (v8sf)
 DO_OP (v4df)
 
-/* { dg-final { scan-assembler-times {\tfnmsb\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfnmsb\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmsb\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmsb\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfnmsb\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fp_arith_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fp_arith_1.c
index a4c09074d43..06fea806038 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fp_arith_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fp_arith_1.c
@@ -1,36 +1,51 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
 #define DO_REGREG_OPS(TYPE, OP, NAME)				\
-void varith_##TYPE##_##NAME (TYPE* dst, TYPE* src, int count)	\
+void varith_##TYPE##_##NAME (TYPE *dst, TYPE *src, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
     dst[i] = dst[i] OP src[i];					\
 }
 
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, OP, NAME)		\
-  void varithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void varithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
-    dst[i] = dst[i] OP VALUE;				\
+    dst[i] = dst[i] OP (TYPE) VALUE;			\
 }
 
 #define DO_ARITH_OPS(TYPE, OP, NAME)				\
-DO_REGREG_OPS (TYPE, OP, NAME);					\
-DO_IMMEDIATE_OPS (1, TYPE, OP, NAME ## 1);			\
-DO_IMMEDIATE_OPS (0.5, TYPE, OP, NAME ## pointfive);		\
-DO_IMMEDIATE_OPS (2, TYPE, OP, NAME ## 2);			\
-DO_IMMEDIATE_OPS (2.5, TYPE, OP, NAME ## twopoint5);		\
-DO_IMMEDIATE_OPS (-0.5, TYPE, OP, NAME ## minuspointfive);	\
-DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);
+  DO_REGREG_OPS (TYPE, OP, NAME);				\
+  DO_IMMEDIATE_OPS (1, TYPE, OP, NAME ## 1);			\
+  DO_IMMEDIATE_OPS (0.5, TYPE, OP, NAME ## pointfive);		\
+  DO_IMMEDIATE_OPS (2, TYPE, OP, NAME ## 2);			\
+  DO_IMMEDIATE_OPS (2.5, TYPE, OP, NAME ## twopoint5);		\
+  DO_IMMEDIATE_OPS (-0.5, TYPE, OP, NAME ## minuspointfive);	\
+  DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);
 
+DO_ARITH_OPS (_Float16, +, add)
 DO_ARITH_OPS (float, +, add)
 DO_ARITH_OPS (double, +, add)
+
+DO_ARITH_OPS (_Float16, -, minus)
 DO_ARITH_OPS (float, -, minus)
 DO_ARITH_OPS (double, -, minus)
 
 /* No specific count because it's valid to use fadd or fsub for the
    out-of-range constants.  */
+/* { dg-final { scan-assembler {\tfadd\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1.0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0.5\n} 2 } } */
+/* { dg-final { scan-assembler-not   {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2} } } */
+/* { dg-final { scan-assembler-not   {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #-} } } */
+
+/* { dg-final { scan-assembler {\tfsub\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1.0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0.5\n} 2 } } */
+/* { dg-final { scan-assembler-not   {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2} } } */
+/* { dg-final { scan-assembler-not   {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #-} } } */
+
 /* { dg-final { scan-assembler {\tfadd\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} } } */
 /* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1.0\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0.5\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frinta_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frinta_1.c
index 37f26fc8203..bad2be4ed33 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frinta_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frinta_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, roundf)
 DO_OPS (double, round)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frinti_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frinti_1.c
index 9faf2a3f81b..4407fb56caa 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frinti_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frinti_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, nearbyintf)
 DO_OPS (double, nearbyint)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frintm_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frintm_1.c
index b59d21ff0c7..01bf65db343 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frintm_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frintm_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, floorf)
 DO_OPS (double, floor)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frintp_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frintp_1.c
index d9a55e3ade5..f8b2c08ac63 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frintp_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frintp_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, ceilf)
 DO_OPS (double, ceil)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frintx_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frintx_1.c
index 012d9cb9de5..a062295011a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frintx_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frintx_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, rintf)
 DO_OPS (double, rint)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_frintz_1.c b/gcc/testsuite/gcc.target/aarch64/sve_frintz_1.c
index 2ae8f0026a7..207814f5506 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_frintz_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_frintz_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, truncf)
 DO_OPS (double, trunc)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fsqrt_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fsqrt_1.c
index 224c6ccfe6f..55081c3bf4f 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fsqrt_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fsqrt_1.c
@@ -9,7 +9,6 @@ vsqrt_##TYPE (TYPE *dst, TYPE *src, int count)	\
     dst[i] = __builtin_##OP (src[i]);		\
 }
 
-
 DO_OPS (float, sqrtf)
 DO_OPS (double, sqrt)
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_fsubr_1.c b/gcc/testsuite/gcc.target/aarch64/sve_fsubr_1.c
index e664bf38c29..b252ef059ce 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_fsubr_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_fsubr_1.c
@@ -1,23 +1,30 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME)			\
-void vsubrarithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void vsubrarithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
-    dst[i] = VALUE - dst[i];					\
+    dst[i] = (TYPE) VALUE - dst[i];				\
 }
 
 #define DO_ARITH_OPS(TYPE)			\
-DO_IMMEDIATE_OPS (0, TYPE, 0);			\
-DO_IMMEDIATE_OPS (1, TYPE, 1);			\
-DO_IMMEDIATE_OPS (0.5, TYPE, 0point5);		\
-DO_IMMEDIATE_OPS (2, TYPE, 2);			\
-DO_IMMEDIATE_OPS (3.5, TYPE, 3point5);
+  DO_IMMEDIATE_OPS (0, TYPE, 0);		\
+  DO_IMMEDIATE_OPS (1, TYPE, 1);		\
+  DO_IMMEDIATE_OPS (0.5, TYPE, 0point5);	\
+  DO_IMMEDIATE_OPS (2, TYPE, 2);		\
+  DO_IMMEDIATE_OPS (3.5, TYPE, 3point5);
 
+DO_ARITH_OPS (_Float16)
 DO_ARITH_OPS (float)
 DO_ARITH_OPS (double)
 
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0.5\n} 1 } } */
+/* { dg-final { scan-assembler-not   {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2} } } */
+/* { dg-final { scan-assembler-not   {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #3} } } */
+
 /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1.0\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0.5\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_index_1.C b/gcc/testsuite/gcc.target/aarch64/sve_index_1.c
index b7ae2d19f1e..09e65cf0fc3 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_index_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_index_1.c
@@ -1,50 +1,57 @@
-/* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -fno-inline -march=armv8-a+sve -msve-vector-bits=256" } */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (32 / sizeof (TYPE))
 
-#define DEF_LOOP(TYPE,BASE,STEP,SUFFIX)		\
-void loop_##TYPE##SUFFIX (TYPE *__restrict__ a)	\
+#define DEF_LOOP(TYPE, BASE, STEP, SUFFIX)	\
+void __attribute__ ((noinline, noclone))	\
+loop_##TYPE##_##SUFFIX (TYPE *a)		\
 {						\
-  for (TYPE i = 0; i < NUM_ELEMS (TYPE); ++i)	\
-    a[i] = TYPE (BASE) + TYPE (i * (STEP));	\
+  for (int i = 0; i < NUM_ELEMS (TYPE); ++i)	\
+    a[i] = (BASE) + i * (STEP);			\
 }
 
-#define DEF_LOOPS_ALL_UNSIGNED_TYPES(BASE,STEP,SUFFIX)	\
-DEF_LOOP (uint8_t,  BASE, STEP, SUFFIX)			\
-DEF_LOOP (uint16_t, BASE, STEP, SUFFIX)			\
-DEF_LOOP (uint32_t, BASE, STEP, SUFFIX)			\
-DEF_LOOP (uint64_t, BASE, STEP, SUFFIX)
+#define TEST_ALL_UNSIGNED_TYPES(T, BASE, STEP, SUFFIX)	\
+  T (uint8_t,  BASE, STEP, SUFFIX)			\
+  T (uint16_t, BASE, STEP, SUFFIX)			\
+  T (uint32_t, BASE, STEP, SUFFIX)			\
+  T (uint64_t, BASE, STEP, SUFFIX)
 
-#define DEF_LOOPS_ALL_SIGNED_TYPES(BASE,STEP,SUFFIX)	\
-DEF_LOOP (int8_t,  BASE, STEP, SUFFIX)			\
-DEF_LOOP (int16_t, BASE, STEP, SUFFIX)			\
-DEF_LOOP (int32_t, BASE, STEP, SUFFIX)			\
-DEF_LOOP (int64_t, BASE, STEP, SUFFIX)
+#define TEST_ALL_SIGNED_TYPES(T, BASE, STEP, SUFFIX)	\
+  T (int8_t,  BASE, STEP, SUFFIX)			\
+  T (int16_t, BASE, STEP, SUFFIX)			\
+  T (int32_t, BASE, STEP, SUFFIX)			\
+  T (int64_t, BASE, STEP, SUFFIX)
 
-/* Immediate Loops.  */
-DEF_LOOPS_ALL_UNSIGNED_TYPES (0, 1, b0s1)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, 1, b0s1)
-DEF_LOOPS_ALL_UNSIGNED_TYPES (0, 15, b0s15)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, 15, b0s15)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, -1, b0sm1)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, -16, b0sm16)
-DEF_LOOPS_ALL_SIGNED_TYPES (-16, 1, bm16s1)
-DEF_LOOPS_ALL_UNSIGNED_TYPES (15, 1, b15s1)
-DEF_LOOPS_ALL_SIGNED_TYPES (15, 1, b15s1)
+/* Immediate loops.  */
+#define TEST_IMMEDIATE(T)			\
+  TEST_ALL_UNSIGNED_TYPES (T, 0, 1, b0s1)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, 1, b0s1)		\
+  TEST_ALL_UNSIGNED_TYPES (T, 0, 15, b0s15)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, 15, b0s15)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, -1, b0sm1)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, -16, b0sm16)	\
+  TEST_ALL_SIGNED_TYPES (T, -16, 1, bm16s1)	\
+  TEST_ALL_UNSIGNED_TYPES (T, 15, 1, b15s1)	\
+  TEST_ALL_SIGNED_TYPES (T, 15, 1, b15s1)
 
-/* Non Immediate Loops.  */
-DEF_LOOPS_ALL_UNSIGNED_TYPES (0, 16, b0s16)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, 16, b0s16)
-DEF_LOOPS_ALL_SIGNED_TYPES (0, -17, b0sm17)
-DEF_LOOPS_ALL_SIGNED_TYPES (-17, 1, bm17s1)
-DEF_LOOPS_ALL_UNSIGNED_TYPES (16, 1, b16s1)
-DEF_LOOPS_ALL_SIGNED_TYPES (16, 1, b16s1)
-DEF_LOOPS_ALL_UNSIGNED_TYPES (16, 16, b16s16)
-DEF_LOOPS_ALL_SIGNED_TYPES (16, 16, b16s16)
-DEF_LOOPS_ALL_SIGNED_TYPES (-17, -17, bm17sm17)
+/* Non-immediate loops.  */
+#define TEST_NONIMMEDIATE(T)			\
+  TEST_ALL_UNSIGNED_TYPES (T, 0, 16, b0s16)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, 16, b0s16)	\
+  TEST_ALL_SIGNED_TYPES (T, 0, -17, b0sm17)	\
+  TEST_ALL_SIGNED_TYPES (T, -17, 1, bm17s1)	\
+  TEST_ALL_UNSIGNED_TYPES (T, 16, 1, b16s1)	\
+  TEST_ALL_SIGNED_TYPES (T, 16, 1, b16s1)	\
+  TEST_ALL_UNSIGNED_TYPES (T, 16, 16, b16s16)	\
+  TEST_ALL_SIGNED_TYPES (T, 16, 16, b16s16)	\
+  TEST_ALL_SIGNED_TYPES (T, -17, -17, bm17sm17)
+
+#define TEST_ALL(T) TEST_IMMEDIATE (T) TEST_NONIMMEDIATE (T)
+
+TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.b, #0, #1\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.b, #0, #15\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.C
deleted file mode 100644
index 0698eaba6eb..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.C
+++ /dev/null
@@ -1,79 +0,0 @@
-/* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include "sve_index_1.C"
-
-#include <stdlib.h>
-#include <stdio.h>
-
-#define SUM_VECTOR(TYPE)			\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++ )	\
-  {						\
-    result += r_##TYPE[i];			\
-  }
-
-#define TEST_LOOPS_ALL_UNSIGNED_TYPES(SUFFIX)	\
-loop_uint8_t##SUFFIX (r_uint8_t);		\
-loop_uint16_t##SUFFIX (r_uint16_t);		\
-loop_uint32_t##SUFFIX (r_uint32_t);		\
-loop_uint64_t##SUFFIX (r_uint64_t);		\
-SUM_VECTOR (uint8_t);				\
-SUM_VECTOR (uint16_t);				\
-SUM_VECTOR (uint32_t);				\
-SUM_VECTOR (uint64_t);
-
-#define TEST_LOOPS_ALL_SIGNED_TYPES(SUFFIX)	\
-loop_int8_t##SUFFIX (r_int8_t);			\
-loop_int16_t##SUFFIX (r_int16_t);		\
-loop_int32_t##SUFFIX (r_int32_t);		\
-loop_int64_t##SUFFIX (r_int64_t);		\
-SUM_VECTOR (int8_t);				\
-SUM_VECTOR (int16_t);				\
-SUM_VECTOR (int32_t);				\
-SUM_VECTOR (int64_t);
-
-
-#define DEF_INIT_VECTOR(TYPE)			\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    r_##TYPE[i] = 0;
-
-int main ()
-{
-  int result = 0;
-  DEF_INIT_VECTOR (int8_t)
-  DEF_INIT_VECTOR (int16_t)
-  DEF_INIT_VECTOR (int32_t)
-  DEF_INIT_VECTOR (int64_t)
-  DEF_INIT_VECTOR (uint8_t)
-  DEF_INIT_VECTOR (uint16_t)
-  DEF_INIT_VECTOR (uint32_t)
-  DEF_INIT_VECTOR (uint64_t)
-
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b0s1)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0s1)
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b0s15)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0s15)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0sm1)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0sm16)
-  TEST_LOOPS_ALL_SIGNED_TYPES (bm16s1)
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b15s1)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b15s1)
-
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b0s16)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0s16)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b0sm17)
-  TEST_LOOPS_ALL_SIGNED_TYPES (bm17s1)
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b16s1)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b16s1)
-  TEST_LOOPS_ALL_UNSIGNED_TYPES (b16s16)
-  TEST_LOOPS_ALL_SIGNED_TYPES (b16s16)
-  TEST_LOOPS_ALL_SIGNED_TYPES (bm17sm17)
-
-  if (result != 24270)
-    {
-      fprintf (stderr, "result = %d\n", result);
-      abort ();
-    }
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.c
new file mode 100644
index 00000000000..7492ed3f756
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_index_1_run.c
@@ -0,0 +1,20 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve -msve-vector-bits=256" } */
+
+#include "sve_index_1.c"
+
+#define TEST_LOOP(TYPE, BASE, STEP, SUFFIX)	\
+  {						\
+    TYPE array[NUM_ELEMS (TYPE)] = {};		\
+    loop_##TYPE##_##SUFFIX (array);		\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
+      if (array[i] != (TYPE) (BASE + i * STEP))	\
+	__builtin_abort ();			\
+  }
+
+int __attribute__ ((optimize (1)))
+main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_indexoffsetlarge_1.c b/gcc/testsuite/gcc.target/aarch64/sve_indexoffsetlarge_1.c
deleted file mode 100644
index 4c9aab4aada..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_indexoffsetlarge_1.c
+++ /dev/null
@@ -1,31 +0,0 @@
-/* { dg-do assemble } */
-/* { dg-options "-std=c99 -ftree-vectorize -O2 -fno-inline -march=armv8-a+sve --save-temps" } */
-
-//Test sizes that are too big for an index register
-#define SIZE 4294967297
-
-#define INDEX_OFFSET_TEST(SIGNED, TYPE)\
-void set_##SIGNED##TYPE (SIGNED TYPE *out, SIGNED TYPE *in)\
-{\
-  unsigned long i;\
-  for (i = 0; i < SIZE; i++)\
-  {\
-    out[i] = in[i];\
-  }\
-}
-
-INDEX_OFFSET_TEST (signed, int)
-INDEX_OFFSET_TEST (unsigned, int)
-INDEX_OFFSET_TEST (signed, short)
-INDEX_OFFSET_TEST (unsigned, short)
-INDEX_OFFSET_TEST (signed, char)
-INDEX_OFFSET_TEST (unsigned, char)
-
-/* { dg-final { scan-assembler-not "ld1d\\tz\[0-9\]+.d, p\[0-9\]+/z, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 3\\\]" } } */
-/* { dg-final { scan-assembler-not "st1d\\tz\[0-9\]+.d, p\[0-9\]+, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 3\\\]" } } */
-/* { dg-final { scan-assembler-not "ld1w\\tz\[0-9\]+.s, p\[0-9\]+/z, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 2\\\]" } } */
-/* { dg-final { scan-assembler-not "st1w\\tz\[0-9\]+.s, p\[0-9\]+, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 2\\\]" } } */
-/* { dg-final { scan-assembler-not "ld1h\\tz\[0-9\]+.h, p\[0-9\]+/z, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 1\\\]" } } */
-/* { dg-final { scan-assembler-not "st1h\\tz\[0-9\]+.h, p\[0-9\]+, \\\[x\[0-9\]+, w\[0-9\]+, .xtw 1\\\]" } } */
-/* { dg-final { scan-assembler-not "ld1b\\tz\[0-9\]+.b, p\[0-9\]+/z, \\\[x\[0-9\]+, w\[0-9\]+, .xtw\\\]" } } */
-/* { dg-final { scan-assembler-not "st1b\\tz\[0-9\]+.b, p\[0-9\]+, \\\[x\[0-9\]+, w\[0-9\]+, .xtw\\\]" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_infloop_1.c b/gcc/testsuite/gcc.target/aarch64/sve_infloop_1.c
deleted file mode 100644
index 11681c05409..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_infloop_1.c
+++ /dev/null
@@ -1,64 +0,0 @@
-/* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve -fno-tree-loop-distribute-patterns" } */
-/* { dg-timeout 60 } */
-
-#include <stdint.h>
-#include <stdio.h>
-#include <string.h>
-#include <stdlib.h>
-#include <limits.h>
-
-/* Make sure that in cases where
-    n <= TYPE_MAX && n > (TYPE_MAX + 1 - (sizeof (SVE_VEC) / sizeof (TYPE)))
-   that we don't iterate more times than we should due to overflow in the last
-   iteration of loop.  If n == TYPE_MAX we could spin forever.  */
-
-#define SIMPLE_LOOP(TYPE)			\
-TYPE foo_##TYPE (TYPE n, TYPE * __restrict__ a)	\
-{						\
-  TYPE i;					\
-  TYPE v = 0;					\
-  for (i = 0; i < n; i++)			\
-    v += a[i];					\
-  return v;					\
-}
-
-SIMPLE_LOOP (uint8_t)
-SIMPLE_LOOP (uint16_t)
-
-/* Minimum architected SVE vector = 128 bits, i.e. 16 bytes. Just choose
-   something that meets the critera shown above.  */
-#define N_uint8_t  (UCHAR_MAX - 1)
-#define N_uint16_t (USHRT_MAX - 1)
-
-#define N_MAX 1024
-#define DEF_VAR(TYPE)						\
-  TYPE *a_##TYPE = (TYPE *) malloc (N_##TYPE * sizeof (TYPE));	\
-  for (i = 0; i < N_##TYPE; i++)				\
-    a_##TYPE[i] = 1;						\
-  TYPE r_##TYPE;
-
-#define TEST_SIMPLE_LOOP(TYPE) r_##TYPE = foo_##TYPE (N_##TYPE, a_##TYPE);
-
-#define VERIFY(TYPE)							\
-  if (r_##TYPE != N_##TYPE)						\
-    {									\
-      fprintf (stderr, "r_" #TYPE " = %ld\n", (uint64_t) r_##TYPE);	\
-      abort ();								\
-    }
-
-int main ()
-{
-  int i;
-  DEF_VAR (uint8_t)
-  DEF_VAR (uint16_t)
-
-  /* We only test 8 and 16 bit as others take too long.  */
-  TEST_SIMPLE_LOOP (uint8_t)
-  TEST_SIMPLE_LOOP (uint16_t)
-
-  VERIFY (uint8_t)
-  VERIFY (uint16_t)
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_ld1r_1.c b/gcc/testsuite/gcc.target/aarch64/sve_ld1r_1.c
new file mode 100644
index 00000000000..314c2b89624
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_ld1r_1.c
@@ -0,0 +1,53 @@
+/* { dg-do assemble } */
+/* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
+
+#include <stdint.h>
+
+#define DUP4(X) X, X, X, X
+#define DUP8(X) DUP4 (X), DUP4 (X)
+#define DUP16(X) DUP8 (X), DUP8 (X)
+#define DUP32(X) DUP16 (X), DUP16 (X)
+
+typedef uint8_t vuint8_t __attribute__ ((vector_size (32)));
+typedef uint16_t vuint16_t __attribute__ ((vector_size (32)));
+typedef uint32_t vuint32_t __attribute__ ((vector_size (32)));
+typedef uint64_t vuint64_t __attribute__ ((vector_size (32)));
+
+#define TEST(TYPE, NAME, INIT)					\
+  void								\
+  NAME##_##TYPE (TYPE *dest, __typeof__(dest[0][0]) *ptr)	\
+  {								\
+    TYPE x = { INIT };						\
+    *dest = x;							\
+  }
+
+#define TEST_GROUP(TYPE, NAME, DUP)		\
+  TEST (TYPE, NAME_##m1, DUP (ptr[-1]))		\
+  TEST (TYPE, NAME_##0, DUP (ptr[0]))		\
+  TEST (TYPE, NAME_##63, DUP (ptr[63]))		\
+  TEST (TYPE, NAME_##64, DUP (ptr[64]))
+
+TEST_GROUP (vuint8_t, t8, DUP32)
+TEST_GROUP (vuint16_t, t16, DUP16)
+TEST_GROUP (vuint32_t, t16, DUP8)
+TEST_GROUP (vuint64_t, t16, DUP4)
+
+/* { dg-final { scan-assembler-not {\tld1rb\tz[0-9]+\.b, p[0-7]/z, \[x1, -1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rb\tz[0-9]+\.b, p[0-7]/z, \[x1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rb\tz[0-9]+\.b, p[0-7]/z, \[x1, 63\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1rb\tz[0-9]+\.b, p[0-7]/z, \[x1, 64\]\n} } } */
+
+/* { dg-final { scan-assembler-not {\tld1rh\tz[0-9]+\.h, p[0-7]/z, \[x1, -1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rh\tz[0-9]+\.h, p[0-7]/z, \[x1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rh\tz[0-9]+\.h, p[0-7]/z, \[x1, 126\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1rh\tz[0-9]+\.h, p[0-7]/z, \[x1, 128\]\n} } } */
+
+/* { dg-final { scan-assembler-not {\tld1rw\tz[0-9]+\.s, p[0-7]/z, \[x1, -1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rw\tz[0-9]+\.s, p[0-7]/z, \[x1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rw\tz[0-9]+\.s, p[0-7]/z, \[x1, 252\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1rw\tz[0-9]+\.s, p[0-7]/z, \[x1, 256\]\n} } } */
+
+/* { dg-final { scan-assembler-not {\tld1rd\tz[0-9]+\.d, p[0-7]/z, \[x1, -1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rd\tz[0-9]+\.d, p[0-7]/z, \[x1\]\n} } } */
+/* { dg-final { scan-assembler {\tld1rd\tz[0-9]+\.d, p[0-7]/z, \[x1, 504\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1rd\tz[0-9]+\.d, p[0-7]/z, \[x1, 512\]\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_ld1r_2.C b/gcc/testsuite/gcc.target/aarch64/sve_ld1r_2.C
deleted file mode 100644
index d209b48d249..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_ld1r_2.C
+++ /dev/null
@@ -1,51 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256" } */
-
-#define DUP4(X) X, X, X, X
-#define DUP8(X) DUP4 (X), DUP4 (X)
-#define DUP16(X) DUP8 (X), DUP8 (X)
-#define DUP32(X) DUP16 (X), DUP16 (X)
-
-typedef unsigned char vuint8_t __attribute__ ((vector_size (32)));
-typedef unsigned short vuint16_t __attribute__ ((vector_size (32)));
-typedef unsigned int vuint32_t __attribute__ ((vector_size (32)));
-typedef unsigned long vuint64_t __attribute__ ((vector_size (32)));
-
-#define TEST(TYPE, NAME, INIT)				\
-  void							\
-  NAME (TYPE *dest, __typeof__(dest[0][0]) *ptr)	\
-  {							\
-    TYPE x = { INIT };					\
-    *dest = x;						\
-  }
-
-#define TEST_GROUP(TYPE, NAME, DUP)		\
-  TEST (TYPE, NAME_##m1, DUP (ptr[-1]))		\
-  TEST (TYPE, NAME_##0, DUP (ptr[0]))		\
-  TEST (TYPE, NAME_##63, DUP (ptr[63]))		\
-  TEST (TYPE, NAME_##64, DUP (ptr[64]))
-
-TEST_GROUP (vuint8_t, t8, DUP32)
-TEST_GROUP (vuint16_t, t16, DUP16)
-TEST_GROUP (vuint32_t, t16, DUP8)
-TEST_GROUP (vuint64_t, t16, DUP4)
-
-/* { dg-final { scan-assembler-not {\tld1rb\tz[0-9]*.b, p[0-7]/z, \[x1, -1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rb\tz[0-9]*.b, p[0-7]/z, \[x1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rb\tz[0-9]*.b, p[0-7]/z, \[x1, 63\]\n} } } */
-/* { dg-final { scan-assembler-not {\tld1rb\tz[0-9]*.b, p[0-7]/z, \[x1, 64\]\n} } } */
-
-/* { dg-final { scan-assembler-not {\tld1rh\tz[0-9]*.h, p[0-7]/z, \[x1, -1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rh\tz[0-9]*.h, p[0-7]/z, \[x1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rh\tz[0-9]*.h, p[0-7]/z, \[x1, 126\]\n} } } */
-/* { dg-final { scan-assembler-not {\tld1rh\tz[0-9]*.h, p[0-7]/z, \[x1, 128\]\n} } } */
-
-/* { dg-final { scan-assembler-not {\tld1rw\tz[0-9]*.s, p[0-7]/z, \[x1, -1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rw\tz[0-9]*.s, p[0-7]/z, \[x1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rw\tz[0-9]*.s, p[0-7]/z, \[x1, 252\]\n} } } */
-/* { dg-final { scan-assembler-not {\tld1rw\tz[0-9]*.s, p[0-7]/z, \[x1, 256\]\n} } } */
-
-/* { dg-final { scan-assembler-not {\tld1rd\tz[0-9]*.d, p[0-7]/z, \[x1, -1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rd\tz[0-9]*.d, p[0-7]/z, \[x1\]\n} } } */
-/* { dg-final { scan-assembler {\tld1rd\tz[0-9]*.d, p[0-7]/z, \[x1, 504\]\n} } } */
-/* { dg-final { scan-assembler-not {\tld1rd\tz[0-9]*.d, p[0-7]/z, \[x1, 512\]\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_load_const_offset_1.c b/gcc/testsuite/gcc.target/aarch64/sve_load_const_offset_1.c
index 3bbae95f332..0bc757907cf 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_load_const_offset_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_load_const_offset_1.c
@@ -1,10 +1,12 @@
 /* { dg-do assemble } */
-/* { dg-options "-O -march=armv8-a+sve -save-temps -msve-vector-bits=256" } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef long v4di __attribute__ ((vector_size (32)));
-typedef int v8si __attribute__ ((vector_size (32)));
-typedef short v16hi __attribute__ ((vector_size (32)));
-typedef char v32qi __attribute__ ((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__ ((vector_size (32)));
+typedef int32_t v8si __attribute__ ((vector_size (32)));
+typedef int16_t v16hi __attribute__ ((vector_size (32)));
+typedef int8_t v32qi __attribute__ ((vector_size (32)));
 
 #define TEST_TYPE(TYPE)						\
   void sve_load_##TYPE##_neg9 (TYPE *a)				\
@@ -52,26 +54,26 @@ TEST_TYPE (v32qi)
 /* { dg-final { scan-assembler-times {\tadd\tx[0-9]+, x0, 16\n} 4 } } */
 /* { dg-final { scan-assembler-times {\tadd\tx[0-9]+, x0, 256\n} 4 } } */
 
-/* { dg-final { scan-assembler-not {\tld1d\tz0.d, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
-/* { dg-final { scan-assembler-times {\tld1d\tz0.d, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tld1d\tz0.d, p[0-7]/z, \[x0\]\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tld1d\tz0.d, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-not {\tld1d\tz0.d, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1d\tz0\.d, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tld1d\tz0\.d, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
 
-/* { dg-final { scan-assembler-not {\tld1w\tz0.s, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
-/* { dg-final { scan-assembler-times {\tld1w\tz0.s, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tld1w\tz0.s, p[0-7]/z, \[x0\]\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tld1w\tz0.s, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-not {\tld1w\tz0.s, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1w\tz0\.s, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tld1w\tz0\.s, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
 
-/* { dg-final { scan-assembler-not {\tld1h\tz0.h, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
-/* { dg-final { scan-assembler-times {\tld1h\tz0.h, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tld1h\tz0.h, p[0-7]/z, \[x0\]\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tld1h\tz0.h, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-not {\tld1h\tz0.h, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1h\tz0\.h, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tld1h\tz0\.h, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
 
-/* { dg-final { scan-assembler-not {\tld1b\tz0.b, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
-/* { dg-final { scan-assembler-times {\tld1b\tz0.b, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tld1b\tz0.b, p[0-7]/z, \[x0\]\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tld1b\tz0.b, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
-/* { dg-final { scan-assembler-not {\tld1b\tz0.b, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-not {\tld1b\tz0\.b, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0, #-8, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0, #7, mul vl\]\n} 1 } } */
+/* { dg-final { scan-assembler-not {\tld1b\tz0\.b, p[0-7]/z, \[x0, #8, mul vl\]\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_1.c b/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_1.c
index 77a3ef82f62..9163702db1d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_1.c
@@ -1,68 +1,70 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef long v4di __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef char v32qi __attribute__((vector_size(32)));
+#include <stdint.h>
 
-void sve_load_64_u_lsl (unsigned long *a)
+typedef int64_t v4di __attribute__ ((vector_size (32)));
+typedef int32_t v8si __attribute__ ((vector_size (32)));
+typedef int16_t v16hi __attribute__ ((vector_size (32)));
+typedef int8_t v32qi __attribute__ ((vector_size (32)));
+
+void sve_load_64_u_lsl (uint64_t *a)
 {
   register unsigned long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v4di *)&a[i]));
 }
 
-void sve_load_64_s_lsl (signed long *a)
+void sve_load_64_s_lsl (int64_t *a)
 {
   register long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v4di *)&a[i]));
 }
 
-void sve_load_32_u_lsl (unsigned int *a)
+void sve_load_32_u_lsl (uint32_t *a)
 {
   register unsigned long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v8si *)&a[i]));
 }
 
-void sve_load_32_s_lsl (signed int *a)
+void sve_load_32_s_lsl (int32_t *a)
 {
   register long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v8si *)&a[i]));
 }
 
-void sve_load_16_z_lsl (unsigned short *a)
+void sve_load_16_z_lsl (uint16_t *a)
 {
   register unsigned long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v16hi *)&a[i]));
 }
 
-void sve_load_16_s_lsl (signed short *a)
+void sve_load_16_s_lsl (int16_t *a)
 {
   register long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v16hi *)&a[i]));
 }
 
-void sve_load_8_z (unsigned char *a)
+void sve_load_8_z (uint8_t *a)
 {
   register unsigned long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v32qi *)&a[i]));
 }
 
-void sve_load_8_s (signed char *a)
+void sve_load_8_s (int8_t *a)
 {
   register long i asm("x1");
   asm volatile ("" : "=r" (i));
   asm volatile ("" :: "w" (*(v32qi *)&a[i]));
 }
 
-/* { dg-final { scan-assembler-times {\tld1d\tz0.d, p[0-7]/z, \[x0, x1, lsl 3\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tld1w\tz0.s, p[0-7]/z, \[x0, x1, lsl 2\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tld1h\tz0.h, p[0-7]/z, \[x0, x1, lsl 1\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tld1b\tz0.b, p[0-7]/z, \[x0, x1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0, x1, lsl 3\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0, x1, lsl 2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0, x1, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0, x1\]\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_2.c b/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_2.c
deleted file mode 100644
index 7a36cce95cd..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_load_scalar_offset_2.c
+++ /dev/null
@@ -1,68 +0,0 @@
-/* { dg-do assemble } */
-/* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
-
-typedef long v4di __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef char v32qi __attribute__((vector_size(32)));
-
-void sve_load_64_u_lsl (unsigned long *a)
-{
-  register unsigned long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v4di *)&a[i]));
-}
-
-void sve_load_64_s_lsl (signed long *a)
-{
-  register long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v4di *)&a[i]));
-}
-
-void sve_load_32_u_lsl (unsigned int *a)
-{
-  register unsigned long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v8si *)&a[i]));
-}
-
-void sve_load_32_s_lsl (signed int *a)
-{
-  register long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v8si *)&a[i]));
-}
-
-void sve_load_16_z_lsl (unsigned short *a)
-{
-  register unsigned long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v16hi *)&a[i]));
-}
-
-void sve_load_16_s_lsl (signed short *a)
-{
-  register long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v16hi *)&a[i]));
-}
-
-void sve_load_8_z (unsigned char *a)
-{
-  register unsigned long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v32qi *)&a[i]));
-}
-
-void sve_load_8_s (signed char *a)
-{
-  register long i asm("x1");
-  asm volatile ("" : "=r" (i));
-  asm volatile ("" :: "w" (*(v32qi *)&a[i]));
-}
-
-/* { dg-final { scan-assembler-times "ld1d\\tz0.d, p\[0-9\]+/z, \\\[x0, x1, lsl 3\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "ld1w\\tz0.s, p\[0-9\]+/z, \\\[x0, x1, lsl 2\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "ld1h\\tz0.h, p\[0-9\]+/z, \\\[x0, x1, lsl 1\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "ld1b\\tz0.b, p\[0-9\]+/z, \\\[x0, x1\\\]" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_logical_1.c b/gcc/testsuite/gcc.target/aarch64/sve_logical_1.c
index 02f92b95733..aa39adf85f8 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_logical_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_logical_1.c
@@ -1,82 +1,82 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-#define DO_CONSTANT(VALUE, TYPE, OP, NAME)				\
-void vlogical_imm_##NAME##_##TYPE (TYPE* dst, unsigned long count)	\
-{									\
-  for (int i = 0; i < count; i++)					\
-    dst[i] = dst[i] OP VALUE;						\
+#define DO_CONSTANT(VALUE, TYPE, OP, NAME)			\
+void vlogical_imm_##NAME##_##TYPE (TYPE *dst, int count)	\
+{								\
+  for (int i = 0; i < count; i++)				\
+    dst[i] = dst[i] OP VALUE;					\
 }
 
 #define DO_LOGICAL_OPS_BRIEF(TYPE, OP, NAME)	\
-DO_CONSTANT (1, TYPE, OP, NAME ## 1)		\
-DO_CONSTANT (2, TYPE, OP, NAME ## 2)		\
-DO_CONSTANT (5, TYPE, OP, NAME ## 5)		\
-DO_CONSTANT (6, TYPE, OP, NAME ## 6)		\
-DO_CONSTANT (8, TYPE, OP, NAME ## 8)		\
-DO_CONSTANT (9, TYPE, OP, NAME ## 9)		\
-DO_CONSTANT (-1, TYPE, OP, NAME ## minus1)	\
-DO_CONSTANT (-2, TYPE, OP, NAME ## minus2)	\
-DO_CONSTANT (-5, TYPE, OP, NAME ## minus5)	\
-DO_CONSTANT (-6, TYPE, OP, NAME ## minus6)
+  DO_CONSTANT (1, TYPE, OP, NAME ## 1)		\
+  DO_CONSTANT (2, TYPE, OP, NAME ## 2)		\
+  DO_CONSTANT (5, TYPE, OP, NAME ## 5)		\
+  DO_CONSTANT (6, TYPE, OP, NAME ## 6)		\
+  DO_CONSTANT (8, TYPE, OP, NAME ## 8)		\
+  DO_CONSTANT (9, TYPE, OP, NAME ## 9)		\
+  DO_CONSTANT (-1, TYPE, OP, NAME ## minus1)	\
+  DO_CONSTANT (-2, TYPE, OP, NAME ## minus2)	\
+  DO_CONSTANT (-5, TYPE, OP, NAME ## minus5)	\
+  DO_CONSTANT (-6, TYPE, OP, NAME ## minus6)
 
-#define DO_LOGICAL_OPS(TYPE, OP, NAME)			\
-DO_CONSTANT (1, TYPE, OP, NAME ## 1)			\
-DO_CONSTANT (2, TYPE, OP, NAME ## 2)			\
-DO_CONSTANT (3, TYPE, OP, NAME ## 3)			\
-DO_CONSTANT (4, TYPE, OP, NAME ## 4)			\
-DO_CONSTANT (5, TYPE, OP, NAME ## 5)			\
-DO_CONSTANT (6, TYPE, OP, NAME ## 6)			\
-DO_CONSTANT (7, TYPE, OP, NAME ## 7)			\
-DO_CONSTANT (8, TYPE, OP, NAME ## 8)			\
-DO_CONSTANT (9, TYPE, OP, NAME ## 9)			\
-DO_CONSTANT (10, TYPE, OP, NAME ## 10)			\
-DO_CONSTANT (11, TYPE, OP, NAME ## 11)			\
-DO_CONSTANT (12, TYPE, OP, NAME ## 12)			\
-DO_CONSTANT (13, TYPE, OP, NAME ## 13)			\
-DO_CONSTANT (14, TYPE, OP, NAME ## 14)			\
-DO_CONSTANT (15, TYPE, OP, NAME ## 15)			\
-DO_CONSTANT (16, TYPE, OP, NAME ## 16)			\
-DO_CONSTANT (17, TYPE, OP, NAME ## 17)			\
-DO_CONSTANT (18, TYPE, OP, NAME ## 18)			\
-DO_CONSTANT (19, TYPE, OP, NAME ## 19)			\
-DO_CONSTANT (20, TYPE, OP, NAME ## 20)			\
-DO_CONSTANT (21, TYPE, OP, NAME ## 21)			\
-DO_CONSTANT (22, TYPE, OP, NAME ## 22)			\
-DO_CONSTANT (23, TYPE, OP, NAME ## 23)			\
-DO_CONSTANT (24, TYPE, OP, NAME ## 24)			\
-DO_CONSTANT (25, TYPE, OP, NAME ## 25)			\
-DO_CONSTANT (26, TYPE, OP, NAME ## 26)			\
-DO_CONSTANT (27, TYPE, OP, NAME ## 27)			\
-DO_CONSTANT (28, TYPE, OP, NAME ## 28)			\
-DO_CONSTANT (29, TYPE, OP, NAME ## 29)			\
-DO_CONSTANT (30, TYPE, OP, NAME ## 30)			\
-DO_CONSTANT (31, TYPE, OP, NAME ## 31)			\
-DO_CONSTANT (32, TYPE, OP, NAME ## 32)			\
-DO_CONSTANT (33, TYPE, OP, NAME ## 33)			\
-DO_CONSTANT (34, TYPE, OP, NAME ## 34)			\
-DO_CONSTANT (35, TYPE, OP, NAME ## 35)			\
-DO_CONSTANT (252, TYPE, OP, NAME ## 252)		\
-DO_CONSTANT (253, TYPE, OP, NAME ## 253)		\
-DO_CONSTANT (254, TYPE, OP, NAME ## 254)		\
-DO_CONSTANT (255, TYPE, OP, NAME ## 255)		\
-DO_CONSTANT (256, TYPE, OP, NAME ## 256)		\
-DO_CONSTANT (257, TYPE, OP, NAME ## 257)		\
-DO_CONSTANT (65535, TYPE, OP, NAME ## 65535)		\
-DO_CONSTANT (65536, TYPE, OP, NAME ## 65536)		\
-DO_CONSTANT (65537, TYPE, OP, NAME ## 65537)		\
-DO_CONSTANT (2147483646, TYPE, OP, NAME ## 2147483646)	\
-DO_CONSTANT (2147483647, TYPE, OP, NAME ## 2147483647)	\
-DO_CONSTANT (2147483648, TYPE, OP, NAME ## 2147483648)	\
-DO_CONSTANT (-1, TYPE, OP, NAME ## minus1)		\
-DO_CONSTANT (-2, TYPE, OP, NAME ## minus2)		\
-DO_CONSTANT (-3, TYPE, OP, NAME ## minus3)		\
-DO_CONSTANT (-4, TYPE, OP, NAME ## minus4)		\
-DO_CONSTANT (-5, TYPE, OP, NAME ## minus5)		\
-DO_CONSTANT (-6, TYPE, OP, NAME ## minus6)		\
-DO_CONSTANT (-7, TYPE, OP, NAME ## minus7)		\
-DO_CONSTANT (-8, TYPE, OP, NAME ## minus8)		\
-DO_CONSTANT (-9, TYPE, OP, NAME ## minus9)
+#define DO_LOGICAL_OPS(TYPE, OP, NAME)				\
+  DO_CONSTANT (1, TYPE, OP, NAME ## 1)				\
+  DO_CONSTANT (2, TYPE, OP, NAME ## 2)				\
+  DO_CONSTANT (3, TYPE, OP, NAME ## 3)				\
+  DO_CONSTANT (4, TYPE, OP, NAME ## 4)				\
+  DO_CONSTANT (5, TYPE, OP, NAME ## 5)				\
+  DO_CONSTANT (6, TYPE, OP, NAME ## 6)				\
+  DO_CONSTANT (7, TYPE, OP, NAME ## 7)				\
+  DO_CONSTANT (8, TYPE, OP, NAME ## 8)				\
+  DO_CONSTANT (9, TYPE, OP, NAME ## 9)				\
+  DO_CONSTANT (10, TYPE, OP, NAME ## 10)			\
+  DO_CONSTANT (11, TYPE, OP, NAME ## 11)			\
+  DO_CONSTANT (12, TYPE, OP, NAME ## 12)			\
+  DO_CONSTANT (13, TYPE, OP, NAME ## 13)			\
+  DO_CONSTANT (14, TYPE, OP, NAME ## 14)			\
+  DO_CONSTANT (15, TYPE, OP, NAME ## 15)			\
+  DO_CONSTANT (16, TYPE, OP, NAME ## 16)			\
+  DO_CONSTANT (17, TYPE, OP, NAME ## 17)			\
+  DO_CONSTANT (18, TYPE, OP, NAME ## 18)			\
+  DO_CONSTANT (19, TYPE, OP, NAME ## 19)			\
+  DO_CONSTANT (20, TYPE, OP, NAME ## 20)			\
+  DO_CONSTANT (21, TYPE, OP, NAME ## 21)			\
+  DO_CONSTANT (22, TYPE, OP, NAME ## 22)			\
+  DO_CONSTANT (23, TYPE, OP, NAME ## 23)			\
+  DO_CONSTANT (24, TYPE, OP, NAME ## 24)			\
+  DO_CONSTANT (25, TYPE, OP, NAME ## 25)			\
+  DO_CONSTANT (26, TYPE, OP, NAME ## 26)			\
+  DO_CONSTANT (27, TYPE, OP, NAME ## 27)			\
+  DO_CONSTANT (28, TYPE, OP, NAME ## 28)			\
+  DO_CONSTANT (29, TYPE, OP, NAME ## 29)			\
+  DO_CONSTANT (30, TYPE, OP, NAME ## 30)			\
+  DO_CONSTANT (31, TYPE, OP, NAME ## 31)			\
+  DO_CONSTANT (32, TYPE, OP, NAME ## 32)			\
+  DO_CONSTANT (33, TYPE, OP, NAME ## 33)			\
+  DO_CONSTANT (34, TYPE, OP, NAME ## 34)			\
+  DO_CONSTANT (35, TYPE, OP, NAME ## 35)			\
+  DO_CONSTANT (252, TYPE, OP, NAME ## 252)			\
+  DO_CONSTANT (253, TYPE, OP, NAME ## 253)			\
+  DO_CONSTANT (254, TYPE, OP, NAME ## 254)			\
+  DO_CONSTANT (255, TYPE, OP, NAME ## 255)			\
+  DO_CONSTANT (256, TYPE, OP, NAME ## 256)			\
+  DO_CONSTANT (257, TYPE, OP, NAME ## 257)			\
+  DO_CONSTANT (65535, TYPE, OP, NAME ## 65535)			\
+  DO_CONSTANT (65536, TYPE, OP, NAME ## 65536)			\
+  DO_CONSTANT (65537, TYPE, OP, NAME ## 65537)			\
+  DO_CONSTANT (2147483646, TYPE, OP, NAME ## 2147483646)	\
+  DO_CONSTANT (2147483647, TYPE, OP, NAME ## 2147483647)	\
+  DO_CONSTANT (2147483648, TYPE, OP, NAME ## 2147483648)	\
+  DO_CONSTANT (-1, TYPE, OP, NAME ## minus1)			\
+  DO_CONSTANT (-2, TYPE, OP, NAME ## minus2)			\
+  DO_CONSTANT (-3, TYPE, OP, NAME ## minus3)			\
+  DO_CONSTANT (-4, TYPE, OP, NAME ## minus4)			\
+  DO_CONSTANT (-5, TYPE, OP, NAME ## minus5)			\
+  DO_CONSTANT (-6, TYPE, OP, NAME ## minus6)			\
+  DO_CONSTANT (-7, TYPE, OP, NAME ## minus7)			\
+  DO_CONSTANT (-8, TYPE, OP, NAME ## minus8)			\
+  DO_CONSTANT (-9, TYPE, OP, NAME ## minus9)
 
 DO_LOGICAL_OPS_BRIEF (char, &, and)
 DO_LOGICAL_OPS_BRIEF (long, &, and)
@@ -215,8 +215,7 @@ DO_LOGICAL_OPS (int, ^, xor)
 /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.s, z[0-9]+\.s, #0xfffffff8\n} 1 } } */
 /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.s, z[0-9]+\.s, #0xfffffff7\n} 1 } } */
 
-/* No specific number because this also doubles as a move.  */
-/* { dg-final { scan-assembler {\torr\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 22 } } */
 
 /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.s, z[0-9]+\.s, #0x1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.s, z[0-9]+\.s, #0x2\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1.c b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1.c
index 73d78bdc8be..5546cefe686 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1.c
@@ -1,13 +1,13 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */
+/* { dg-options "-O3 -march=armv8-a+sve" } */
 
-__attribute__((noinline, noclone))
-void vadd (int *dst, int *op1, int *op2, int count)
+void __attribute__((noinline, noclone))
+vadd (int *dst, int *op1, int *op2, int count)
 {
   for (int i = 0; i < count; ++i)
     dst[i] = op1[i] + op2[i];
 }
 
-/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]/z,} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7],} 1 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z,} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],} 1 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1_run.c
index 6f06ce6e8a6..c7d0352e273 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_1_run.c
@@ -1,11 +1,12 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve" } */
+/* { dg-options "-O3 -march=armv8-a+sve" } */
 
 #include "sve_loop_add_1.c"
 
 #define ELEMS 10
 
-int main (void)
+int __attribute__ ((optimize (1)))
+main (void)
 {
   int in1[ELEMS] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
   int in2[ELEMS] = { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
@@ -16,7 +17,7 @@ int main (void)
 
   for (int i = 0; i < ELEMS; ++i)
     if (out[i] != check[i])
-      return 1;
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_5.c b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_5.c
index 30891703a63..a27bde6f9da 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_loop_add_5.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_loop_add_5.c
@@ -12,8 +12,8 @@
 /* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7]+, \[x[0-9]+, x[0-9]+\]} 8 } } */
 
 /* The induction vector is invariant for steps of -16 and 16.  */
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #} 3 } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #} 3 } } */
+/* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, #} } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, #} 6 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 8 } } */
 
 /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.h, w[0-9]+, #-16\n} 1 { xfail *-*-* } } } */
@@ -25,8 +25,8 @@
 /* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 1\]} 8 } } */
 
 /* The (-)17 * 16 is out of range.  */
-/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #} 3 } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #} 3 } } */
+/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, #} 2 } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, #} 4 } } */
 /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 10 } } */
 
 /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.s, w[0-9]+, #-16\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_mad_1.c b/gcc/testsuite/gcc.target/aarch64/sve_mad_1.c
index 6da2e115782..ccb20b4191f 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_mad_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_mad_1.c
@@ -1,34 +1,34 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef char v32qi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef long v4di __attribute__((vector_size(32)));
+#include <stdint.h>
 
-#define DO_OP(TYPE)                                   \
-void vmla##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (dst * src1) + src2;                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+typedef int8_t v32qi __attribute__((vector_size(32)));
+typedef int16_t v16hi __attribute__((vector_size(32)));
+typedef int32_t v8si __attribute__((vector_size(32)));
+typedef int64_t v4di __attribute__((vector_size(32)));
+
+#define DO_OP(TYPE)						\
+void vmla_##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (dst * src1) + src2;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
-DO_OP(v32qi)
-DO_OP(v16hi)
-DO_OP(v8si)
-DO_OP(v4di)
+DO_OP (v32qi)
+DO_OP (v16hi)
+DO_OP (v8si)
+DO_OP (v4di)
 
-/* { dg-final { scan-assembler-times {\tmad\tz0.b, p[0-7]/m, z2.b, z4.b} 1 } } */
-/* { dg-final { scan-assembler-times {\tmad\tz0.h, p[0-7]/m, z2.h, z4.h} 1 } } */
-/* { dg-final { scan-assembler-times {\tmad\tz0.s, p[0-7]/m, z2.s, z4.s} 1 } } */
-/* { dg-final { scan-assembler-times {\tmad\tz0.d, p[0-7]/m, z2.d, z4.d} 1 } } */
+/* { dg-final { scan-assembler-times {\tmad\tz0\.b, p[0-7]/m, z2\.b, z4\.b} 1 } } */
+/* { dg-final { scan-assembler-times {\tmad\tz0\.h, p[0-7]/m, z2\.h, z4\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\tmad\tz0\.s, p[0-7]/m, z2\.s, z4\.s} 1 } } */
+/* { dg-final { scan-assembler-times {\tmad\tz0\.d, p[0-7]/m, z2\.d, z4\.d} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1.C b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1.c
index f42eb6e9edf..733ffd1b765 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1.c
@@ -1,39 +1,45 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (320 / sizeof (TYPE))
 
-#define DEF_MAXMIN(TYPE,NAME,CMP_OP)					\
-void fun_##NAME##TYPE (TYPE *__restrict__ r, TYPE *__restrict__ a,	\
-                       TYPE *__restrict__ b)				\
-{									\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)				\
-    r[i] = a[i] CMP_OP b[i] ? a[i] : b[i];				\
+#define DEF_MAXMIN(TYPE, NAME, CMP_OP)				\
+void __attribute__ ((noinline, noclone))			\
+fun_##NAME##_##TYPE (TYPE *restrict r, TYPE *restrict a,	\
+		     TYPE *restrict b)				\
+{								\
+  for (int i = 0; i < NUM_ELEMS (TYPE); i++)			\
+    r[i] = a[i] CMP_OP b[i] ? a[i] : b[i];			\
 }
 
-DEF_MAXMIN (int8_t, max, >)
-DEF_MAXMIN (int16_t, max, >)
-DEF_MAXMIN (int32_t, max, >)
-DEF_MAXMIN (int64_t, max, >)
-DEF_MAXMIN (uint8_t, max, >)
-DEF_MAXMIN (uint16_t, max, >)
-DEF_MAXMIN (uint32_t, max, >)
-DEF_MAXMIN (uint64_t, max, >)
-DEF_MAXMIN (float, max, >)
-DEF_MAXMIN (double, max, >)
-
-DEF_MAXMIN (int8_t, min, <)
-DEF_MAXMIN (int16_t, min, <)
-DEF_MAXMIN (int32_t, min, <)
-DEF_MAXMIN (int64_t, min, <)
-DEF_MAXMIN (uint8_t, min, <)
-DEF_MAXMIN (uint16_t, min, <)
-DEF_MAXMIN (uint32_t, min, <)
-DEF_MAXMIN (uint64_t, min, <)
-DEF_MAXMIN (float, min, <)
-DEF_MAXMIN (double, min, <)
+#define TEST_ALL(T)			\
+  T (int8_t, max, >)			\
+  T (int16_t, max, >)			\
+  T (int32_t, max, >)			\
+  T (int64_t, max, >)			\
+  T (uint8_t, max, >)			\
+  T (uint16_t, max, >)			\
+  T (uint32_t, max, >)			\
+  T (uint64_t, max, >)			\
+  T (_Float16, max, >)			\
+  T (float, max, >)			\
+  T (double, max, >)			\
+					\
+  T (int8_t, min, <)			\
+  T (int16_t, min, <)			\
+  T (int32_t, min, <)			\
+  T (int64_t, min, <)			\
+  T (uint8_t, min, <)			\
+  T (uint16_t, min, <)			\
+  T (uint32_t, min, <)			\
+  T (uint64_t, min, <)			\
+  T (_Float16, min, <)			\
+  T (float, min, <)			\
+  T (double, min, <)
+
+TEST_ALL (DEF_MAXMIN)
 
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
@@ -45,6 +51,7 @@ DEF_MAXMIN (double, min, <)
 /* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
 
@@ -58,5 +65,6 @@ DEF_MAXMIN (double, min, <)
 /* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.C
deleted file mode 100644
index 37dc9a4cdec..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.C
+++ /dev/null
@@ -1,88 +0,0 @@
-/* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
-
-#include "sve_maxmin_1.C"
-
-#include <stdlib.h>
-#include <stdio.h>
-
-#define DEF_INIT_VECTOR(TYPE)				\
-  TYPE a_##TYPE[NUM_ELEMS (TYPE)];			\
-  TYPE b_##TYPE[NUM_ELEMS (TYPE)];			\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++ )		\
-  {							\
-    a_##TYPE[i] = ((i * 2) % 3) * (i & 1 ? 1 : -1);	\
-    b_##TYPE[i] = (1 + (i % 4)) * (i & 1 ? -1 : 1);	\
-  }
-
-#define TEST_MAX(RES,TYPE)			\
-{						\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  fun_max##TYPE (r_##TYPE, a_##TYPE, b_##TYPE);	\
-  TYPE tmp = 0;					\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    tmp += r_##TYPE[i];				\
-  (RES) += tmp;					\
-}
-
-#define TEST_MIN(RES,TYPE)			\
-{						\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  fun_max##TYPE (r_##TYPE, a_##TYPE, b_##TYPE);	\
-  TYPE tmp = 0;					\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    tmp += r_##TYPE[i];				\
-  (RES) += tmp;					\
-}
-
-int main ()
-{
-  int result = 0;
-  double resultF = 0.0;
-  DEF_INIT_VECTOR (int8_t)
-  DEF_INIT_VECTOR (int16_t)
-  DEF_INIT_VECTOR (int32_t)
-  DEF_INIT_VECTOR (int64_t)
-  DEF_INIT_VECTOR (uint8_t)
-  DEF_INIT_VECTOR (uint16_t)
-  DEF_INIT_VECTOR (uint32_t)
-  DEF_INIT_VECTOR (uint64_t)
-  DEF_INIT_VECTOR (float)
-  DEF_INIT_VECTOR (double)
-
-  TEST_MIN (result, int8_t)
-  TEST_MIN (result, int16_t)
-  TEST_MIN (result, int32_t)
-  TEST_MIN (result, int64_t)
-  TEST_MIN (result, uint8_t)
-  TEST_MIN (result, uint16_t)
-  TEST_MIN (result, uint32_t)
-  TEST_MIN (result, uint64_t)
-  TEST_MIN (resultF, float)
-  TEST_MIN (resultF, double)
-
-  TEST_MAX (result, int8_t)
-  TEST_MAX (result, int16_t)
-  TEST_MAX (result, int32_t)
-  TEST_MAX (result, int64_t)
-  TEST_MAX (result, uint8_t)
-  TEST_MAX (result, uint16_t)
-  TEST_MAX (result, uint32_t)
-  TEST_MAX (result, uint64_t)
-  TEST_MAX (resultF, float)
-  TEST_MAX (resultF, double)
-
-  if (result != 131400)
-    {
-      fprintf (stderr, "result = %d\n", result);
-      abort ();
-    }
-
-  if (resultF != 362)
-    {
-      fprintf (stderr, "resultF = %1.16lf\n", resultF);
-      abort ();
-    }
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.c
new file mode 100644
index 00000000000..d3130bff8fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_1_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { aarch64_sve_hw } } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
+
+#include "sve_maxmin_1.c"
+
+#define TEST_LOOP(TYPE, NAME, CMP_OP)			\
+  {							\
+    TYPE a[NUM_ELEMS (TYPE)];				\
+    TYPE b[NUM_ELEMS (TYPE)];				\
+    TYPE r[NUM_ELEMS (TYPE)];				\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)		\
+      {							\
+	a[i] = ((i * 2) % 3) * (i & 1 ? 1 : -1);	\
+	b[i] = (1 + (i % 4)) * (i & 1 ? -1 : 1);	\
+	asm volatile ("" ::: "memory");			\
+      }							\
+    fun_##NAME##_##TYPE (r, a, b);			\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)		\
+      if (r[i] != (a[i] CMP_OP b[i] ? a[i] : b[i]))	\
+	__builtin_abort ();				\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1.C b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1.c
index 8a5ce725bf1..27561d19694 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1.c
@@ -1,23 +1,27 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include <math.h>
 
 #define NUM_ELEMS(TYPE) (320 / sizeof (TYPE))
 
-#define DEF_MAXMIN(TYPE,FUN)					\
-void test_##FUN (TYPE *__restrict__ r, TYPE *__restrict__ a,	\
-                 TYPE *__restrict__ b)				\
+#define DEF_MAXMIN(TYPE, FUN)					\
+void __attribute__ ((noinline, noclone))			\
+test_##FUN##_##TYPE (TYPE *restrict r, TYPE *restrict a,	\
+		     TYPE *restrict b)				\
 {								\
   for (int i = 0; i < NUM_ELEMS (TYPE); i++)			\
     r[i] = FUN (a[i], b[i]);					\
 }
 
-DEF_MAXMIN (float, fmaxf)
-DEF_MAXMIN (double, fmax)
+#define TEST_ALL(T)				\
+  T (float, fmaxf)				\
+  T (double, fmax)				\
+						\
+  T (float, fminf)				\
+  T (double, fmin)
 
-DEF_MAXMIN (float, fminf)
-DEF_MAXMIN (double, fmin)
+TEST_ALL (DEF_MAXMIN)
 
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.C
deleted file mode 100644
index 06c868638e9..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.C
+++ /dev/null
@@ -1,56 +0,0 @@
-/* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <stdlib.h>
-#include <stdio.h>
-#include "sve_maxmin_strict_1.C"
-
-#define DEF_INIT_VECTOR(TYPE)				\
-  TYPE a_##TYPE[NUM_ELEMS (TYPE)];			\
-  TYPE b_##TYPE[NUM_ELEMS (TYPE)];			\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++ )		\
-  {							\
-    a_##TYPE[i] = ((i * 2) % 3) * (i & 1 ? 1 : -1);	\
-    b_##TYPE[i] = (1 + (i % 4)) * (i & 1 ? -1 : 1);	\
-  }
-
-#define TEST_MAX(RES,FUN,TYPE)			\
-{						\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  test_##FUN (r_##TYPE, a_##TYPE, b_##TYPE);	\
-  TYPE tmp = 0;					\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    tmp += r_##TYPE[i];				\
-  (RES) += tmp;					\
-}
-
-#define TEST_MIN(RES,FUN,TYPE)			\
-{						\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  test_##FUN (r_##TYPE, a_##TYPE, b_##TYPE);	\
-  TYPE tmp = 0;					\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    tmp += r_##TYPE[i];				\
-  (RES) += tmp;					\
-}
-
-int main ()
-{
-  double resultF = 0.0;
-  DEF_INIT_VECTOR (float)
-  DEF_INIT_VECTOR (double)
-
-  TEST_MIN (resultF, fminf, float)
-  TEST_MIN (resultF, fmin, double)
-
-  TEST_MAX (resultF, fmaxf, float)
-  TEST_MAX (resultF, fmax, double)
-
-  if (resultF != -57)
-    {
-      fprintf (stderr, "resultF = %1.16lf\n", resultF);
-      abort ();
-    }
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.c
new file mode 100644
index 00000000000..2b869c62a5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_maxmin_strict_1_run.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { aarch64_sve_hw } } } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
+
+#include "sve_maxmin_strict_1.c"
+
+#define TEST_LOOP(TYPE, FUN)				\
+  {							\
+    TYPE a[NUM_ELEMS (TYPE)];				\
+    TYPE b[NUM_ELEMS (TYPE)];				\
+    TYPE r[NUM_ELEMS (TYPE)];				\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)		\
+      {							\
+	a[i] = ((i * 2) % 3) * (i & 1 ? 1 : -1);	\
+	b[i] = (1 + (i % 4)) * (i & 1 ? -1 : 1);	\
+	asm volatile ("" ::: "memory");			\
+      }							\
+    test_##FUN##_##TYPE (r, a, b);			\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)		\
+      if (r[i] != FUN (a[i], b[i]))			\
+	__builtin_abort ();				\
+  }
+
+int main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_mla_1.c b/gcc/testsuite/gcc.target/aarch64/sve_mla_1.c
index 329cba68ffb..a4d705e38ba 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_mla_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_mla_1.c
@@ -1,26 +1,26 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef char v32qi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef long v4di __attribute__((vector_size(32)));
+#include <stdint.h>
 
-#define DO_OP(TYPE)                                   \
-void vmla##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = (src1 * src2) + dst;                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+typedef int8_t v32qi __attribute__((vector_size(32)));
+typedef int16_t v16hi __attribute__((vector_size(32)));
+typedef int32_t v8si __attribute__((vector_size(32)));
+typedef int64_t v4di __attribute__((vector_size(32)));
+
+#define DO_OP(TYPE)						\
+void vmla_##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = (src1 * src2) + dst;					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
 DO_OP (v32qi)
@@ -28,7 +28,7 @@ DO_OP (v16hi)
 DO_OP (v8si)
 DO_OP (v4di)
 
-/* { dg-final { scan-assembler-times {\tmla\tz0.b, p[0-7]/m, z2.b, z4.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmla\tz0.h, p[0-7]/m, z2.h, z4.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmla\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmla\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmla\tz0\.b, p[0-7]/m, z2\.b, z4\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmla\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmla\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmla\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_mls_1.c b/gcc/testsuite/gcc.target/aarch64/sve_mls_1.c
index abcfc5f40e9..b7cc1dba087 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_mls_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_mls_1.c
@@ -1,26 +1,26 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef char v32qi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef long v4di __attribute__((vector_size(32)));
+#include <stdint.h>
 
-#define DO_OP(TYPE)                                   \
-void vmls##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
-  dst = dst - (src1 * src2);                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+typedef int8_t v32qi __attribute__((vector_size(32)));
+typedef int16_t v16hi __attribute__((vector_size(32)));
+typedef int32_t v8si __attribute__((vector_size(32)));
+typedef int64_t v4di __attribute__((vector_size(32)));
+
+#define DO_OP(TYPE)						\
+void vmla_##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
+  dst = dst - (src1 * src2);					\
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
 DO_OP (v32qi)
@@ -28,7 +28,7 @@ DO_OP (v16hi)
 DO_OP (v8si)
 DO_OP (v4di)
 
-/* { dg-final { scan-assembler-times {\tmls\tz0.b, p[0-7]/m, z2.b, z4.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmls\tz0.h, p[0-7]/m, z2.h, z4.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmls\tz0.s, p[0-7]/m, z2.s, z4.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmls\tz0.d, p[0-7]/m, z2.d, z4.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmls\tz0\.b, p[0-7]/m, z2\.b, z4\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmls\tz0\.h, p[0-7]/m, z2\.h, z4\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmls\tz0\.s, p[0-7]/m, z2\.s, z4\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmls\tz0\.d, p[0-7]/m, z2\.d, z4\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_mov_rr_1.c b/gcc/testsuite/gcc.target/aarch64/sve_mov_rr_1.c
index d5a3a38442b..a38375af017 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_mov_rr_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_mov_rr_1.c
@@ -11,4 +11,4 @@ void sve_copy_rr (void)
   asm volatile ("#foo" :: "w" (y));
 }
 
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+.d, z[0-9]+.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_msb_1.c b/gcc/testsuite/gcc.target/aarch64/sve_msb_1.c
index 132740b0866..fc05837a920 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_msb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_msb_1.c
@@ -1,34 +1,34 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef char v32qi __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef long v4di __attribute__((vector_size(32)));
+#include <stdint.h>
 
-#define DO_OP(TYPE)                                   \
-void vmla##TYPE (TYPE *_dst, TYPE _src1, TYPE _src2)  \
-{                                                     \
-  register TYPE dst  asm("z0");                       \
-  register TYPE src1 asm("z2");                       \
-  register TYPE src2 asm("z4");                       \
-  dst = *_dst;                                        \
-  asm volatile ("" :: "w" (dst));                     \
-  src1 = _src1;                                       \
-  asm volatile ("" :: "w" (src1));                    \
-  src2 = _src2;                                       \
-  asm volatile ("" :: "w" (src2));                    \
+typedef int8_t v32qi __attribute__((vector_size(32)));
+typedef int16_t v16hi __attribute__((vector_size(32)));
+typedef int32_t v8si __attribute__((vector_size(32)));
+typedef int64_t v4di __attribute__((vector_size(32)));
+
+#define DO_OP(TYPE)						\
+void vmla_##TYPE (TYPE *x, TYPE y, TYPE z)			\
+{								\
+  register TYPE dst  asm("z0");					\
+  register TYPE src1 asm("z2");					\
+  register TYPE src2 asm("z4");					\
+  dst = *x;							\
+  src1 = y;							\
+  src2 = z;							\
+  asm volatile ("" :: "w" (dst), "w" (src1), "w" (src2));	\
   dst = src2 - (dst * src1);                          \
-  asm volatile ("" :: "w" (dst));                     \
-  *_dst = dst;                                        \
+  asm volatile ("" :: "w" (dst));				\
+  *x = dst;							\
 }
 
-DO_OP(v32qi)
-DO_OP(v16hi)
-DO_OP(v8si)
-DO_OP(v4di)
+DO_OP (v32qi)
+DO_OP (v16hi)
+DO_OP (v8si)
+DO_OP (v4di)
 
-/* { dg-final { scan-assembler-times {\tmsb\tz0.b, p[0-7]/m, z2.b, z4.b} 1 } } */
-/* { dg-final { scan-assembler-times {\tmsb\tz0.h, p[0-7]/m, z2.h, z4.h} 1 } } */
-/* { dg-final { scan-assembler-times {\tmsb\tz0.s, p[0-7]/m, z2.s, z4.s} 1 } } */
-/* { dg-final { scan-assembler-times {\tmsb\tz0.d, p[0-7]/m, z2.d, z4.d} 1 } } */
+/* { dg-final { scan-assembler-times {\tmsb\tz0\.b, p[0-7]/m, z2\.b, z4\.b} 1 } } */
+/* { dg-final { scan-assembler-times {\tmsb\tz0\.h, p[0-7]/m, z2\.h, z4\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\tmsb\tz0\.s, p[0-7]/m, z2\.s, z4\.s} 1 } } */
+/* { dg-final { scan-assembler-times {\tmsb\tz0\.d, p[0-7]/m, z2\.d, z4\.d} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_mul_1.c b/gcc/testsuite/gcc.target/aarch64/sve_mul_1.c
index ae6f8688c58..2b1cd4a7a93 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_mul_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_mul_1.c
@@ -1,34 +1,36 @@
 /* { dg-do assemble } */
-/* { dg-options {-std=c99 -O3 -march=armv8-a+sve --save-temps} } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
+
+#include <stdint.h>
 
 #define DO_REGREG_OPS(TYPE, OP, NAME)				\
-void varith_##TYPE##_##NAME (TYPE* dst, TYPE* src, int count)	\
+void varith_##TYPE##_##NAME (TYPE *dst, TYPE *src, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
     dst[i] = dst[i] OP src[i];					\
 }
 
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, OP, NAME)		\
-void varithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void varithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
     dst[i] = dst[i] OP VALUE;				\
 }
 
 #define DO_ARITH_OPS(TYPE, OP, NAME)			\
-DO_REGREG_OPS (TYPE, OP, NAME);				\
-DO_IMMEDIATE_OPS (0, TYPE, OP, NAME ## 0);		\
-DO_IMMEDIATE_OPS (86, TYPE, OP, NAME ## 86);		\
-DO_IMMEDIATE_OPS (109, TYPE, OP, NAME ## 109);		\
-DO_IMMEDIATE_OPS (141, TYPE, OP, NAME ## 141);		\
-DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);	\
-DO_IMMEDIATE_OPS (-110, TYPE, OP, NAME ## minus110);	\
-DO_IMMEDIATE_OPS (-141, TYPE, OP, NAME ## minus141);
-
-DO_ARITH_OPS (char, *, mul)
-DO_ARITH_OPS (short, *, mul)
-DO_ARITH_OPS (int, *, mul)
-DO_ARITH_OPS (long, *, mul)
+  DO_REGREG_OPS (TYPE, OP, NAME);			\
+  DO_IMMEDIATE_OPS (0, TYPE, OP, NAME ## 0);		\
+  DO_IMMEDIATE_OPS (86, TYPE, OP, NAME ## 86);		\
+  DO_IMMEDIATE_OPS (109, TYPE, OP, NAME ## 109);	\
+  DO_IMMEDIATE_OPS (141, TYPE, OP, NAME ## 141);	\
+  DO_IMMEDIATE_OPS (-1, TYPE, OP, NAME ## minus1);	\
+  DO_IMMEDIATE_OPS (-110, TYPE, OP, NAME ## minus110);	\
+  DO_IMMEDIATE_OPS (-141, TYPE, OP, NAME ## minus141);
+
+DO_ARITH_OPS (int8_t, *, mul)
+DO_ARITH_OPS (int16_t, *, mul)
+DO_ARITH_OPS (int32_t, *, mul)
+DO_ARITH_OPS (int64_t, *, mul)
 
 /* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, z[0-9]+\.b, #86\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_neg_1.c b/gcc/testsuite/gcc.target/aarch64/sve_neg_1.c
index 8e5e8e58b07..b463c2c0580 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_neg_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_neg_1.c
@@ -1,17 +1,21 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
+
+#include <stdint.h>
 
 #define DO_OPS(TYPE)					\
-void vneg_##TYPE (TYPE* dst, TYPE* src, int count)	\
+void vneg_##TYPE (TYPE *dst, TYPE *src, int count)	\
 {							\
   for (int i = 0; i < count; ++i)			\
     dst[i] = -src[i];					\
 }
 
-DO_OPS (char)
-DO_OPS (int)
-DO_OPS (long)
+DO_OPS (int8_t)
+DO_OPS (int16_t)
+DO_OPS (int32_t)
+DO_OPS (int64_t)
 
 /* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1.c b/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1.c
index 8f54a2a3143..3871451bc1d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1.c
@@ -1,25 +1,30 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
 
-#define DO_VNLOGICAL(TYPE)						\
-void __attribute__ ((weak))						\
-vnlogical_not_##TYPE (TYPE *dst, unsigned long count)			\
-{									\
-  for (int i = 0; i < count; i++)					\
-    dst[i] = ~dst[i];							\
-}									\
-									\
-void __attribute__ ((weak))						\
-vnlogical_bic_##TYPE (TYPE *dst, TYPE *src, unsigned long count)	\
-{									\
-  for (int i = 0; i < count; i++)					\
-    dst[i] = dst[i] & ~src[i];						\
+#include <stdint.h>
+
+#define DO_VNLOGICAL(TYPE)				\
+void __attribute__ ((noinline, noclone))		\
+vnlogical_not_##TYPE (TYPE *dst, int count)		\
+{							\
+  for (int i = 0; i < count; i++)			\
+    dst[i] = ~dst[i];					\
+}							\
+							\
+void __attribute__ ((noinline, noclone))		\
+vnlogical_bic_##TYPE (TYPE *dst, TYPE *src, int count)	\
+{							\
+  for (int i = 0; i < count; i++)			\
+    dst[i] = dst[i] & ~src[i];				\
 }
 
-DO_VNLOGICAL (char)
-DO_VNLOGICAL (short)
-DO_VNLOGICAL (int)
-DO_VNLOGICAL (long)
+#define TEST_ALL(T)				\
+  T (int8_t)					\
+  T (int16_t)					\
+  T (int32_t)					\
+  T (int64_t)
+
+TEST_ALL (DO_VNLOGICAL)
 
 /* { dg-final { scan-assembler-times {\tnot\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tnot\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1_run.c
index ca3f47134fa..905d44b8265 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_nlogical_1_run.c
@@ -9,7 +9,10 @@
   {							\
     TYPE dst[N], src[N];				\
     for (int i = 0; i < N; ++i)				\
-      dst[i] = i ^ 42;					\
+      {							\
+	dst[i] = i ^ 42;				\
+	asm volatile ("" ::: "memory");			\
+      }							\
     vnlogical_not_##TYPE (dst, N);			\
     for (int i = 0; i < N; ++i)				\
       if (dst[i] != (TYPE) ~(i ^ 42))			\
@@ -18,6 +21,7 @@
       {							\
 	dst[i] = i ^ 42;				\
 	src[i] = i % 5;					\
+	asm volatile ("" ::: "memory");			\
       }							\
     vnlogical_bic_##TYPE (dst, src, N);			\
     for (int i = 0; i < N; ++i)				\
@@ -25,12 +29,9 @@
 	__builtin_abort ();				\
   }
 
-int
+int __attribute__ ((optimize (1)))
 main (void)
 {
-  TEST_VNLOGICAL (char);
-  TEST_VNLOGICAL (short);
-  TEST_VNLOGICAL (int);
-  TEST_VNLOGICAL (long);
+  TEST_ALL (TEST_VNLOGICAL)
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_1.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_1.c
index 12fa945a794..723b4e3433b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_1.c
@@ -1,20 +1,25 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-#define PACK(TYPED, TYPES, SIGN)				\
-void pack_##TYPED##_##TYPES##_##SIGN (SIGN TYPED *d,		\
-				      SIGN TYPES *s, int size)	\
-{								\
-  for (int i = 0; i < size; i++)				\
-    d[i] = s[i] + 1;						\
+#include <stdint.h>
+
+#define PACK(TYPED, TYPES)				\
+void __attribute__ ((noinline, noclone))		\
+pack_##TYPED##_##TYPES (TYPED *d, TYPES *s, int size)	\
+{							\
+  for (int i = 0; i < size; i++)			\
+    d[i] = s[i] + 1;					\
 }
 
-PACK (int, long, signed)			\
-PACK (short, int, signed)			\
-PACK (char, short, signed)			\
-PACK (int, long, unsigned)			\
-PACK (short, int, unsigned)			\
-PACK (char, short, unsigned)
+#define TEST_ALL(T)				\
+  T (int32_t, int64_t)				\
+  T (int16_t, int32_t)				\
+  T (int8_t, int16_t)				\
+  T (uint32_t, uint64_t)			\
+  T (uint16_t, uint32_t)			\
+  T (uint8_t, uint16_t)
+
+TEST_ALL (PACK)
 
 /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_1_run.c
index 208889f86b8..cb7876cb135 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_1_run.c
@@ -1,46 +1,28 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_pack_1.c"
 
 #define ARRAY_SIZE 57
 
-#define RUN_AND_CHECK_LOOP(TYPED, TYPES, VALUED, VALUES)		\
-{									\
-  int value = 0;							\
-  TYPED arrayd[ARRAY_SIZE];						\
-  TYPES arrays[ARRAY_SIZE];						\
-  memset (arrayd, 67, ARRAY_SIZE * sizeof (TYPED));			\
-  memset (arrays, VALUES, ARRAY_SIZE * sizeof (TYPES));			\
-  pack_##TYPED##_##TYPES##_signed (arrayd, arrays, ARRAY_SIZE);		\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-        fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-        exit (1);							\
-      }									\
-  memset (arrayd, 74, ARRAY_SIZE*sizeof (TYPED));			\
-  pack_##TYPED##_##TYPES##_unsigned (arrayd, arrays, ARRAY_SIZE);	\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-        fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-        exit (1);							\
-      }									\
-}
+#define TEST_LOOP(TYPED, TYPES)					\
+  {								\
+    TYPED arrayd[ARRAY_SIZE];					\
+    TYPES arrays[ARRAY_SIZE];					\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      {								\
+	arrays[i] = (i - 10) * 3;				\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    pack_##TYPED##_##TYPES (arrayd, arrays, ARRAY_SIZE);	\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      if (arrayd[i] != (TYPED) ((TYPES) ((i - 10) * 3) + 1))	\
+	__builtin_abort ();					\
+  }
 
-int main (void)
+int __attribute__ ((optimize (1)))
+main (void)
 {
-  int total = 5;
-  RUN_AND_CHECK_LOOP (char, short, total + 1, total);
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (short, int, total + 1, total);
-  total = (total << 8) + 5;
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (int, long, total + 1, total);
+  TEST_ALL (TEST_LOOP)
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1.c
index 2d1918cc2cd..a99d227e4c8 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1.c
@@ -1,7 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-void pack_int_double_plus_3 (signed int *d, double *s, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+pack_int_double_plus_3 (int32_t *d, double *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = s[i] + 3;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c
index 11d85fc8eb0..2a45bb5b1e8 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c
@@ -1,9 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_pack_fcvt_signed_1.c"
 
@@ -14,19 +10,19 @@
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static signed int array_dest[ARRAY_SIZE];
+  static int32_t array_dest[ARRAY_SIZE];
   double array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   pack_int_double_plus_3 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_dest[i] != (int) VAL1 + 3)
-      {
-	fprintf (stderr,"%d: %d != %d\n", i, array_dest[i], (int) VAL1 + 3);
-	exit (1);
-      }
+    if (array_dest[i] != (int32_t) VAL1 + 3)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c
index f7692989a71..a039d6fdd66 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c
@@ -1,7 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-void pack_int_double_plus_7 (unsigned int *d, double *s, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+pack_int_double_plus_7 (uint32_t *d, double *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = s[i] + 7;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c
index 196b6de358a..8a1e72485ad 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c
@@ -1,9 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_pack_fcvt_unsigned_1.c"
 
@@ -14,19 +10,19 @@
 int __attribute__ ((optimize (1)))
 main (void)
 {
-  static unsigned int array_dest[ARRAY_SIZE];
+  static uint32_t array_dest[ARRAY_SIZE];
   double array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   pack_int_double_plus_7 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_dest[i] != (int) VAL1 + 7)
-      {
-	fprintf (stderr,"%d: %d != %d\n", i, array_dest[i], (int) VAL1 + 7);
-	exit (1);
-      }
+    if (array_dest[i] != (uint32_t) VAL1 + 7)
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1.c
index 7faf7652e75..746154e530d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-void pack_float_plus_1point1 (float *d, double *s, int size)
+void __attribute__ ((noinline, noclone))
+pack_float_plus_1point1 (float *d, double *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = s[i] + 1.1;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1_run.c
index 85a7eca9173..91e8a699f0b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_pack_float_1_run.c
@@ -1,9 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_pack_float_1.c"
 
@@ -18,16 +14,15 @@ main (void)
   double array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   pack_float_plus_1point1 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
     if (array_dest[i] != (float) (VAL1 + 1.1))
-      {
-	fprintf (stderr, "%d: %f != %f\n", i, array_dest[i],
-		 (float) (VAL1 + 1.1));
-	exit (1);
-      }
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_popcount_1.c b/gcc/testsuite/gcc.target/aarch64/sve_popcount_1.c
index 0e640dab810..c3bb2756b2a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_popcount_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_popcount_1.c
@@ -1,16 +1,17 @@
 /* { dg-do assemble } */
 /* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve --save-temps" } */
 
-void
-popcount_32 (unsigned int *restrict dst, unsigned int *restrict src, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+popcount_32 (unsigned int *restrict dst, uint32_t *restrict src, int size)
 {
   for (int i = 0; i < size; ++i)
     dst[i] = __builtin_popcount (src[i]);
 }
 
-void
-popcount_64 (unsigned int *restrict dst, unsigned long *restrict src,
-	     int size)
+void __attribute__ ((noinline, noclone))
+popcount_64 (unsigned int *restrict dst, uint64_t *restrict src, int size)
 {
   for (int i = 0; i < size; ++i)
     dst[i] = __builtin_popcountl (src[i]);
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_popcount_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_popcount_1_run.c
index 9ef47bcbf2c..6be828fa81a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_popcount_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_popcount_1_run.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_popcount_1.c"
 
@@ -16,24 +16,31 @@ unsigned int data[] = {
   0x0, 0
 };
 
-int
+int __attribute__ ((optimize (1)))
 main (void)
 {
   unsigned int count = sizeof (data) / sizeof (data[0]) / 2;
 
-  unsigned int in32[count], out32[count];
+  uint32_t in32[count];
+  unsigned int out32[count];
   for (unsigned int i = 0; i < count; ++i)
-    in32[i] = data[i * 2];
+    {
+      in32[i] = data[i * 2];
+      asm volatile ("" ::: "memory");
+    }
   popcount_32 (out32, in32, count);
   for (unsigned int i = 0; i < count; ++i)
     if (out32[i] != data[i * 2 + 1])
       abort ();
 
   count /= 2;
-  unsigned long in64[count];
+  uint64_t in64[count];
   unsigned int out64[count];
   for (unsigned int i = 0; i < count; ++i)
-    in64[i] = ((unsigned long) data[i * 4] << 32) | data[i * 4 + 2];
+    {
+      in64[i] = ((uint64_t) data[i * 4] << 32) | data[i * 4 + 2];
+      asm volatile ("" ::: "memory");
+    }
   popcount_64 (out64, in64, count);
   for (unsigned int i = 0; i < count; ++i)
     if (out64[i] != data[i * 4 + 1] + data[i * 4 + 3])
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_1.C b/gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c
index da3b5fa1963..4c26e78fae8 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_1.c
@@ -1,10 +1,11 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
 
 #include <stdint.h>
 
 #define DEF_REDUC_PLUS(TYPE)			\
-TYPE reduc_plus_##TYPE (TYPE *a, int n)		\
+TYPE __attribute__ ((noinline, noclone))	\
+reduc_plus_##TYPE (TYPE *a, int n)		\
 {						\
   TYPE r = 0;					\
   for (int i = 0; i < n; ++i)			\
@@ -12,19 +13,24 @@ TYPE reduc_plus_##TYPE (TYPE *a, int n)		\
   return r;					\
 }
 
-DEF_REDUC_PLUS (int8_t)
-DEF_REDUC_PLUS (int16_t)
-DEF_REDUC_PLUS (int32_t)
-DEF_REDUC_PLUS (int64_t)
-DEF_REDUC_PLUS (uint8_t)
-DEF_REDUC_PLUS (uint16_t)
-DEF_REDUC_PLUS (uint32_t)
-DEF_REDUC_PLUS (uint64_t)
-DEF_REDUC_PLUS (float)
-DEF_REDUC_PLUS (double)
-
-#define DEF_REDUC_MAXMIN(TYPE,NAME,CMP_OP)	\
-TYPE reduc_##NAME##TYPE (TYPE *a, int n)	\
+#define TEST_PLUS(T)				\
+  T (int8_t)					\
+  T (int16_t)					\
+  T (int32_t)					\
+  T (int64_t)					\
+  T (uint8_t)					\
+  T (uint16_t)					\
+  T (uint32_t)					\
+  T (uint64_t)					\
+  T (_Float16)					\
+  T (float)					\
+  T (double)
+
+TEST_PLUS (DEF_REDUC_PLUS)
+
+#define DEF_REDUC_MAXMIN(TYPE, NAME, CMP_OP)	\
+TYPE __attribute__ ((noinline, noclone))	\
+reduc_##NAME##_##TYPE (TYPE *a, int n)		\
 {						\
   TYPE r = 13;					\
   for (int i = 0; i < n; ++i)			\
@@ -32,30 +38,36 @@ TYPE reduc_##NAME##TYPE (TYPE *a, int n)	\
   return r;					\
 }
 
-DEF_REDUC_MAXMIN (int8_t, max, >)
-DEF_REDUC_MAXMIN (int16_t, max, >)
-DEF_REDUC_MAXMIN (int32_t, max, >)
-DEF_REDUC_MAXMIN (int64_t, max, >)
-DEF_REDUC_MAXMIN (uint8_t, max, >)
-DEF_REDUC_MAXMIN (uint16_t, max, >)
-DEF_REDUC_MAXMIN (uint32_t, max, >)
-DEF_REDUC_MAXMIN (uint64_t, max, >)
-DEF_REDUC_MAXMIN (float, max, >)
-DEF_REDUC_MAXMIN (double, max, >)
-
-DEF_REDUC_MAXMIN (int8_t, min, <)
-DEF_REDUC_MAXMIN (int16_t, min, <)
-DEF_REDUC_MAXMIN (int32_t, min, <)
-DEF_REDUC_MAXMIN (int64_t, min, <)
-DEF_REDUC_MAXMIN (uint8_t, min, <)
-DEF_REDUC_MAXMIN (uint16_t, min, <)
-DEF_REDUC_MAXMIN (uint32_t, min, <)
-DEF_REDUC_MAXMIN (uint64_t, min, <)
-DEF_REDUC_MAXMIN (float, min, <)
-DEF_REDUC_MAXMIN (double, min, <)
-
-#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP)	\
-TYPE reduc_##NAME##TYPE (TYPE *a, int n)	\
+#define TEST_MAXMIN(T)				\
+  T (int8_t, max, >)				\
+  T (int16_t, max, >)				\
+  T (int32_t, max, >)				\
+  T (int64_t, max, >)				\
+  T (uint8_t, max, >)				\
+  T (uint16_t, max, >)				\
+  T (uint32_t, max, >)				\
+  T (uint64_t, max, >)				\
+  T (_Float16, max, >)				\
+  T (float, max, >)				\
+  T (double, max, >)				\
+						\
+  T (int8_t, min, <)				\
+  T (int16_t, min, <)				\
+  T (int32_t, min, <)				\
+  T (int64_t, min, <)				\
+  T (uint8_t, min, <)				\
+  T (uint16_t, min, <)				\
+  T (uint32_t, min, <)				\
+  T (uint64_t, min, <)				\
+  T (_Float16, min, <)				\
+  T (float, min, <)				\
+  T (double, min, <)
+
+TEST_MAXMIN (DEF_REDUC_MAXMIN)
+
+#define DEF_REDUC_BITWISE(TYPE, NAME, BIT_OP)	\
+TYPE __attribute__ ((noinline, noclone))	\
+reduc_##NAME##_##TYPE (TYPE *a, int n)		\
 {						\
   TYPE r = 13;					\
   for (int i = 0; i < n; ++i)			\
@@ -63,80 +75,93 @@ TYPE reduc_##NAME##TYPE (TYPE *a, int n)	\
   return r;					\
 }
 
-DEF_REDUC_BITWISE (int8_t, and, &=)
-DEF_REDUC_BITWISE (int16_t, and, &=)
-DEF_REDUC_BITWISE (int32_t, and, &=)
-DEF_REDUC_BITWISE (int64_t, and, &=)
-DEF_REDUC_BITWISE (uint8_t, and, &=)
-DEF_REDUC_BITWISE (uint16_t, and, &=)
-DEF_REDUC_BITWISE (uint32_t, and, &=)
-DEF_REDUC_BITWISE (uint64_t, and, &=)
-
-DEF_REDUC_BITWISE (int8_t, ior, |=)
-DEF_REDUC_BITWISE (int16_t, ior, |=)
-DEF_REDUC_BITWISE (int32_t, ior, |=)
-DEF_REDUC_BITWISE (int64_t, ior, |=)
-DEF_REDUC_BITWISE (uint8_t, ior, |=)
-DEF_REDUC_BITWISE (uint16_t, ior, |=)
-DEF_REDUC_BITWISE (uint32_t, ior, |=)
-DEF_REDUC_BITWISE (uint64_t, ior, |=)
-
-DEF_REDUC_BITWISE (int8_t, xor, ^=)
-DEF_REDUC_BITWISE (int16_t, xor, ^=)
-DEF_REDUC_BITWISE (int32_t, xor, ^=)
-DEF_REDUC_BITWISE (int64_t, xor, ^=)
-DEF_REDUC_BITWISE (uint8_t, xor, ^=)
-DEF_REDUC_BITWISE (uint16_t, xor, ^=)
-DEF_REDUC_BITWISE (uint32_t, xor, ^=)
-DEF_REDUC_BITWISE (uint64_t, xor, ^=)
-
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 2 } } */
-
-/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 1 } } */
-
-/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 1 } } */
-
-/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 1 } } */
-
-/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 1 } } */
-
-/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 1 } } */
-
-/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 2 } } */
-
-/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 2 } } */
-
-/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.b, p[0-7]/m, z[0-9]\.b, z[0-9]\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.h, p[0-7]/m, z[0-9]\.h, z[0-9]\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.s, p[0-7]/m, z[0-9]\.s, z[0-9]\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, p[0-7]/m, z[0-9]\.d, z[0-9]\.d\n} 2 } } */
+#define TEST_BITWISE(T)				\
+  T (int8_t, and, &=)				\
+  T (int16_t, and, &=)				\
+  T (int32_t, and, &=)				\
+  T (int64_t, and, &=)				\
+  T (uint8_t, and, &=)				\
+  T (uint16_t, and, &=)				\
+  T (uint32_t, and, &=)				\
+  T (uint64_t, and, &=)				\
+						\
+  T (int8_t, ior, |=)				\
+  T (int16_t, ior, |=)				\
+  T (int32_t, ior, |=)				\
+  T (int64_t, ior, |=)				\
+  T (uint8_t, ior, |=)				\
+  T (uint16_t, ior, |=)				\
+  T (uint32_t, ior, |=)				\
+  T (uint64_t, ior, |=)				\
+						\
+  T (int8_t, xor, ^=)				\
+  T (int16_t, xor, ^=)				\
+  T (int32_t, xor, ^=)				\
+  T (int64_t, xor, ^=)				\
+  T (uint8_t, xor, ^=)				\
+  T (uint16_t, xor, ^=)				\
+  T (uint32_t, xor, ^=)				\
+  T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
+
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tumax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.b, p[0-7]/m, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfaddv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfaddv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
@@ -148,6 +173,7 @@ DEF_REDUC_BITWISE (uint64_t, xor, ^=)
 /* { dg-final { scan-assembler-times {\tumaxv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumaxv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumaxv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
@@ -159,6 +185,7 @@ DEF_REDUC_BITWISE (uint64_t, xor, ^=)
 /* { dg-final { scan-assembler-times {\tuminv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuminv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuminv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.C
deleted file mode 100644
index 17c978de7f7..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.C
+++ /dev/null
@@ -1,117 +0,0 @@
-/* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
-
-#include "sve_reduc_1.C"
-
-#include <stdlib.h>
-#include <stdio.h>
-
-#define NUM_ELEMS(TYPE) (73 + sizeof (TYPE))
-
-#define DEF_INIT_VECTOR(TYPE)				\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE) + 1];			\
-  for (int i = 0; i < NUM_ELEMS (TYPE) + 1; i++)	\
-    r_##TYPE[i] = (i * 2) * (i & 1 ? 1 : -1);
-
-#define TEST_REDUC_PLUS(RES,TYPE) \
-  (RES) += reduc_plus_##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-#define TEST_REDUC_MAX(RES,TYPE) \
-  (RES) += reduc_max##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-#define TEST_REDUC_MIN(RES,TYPE) \
-  (RES) += reduc_min##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-#define TEST_REDUC_AND(RES,TYPE) \
-  (RES) += reduc_and##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-#define TEST_REDUC_IOR(RES,TYPE) \
-  (RES) += reduc_ior##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-#define TEST_REDUC_XOR(RES,TYPE) \
-  (RES) += reduc_xor##TYPE (r_##TYPE, NUM_ELEMS (TYPE));
-
-int main ()
-{
-  int result = 0;
-  double resultF = 0.0;
-  DEF_INIT_VECTOR (int8_t)
-  DEF_INIT_VECTOR (int16_t)
-  DEF_INIT_VECTOR (int32_t)
-  DEF_INIT_VECTOR (int64_t)
-  DEF_INIT_VECTOR (uint8_t)
-  DEF_INIT_VECTOR (uint16_t)
-  DEF_INIT_VECTOR (uint32_t)
-  DEF_INIT_VECTOR (uint64_t)
-  DEF_INIT_VECTOR (float)
-  DEF_INIT_VECTOR (double)
-
-  TEST_REDUC_PLUS (result, int8_t)
-  TEST_REDUC_PLUS (result, int16_t)
-  TEST_REDUC_PLUS (result, int32_t)
-  TEST_REDUC_PLUS (result, int64_t)
-  TEST_REDUC_PLUS (result, uint8_t)
-  TEST_REDUC_PLUS (result, uint16_t)
-  TEST_REDUC_PLUS (result, uint32_t)
-  TEST_REDUC_PLUS (result, uint64_t)
-  TEST_REDUC_PLUS (resultF, float)
-  TEST_REDUC_PLUS (resultF, double)
-
-  TEST_REDUC_MIN (result, int8_t)
-  TEST_REDUC_MIN (result, int16_t)
-  TEST_REDUC_MIN (result, int32_t)
-  TEST_REDUC_MIN (result, int64_t)
-  TEST_REDUC_MIN (result, uint8_t)
-  TEST_REDUC_MIN (result, uint16_t)
-  TEST_REDUC_MIN (result, uint32_t)
-  TEST_REDUC_MIN (result, uint64_t)
-  TEST_REDUC_MIN (resultF, float)
-  TEST_REDUC_MIN (resultF, double)
-
-  TEST_REDUC_MAX (result, int8_t)
-  TEST_REDUC_MAX (result, int16_t)
-  TEST_REDUC_MAX (result, int32_t)
-  TEST_REDUC_MAX (result, int64_t)
-  TEST_REDUC_MAX (result, uint8_t)
-  TEST_REDUC_MAX (result, uint16_t)
-  TEST_REDUC_MAX (result, uint32_t)
-  TEST_REDUC_MAX (result, uint64_t)
-  TEST_REDUC_MAX (resultF, float)
-  TEST_REDUC_MAX (resultF, double)
-
-  TEST_REDUC_AND (result, int8_t)
-  TEST_REDUC_AND (result, int16_t)
-  TEST_REDUC_AND (result, int32_t)
-  TEST_REDUC_AND (result, int64_t)
-  TEST_REDUC_AND (result, uint8_t)
-  TEST_REDUC_AND (result, uint16_t)
-  TEST_REDUC_AND (result, uint32_t)
-  TEST_REDUC_AND (result, uint64_t)
-
-  TEST_REDUC_IOR (result, int8_t)
-  TEST_REDUC_IOR (result, int16_t)
-  TEST_REDUC_IOR (result, int32_t)
-  TEST_REDUC_IOR (result, int64_t)
-  TEST_REDUC_IOR (result, uint8_t)
-  TEST_REDUC_IOR (result, uint16_t)
-  TEST_REDUC_IOR (result, uint32_t)
-  TEST_REDUC_IOR (result, uint64_t)
-
-  TEST_REDUC_XOR (result, int8_t)
-  TEST_REDUC_XOR (result, int16_t)
-  TEST_REDUC_XOR (result, int32_t)
-  TEST_REDUC_XOR (result, int64_t)
-  TEST_REDUC_XOR (result, uint8_t)
-  TEST_REDUC_XOR (result, uint16_t)
-  TEST_REDUC_XOR (result, uint32_t)
-  TEST_REDUC_XOR (result, uint64_t)
-
-  if (result != 262400)
-    {
-      fprintf (stderr, "result = %d\n", result);
-      abort ();
-    }
-
-  if (resultF != -160)
-    {
-      fprintf (stderr, "resultF = %1.16lf\n", resultF);
-      abort ();
-    }
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c
new file mode 100644
index 00000000000..9f4afbcf3a7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_1_run.c
@@ -0,0 +1,56 @@
+/* { dg-do run { target { aarch64_sve_hw } } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
+
+#include "sve_reduc_1.c"
+
+#define NUM_ELEMS(TYPE) (73 + sizeof (TYPE))
+
+#define INIT_VECTOR(TYPE)				\
+  TYPE a[NUM_ELEMS (TYPE) + 1];				\
+  for (int i = 0; i < NUM_ELEMS (TYPE) + 1; i++)	\
+    {							\
+      a[i] = ((i * 2) * (i & 1 ? 1 : -1) | 3);		\
+      asm volatile ("" ::: "memory");			\
+    }
+
+#define TEST_REDUC_PLUS(TYPE)				\
+  {							\
+    INIT_VECTOR (TYPE);					\
+    TYPE r1 = reduc_plus_##TYPE (a, NUM_ELEMS (TYPE));	\
+    volatile TYPE r2 = 0;				\
+    for (int i = 0; i < NUM_ELEMS (TYPE); ++i)		\
+      r2 += a[i];					\
+    if (r1 != r2)					\
+      __builtin_abort ();				\
+  }
+
+#define TEST_REDUC_MAXMIN(TYPE, NAME, CMP_OP)			\
+  {								\
+    INIT_VECTOR (TYPE);						\
+    TYPE r1 = reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE));	\
+    volatile TYPE r2 = 13;					\
+    for (int i = 0; i < NUM_ELEMS (TYPE); ++i)			\
+      r2 = a[i] CMP_OP r2 ? a[i] : r2;				\
+    if (r1 != r2)						\
+      __builtin_abort ();					\
+  }
+
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP)			\
+  {								\
+    INIT_VECTOR (TYPE);						\
+    TYPE r1 = reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE));	\
+    volatile TYPE r2 = 13;					\
+    for (int i = 0; i < NUM_ELEMS (TYPE); ++i)			\
+      r2 BIT_OP a[i];						\
+    if (r1 != r2)						\
+      __builtin_abort ();					\
+  }
+
+int main ()
+{
+  TEST_PLUS (TEST_REDUC_PLUS)
+  TEST_MAXMIN (TEST_REDUC_MAXMIN)
+  TEST_BITWISE (TEST_REDUC_BITWISE)
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_2.C b/gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c
index 6ac37570164..669306549d3 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_2.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_2.c
@@ -1,109 +1,126 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (1024 / sizeof (TYPE))
 
-#define DEF_REDUC_PLUS(TYPE)						\
-void reduc_plus_##TYPE (TYPE (*__restrict__ a)[NUM_ELEMS (TYPE)],	\
-			TYPE *__restrict__ r, int n)			\
-{									\
-  for (int i = 0; i < n; i++)						\
-    {									\
-      r[i] = 0;								\
-      for (int j = 0; j < NUM_ELEMS (TYPE); j++)			\
-        r[i] += a[i][j];						\
-    }									\
+#define DEF_REDUC_PLUS(TYPE)					\
+void __attribute__ ((noinline, noclone))			\
+reduc_plus_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)],	\
+		   TYPE *restrict r, int n)			\
+{								\
+  for (int i = 0; i < n; i++)					\
+    {								\
+      r[i] = 0;							\
+      for (int j = 0; j < NUM_ELEMS (TYPE); j++)		\
+        r[i] += a[i][j];					\
+    }								\
 }
 
-DEF_REDUC_PLUS (int8_t)
-DEF_REDUC_PLUS (int16_t)
-DEF_REDUC_PLUS (int32_t)
-DEF_REDUC_PLUS (int64_t)
-DEF_REDUC_PLUS (uint8_t)
-DEF_REDUC_PLUS (uint16_t)
-DEF_REDUC_PLUS (uint32_t)
-DEF_REDUC_PLUS (uint64_t)
-DEF_REDUC_PLUS (float)
-DEF_REDUC_PLUS (double)
-
-#define DEF_REDUC_MAXMIN(TYPE,NAME,CMP_OP)				\
-void reduc_##NAME##TYPE (TYPE (*__restrict__ a)[NUM_ELEMS (TYPE)],	\
-			 TYPE *__restrict__ r, int n)			\
-{									\
-  for (int i = 0; i < n; i++)						\
-    {									\
-      r[i] = a[i][0];							\
-      for (int j = 0; j < NUM_ELEMS (TYPE); j++)			\
-        r[i] = a[i][j] CMP_OP r[i] ? a[i][j] : r[i];			\
-    }									\
+#define TEST_PLUS(T)				\
+  T (int8_t)					\
+  T (int16_t)					\
+  T (int32_t)					\
+  T (int64_t)					\
+  T (uint8_t)					\
+  T (uint16_t)					\
+  T (uint32_t)					\
+  T (uint64_t)					\
+  T (_Float16)					\
+  T (float)					\
+  T (double)
+
+TEST_PLUS (DEF_REDUC_PLUS)
+
+#define DEF_REDUC_MAXMIN(TYPE, NAME, CMP_OP)			\
+void __attribute__ ((noinline, noclone))			\
+reduc_##NAME##_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)],	\
+		       TYPE *restrict r, int n)			\
+{								\
+  for (int i = 0; i < n; i++)					\
+    {								\
+      r[i] = a[i][0];						\
+      for (int j = 0; j < NUM_ELEMS (TYPE); j++)		\
+        r[i] = a[i][j] CMP_OP r[i] ? a[i][j] : r[i];		\
+    }								\
 }
 
-DEF_REDUC_MAXMIN (int8_t, max, >)
-DEF_REDUC_MAXMIN (int16_t, max, >)
-DEF_REDUC_MAXMIN (int32_t, max, >)
-DEF_REDUC_MAXMIN (int64_t, max, >)
-DEF_REDUC_MAXMIN (uint8_t, max, >)
-DEF_REDUC_MAXMIN (uint16_t, max, >)
-DEF_REDUC_MAXMIN (uint32_t, max, >)
-DEF_REDUC_MAXMIN (uint64_t, max, >)
-DEF_REDUC_MAXMIN (float, max, >)
-DEF_REDUC_MAXMIN (double, max, >)
-
-DEF_REDUC_MAXMIN (int8_t, min, <)
-DEF_REDUC_MAXMIN (int16_t, min, <)
-DEF_REDUC_MAXMIN (int32_t, min, <)
-DEF_REDUC_MAXMIN (int64_t, min, <)
-DEF_REDUC_MAXMIN (uint8_t, min, <)
-DEF_REDUC_MAXMIN (uint16_t, min, <)
-DEF_REDUC_MAXMIN (uint32_t, min, <)
-DEF_REDUC_MAXMIN (uint64_t, min, <)
-DEF_REDUC_MAXMIN (float, min, <)
-DEF_REDUC_MAXMIN (double, min, <)
-
-#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP)\
-void reduc_##NAME##TYPE (TYPE (*__restrict__ a)[NUM_ELEMS(TYPE)], TYPE *__restrict__ r, int n)\
-{\
-  for (int i = 0; i < n; i++)\
-    {\
-      r[i] = a[i][0];\
-      for (int j = 0; j < NUM_ELEMS(TYPE); j++)\
-        r[i] BIT_OP a[i][j];\
-    }\
-}\
-
-DEF_REDUC_BITWISE (int8_t, and, &=)
-DEF_REDUC_BITWISE (int16_t, and, &=)
-DEF_REDUC_BITWISE (int32_t, and, &=)
-DEF_REDUC_BITWISE (int64_t, and, &=)
-DEF_REDUC_BITWISE (uint8_t, and, &=)
-DEF_REDUC_BITWISE (uint16_t, and, &=)
-DEF_REDUC_BITWISE (uint32_t, and, &=)
-DEF_REDUC_BITWISE (uint64_t, and, &=)
-
-DEF_REDUC_BITWISE (int8_t, ior, |=)
-DEF_REDUC_BITWISE (int16_t, ior, |=)
-DEF_REDUC_BITWISE (int32_t, ior, |=)
-DEF_REDUC_BITWISE (int64_t, ior, |=)
-DEF_REDUC_BITWISE (uint8_t, ior, |=)
-DEF_REDUC_BITWISE (uint16_t, ior, |=)
-DEF_REDUC_BITWISE (uint32_t, ior, |=)
-DEF_REDUC_BITWISE (uint64_t, ior, |=)
-
-DEF_REDUC_BITWISE (int8_t, xor, ^=)
-DEF_REDUC_BITWISE (int16_t, xor, ^=)
-DEF_REDUC_BITWISE (int32_t, xor, ^=)
-DEF_REDUC_BITWISE (int64_t, xor, ^=)
-DEF_REDUC_BITWISE (uint8_t, xor, ^=)
-DEF_REDUC_BITWISE (uint16_t, xor, ^=)
-DEF_REDUC_BITWISE (uint32_t, xor, ^=)
-DEF_REDUC_BITWISE (uint64_t, xor, ^=)
+#define TEST_MAXMIN(T)				\
+  T (int8_t, max, >)				\
+  T (int16_t, max, >)				\
+  T (int32_t, max, >)				\
+  T (int64_t, max, >)				\
+  T (uint8_t, max, >)				\
+  T (uint16_t, max, >)				\
+  T (uint32_t, max, >)				\
+  T (uint64_t, max, >)				\
+  T (_Float16, max, >)				\
+  T (float, max, >)				\
+  T (double, max, >)				\
+						\
+  T (int8_t, min, <)				\
+  T (int16_t, min, <)				\
+  T (int32_t, min, <)				\
+  T (int64_t, min, <)				\
+  T (uint8_t, min, <)				\
+  T (uint16_t, min, <)				\
+  T (uint32_t, min, <)				\
+  T (uint64_t, min, <)				\
+  T (_Float16, min, <)				\
+  T (float, min, <)				\
+  T (double, min, <)
+
+TEST_MAXMIN (DEF_REDUC_MAXMIN)
+
+#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP)			\
+void __attribute__ ((noinline, noclone))			\
+reduc_##NAME##TYPE (TYPE (*restrict a)[NUM_ELEMS(TYPE)],	\
+		    TYPE *restrict r, int n)			\
+{								\
+  for (int i = 0; i < n; i++)					\
+    {								\
+      r[i] = a[i][0];						\
+      for (int j = 0; j < NUM_ELEMS(TYPE); j++)			\
+        r[i] BIT_OP a[i][j];					\
+    }								\
+}
+
+#define TEST_BITWISE(T)				\
+  T (int8_t, and, &=)				\
+  T (int16_t, and, &=)				\
+  T (int32_t, and, &=)				\
+  T (int64_t, and, &=)				\
+  T (uint8_t, and, &=)				\
+  T (uint16_t, and, &=)				\
+  T (uint32_t, and, &=)				\
+  T (uint64_t, and, &=)				\
+						\
+  T (int8_t, ior, |=)				\
+  T (int16_t, ior, |=)				\
+  T (int32_t, ior, |=)				\
+  T (int64_t, ior, |=)				\
+  T (uint8_t, ior, |=)				\
+  T (uint16_t, ior, |=)				\
+  T (uint32_t, ior, |=)				\
+  T (uint64_t, ior, |=)				\
+						\
+  T (int8_t, xor, ^=)				\
+  T (int16_t, xor, ^=)				\
+  T (int32_t, xor, ^=)				\
+  T (int64_t, xor, ^=)				\
+  T (uint8_t, xor, ^=)				\
+  T (uint16_t, xor, ^=)				\
+  T (uint32_t, xor, ^=)				\
+  T (uint64_t, xor, ^=)
+
+TEST_BITWISE (DEF_REDUC_BITWISE)
 
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.b\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfaddv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfaddv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
@@ -115,6 +132,7 @@ DEF_REDUC_BITWISE (uint64_t, xor, ^=)
 /* { dg-final { scan-assembler-times {\tumaxv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumaxv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tumaxv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfmaxnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
@@ -126,6 +144,7 @@ DEF_REDUC_BITWISE (uint64_t, xor, ^=)
 /* { dg-final { scan-assembler-times {\tuminv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuminv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tuminv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnmv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnmv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tfminnmv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 1 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.C b/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.C
deleted file mode 100644
index 6f170fb0de6..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.C
+++ /dev/null
@@ -1,135 +0,0 @@
-/* { dg-do run { target { aarch64_sve_hw } } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
-
-#include "sve_reduc_2.C"
-
-#include <stdlib.h>
-#include <stdio.h>
-#include <math.h>
-
-#define NROWS 5
-
-#define DEF_INIT_VECTOR(TYPE)					\
-  TYPE mat_##TYPE[NROWS][NUM_ELEMS (TYPE)];			\
-  TYPE r_##TYPE[NROWS];						\
-  for (int i = 0; i < NROWS; i++)				\
-    for (int j = 0; j < NUM_ELEMS (TYPE); j++ )			\
-      mat_##TYPE[i][j] = i + (j * 2) * (j & 1 ? 1 : -1);
-
-#define TEST_REDUC_PLUS(TYPE) reduc_plus_##TYPE (mat_##TYPE, r_##TYPE, NROWS);
-#define TEST_REDUC_MAX(TYPE)  reduc_max##TYPE   (mat_##TYPE, r_##TYPE, NROWS);
-#define TEST_REDUC_MIN(TYPE)  reduc_min##TYPE   (mat_##TYPE, r_##TYPE, NROWS);
-#define TEST_REDUC_AND(TYPE)  reduc_and##TYPE   (mat_##TYPE, r_##TYPE, NROWS);
-#define TEST_REDUC_IOR(TYPE)  reduc_ior##TYPE   (mat_##TYPE, r_##TYPE, NROWS);
-#define TEST_REDUC_XOR(TYPE)  reduc_xor##TYPE   (mat_##TYPE, r_##TYPE, NROWS);
-
-#define SUM_VECTOR(RES, TYPE)\
-  for (int i = 0; i < NROWS; i++)\
-    (RES) += r_##TYPE[i];
-
-#define SUM_INT_RESULT(RES)\
-  SUM_VECTOR (RES, int8_t);\
-  SUM_VECTOR (RES, int16_t);\
-  SUM_VECTOR (RES, int32_t);\
-  SUM_VECTOR (RES, int64_t);\
-  SUM_VECTOR (RES, uint8_t);\
-  SUM_VECTOR (RES, uint16_t);\
-  SUM_VECTOR (RES, uint32_t);\
-  SUM_VECTOR (RES, uint64_t);\
-
-#define SUM_FLOAT_RESULT(RES)\
-  SUM_VECTOR (RES, float);\
-  SUM_VECTOR (RES, double);\
-
-int main ()
-{
-  int result = 0;
-  double resultF = 0.0;
-  DEF_INIT_VECTOR (int8_t)
-  DEF_INIT_VECTOR (int16_t)
-  DEF_INIT_VECTOR (int32_t)
-  DEF_INIT_VECTOR (int64_t)
-  DEF_INIT_VECTOR (uint8_t)
-  DEF_INIT_VECTOR (uint16_t)
-  DEF_INIT_VECTOR (uint32_t)
-  DEF_INIT_VECTOR (uint64_t)
-  DEF_INIT_VECTOR (float)
-  DEF_INIT_VECTOR (double)
-
-  TEST_REDUC_PLUS (int8_t)
-  TEST_REDUC_PLUS (int16_t)
-  TEST_REDUC_PLUS (int32_t)
-  TEST_REDUC_PLUS (int64_t)
-  TEST_REDUC_PLUS (uint8_t)
-  TEST_REDUC_PLUS (uint16_t)
-  TEST_REDUC_PLUS (uint32_t)
-  TEST_REDUC_PLUS (uint64_t)
-  TEST_REDUC_PLUS (float)
-  TEST_REDUC_PLUS (double)
-
-  SUM_INT_RESULT (result);
-  SUM_FLOAT_RESULT (resultF);
-
-  TEST_REDUC_MIN (int8_t)
-  TEST_REDUC_MIN (int16_t)
-  TEST_REDUC_MIN (int32_t)
-  TEST_REDUC_MIN (int64_t)
-  TEST_REDUC_MIN (uint8_t)
-  TEST_REDUC_MIN (uint16_t)
-  TEST_REDUC_MIN (uint32_t)
-  TEST_REDUC_MIN (uint64_t)
-  TEST_REDUC_MIN (float)
-  TEST_REDUC_MIN (double)
-
-  TEST_REDUC_MAX (int8_t)
-  TEST_REDUC_MAX (int16_t)
-  TEST_REDUC_MAX (int32_t)
-  TEST_REDUC_MAX (int64_t)
-  TEST_REDUC_MAX (uint8_t)
-  TEST_REDUC_MAX (uint16_t)
-  TEST_REDUC_MAX (uint32_t)
-  TEST_REDUC_MAX (uint64_t)
-  TEST_REDUC_MAX (float)
-  TEST_REDUC_MAX (double)
-
-  TEST_REDUC_AND (int8_t)
-  TEST_REDUC_AND (int16_t)
-  TEST_REDUC_AND (int32_t)
-  TEST_REDUC_AND (int64_t)
-  TEST_REDUC_AND (uint8_t)
-  TEST_REDUC_AND (uint16_t)
-  TEST_REDUC_AND (uint32_t)
-  TEST_REDUC_AND (uint64_t)
-
-  TEST_REDUC_IOR (int8_t)
-  TEST_REDUC_IOR (int16_t)
-  TEST_REDUC_IOR (int32_t)
-  TEST_REDUC_IOR (int64_t)
-  TEST_REDUC_IOR (uint8_t)
-  TEST_REDUC_IOR (uint16_t)
-  TEST_REDUC_IOR (uint32_t)
-  TEST_REDUC_IOR (uint64_t)
-
-  TEST_REDUC_XOR (int8_t)
-  TEST_REDUC_XOR (int16_t)
-  TEST_REDUC_XOR (int32_t)
-  TEST_REDUC_XOR (int64_t)
-  TEST_REDUC_XOR (uint8_t)
-  TEST_REDUC_XOR (uint16_t)
-  TEST_REDUC_XOR (uint32_t)
-  TEST_REDUC_XOR (uint64_t)
-
-  if (result != 26880)
-    {
-      fprintf (stderr, "result = %d\n", result);
-      abort ();
-    }
-
-  if (resultF != double (5760))
-    {
-      fprintf (stderr, "resultF = %1.16lf\n", resultF);
-      abort ();
-    }
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c
new file mode 100644
index 00000000000..041db66c8cf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_2_run.c
@@ -0,0 +1,79 @@
+/* { dg-do run { target { aarch64_sve_hw } } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
+
+#include "sve_reduc_2.c"
+
+#define NROWS 53
+
+/* -ffast-math fuzz for PLUS.  */
+#define CMP__Float16(X, Y) ((X) >= (Y) * 0.875 && (X) <= (Y) * 1.125)
+#define CMP_float(X, Y) ((X) == (Y))
+#define CMP_double(X, Y) ((X) == (Y))
+#define CMP_int8_t(X, Y) ((X) == (Y))
+#define CMP_int16_t(X, Y) ((X) == (Y))
+#define CMP_int32_t(X, Y) ((X) == (Y))
+#define CMP_int64_t(X, Y) ((X) == (Y))
+#define CMP_uint8_t(X, Y) ((X) == (Y))
+#define CMP_uint16_t(X, Y) ((X) == (Y))
+#define CMP_uint32_t(X, Y) ((X) == (Y))
+#define CMP_uint64_t(X, Y) ((X) == (Y))
+
+#define INIT_MATRIX(TYPE)				\
+  TYPE mat[NROWS][NUM_ELEMS (TYPE)];			\
+  TYPE r[NROWS];					\
+  for (int i = 0; i < NROWS; i++)			\
+    for (int j = 0; j < NUM_ELEMS (TYPE); j++)		\
+      {							\
+	mat[i][j] = i + (j * 2) * (j & 1 ? 1 : -1);	\
+	asm volatile ("" ::: "memory");			\
+      }
+
+#define TEST_REDUC_PLUS(TYPE)				\
+  {							\
+    INIT_MATRIX (TYPE);					\
+    reduc_plus_##TYPE (mat, r, NROWS);			\
+    for (int i = 0; i < NROWS; i++)			\
+      {							\
+	volatile TYPE r2 = 0;				\
+	for (int j = 0; j < NUM_ELEMS (TYPE); ++j)	\
+	  r2 += mat[i][j];				\
+	if (!CMP_##TYPE (r[i], r2))			\
+	  __builtin_abort ();				\
+      }							\
+    }
+
+#define TEST_REDUC_MAXMIN(TYPE, NAME, CMP_OP)		\
+  {							\
+    INIT_MATRIX (TYPE);					\
+    reduc_##NAME##_##TYPE (mat, r, NROWS);		\
+    for (int i = 0; i < NROWS; i++)			\
+      {							\
+	volatile TYPE r2 = mat[i][0];			\
+	for (int j = 0; j < NUM_ELEMS (TYPE); ++j)	\
+	  r2 = mat[i][j] CMP_OP r2 ? mat[i][j] : r2;	\
+	if (r[i] != r2)					\
+	  __builtin_abort ();				\
+      }							\
+    }
+
+#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP)		\
+  {							\
+    INIT_MATRIX (TYPE);					\
+    reduc_##NAME##_##TYPE (mat, r, NROWS);		\
+    for (int i = 0; i < NROWS; i++)			\
+      {							\
+	volatile TYPE r2 = mat[i][0];			\
+	for (int j = 0; j < NUM_ELEMS (TYPE); ++j)	\
+	  r2 BIT_OP mat[i][j];				\
+	if (r[i] != r2)					\
+	  __builtin_abort ();				\
+      }							\
+    }
+
+int main ()
+{
+  TEST_PLUS (TEST_REDUC_PLUS)
+  TEST_MAXMIN (TEST_REDUC_MAXMIN)
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_3.c b/gcc/testsuite/gcc.target/aarch64/sve_reduc_3.c
index 9e997adedca..7daf3ae130e 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_3.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_3.c
@@ -1,18 +1,52 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
 
-double
-f (double *restrict a, double *restrict b, int *lookup)
-{
-  double res = 0.0;
-  for (int i = 0; i < 512; ++i)
-    res += a[lookup[i]] * b[i];
-  return res;
+#include <stdint.h>
+
+#define NUM_ELEMS(TYPE) (32 / sizeof (TYPE))
+
+#define REDUC_PTR(DSTTYPE, SRCTYPE)				\
+void reduc_ptr_##DSTTYPE##_##SRCTYPE (DSTTYPE *restrict sum,	\
+				      SRCTYPE *restrict array,	\
+				      int count)		\
+{								\
+  *sum = 0;							\
+  for (int i = 0; i < count; ++i)				\
+    *sum += array[i];						\
 }
 
-/* { dg-final { scan-assembler-times {\tfmla\tz[0-9]+.d, p[0-7]/m, } 2 } } */
-/* Check that the vector instructions are the only instructions.  */
-/* { dg-final { scan-assembler-times {\tfmla\t} 2 } } */
-/* { dg-final { scan-assembler-not {\tfadd\t} } } */
-/* { dg-final { scan-assembler-times {\tfaddv\td0,} 1 } } */
-/* { dg-final { scan-assembler-not {\tsel\t} } } */
+REDUC_PTR (int8_t, int8_t)
+REDUC_PTR (int16_t, int16_t)
+
+REDUC_PTR (int32_t, int32_t)
+REDUC_PTR (int64_t, int64_t)
+
+REDUC_PTR (_Float16, _Float16)
+REDUC_PTR (float, float)
+REDUC_PTR (double, double)
+
+/* Widening reductions.  */
+REDUC_PTR (int32_t, int8_t)
+REDUC_PTR (int32_t, int16_t)
+
+REDUC_PTR (int64_t, int8_t)
+REDUC_PTR (int64_t, int16_t)
+REDUC_PTR (int64_t, int32_t)
+
+REDUC_PTR (float, _Float16)
+REDUC_PTR (double, float)
+
+/* Float<>Int conversions */
+REDUC_PTR (_Float16, int16_t)
+REDUC_PTR (float, int32_t)
+REDUC_PTR (double, int64_t)
+
+REDUC_PTR (int16_t, _Float16)
+REDUC_PTR (int32_t, float)
+REDUC_PTR (int64_t, double)
+
+/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tfaddv\th[0-9]+, p[0-7], z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfaddv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tfaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 3 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_reduc_4.c b/gcc/testsuite/gcc.target/aarch64/sve_reduc_4.c
index 2ba09b14851..9e997adedca 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_reduc_4.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_reduc_4.c
@@ -1,47 +1,18 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -ffast-math -fno-inline -march=armv8-a+sve" } */
-
-#include <stdint.h>
-
-#define NUM_ELEMS(TYPE) (32 / sizeof (TYPE))
-
-#define REDUC_PTR(DSTTYPE, SRCTYPE)				\
-void reduc_ptr_##DSTTYPE##_##SRCTYPE (DSTTYPE *restrict sum,	\
-				     SRCTYPE *restrict array,	\
-				     int count)			\
-{								\
-  *sum = 0;							\
-  for (int i = 0; i < count; ++i)				\
-    *sum += array[i];						\
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -march=armv8-a+sve" } */
+
+double
+f (double *restrict a, double *restrict b, int *lookup)
+{
+  double res = 0.0;
+  for (int i = 0; i < 512; ++i)
+    res += a[lookup[i]] * b[i];
+  return res;
 }
 
-REDUC_PTR (int8_t, int8_t)
-REDUC_PTR (int16_t, int16_t)
-
-REDUC_PTR (int32_t, int32_t)
-REDUC_PTR (int64_t, int64_t)
-
-REDUC_PTR (float, float)
-REDUC_PTR (double, double)
-
-/* Widening reductions.  */
-REDUC_PTR (int32_t, int8_t)
-REDUC_PTR (int32_t, int16_t)
-
-REDUC_PTR (int64_t, int8_t)
-REDUC_PTR (int64_t, int16_t)
-REDUC_PTR (int64_t, int32_t)
-
-REDUC_PTR (double, float)
-
-/* Float<>Int conversions */
-REDUC_PTR (float, int32_t)
-REDUC_PTR (double, int64_t)
-
-REDUC_PTR (int32_t, float)
-REDUC_PTR (int64_t, double)
-
-/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.s\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tuaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tfaddv\ts[0-9]+, p[0-7], z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tfaddv\td[0-9]+, p[0-7], z[0-9]+\.d\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tfmla\tz[0-9]+.d, p[0-7]/m, } 2 } } */
+/* Check that the vector instructions are the only instructions.  */
+/* { dg-final { scan-assembler-times {\tfmla\t} 2 } } */
+/* { dg-final { scan-assembler-not {\tfadd\t} } } */
+/* { dg-final { scan-assembler-times {\tfaddv\td0,} 1 } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_revb_1.c b/gcc/testsuite/gcc.target/aarch64/sve_revb_1.c
index 2b8c6e523ca..9307200fb05 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_revb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_revb_1.c
@@ -1,7 +1,9 @@
 /* { dg-do assemble } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int8_t v32qi __attribute__((vector_size (32)));
 
 #define MASK_2(X, Y) (X) ^ (Y), (X + 1) ^ (Y)
 #define MASK_4(X, Y) MASK_2 (X, Y), MASK_2 (X + 2, Y)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_revh_1.c b/gcc/testsuite/gcc.target/aarch64/sve_revh_1.c
index aaa08dc03e2..fb238373c4e 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_revh_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_revh_1.c
@@ -1,7 +1,10 @@
 /* { dg-do assemble } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef unsigned short v16hi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef uint16_t v16hi __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define MASK_2(X, Y) (X) ^ (Y), (X + 1) ^ (Y)
 #define MASK_4(X, Y) MASK_2 (X, Y), MASK_2 (X + 2, Y)
@@ -21,11 +24,13 @@ typedef unsigned short v16hi __attribute__((vector_size (32)));
 
 #define TEST_ALL(T)				\
   T (v16hi, 16, 2)				\
-  T (v16hi, 16, 4)
+  T (v16hi, 16, 4)				\
+  T (v16hf, 16, 2)				\
+  T (v16hf, 16, 4)
 
 TEST_ALL (PERMUTE)
 
 /* { dg-final { scan-assembler-not {\ttbl\t} } } */
 
-/* { dg-final { scan-assembler-times {\trevh\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d} 1 } } */
-/* { dg-final { scan-assembler-times {\trevh\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s} 1 } } */
+/* { dg-final { scan-assembler-times {\trevh\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d} 2 } } */
+/* { dg-final { scan-assembler-times {\trevh\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_revw_1.c b/gcc/testsuite/gcc.target/aarch64/sve_revw_1.c
index ac7ef0ef267..4834e2c2b01 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_revw_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_revw_1.c
@@ -1,7 +1,9 @@
 /* { dg-do assemble } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef unsigned int v8si __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef uint32_t v8si __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
 
 #define MASK_2(X, Y) (X) ^ (Y), (X + 1) ^ (Y)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_shift_1.c b/gcc/testsuite/gcc.target/aarch64/sve_shift_1.c
index 24aa47488ef..b19cd7a3161 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_shift_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_shift_1.c
@@ -1,77 +1,84 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O2 -ftree-vectorize -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve --save-temps" } */
 
-#define DO_REG_OPS(TYPE)						\
-void ashiftr_##TYPE (signed TYPE* dst, signed TYPE src, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> src;						\
-}									\
-void lshiftr_##TYPE (unsigned TYPE* dst, unsigned TYPE src, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> src;						\
-}									\
-void lshiftl_##TYPE (unsigned TYPE* dst, unsigned TYPE src, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] << src;						\
-}									\
-void vashiftr_##TYPE (signed TYPE* dst, signed TYPE* src, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> src[i];						\
-}									\
-void vlshiftr_##TYPE (unsigned TYPE* dst, unsigned TYPE* src, int count) \
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> src[i];						\
-}									\
-void vlshiftl_##TYPE (unsigned TYPE* dst, unsigned TYPE* src, int count) \
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] << src[i];						\
+#include <stdint.h>
+
+#define DO_REG_OPS(TYPE)					\
+void ashiftr_##TYPE (TYPE *dst, TYPE src, int count)		\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> src;					\
+}								\
+void lshiftr_##TYPE (u##TYPE *dst, u##TYPE src, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> src;					\
+}								\
+void lshiftl_##TYPE (u##TYPE *dst, u##TYPE src, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] << src;					\
+}								\
+void vashiftr_##TYPE (TYPE *dst, TYPE *src, int count)		\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> src[i];					\
+}								\
+void vlshiftr_##TYPE (u##TYPE *dst, u##TYPE *src, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> src[i];					\
+}								\
+void vlshiftl_##TYPE (u##TYPE *dst, u##TYPE *src, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] << src[i];					\
 }
 
-#define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME)				\
-void vashiftr_imm_##NAME##_##TYPE (signed TYPE* dst, int count)		\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> VALUE;						\
-}									\
-void vlshiftr_imm_##NAME##_##TYPE (unsigned TYPE* dst, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] >> VALUE;						\
-}									\
-void vlshiftl_imm_##NAME##_##TYPE (unsigned TYPE* dst, int count)	\
-{									\
-  for (int i = 0; i < count; ++i)					\
-    dst[i] = dst[i] << VALUE;						\
+#define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME)			\
+void vashiftr_imm_##NAME##_##TYPE (TYPE *dst, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> VALUE;					\
+}								\
+void vlshiftr_imm_##NAME##_##TYPE (u##TYPE *dst, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] >> VALUE;					\
+}								\
+void vlshiftl_imm_##NAME##_##TYPE (u##TYPE *dst, int count)	\
+{								\
+  for (int i = 0; i < count; ++i)				\
+    dst[i] = dst[i] << VALUE;					\
 }
 
-DO_REG_OPS (int);
+DO_REG_OPS (int32_t);
+DO_REG_OPS (int64_t);
 
-DO_IMMEDIATE_OPS (0, char, 0);
-DO_IMMEDIATE_OPS (5, char, 5);
-DO_IMMEDIATE_OPS (7, char, 7);
+DO_IMMEDIATE_OPS (0, int8_t, 0);
+DO_IMMEDIATE_OPS (5, int8_t, 5);
+DO_IMMEDIATE_OPS (7, int8_t, 7);
 
-DO_IMMEDIATE_OPS (0, short, 0);
-DO_IMMEDIATE_OPS (5, short, 5);
-DO_IMMEDIATE_OPS (15, short, 15);
+DO_IMMEDIATE_OPS (0, int16_t, 0);
+DO_IMMEDIATE_OPS (5, int16_t, 5);
+DO_IMMEDIATE_OPS (15, int16_t, 15);
 
-DO_IMMEDIATE_OPS (0, int, 0);
-DO_IMMEDIATE_OPS (5, int, 5);
-DO_IMMEDIATE_OPS (31, int, 31);
+DO_IMMEDIATE_OPS (0, int32_t, 0);
+DO_IMMEDIATE_OPS (5, int32_t, 5);
+DO_IMMEDIATE_OPS (31, int32_t, 31);
 
-DO_IMMEDIATE_OPS (0, long, 0);
-DO_IMMEDIATE_OPS (5, long, 5);
-DO_IMMEDIATE_OPS (63, long, 63);
+DO_IMMEDIATE_OPS (0, int64_t, 0);
+DO_IMMEDIATE_OPS (5, int64_t, 5);
+DO_IMMEDIATE_OPS (63, int64_t, 63);
 
 /* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
 
+/* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
 /* { dg-final { scan-assembler-times {\tasr\tz[0-9]+\.b, z[0-9]+\.b, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.b, z[0-9]+\.b, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tlsl\tz[0-9]+\.b, z[0-9]+\.b, #5\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_single_1.c b/gcc/testsuite/gcc.target/aarch64/sve_single_1.c
index e50d7064858..f7aeed06907 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_single_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_single_1.c
@@ -5,25 +5,28 @@
 #define N 32
 #endif
 
-#define TEST_LOOP(NAME, TYPE, VALUE)		\
+#include <stdint.h>
+
+#define TEST_LOOP(TYPE, VALUE)			\
   void						\
-  NAME (TYPE *data)				\
+  test_##TYPE (TYPE *data)			\
   {						\
     _Pragma ("omp simd")			\
     for (int i = 0; i < N / sizeof (TYPE); ++i)	\
       data[i] = VALUE;				\
   }
 
-TEST_LOOP (uc, unsigned char, 1)
-TEST_LOOP (sc, signed char, 2)
-TEST_LOOP (us, unsigned short, 3)
-TEST_LOOP (ss, signed short, 4)
-TEST_LOOP (ui, unsigned int, 5)
-TEST_LOOP (si, signed int, 6)
-TEST_LOOP (ul, unsigned long, 7)
-TEST_LOOP (sl, signed long, 8)
-TEST_LOOP (f, float, 1.0f)
-TEST_LOOP (d, double, 2.0)
+TEST_LOOP (uint8_t, 1)
+TEST_LOOP (int8_t, 2)
+TEST_LOOP (uint16_t, 3)
+TEST_LOOP (int16_t, 4)
+TEST_LOOP (uint32_t, 5)
+TEST_LOOP (int32_t, 6)
+TEST_LOOP (uint64_t, 7)
+TEST_LOOP (int64_t, 8)
+TEST_LOOP (_Float16, 1.0f)
+TEST_LOOP (float, 2.0f)
+TEST_LOOP (double, 3.0)
 
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, #1\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, #2\n} 1 } } */
@@ -33,16 +36,17 @@ TEST_LOOP (d, double, 2.0)
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, #6\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #7\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #8\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #1\.0e\+0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, #15360\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #3\.0e\+0\n} 1 } } */
 
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.b, vl32\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl16\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl16\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.s, vl8\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.d, vl4\n} 3 } } */
 
 /* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b,} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_single_2.c b/gcc/testsuite/gcc.target/aarch64/sve_single_2.c
index e167782323b..7daea6262d6 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_single_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_single_2.c
@@ -12,16 +12,17 @@
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, #6\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #7\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #8\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #1\.0e\+0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, #15360\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #3\.0e\+0\n} 1 } } */
 
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.b, vl64\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl32\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl32\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.s, vl16\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.d, vl8\n} 3 } } */
 
 /* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b,} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_single_3.c b/gcc/testsuite/gcc.target/aarch64/sve_single_3.c
index 8967586bf50..e779d6c50d9 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_single_3.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_single_3.c
@@ -12,16 +12,17 @@
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, #6\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #7\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #8\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #1\.0e\+0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, #15360\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #3\.0e\+0\n} 1 } } */
 
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.b, vl128\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl64\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl64\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.s, vl32\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.d, vl16\n} 3 } } */
 
 /* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b,} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_single_4.c b/gcc/testsuite/gcc.target/aarch64/sve_single_4.c
index 99e2284b164..7c8b3015551 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_single_4.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_single_4.c
@@ -12,16 +12,17 @@
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, #6\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #7\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #8\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #1\.0e\+0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, #15360\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0e\+0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #3\.0e\+0\n} 1 } } */
 
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.b, vl256\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl128\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.h, vl128\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.s, vl64\n} 3 } } */
 /* { dg-final { scan-assembler-times {\tptrue\tp[0-7]\.d, vl32\n} 3 } } */
 
 /* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b,} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s,} 3 } } */
 /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 3 } } */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_1.c b/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_1.c
index 3fa4f187fa4..3e7367cd9fa 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_1.c
@@ -1,53 +1,55 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
-typedef long v4di __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef char v32qi __attribute__((vector_size(32)));
+#include <stdint.h>
 
-void sve_store_64_z_lsl (unsigned long *a, unsigned long i)
+typedef int64_t v4di __attribute__((vector_size(32)));
+typedef int32_t v8si __attribute__((vector_size(32)));
+typedef int16_t v16hi __attribute__((vector_size(32)));
+typedef int8_t v32qi __attribute__((vector_size(32)));
+
+void sve_store_64_z_lsl (uint64_t *a, unsigned long i)
 {
   asm volatile ("" : "=w" (*(v4di *) &a[i]));
 }
 
-void sve_store_64_s_lsl (signed long *a, signed long i)
+void sve_store_64_s_lsl (int64_t *a, signed long i)
 {
   asm volatile ("" : "=w" (*(v4di *) &a[i]));
 }
 
-void sve_store_32_z_lsl (unsigned int *a, unsigned long i)
+void sve_store_32_z_lsl (uint32_t *a, unsigned long i)
 {
   asm volatile ("" : "=w" (*(v8si *) &a[i]));
 }
 
-void sve_store_32_s_lsl (signed int *a, signed long i)
+void sve_store_32_s_lsl (int32_t *a, signed long i)
 {
   asm volatile ("" : "=w" (*(v8si *) &a[i]));
 }
 
-void sve_store_16_z_lsl (unsigned short *a, unsigned long i)
+void sve_store_16_z_lsl (uint16_t *a, unsigned long i)
 {
   asm volatile ("" : "=w" (*(v16hi *) &a[i]));
 }
 
-void sve_store_16_s_lsl (signed short *a, signed long i)
+void sve_store_16_s_lsl (int16_t *a, signed long i)
 {
   asm volatile ("" : "=w" (*(v16hi *) &a[i]));
 }
 
 /* ??? The other argument order leads to a redundant move.  */
-void sve_store_8_z (unsigned long i, unsigned char *a)
+void sve_store_8_z (unsigned long i, uint8_t *a)
 {
   asm volatile ("" : "=w" (*(v32qi *) &a[i]));
 }
 
-void sve_store_8_s (signed long i, signed char *a)
+void sve_store_8_s (signed long i, int8_t *a)
 {
   asm volatile ("" : "=w" (*(v32qi *) &a[i]));
 }
 
-/* { dg-final { scan-assembler-times {\tst1d\tz0.d, p[0-7], \[x0, x1, lsl 3\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1w\tz0.s, p[0-7], \[x0, x1, lsl 2\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1h\tz0.h, p[0-7], \[x0, x1, lsl 1\]\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tst1b\tz0.b, p[0-7], \[x1, x0\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz0\.d, p[0-7], \[x0, x1, lsl 3\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz0\.s, p[0-7], \[x0, x1, lsl 2\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz0\.h, p[0-7], \[x0, x1, lsl 1\]\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1b\tz0\.b, p[0-7], \[x1, x0\]\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_2.c b/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_2.c
deleted file mode 100644
index 586e9726396..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_store_scalar_offset_2.c
+++ /dev/null
@@ -1,53 +0,0 @@
-/* { dg-do assemble } */
-/* { dg-options "-O3 -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
-
-typedef long v4di __attribute__((vector_size(32)));
-typedef int v8si __attribute__((vector_size(32)));
-typedef short v16hi __attribute__((vector_size(32)));
-typedef char v32qi __attribute__((vector_size(32)));
-
-void sve_store_64_z_lsl (unsigned long *a, unsigned long i)
-{
-  asm volatile ("" : "=w" (*(v4di *)&a[i]));
-}
-
-void sve_store_64_s_lsl (signed long *a, signed long i)
-{
-  asm volatile ("" : "=w" (*(v4di *)&a[i]));
-}
-
-void sve_store_32_z_lsl (unsigned int *a, unsigned long i)
-{
-  asm volatile ("" : "=w" (*(v8si *)&a[i]));
-}
-
-void sve_store_32_s_lsl (signed int *a, signed long i)
-{
-  asm volatile ("" : "=w" (*(v8si *)&a[i]));
-}
-
-void sve_store_16_z_lsl (unsigned short *a, unsigned long i)
-{
-  asm volatile ("" : "=w" (*(v16hi *)&a[i]));
-}
-
-void sve_store_16_s_lsl (signed short *a, signed long i)
-{
-  asm volatile ("" : "=w" (*(v16hi *)&a[i]));
-}
-
-/* ??? The other argument order leads to a redundant move.  */
-void sve_store_8_z (unsigned long i, unsigned char *a)
-{
-  asm volatile ("" : "=w" (*(v32qi *)&a[i]));
-}
-
-void sve_store_8_s (signed long i, signed char *a)
-{
-  asm volatile ("" : "=w" (*(v32qi *)&a[i]));
-}
-
-/* { dg-final { scan-assembler-times "st1d\\tz0.d, p\[0-9\]+, \\\[x0, x1, lsl 3\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "st1w\\tz0.s, p\[0-9\]+, \\\[x0, x1, lsl 2\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "st1h\\tz0.h, p\[0-9\]+, \\\[x0, x1, lsl 1\\\]" 2 } } */
-/* { dg-final { scan-assembler-times "st1b\\tz0.b, p\[0-9\]+, \\\[x1, x0\\\]" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_subr_1.c b/gcc/testsuite/gcc.target/aarch64/sve_subr_1.c
index de4dbe8c6cc..1d8dc76719d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_subr_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_subr_1.c
@@ -1,28 +1,32 @@
 /* { dg-do assemble } */
-/* { dg-options "-std=c99 -O3 -march=armv8-a+sve --save-temps" } */
+/* { dg-options "-O3 -march=armv8-a+sve --save-temps" } */
+
+#include <stdint.h>
 
 #define DO_IMMEDIATE_OPS(VALUE, TYPE, NAME)			\
-void vsubrarithimm_##NAME##_##TYPE (TYPE* dst, int count)	\
+void vsubr_arithimm_##NAME##_##TYPE (TYPE *dst, int count)	\
 {								\
   for (int i = 0; i < count; ++i)				\
     dst[i] = VALUE - dst[i];					\
 }
 
 #define DO_ARITH_OPS(TYPE)			\
-DO_IMMEDIATE_OPS (0, TYPE, 0);			\
-DO_IMMEDIATE_OPS (5, TYPE, 5);			\
-DO_IMMEDIATE_OPS (255, TYPE, 255);		\
-DO_IMMEDIATE_OPS (256, TYPE, 256);		\
-DO_IMMEDIATE_OPS (257, TYPE, 257);		\
-DO_IMMEDIATE_OPS (65280, TYPE, 65280);		\
-DO_IMMEDIATE_OPS (65281, TYPE, 65281);		\
-DO_IMMEDIATE_OPS (-1, TYPE, minus1);
-
-DO_ARITH_OPS (char)
-DO_ARITH_OPS (int)
-DO_ARITH_OPS (long)
+  DO_IMMEDIATE_OPS (0, TYPE, 0);		\
+  DO_IMMEDIATE_OPS (5, TYPE, 5);		\
+  DO_IMMEDIATE_OPS (255, TYPE, 255);		\
+  DO_IMMEDIATE_OPS (256, TYPE, 256);		\
+  DO_IMMEDIATE_OPS (257, TYPE, 257);		\
+  DO_IMMEDIATE_OPS (65280, TYPE, 65280);	\
+  DO_IMMEDIATE_OPS (65281, TYPE, 65281);	\
+  DO_IMMEDIATE_OPS (-1, TYPE, minus1);
+
+DO_ARITH_OPS (int8_t)
+DO_ARITH_OPS (int16_t)
+DO_ARITH_OPS (int32_t)
+DO_ARITH_OPS (int64_t)
 
 /* { dg-final { scan-assembler-not {\tsub\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 
@@ -35,6 +39,14 @@ DO_ARITH_OPS (long)
 /* { dg-final { scan-assembler-not   {\tsubr\tz[0-9]+\.b, z[0-9]+\.b, #65281\n} } } */
 /* { dg-final { scan-assembler-not   {\tsubr\tz[0-9]+\.b, z[0-9]+\.b, #-1\n} } } */
 
+/* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #255\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #256\n} 1 } } */
+/* { dg-final { scan-assembler-not   {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #257\n} } } */
+/* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #65280\n} 1 } } */
+/* { dg-final { scan-assembler-not   {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #65281\n} } } */
+/* { dg-final { scan-assembler-not   {\tsubr\tz[0-9]+\.h, z[0-9]+\.h, #-1\n} } } */
+
 /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, z[0-9]+\.s, #5\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, z[0-9]+\.s, #255\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, z[0-9]+\.s, #256\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_trn1_1.c b/gcc/testsuite/gcc.target/aarch64/sve_trn1_1.c
index c82f30e9578..0c7b887d232 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_trn1_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_trn1_1.c
@@ -5,12 +5,15 @@
 #define BIAS 0
 #endif
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define MASK_2(X, Y) X, Y + X
 #define MASK_4(X, Y) MASK_2 (X, Y), MASK_2 (X + 2, Y)
@@ -37,7 +40,8 @@ typedef float v8sf __attribute__((vector_size (32)));
   T (v16hi, 16)					\
   T (v32qi, 32)					\
   T (v4df, 4)					\
-  T (v8sf, 8)
+  T (v8sf, 8)					\
+  T (v16hf, 16)
 
 TEST_ALL (PERMUTE)
 
@@ -45,5 +49,5 @@ TEST_ALL (PERMUTE)
 
 /* { dg-final { scan-assembler-times {\ttrn1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d} 2 } } */
 /* { dg-final { scan-assembler-times {\ttrn1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s} 2 } } */
-/* { dg-final { scan-assembler-times {\ttrn1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\ttrn1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 2 } } */
 /* { dg-final { scan-assembler-times {\ttrn1\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_trn2_1.c b/gcc/testsuite/gcc.target/aarch64/sve_trn2_1.c
index a4b3ea40a21..6654781bbd5 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_trn2_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_trn2_1.c
@@ -8,5 +8,5 @@
 
 /* { dg-final { scan-assembler-times {\ttrn2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d} 2 } } */
 /* { dg-final { scan-assembler-times {\ttrn2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s} 2 } } */
-/* { dg-final { scan-assembler-times {\ttrn2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\ttrn2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 2 } } */
 /* { dg-final { scan-assembler-times {\ttrn2\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1.c
index de010318fd1..c415c4bf5d1 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1.c
@@ -1,7 +1,10 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-void unpack_double_int_plus8 (double *d, signed int *s, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+unpack_double_int_plus8 (double *d, int32_t *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = s[i] + 8;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c
index 083f1b346d3..f8d9cc2b2ca 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c
@@ -1,9 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_unpack_fcvt_signed_1.c"
 
@@ -15,19 +11,18 @@ int __attribute__ ((optimize (1)))
 main (void)
 {
   double array_dest[ARRAY_SIZE];
-  signed int array_source[ARRAY_SIZE];
+  int32_t array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   unpack_double_int_plus8 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
-    if (array_dest[i] != (float) (VAL1 + 8))
-      {
-	fprintf (stderr,"%d: %f != %f\n", i, array_dest[i],
-		 (float) (VAL1 + 8));
-	exit (1);
-      }
+    if (array_dest[i] != (double) (VAL1 + 8))
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c
index cc1b5e576f4..fb9fe810cf9 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c
@@ -1,7 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-void unpack_double_int_plus9 (double *d, unsigned int *s, int size)
+#include <stdint.h>
+
+void __attribute__ ((noinline, noclone))
+unpack_double_int_plus9 (double *d, uint32_t *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = (double) (s[i] + 9);
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c
index 1c31e18c410..93788a342ce 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c
@@ -1,9 +1,5 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
-
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include "sve_unpack_fcvt_unsigned_1.c"
 
@@ -15,19 +11,18 @@ int __attribute__ ((optimize (1)))
 main (void)
 {
   double array_dest[ARRAY_SIZE];
-  unsigned int array_source[ARRAY_SIZE];
+  uint32_t array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   unpack_double_int_plus9 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
     if (array_dest[i] != (double) (VAL1 + 9))
-      {
-	fprintf (stderr,"%d: %lf != %lf\n", i, array_dest[i],
-		 (double) (VAL1 + 9));
-	exit (1);
-      }
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1.c
index 86bd60918e2..73c7a815e36 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-void unpack_float_plus_7point9 (double *d, float *s, int size)
+void __attribute__ ((noinline, noclone))
+unpack_float_plus_7point9 (double *d, float *s, int size)
 {
   for (int i = 0; i < size; i++)
     d[i] = s[i] + 7.9;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1_run.c
index 4e280dd10f9..2a645b33d4b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_float_1_run.c
@@ -1,10 +1,6 @@
 /* { dg-do run { target aarch64_sve_hw } } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
-
 #include "sve_unpack_float_1.c"
 
 #define ARRAY_SIZE 199
@@ -18,16 +14,15 @@ main (void)
   float array_source[ARRAY_SIZE];
 
   for (int i = 0; i < ARRAY_SIZE; i++)
-    array_source[i] = VAL1;
+    {
+      array_source[i] = VAL1;
+      asm volatile ("" ::: "memory");
+    }
 
   unpack_float_plus_7point9 (array_dest, array_source, ARRAY_SIZE);
   for (int i = 0; i < ARRAY_SIZE; i++)
     if (array_dest[i] != (double) (VAL1 + 7.9))
-      {
-	fprintf (stderr,"%d: %f != %f\n", i, array_dest[i],
-		 (double) (VAL1 + 7.9));
-	exit (1);
-      }
+      __builtin_abort ();
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1.c
index e4d0393b047..4d345cf81e9 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1.c
@@ -1,20 +1,25 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-#define UNPACK(TYPED, TYPES, SIGND)					\
-void unpack_##TYPED##_##TYPES##_##SIGND (SIGND TYPED *d, signed TYPES *s, \
-					 int size)			\
-{									\
-   for (int i = 0; i < size; i++)					\
-     d[i] = s[i] + 1;							\
+#include <stdint.h>
+
+#define UNPACK(TYPED, TYPES)				\
+void __attribute__ ((noinline, noclone))		\
+unpack_##TYPED##_##TYPES (TYPED *d, TYPES *s, int size)	\
+{							\
+  for (int i = 0; i < size; i++)			\
+    d[i] = s[i] + 1;					\
 }
 
-UNPACK (long, int, signed)
-UNPACK (int, short, signed)
-UNPACK (short, char, signed)
-UNPACK (long, int, unsigned)
-UNPACK (int, short, unsigned)
-UNPACK (short, char, unsigned)
+#define TEST_ALL(T)			\
+  T (int64_t, int32_t)			\
+  T (int32_t, int16_t)			\
+  T (int16_t, int8_t)			\
+  T (uint64_t, int32_t)			\
+  T (uint32_t, int16_t)			\
+  T (uint16_t, int8_t)
+
+TEST_ALL (UNPACK)
 
 /* { dg-final { scan-assembler-times {\tsunpkhi\tz[0-9]+\.d, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tsunpkhi\tz[0-9]+\.s, z[0-9]+\.h\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1_run.c
index b63aa5b7d1d..d183408d124 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_signed_1_run.c
@@ -1,46 +1,28 @@
 /* { dg-do run { target aarch64_sve_hw } } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
-
 #include "sve_unpack_signed_1.c"
 
 #define ARRAY_SIZE 33
 
-#define RUN_AND_CHECK_LOOP(TYPED, TYPES, VALUED, VALUES)		\
-{									\
-  int value = 0;							\
-  TYPED arrayd[ARRAY_SIZE];						\
-  TYPES arrays[ARRAY_SIZE];						\
-  memset (arrayd, 67, ARRAY_SIZE * sizeof (TYPED));			\
-  memset (arrays, VALUES, ARRAY_SIZE * sizeof (TYPES));			\
-  unpack_##TYPED##_##TYPES##_signed (arrayd, arrays, ARRAY_SIZE);	\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-	fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-	exit (1);							\
-      }									\
-  memset (arrayd, 74, ARRAY_SIZE * sizeof (TYPED));			\
-  unpack_##TYPED##_##TYPES##_unsigned (arrayd, arrays, ARRAY_SIZE);	\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-	fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-	exit (1);							\
-      }									\
-}
+#define TEST_LOOP(TYPED, TYPES)					\
+  {								\
+    TYPED arrayd[ARRAY_SIZE];					\
+    TYPES arrays[ARRAY_SIZE];					\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      {								\
+	arrays[i] = (i - 10) * 3;				\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    unpack_##TYPED##_##TYPES (arrayd, arrays, ARRAY_SIZE);	\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      if (arrayd[i] != (TYPED) ((TYPES) ((i - 10) * 3) + 1))	\
+	__builtin_abort ();					\
+  }
 
-int main (void)
+int __attribute__ ((optimize (1)))
+main (void)
 {
-  int total = 5;
-  RUN_AND_CHECK_LOOP (short, char, total+1, total);
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (int, short, total+1, total);
-  total = (total << 8) + 5;
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (long, int, total+1, total);
+  TEST_ALL (TEST_LOOP)
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1.c
index 94192d977f4..fa8de963264 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1.c
@@ -1,20 +1,25 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-#define UNPACK(TYPED, TYPES, SIGND)					\
-void unpack_##TYPED##_##TYPES##_##SIGND (SIGND TYPED *d, unsigned TYPES *s, \
-					 int size)			\
-{									\
-  for (int i = 0; i < size; i++)					\
-    d[i] = s[i] + 1;							\
+#include <stdint.h>
+
+#define UNPACK(TYPED, TYPES)				\
+void __attribute__ ((noinline, noclone))		\
+unpack_##TYPED##_##TYPES (TYPED *d, TYPES *s, int size)	\
+{							\
+  for (int i = 0; i < size; i++)			\
+    d[i] = s[i] + 1;					\
 }
 
-UNPACK (long, int, signed)			\
-UNPACK (int, short, signed)			\
-UNPACK (short, char, signed)			\
-UNPACK (long, int, unsigned)			\
-UNPACK (int, short, unsigned)			\
-UNPACK (short, char, unsigned)
+#define TEST_ALL(T)			\
+  T (int64_t, uint32_t)			\
+  T (int32_t, uint16_t)			\
+  T (int16_t, uint8_t)			\
+  T (uint64_t, uint32_t)		\
+  T (uint32_t, uint16_t)		\
+  T (uint16_t, uint8_t)
+
+TEST_ALL (UNPACK)
 
 /* { dg-final { scan-assembler-times {\tuunpkhi\tz[0-9]+\.d, z[0-9]+\.s\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuunpkhi\tz[0-9]+\.s, z[0-9]+\.h\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1_run.c
index 33f5f939c84..3fa66220f17 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_unpack_unsigned_1_run.c
@@ -1,46 +1,28 @@
 /* { dg-do run { target aarch64_sve_hw } } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
 
-#include <string.h>
-#include <stdio.h>
-#include <stdlib.h>
-
 #include "sve_unpack_unsigned_1.c"
 
 #define ARRAY_SIZE 85
 
-#define RUN_AND_CHECK_LOOP(TYPED, TYPES, VALUED, VALUES)		\
-{									\
-  int value = 0;							\
-  TYPED arrayd[ARRAY_SIZE];						\
-  TYPES arrays[ARRAY_SIZE];						\
-  memset (arrayd, 67, ARRAY_SIZE * sizeof (TYPED));			\
-  memset (arrays, VALUES, ARRAY_SIZE * sizeof (TYPES));			\
-  unpack_##TYPED##_##TYPES##_signed (arrayd, arrays, ARRAY_SIZE);	\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-	fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-	exit (1);							\
-      }									\
-  memset (arrayd, 74, ARRAY_SIZE * sizeof (TYPED));			\
-  unpack_##TYPED##_##TYPES##_unsigned (arrayd, arrays, ARRAY_SIZE);	\
-  for (int i = 0; i < ARRAY_SIZE; i++)					\
-    if (arrayd[i] != VALUED)						\
-      {									\
-	fprintf (stderr,"%d: %d != %d\n", i, arrayd[i], VALUED);	\
-	exit (1);							\
-      }									\
-}
+#define TEST_LOOP(TYPED, TYPES)					\
+  {								\
+    TYPED arrayd[ARRAY_SIZE];					\
+    TYPES arrays[ARRAY_SIZE];					\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      {								\
+	arrays[i] = (i - 10) * 3;				\
+	asm volatile ("" ::: "memory");				\
+      }								\
+    unpack_##TYPED##_##TYPES (arrayd, arrays, ARRAY_SIZE);	\
+    for (int i = 0; i < ARRAY_SIZE; i++)			\
+      if (arrayd[i] != (TYPED) ((TYPES) ((i - 10) * 3) + 1))	\
+	__builtin_abort ();					\
+  }
 
-int main (void)
+int __attribute__ ((optimize (1)))
+main (void)
 {
-  int total = 5;
-  RUN_AND_CHECK_LOOP (short, char, total + 1, total);
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (int, short, total + 1, total);
-  total = (total << 8) + 5;
-  total = (total << 8) + 5;
-  RUN_AND_CHECK_LOOP (long, int, total + 1, total);
+  TEST_ALL (TEST_LOOP)
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1.c b/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1.c
index 22fc84f066c..aaa4fdccbf0 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define UZP1(TYPE, MASK)				\
 TYPE uzp1_##TYPE (TYPE values1, TYPE values2)		\
@@ -25,6 +28,8 @@ UZP1 (v32qi, ((v32qi) { 0, 2, 4, 6, 8, 10, 12, 14,
 			48, 50, 52, 54, 56, 58, 60, 62 }));
 UZP1 (v4df,  ((v4di) { 0, 2, 4, 6 }));
 UZP1 (v8sf,  ((v8si) { 0, 2, 4, 6, 8, 10, 12, 14 }));
+UZP1 (v16hf, ((v16hi) { 0, 2, 4, 6, 8, 10, 12, 14,
+			16, 18, 20, 22, 24, 26, 28, 30 }));
 
 /* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} } } */
 /* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} } } */
@@ -33,5 +38,5 @@ UZP1 (v8sf,  ((v8si) { 0, 2, 4, 6, 8, 10, 12, 14 }));
 
 /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1_run.c
index 338670c03af..d35dad0ffca 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_uzp1_1_run.c
@@ -2,7 +2,6 @@
 /* { dg-options "-O -march=armv8-a+sve" } */
 
 #include "sve_uzp1_1.c"
-extern void abort (void);
 
 #define TEST_UZP1(TYPE, EXPECTED_RESULT, VALUES1, VALUES2)		\
 {									\
@@ -12,7 +11,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = uzp1_##TYPE (values1, values2);				\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -53,5 +52,12 @@ int main (void)
 	     ((v8sf) { 3.0, 5.0, 7.0, 9.0, 33.0, 35.0, 37.0, 39.0 }),
 	     ((v8sf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0 }),
 	     ((v8sf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0 }));
+  TEST_UZP1 (v16hf,
+	     ((v16hf) { 3.0, 5.0, 7.0, 9.0, 11.0, 13.0, 15.0, 17.0,
+			33.0, 35.0, 37.0, 39.0, 41.0, 43.0, 45.0, 47.0 }),
+	     ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+	     ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1.c b/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1.c
index 39c8ff43368..1bb84d80eb0 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define UZP2(TYPE, MASK)				\
 TYPE uzp2_##TYPE (TYPE values1, TYPE values2)		\
@@ -24,6 +27,8 @@ UZP2 (v32qi, ((v32qi) { 1, 3, 5, 7, 9, 11, 13, 15,
 			49, 51, 53, 55, 57, 59, 61, 63 }));
 UZP2 (v4df,  ((v4di) { 1, 3, 5, 7 }));
 UZP2 (v8sf,  ((v8si) { 1, 3, 5, 7, 9, 11, 13, 15 }));
+UZP2 (v16hf, ((v16hi) { 1, 3, 5, 7, 9, 11, 13, 15,
+			17, 19, 21, 23, 25, 27, 29, 31 }));
 
 /* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} } } */
 /* { dg-final { scan-assembler-not {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} } } */
@@ -32,5 +37,5 @@ UZP2 (v8sf,  ((v8si) { 1, 3, 5, 7, 9, 11, 13, 15 }));
 
 /* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
 /* { dg-final { scan-assembler-times {\tuzp2\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1_run.c
index b9b8cccfafe..d7a241c1258 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_uzp2_1_run.c
@@ -2,7 +2,6 @@
 /* { dg-options "-O -march=armv8-a+sve" } */
 
 #include "sve_uzp2_1.c"
-extern void abort (void);
 
 #define TEST_UZP2(TYPE, EXPECTED_RESULT, VALUES1, VALUES2)		\
 {									\
@@ -12,7 +11,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = uzp2_##TYPE (values1, values2);				\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -53,5 +52,12 @@ int main (void)
 	     ((v8sf) { 4.0, 6.0, 8.0, 10.0, 34.0, 36.0, 38.0, 40.0 }),
 	     ((v8sf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0 }),
 	     ((v8sf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0 }));
+  TEST_UZP2 (v16hf,
+	     ((v16hf) { 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 18.0,
+			34.0, 36.0, 38.0, 40.0, 42.0, 44.0, 46.0, 48.0 }),
+	     ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+	     ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_1.C b/gcc/testsuite/gcc.target/aarch64/sve_vcond_1.C
index 48ad92d0ab7..9be09546c80 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_1.C
@@ -1,5 +1,5 @@
-/* { dg-do compile { target { ! *-*-* } } } */
-/* { dg-options "-std=c++11 -O3 -fno-inline -march=armv8-a+sve" } */
+/* { dg-do assemble } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256 --save-temps" } */
 
 #include <stdint.h>
 
@@ -13,231 +13,233 @@ typedef uint16_t v16hu __attribute__((vector_size(32)));
 typedef uint32_t v8su __attribute__((vector_size(32)));
 typedef uint64_t v4du __attribute__((vector_size(32)));
 
-#define NUM_ELEMS(TYPE) (sizeof (r_##TYPE) / sizeof (r_##TYPE[0]))
-
-#define DEF_VCOND(TYPE,COND,SUFFIX)				\
-TYPE vcond_##TYPE##SUFFIX (TYPE x, TYPE y, TYPE a, TYPE b)	\
+#define DEF_VCOND_VAR(TYPE, COND, SUFFIX)			\
+TYPE vcond_##TYPE##_##SUFFIX (TYPE x, TYPE y, TYPE a, TYPE b)	\
 {								\
   TYPE r;							\
   r = a COND b ? x : y;						\
   return r;							\
 }
 
-#define DEF_VCOND_IMM(TYPE,COND,IMM,SUFFIX)		\
-TYPE vcond_imm_##TYPE##SUFFIX (TYPE x, TYPE y, TYPE a)	\
-{							\
-  TYPE r;						\
-  r = a COND IMM ? x : y;				\
-  return r;						\
+#define DEF_VCOND_IMM(TYPE, COND, IMM, SUFFIX)			\
+TYPE vcond_imm_##TYPE##_##SUFFIX (TYPE x, TYPE y, TYPE a)	\
+{								\
+  TYPE r;							\
+  r = a COND IMM ? x : y;					\
+  return r;							\
 }
 
-#define DEF_VCOND_SIGNED_ALL(COND,SUFFIX)	\
-DEF_VCOND (v32qi,COND,SUFFIX)			\
-DEF_VCOND (v16hi,COND,SUFFIX)			\
-DEF_VCOND (v8si,COND,SUFFIX)			\
-DEF_VCOND (v4di,COND,SUFFIX)
-
-#define DEF_VCOND_UNSIGNED_ALL(COND,SUFFIX)	\
-DEF_VCOND (v32qu,COND,SUFFIX)			\
-DEF_VCOND (v16hu,COND,SUFFIX)			\
-DEF_VCOND (v8su,COND,SUFFIX)			\
-DEF_VCOND (v4du,COND,SUFFIX)
-
-#define DEF_VCOND_ALL(COND,SUFFIX)		\
-DEF_VCOND_SIGNED_ALL (COND,SUFFIX)		\
-DEF_VCOND_UNSIGNED_ALL (COND,SUFFIX)
-
-#define DEF_VCOND_IMM_SIGNED_ALL(COND,IMM,SUFFIX)	\
-DEF_VCOND_IMM (v32qi,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v16hi,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v8si,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v4di,COND,IMM,SUFFIX)
-
-#define DEF_VCOND_IMM_UNSIGNED_ALL(COND,IMM,SUFFIX)	\
-DEF_VCOND_IMM (v32qu,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v16hu,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v8su,COND,IMM,SUFFIX)			\
-DEF_VCOND_IMM (v4du,COND,IMM,SUFFIX)
-
-#define DEF_VCOND_IMM_ALL(COND,IMM,SUFFIX)	\
-DEF_VCOND_IMM_SIGNED_ALL (COND,IMM,SUFFIX)	\
-DEF_VCOND_IMM_UNSIGNED_ALL (COND,IMM,SUFFIX)
-
-DEF_VCOND_ALL (>, _gt)
-DEF_VCOND_ALL (<, _lt)
-DEF_VCOND_ALL (>=, _ge)
-DEF_VCOND_ALL (<=, _le)
-DEF_VCOND_ALL (==, _eq)
-DEF_VCOND_ALL (!=, _ne)
-
-/* == Expect immediates to make it into the encoding == */
-
-DEF_VCOND_IMM_ALL (>, 5, _gt)
-DEF_VCOND_IMM_ALL (<, 5, _lt)
-DEF_VCOND_IMM_ALL (>=, 5, _ge)
-DEF_VCOND_IMM_ALL (<=, 5, _le)
-DEF_VCOND_IMM_ALL (==, 5, _eq)
-DEF_VCOND_IMM_ALL (!=, 5, _ne)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, 15, _gt2)
-DEF_VCOND_IMM_SIGNED_ALL (<, 15, _lt2)
-DEF_VCOND_IMM_SIGNED_ALL (>=, 15, _ge2)
-DEF_VCOND_IMM_SIGNED_ALL (<=, 15, _le2)
-DEF_VCOND_IMM_SIGNED_ALL (==, 15, _eq2)
-DEF_VCOND_IMM_SIGNED_ALL (!=, 15, _ne2)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, -16, _gt3)
-DEF_VCOND_IMM_SIGNED_ALL (<, -16, _lt3)
-DEF_VCOND_IMM_SIGNED_ALL (>=, -16, _ge3)
-DEF_VCOND_IMM_SIGNED_ALL (<=, -16, _le3)
-DEF_VCOND_IMM_SIGNED_ALL (==, -16, _eq3)
-DEF_VCOND_IMM_SIGNED_ALL (!=, -16, _ne3)
-
-DEF_VCOND_IMM_UNSIGNED_ALL (>, 0, _gt4)
-/* Testing if an unsigned value >= 0 or < 0 is pointless as it will
-   get folded away by the compiler.  */
-DEF_VCOND_IMM_UNSIGNED_ALL (<=, 0, _le4)
-
-DEF_VCOND_IMM_UNSIGNED_ALL (>, 31, _gt5)
-DEF_VCOND_IMM_UNSIGNED_ALL (<, 31, _lt5)
-DEF_VCOND_IMM_UNSIGNED_ALL (>=, 31, _ge5)
-DEF_VCOND_IMM_UNSIGNED_ALL (<=, 31, _le5)
-
-/* Expect immediates to NOT make it into the encoding, and instead be
-   forced into a register.  == */
-DEF_VCOND_IMM_ALL (>, 32, _gt6)
-DEF_VCOND_IMM_ALL (<, 32, _lt6)
-DEF_VCOND_IMM_ALL (>=, 32, _ge6)
-DEF_VCOND_IMM_ALL (<=, 32, _le6)
-DEF_VCOND_IMM_ALL (==, 32, _eq6)
-DEF_VCOND_IMM_ALL (!=, 32, _ne6)
-
-/* { dg-final { scan-assembler {\tsel\tz[0-9]+.b, p[0-7], z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tsel\tz[0-9]+.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tsel\tz[0-9]+.s, p[0-7], z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tsel\tz[0-9]+.d, p[0-7], z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
-
-
-
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmplt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmple\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
-/* { dg-final { scan-assembler {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
-
-
-
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #0\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #0\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #0\n} } } */
-
-
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
-
-/* { dg-final { scan-assembler {\tcmplo\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmplo\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmplo\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmplo\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
-
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmphs\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
-
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
-/* { dg-final { scan-assembler {\tcmpls\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
+#define TEST_COND_VAR_SIGNED_ALL(T, COND, SUFFIX)	\
+  T (v32qi, COND, SUFFIX)				\
+  T (v16hi, COND, SUFFIX)				\
+  T (v8si, COND, SUFFIX)				\
+  T (v4di, COND, SUFFIX)
+
+#define TEST_COND_VAR_UNSIGNED_ALL(T, COND, SUFFIX)	\
+  T (v32qu, COND, SUFFIX)				\
+  T (v16hu, COND, SUFFIX)				\
+  T (v8su, COND, SUFFIX)				\
+  T (v4du, COND, SUFFIX)
+
+#define TEST_COND_VAR_ALL(T, COND, SUFFIX)		\
+  TEST_COND_VAR_SIGNED_ALL (T, COND, SUFFIX)		\
+  TEST_COND_VAR_UNSIGNED_ALL (T, COND, SUFFIX)
+
+#define TEST_VAR_ALL(T)				\
+  TEST_COND_VAR_ALL (T, >, gt)			\
+  TEST_COND_VAR_ALL (T, <, lt)			\
+  TEST_COND_VAR_ALL (T, >=, ge)			\
+  TEST_COND_VAR_ALL (T, <=, le)			\
+  TEST_COND_VAR_ALL (T, ==, eq)			\
+  TEST_COND_VAR_ALL (T, !=, ne)
+
+#define TEST_COND_IMM_SIGNED_ALL(T, COND, IMM, SUFFIX)	\
+  T (v32qi, COND, IMM, SUFFIX)				\
+  T (v16hi, COND, IMM, SUFFIX)				\
+  T (v8si, COND, IMM, SUFFIX)				\
+  T (v4di, COND, IMM, SUFFIX)
+
+#define TEST_COND_IMM_UNSIGNED_ALL(T, COND, IMM, SUFFIX)	\
+  T (v32qu, COND, IMM, SUFFIX)					\
+  T (v16hu, COND, IMM, SUFFIX)					\
+  T (v8su, COND, IMM, SUFFIX)					\
+  T (v4du, COND, IMM, SUFFIX)
+
+#define TEST_COND_IMM_ALL(T, COND, IMM, SUFFIX)		\
+  TEST_COND_IMM_SIGNED_ALL (T, COND, IMM, SUFFIX)	\
+  TEST_COND_IMM_UNSIGNED_ALL (T, COND, IMM, SUFFIX)
+
+#define TEST_IMM_ALL(T)							\
+  /* Expect immediates to make it into the encoding.  */		\
+  TEST_COND_IMM_ALL (T, >, 5, gt)					\
+  TEST_COND_IMM_ALL (T, <, 5, lt)					\
+  TEST_COND_IMM_ALL (T, >=, 5, ge)					\
+  TEST_COND_IMM_ALL (T, <=, 5, le)					\
+  TEST_COND_IMM_ALL (T, ==, 5, eq)					\
+  TEST_COND_IMM_ALL (T, !=, 5, ne)					\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, 15, gt2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, 15, lt2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, 15, ge2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, 15, le2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, ==, 15, eq2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, !=, 15, ne2)				\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, -16, gt3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, -16, lt3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, -16, ge3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, -16, le3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, ==, -16, eq3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, !=, -16, ne3)				\
+									\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >, 0, gt4)				\
+  /* Testing if an unsigned value >= 0 or < 0 is pointless as it will	\
+     get folded away by the compiler.  */				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <=, 0, le4)				\
+									\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >, 31, gt5)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <, 31, lt5)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >=, 31, ge5)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <=, 31, le5)				\
+									\
+  /* Expect immediates to NOT make it into the encoding, and instead be	\
+     forced into a register.  */					\
+  TEST_COND_IMM_ALL (T, >, 32, gt6)					\
+  TEST_COND_IMM_ALL (T, <, 32, lt6)					\
+  TEST_COND_IMM_ALL (T, >=, 32, ge6)					\
+  TEST_COND_IMM_ALL (T, <=, 32, le6)					\
+  TEST_COND_IMM_ALL (T, ==, 32, eq6)					\
+  TEST_COND_IMM_ALL (T, !=, 32, ne6)
+
+TEST_VAR_ALL (DEF_VCOND_VAR)
+TEST_IMM_ALL (DEF_VCOND_IMM)
+
+/* { dg-final { scan-assembler {\tsel\tz[0-9]+\.b, p[0-7], z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tsel\tz[0-9]+\.s, p[0-7], z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tsel\tz[0-9]+\.d, p[0-7], z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} } } */
+
+
+
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmplt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmple\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} } } */
+/* { dg-final { scan-assembler {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} } } */
+
+
+
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #0\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #0\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #0\n} } } */
+
+
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
+
+/* { dg-final { scan-assembler {\tcmplo\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmplo\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmplo\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmplo\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
+
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmphs\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
+
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #31\n} } } */
+/* { dg-final { scan-assembler {\tcmpls\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #31\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_vcond_1_run.C
index e2b1c62a667..42e09d94393 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_1_run.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_1_run.C
@@ -1,100 +1,46 @@
-/* { dg-do run { target { ! *-*-* } } } */
-/* { dg-options "-std=c++11 -O3 -fno-inline -march=armv8-a+sve" } */
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O -march=armv8-a+sve" } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */
 
 #include "sve_vcond_1.C"
 
-#include <stdlib.h>
+#define NUM_ELEMS(X) (sizeof (X) / sizeof (X[0]))
 
-#define TEST_VCOND(TYPE,COND,SUFFIX)			\
+#define TEST_VCOND_VAR(TYPE, COND, SUFFIX)		\
 {							\
-  TYPE x = { 1 }, y = { 2 }, a = { 3 }, b = { 4 };	\
-  r_##TYPE += vcond_##TYPE##SUFFIX (x, y, a, b);	\
+  TYPE x, y, a, b;					\
+  for (int i = 0; i < NUM_ELEMS (x); ++i)		\
+    {							\
+      a[i] = i - 2;					\
+      b[i] = NUM_ELEMS (x) - 2 - i;			\
+      x[i] = i * 2;					\
+      y[i] = -i * 3;					\
+    }							\
+  TYPE r = vcond_##TYPE##_##SUFFIX (x, y, a, b);	\
+  for (int i = 0; i < NUM_ELEMS (x); ++i)		\
+    if (r[i] != (a[i] COND b[i] ? x[i] : y[i]))		\
+      __builtin_abort ();				\
 }
 
-#define TEST_VCOND_IMM(TYPE,COND,IMM,SUFFIX)		\
+#define TEST_VCOND_IMM(TYPE, COND, IMM, SUFFIX)		\
 {							\
-  TYPE x = { 1 }, y = { 2 }, a = { 3 };			\
-  r_##TYPE += vcond_imm_##TYPE##SUFFIX (x, y, a);	\
+  TYPE x, y, a;						\
+  for (int i = 0; i < NUM_ELEMS (x); ++i)		\
+    {							\
+      a[i] = IMM - 2 + i;				\
+      x[i] = i * 2;					\
+      y[i] = -i * 3;					\
+    }							\
+  TYPE r = vcond_imm_##TYPE##_##SUFFIX (x, y, a);	\
+  for (int i = 0; i < NUM_ELEMS (x); ++i)		\
+    if (r[i] != (a[i] COND IMM ? x[i] : y[i]))		\
+      __builtin_abort ();				\
 }
 
 
-#define TEST_VCOND_SIGNED_ALL(COND, SUFFIX)	\
-TEST_VCOND (v32qi, COND, SUFFIX)		\
-TEST_VCOND (v16hi, COND, SUFFIX)		\
-TEST_VCOND (v8si, COND, SUFFIX)			\
-TEST_VCOND (v4di, COND, SUFFIX)
-
-#define TEST_VCOND_UNSIGNED_ALL(COND, SUFFIX)	\
-TEST_VCOND (v32qu, COND, SUFFIX)		\
-TEST_VCOND (v16hu, COND, SUFFIX)		\
-TEST_VCOND (v8su, COND, SUFFIX)			\
-TEST_VCOND (v4du, COND, SUFFIX)
-
-#define TEST_VCOND_ALL(COND, SUFFIX)		\
-TEST_VCOND_SIGNED_ALL (COND, SUFFIX)		\
-TEST_VCOND_UNSIGNED_ALL(COND, SUFFIX)
-
-#define TEST_VCOND_IMM_SIGNED_ALL(COND, IMM, SUFFIX)	\
-TEST_VCOND_IMM (v32qi, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v16hi, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v8si, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v4di, COND, IMM, SUFFIX)
-
-#define TEST_VCOND_IMM_UNSIGNED_ALL(COND, IMM, SUFFIX)	\
-TEST_VCOND_IMM (v32qu, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v16hu, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v8su, COND, IMM, SUFFIX)		\
-TEST_VCOND_IMM (v4du, COND, IMM, SUFFIX)
-
-#define TEST_VCOND_IMM_ALL(COND,IMM,SUFFIX)	\
-TEST_VCOND_IMM_SIGNED_ALL (COND,IMM,SUFFIX)	\
-TEST_VCOND_IMM_UNSIGNED_ALL (COND,IMM,SUFFIX)
-
-#define DEF_INIT_VECTOR(TYPE)			\
-  TYPE r_##TYPE;				\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++ )	\
-    r_##TYPE[i] = i * 3;
-
-#define SUM_VECTOR(VAL,TYPE)			\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++ )	\
-    VAL += r_##TYPE[i];
-
 int main (int argc, char **argv)
 {
-  int result = 0;
-  DEF_INIT_VECTOR (v32qi)
-  DEF_INIT_VECTOR (v16hi)
-  DEF_INIT_VECTOR (v8si)
-  DEF_INIT_VECTOR (v4di)
-  DEF_INIT_VECTOR (v32qu)
-  DEF_INIT_VECTOR (v16hu)
-  DEF_INIT_VECTOR (v8su)
-  DEF_INIT_VECTOR (v4du)
-
-  TEST_VCOND_ALL (>, _gt)
-  TEST_VCOND_ALL (<, _lt)
-  TEST_VCOND_ALL (>=, _ge)
-  TEST_VCOND_ALL (<=, _le)
-  TEST_VCOND_ALL (==, _eq)
-  TEST_VCOND_ALL (!=, _ne)
-
-  TEST_VCOND_IMM_ALL (>, 5, _gt)
-  TEST_VCOND_IMM_ALL (<, 5, _lt)
-  TEST_VCOND_IMM_ALL (>=, 5, _ge)
-  TEST_VCOND_IMM_ALL (<=, 5, _le)
-  TEST_VCOND_IMM_ALL (==, 5, _eq)
-  TEST_VCOND_IMM_ALL (!=, 5, _ne)
-
-  SUM_VECTOR (result, v32qi)
-  SUM_VECTOR (result, v16hi)
-  SUM_VECTOR (result, v8si)
-  SUM_VECTOR (result, v4di)
-  SUM_VECTOR (result, v32qu)
-  SUM_VECTOR (result, v16hu)
-  SUM_VECTOR (result, v8su)
-  SUM_VECTOR (result, v4du)
-
-  if (result != 4044)
-    abort ();
+  TEST_VAR_ALL (TEST_VCOND_VAR)
+  TEST_IMM_ALL (TEST_VCOND_IMM)
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.C b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.C
deleted file mode 100644
index 80299d7e4b8..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.C
+++ /dev/null
@@ -1,310 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -march=armv8-a+sve -fno-inline -fno-ipa-icf" } */
-
-#include <stdint.h>
-
-#define NUM_ELEMS(TYPE) (32 / sizeof (TYPE))
-
-#define DEF_VCOND(DATA_TYPE, CMP_TYPE, COND, SUFFIX)			\
-  void vcond_##CMP_TYPE##SUFFIX (DATA_TYPE *__restrict__ r,		\
-				 DATA_TYPE *__restrict__ a,		\
-				 DATA_TYPE *__restrict__ b,		\
-				 CMP_TYPE *__restrict__ x,		\
-				 CMP_TYPE *__restrict__ y,		\
-				 int n)					\
-  {									\
-    for (int i = 0; i < n; i++)						\
-      {									\
-	CMP_TYPE yval = y[i], xval = x[i];				\
-	DATA_TYPE aval = a[i], bval = b[i];				\
-	r[i] = xval COND yval ? aval : bval;				\
-      }									\
-  }
-
-#define DEF_VCOND_IMM(DATA_TYPE, CMP_TYPE, COND, IMM, SUFFIX)	\
-  void vcond_imm_##CMP_TYPE##SUFFIX (DATA_TYPE *__restrict__ r,	\
-				     DATA_TYPE *__restrict__ a,	\
-				     DATA_TYPE *__restrict__ b,	\
-				     CMP_TYPE *__restrict__ x,	\
-				     int n)			\
-  {								\
-    for (int i = 0; i < n; i++)					\
-      {								\
-	CMP_TYPE xval = x[i];					\
-	DATA_TYPE aval = a[i], bval = b[i];			\
-	r[i] = xval COND (CMP_TYPE) IMM ? aval : bval;		\
-      }								\
-  }
-
-#define DEF_VCOND_SIGNED_ALL(COND, SUFFIX)			\
-  DEF_VCOND (int8_t, int8_t, COND, SUFFIX)			\
-  DEF_VCOND (int16_t, int16_t, COND, SUFFIX)			\
-  DEF_VCOND (int32_t, int32_t, COND, SUFFIX)			\
-  DEF_VCOND (int64_t, int64_t, COND, SUFFIX)			\
-  DEF_VCOND (float, int32_t, COND, SUFFIX##_float)		\
-  DEF_VCOND (double, int64_t, COND, SUFFIX##_double)
-
-#define DEF_VCOND_UNSIGNED_ALL(COND, SUFFIX)			\
-  DEF_VCOND (uint8_t, uint8_t, COND, SUFFIX)			\
-  DEF_VCOND (uint16_t, uint16_t, COND, SUFFIX)			\
-  DEF_VCOND (uint32_t, uint32_t, COND, SUFFIX)			\
-  DEF_VCOND (uint64_t, uint64_t, COND, SUFFIX)			\
-  DEF_VCOND (float, uint32_t, COND, SUFFIX##_float)		\
-  DEF_VCOND (double, uint64_t, COND, SUFFIX##_double)
-
-#define DEF_VCOND_ALL(COND, SUFFIX)		\
-  DEF_VCOND_SIGNED_ALL (COND, SUFFIX)		\
-  DEF_VCOND_UNSIGNED_ALL (COND, SUFFIX)
-
-#define DEF_VCOND_IMM_SIGNED_ALL(COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (int8_t, int8_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (int16_t, int16_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (int32_t, int32_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (int64_t, int64_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (float, int32_t, COND, IMM, SUFFIX##_float)	\
-  DEF_VCOND_IMM (double, int64_t, COND, IMM, SUFFIX##_double)
-
-#define DEF_VCOND_IMM_UNSIGNED_ALL(COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (uint8_t, uint8_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (uint16_t, uint16_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (uint32_t, uint32_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (uint64_t, uint64_t, COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM (float, uint32_t, COND, IMM, SUFFIX##_float)	\
-  DEF_VCOND_IMM (double, uint64_t, COND, IMM, SUFFIX##_double)
-
-#define DEF_VCOND_IMM_ALL(COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM_SIGNED_ALL (COND, IMM, SUFFIX)		\
-  DEF_VCOND_IMM_UNSIGNED_ALL (COND, IMM, SUFFIX)
-
-DEF_VCOND_ALL (>, _gt)
-DEF_VCOND_ALL (<, _lt)
-DEF_VCOND_ALL (>=, _ge)
-DEF_VCOND_ALL (<=, _le)
-DEF_VCOND_ALL (==, _eq)
-DEF_VCOND_ALL (!=, _ne)
-
-/* == Expect immediates to make it into the encoding == */
-
-DEF_VCOND_IMM_ALL (>, 5, _gt)
-DEF_VCOND_IMM_ALL (<, 5, _lt)
-DEF_VCOND_IMM_ALL (>=, 5, _ge)
-DEF_VCOND_IMM_ALL (<=, 5, _le)
-DEF_VCOND_IMM_ALL (==, 5, _eq)
-DEF_VCOND_IMM_ALL (!=, 5, _ne)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, 15, _gt2)
-DEF_VCOND_IMM_SIGNED_ALL (<, 15, _lt2)
-DEF_VCOND_IMM_SIGNED_ALL (>=, 15, _ge2)
-DEF_VCOND_IMM_SIGNED_ALL (<=, 15, _le2)
-DEF_VCOND_IMM_ALL (==, 15, _eq2)
-DEF_VCOND_IMM_ALL (!=, 15, _ne2)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, 16, _gt3)
-DEF_VCOND_IMM_SIGNED_ALL (<, 16, _lt3)
-DEF_VCOND_IMM_SIGNED_ALL (>=, 16, _ge3)
-DEF_VCOND_IMM_SIGNED_ALL (<=, 16, _le3)
-DEF_VCOND_IMM_ALL (==, 16, _eq3)
-DEF_VCOND_IMM_ALL (!=, 16, _ne3)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, -16, _gt4)
-DEF_VCOND_IMM_SIGNED_ALL (<, -16, _lt4)
-DEF_VCOND_IMM_SIGNED_ALL (>=, -16, _ge4)
-DEF_VCOND_IMM_SIGNED_ALL (<=, -16, _le4)
-DEF_VCOND_IMM_ALL (==, -16, _eq4)
-DEF_VCOND_IMM_ALL (!=, -16, _ne4)
-
-DEF_VCOND_IMM_SIGNED_ALL (>, -17, _gt5)
-DEF_VCOND_IMM_SIGNED_ALL (<, -17, _lt5)
-DEF_VCOND_IMM_SIGNED_ALL (>=, -17, _ge5)
-DEF_VCOND_IMM_SIGNED_ALL (<=, -17, _le5)
-DEF_VCOND_IMM_ALL (==, -17, _eq5)
-DEF_VCOND_IMM_ALL (!=, -17, _ne5)
-
-DEF_VCOND_IMM_UNSIGNED_ALL (>, 0, _gt6)
-/* Testing if an unsigned value >= 0 or < 0 is pointless as it will get
-   folded away by the compiler.  */
-DEF_VCOND_IMM_UNSIGNED_ALL (<=, 0, _le6)
-
-DEF_VCOND_IMM_UNSIGNED_ALL (>, 127, _gt7)
-DEF_VCOND_IMM_UNSIGNED_ALL (<, 127, _lt7)
-DEF_VCOND_IMM_UNSIGNED_ALL (>=, 127, _ge7)
-DEF_VCOND_IMM_UNSIGNED_ALL (<=, 127, _le7)
-
-/* == Expect immediates to NOT make it into the encoding, and instead be
-      forced into a register.  == */
-DEF_VCOND_IMM_UNSIGNED_ALL (>, 128, _gt8)
-DEF_VCOND_IMM_UNSIGNED_ALL (<, 128, _lt8)
-DEF_VCOND_IMM_UNSIGNED_ALL (>=, 128, _ge8)
-DEF_VCOND_IMM_UNSIGNED_ALL (<=, 128, _le8)
-
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+.b, p[0-7], z[0-9]+\.b, z[0-9]+\.b\n} 66 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 66 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+.s, p[0-7], z[0-9]+\.s, z[0-9]+\.s\n} 132 } } */
-/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+.d, p[0-7], z[0-9]+\.d, z[0-9]+\.d\n} 132 } } */
-
-/* There are two signed ordered register comparisons for each of .b and .h,
-   one for a variable comparison and one for one of the two out-of-range
-   constant comparisons.  The other out-of-ranger constant comparison can
-   be adjusted to an in-range value by inverting the handling of equality.
-
-   The same pattern appears twice for each .s and .d, once for integer data
-   and once for floating-point data.  */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
-
-/* Out-of-range >= is converted to in-range >.  */
-/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
-
-/* Out-of-range < is converted to in-range <=.  */
-/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
-
-/* 6 for .b and .h: {signed, unsigned\n} x {variable, too high, too low\n}.  */
-/* 12 for .s and .d: the above 6 repeated for integer and floating-point
-   data.  */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 6 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 12 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 12 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 6 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 12 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 12 } } */
-
-/* Also used for >= 16. */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
-
-/* gcc converts "a < 15" into "a <= 14".  */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #14\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #14\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #14\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #14\n} 2 } } */
-
-/* gcc converts "a >= 15" into "a > 14".  */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #14\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #14\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #14\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #14\n} 2 } } */
-
-/* Also used for < 16.  */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmple\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
-
-/* Appears once for each signedness.  */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
-
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
-
-/* gcc converts "a > -16" into "a >= -15".  */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-15\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-15\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-15\n} 2 } } */
-
-/* Also used for <= -17.  */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
-
-/* Also used for > -17.  */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-16\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
-
-/* gcc converts "a <= -16" into "a < -15".  */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #-15\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #-15\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #-15\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #-15\n} 2 } } */
-
-/* gcc converts "a > 0" into "a != 0".  */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #0\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #0\n} 2 } } */
-
-/* gcc converts "a <= 0" into "a == 0".  */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #0\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #0\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #0\n} 2 } } */
-
-/* Also used for >= 128.  */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #127\n} 2 { xfail *-*-* } } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #127\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #127\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #127\n} 4 } } */
-
-/* gcc converts "a < 127" into "a <= 126".  */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #126\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #126\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #126\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #126\n} 2 } } */
-
-/* gcc converts "a >= 127" into "a > 126".  */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #126\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #126\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #126\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #126\n} 2 } } */
-
-/* Also used for < 128.  */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].b, p[0-7]/z, z[0-9]+\.b, #127\n} 2 { xfail *-*-* } } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].h, p[0-7]/z, z[0-9]+\.h, #127\n} 2 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].s, p[0-7]/z, z[0-9]+\.s, #127\n} 4 } } */
-/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7].d, p[0-7]/z, z[0-9]+\.d, #127\n} 4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.c b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.c
new file mode 100644
index 00000000000..0c67f8147c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2.c
@@ -0,0 +1,318 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
+
+#include <stdint.h>
+
+#define DEF_VCOND_VAR(DATA_TYPE, CMP_TYPE, COND, SUFFIX)	\
+  void __attribute__ ((noinline, noclone))			\
+  vcond_var_##CMP_TYPE##_##SUFFIX (DATA_TYPE *__restrict__ r,	\
+				   DATA_TYPE *__restrict__ x,	\
+				   DATA_TYPE *__restrict__ y,	\
+				   CMP_TYPE *__restrict__ a,	\
+				   CMP_TYPE *__restrict__ b,	\
+				   int n)			\
+  {								\
+    for (int i = 0; i < n; i++)					\
+      {								\
+	DATA_TYPE xval = x[i], yval = y[i];			\
+	CMP_TYPE aval = a[i], bval = b[i];			\
+	r[i] = aval COND bval ? xval : yval;			\
+      }								\
+  }
+
+#define DEF_VCOND_IMM(DATA_TYPE, CMP_TYPE, COND, IMM, SUFFIX)	\
+  void __attribute__ ((noinline, noclone))			\
+  vcond_imm_##CMP_TYPE##_##SUFFIX (DATA_TYPE *__restrict__ r,	\
+				   DATA_TYPE *__restrict__ x,	\
+				   DATA_TYPE *__restrict__ y,	\
+				   CMP_TYPE *__restrict__ a,	\
+				   int n)			\
+  {								\
+    for (int i = 0; i < n; i++)					\
+      {								\
+	DATA_TYPE xval = x[i], yval = y[i];			\
+	CMP_TYPE aval = a[i];					\
+	r[i] = aval COND (CMP_TYPE) IMM ? xval : yval;		\
+      }								\
+  }
+
+#define TEST_COND_VAR_SIGNED_ALL(T, COND, SUFFIX)	\
+  T (int8_t, int8_t, COND, SUFFIX)			\
+  T (int16_t, int16_t, COND, SUFFIX)			\
+  T (int32_t, int32_t, COND, SUFFIX)			\
+  T (int64_t, int64_t, COND, SUFFIX)			\
+  T (_Float16, int16_t, COND, SUFFIX##_float16)		\
+  T (float, int32_t, COND, SUFFIX##_float)		\
+  T (double, int64_t, COND, SUFFIX##_double)
+
+#define TEST_COND_VAR_UNSIGNED_ALL(T, COND, SUFFIX)	\
+  T (uint8_t, uint8_t, COND, SUFFIX)			\
+  T (uint16_t, uint16_t, COND, SUFFIX)			\
+  T (uint32_t, uint32_t, COND, SUFFIX)			\
+  T (uint64_t, uint64_t, COND, SUFFIX)			\
+  T (_Float16, uint16_t, COND, SUFFIX##_float16)	\
+  T (float, uint32_t, COND, SUFFIX##_float)		\
+  T (double, uint64_t, COND, SUFFIX##_double)
+
+#define TEST_COND_VAR_ALL(T, COND, SUFFIX)	\
+  TEST_COND_VAR_SIGNED_ALL (T, COND, SUFFIX)	\
+  TEST_COND_VAR_UNSIGNED_ALL (T, COND, SUFFIX)
+
+#define TEST_VAR_ALL(T)				\
+  TEST_COND_VAR_ALL (T, >, _gt)			\
+  TEST_COND_VAR_ALL (T, <, _lt)			\
+  TEST_COND_VAR_ALL (T, >=, _ge)		\
+  TEST_COND_VAR_ALL (T, <=, _le)		\
+  TEST_COND_VAR_ALL (T, ==, _eq)		\
+  TEST_COND_VAR_ALL (T, !=, _ne)
+
+#define TEST_COND_IMM_SIGNED_ALL(T, COND, IMM, SUFFIX)	\
+  T (int8_t, int8_t, COND, IMM, SUFFIX)			\
+  T (int16_t, int16_t, COND, IMM, SUFFIX)		\
+  T (int32_t, int32_t, COND, IMM, SUFFIX)		\
+  T (int64_t, int64_t, COND, IMM, SUFFIX)		\
+  T (_Float16, int16_t, COND, IMM, SUFFIX##_float16)	\
+  T (float, int32_t, COND, IMM, SUFFIX##_float)		\
+  T (double, int64_t, COND, IMM, SUFFIX##_double)
+
+#define TEST_COND_IMM_UNSIGNED_ALL(T, COND, IMM, SUFFIX)	\
+  T (uint8_t, uint8_t, COND, IMM, SUFFIX)			\
+  T (uint16_t, uint16_t, COND, IMM, SUFFIX)			\
+  T (uint32_t, uint32_t, COND, IMM, SUFFIX)			\
+  T (uint64_t, uint64_t, COND, IMM, SUFFIX)			\
+  T (_Float16, uint16_t, COND, IMM, SUFFIX##_float16)		\
+  T (float, uint32_t, COND, IMM, SUFFIX##_float)		\
+  T (double, uint64_t, COND, IMM, SUFFIX##_double)
+
+#define TEST_COND_IMM_ALL(T, COND, IMM, SUFFIX)		\
+  TEST_COND_IMM_SIGNED_ALL (T, COND, IMM, SUFFIX)	\
+  TEST_COND_IMM_UNSIGNED_ALL (T, COND, IMM, SUFFIX)
+
+#define TEST_IMM_ALL(T)							\
+  /* Expect immediates to make it into the encoding.  */		\
+  TEST_COND_IMM_ALL (T, >, 5, _gt)					\
+  TEST_COND_IMM_ALL (T, <, 5, _lt)					\
+  TEST_COND_IMM_ALL (T, >=, 5, _ge)					\
+  TEST_COND_IMM_ALL (T, <=, 5, _le)					\
+  TEST_COND_IMM_ALL (T, ==, 5, _eq)					\
+  TEST_COND_IMM_ALL (T, !=, 5, _ne)					\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, 15, _gt2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, 15, _lt2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, 15, _ge2)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, 15, _le2)				\
+  TEST_COND_IMM_ALL (T, ==, 15, _eq2)					\
+  TEST_COND_IMM_ALL (T, !=, 15, _ne2)					\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, 16, _gt3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, 16, _lt3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, 16, _ge3)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, 16, _le3)				\
+  TEST_COND_IMM_ALL (T, ==, 16, _eq3)					\
+  TEST_COND_IMM_ALL (T, !=, 16, _ne3)					\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, -16, _gt4)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, -16, _lt4)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, -16, _ge4)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, -16, _le4)				\
+  TEST_COND_IMM_ALL (T, ==, -16, _eq4)					\
+  TEST_COND_IMM_ALL (T, !=, -16, _ne4)					\
+									\
+  TEST_COND_IMM_SIGNED_ALL (T, >, -17, _gt5)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <, -17, _lt5)				\
+  TEST_COND_IMM_SIGNED_ALL (T, >=, -17, _ge5)				\
+  TEST_COND_IMM_SIGNED_ALL (T, <=, -17, _le5)				\
+  TEST_COND_IMM_ALL (T, ==, -17, _eq5)					\
+  TEST_COND_IMM_ALL (T, !=, -17, _ne5)					\
+									\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >, 0, _gt6)				\
+  /* Testing if an unsigned value >= 0 or < 0 is pointless as it will	\
+     get folded away by the compiler.  */				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <=, 0, _le6)				\
+									\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >, 127, _gt7)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <, 127, _lt7)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >=, 127, _ge7)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <=, 127, _le7)				\
+									\
+  /* Expect immediates to NOT make it into the encoding, and instead be \
+     forced into a register.  */					\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >, 128, _gt8)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <, 128, _lt8)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, >=, 128, _ge8)				\
+  TEST_COND_IMM_UNSIGNED_ALL (T, <=, 128, _le8)
+
+TEST_VAR_ALL (DEF_VCOND_VAR)
+TEST_IMM_ALL (DEF_VCOND_IMM)
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b, p[0-7], z[0-9]+\.b, z[0-9]+\.b\n} 66 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 132 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s, p[0-7], z[0-9]+\.s, z[0-9]+\.s\n} 132 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d, p[0-7], z[0-9]+\.d, z[0-9]+\.d\n} 132 } } */
+
+/* There are two signed ordered register comparisons for .b, one for a
+   variable comparison and one for one of the two out-of-range constant
+   comparisons.  The other out-of-ranger constant comparison can be
+   adjusted to an in-range value by inverting the handling of equality.
+
+   The same pattern appears twice for .h, .s and .d, once for integer data
+   and once for floating-point data.  */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
+
+/* Out-of-range >= is converted to in-range >.  */
+/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmphs\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* Out-of-range < is converted to in-range <=.  */
+/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplo\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* 6 for .b: {signed, unsigned\n} x {variable, too high, too low}.  */
+/* 12 for .h,.s and .d: the above 6 repeated for integer and floating-point
+   data.  */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 12 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 12 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 12 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, z[0-9]+\.b\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} 12 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, z[0-9]+\.s\n} 12 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, z[0-9]+\.d\n} 12 } } */
+
+/* Also used for >= 16. */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
+
+/* gcc converts "a < 15" into "a <= 14".  */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #14\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #14\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #14\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #14\n} 2 } } */
+
+/* gcc converts "a >= 15" into "a > 14".  */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #14\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #14\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #14\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpgt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #14\n} 2 } } */
+
+/* Also used for < 16.  */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmple\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
+
+/* Appears once for each signedness.  */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #15\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #15\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
+
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
+
+/* gcc converts "a > -16" into "a >= -15".  */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-15\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-15\n} 2 } } */
+
+/* Also used for <= -17.  */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
+
+/* Also used for > -17.  */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-16\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-16\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpge\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-16\n} 4 } } */
+
+/* gcc converts "a <= -16" into "a < -15".  */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #-15\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #-15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #-15\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmplt\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #-15\n} 2 } } */
+
+/* gcc converts "a > 0" into "a != 0".  */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #0\n} 2 } } */
+
+/* gcc converts "a <= 0" into "a == 0".  */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #0\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpeq\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #0\n} 2 } } */
+
+/* Also used for >= 128.  */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #127\n} 2 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #127\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #127\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #127\n} 4 } } */
+
+/* gcc converts "a < 127" into "a <= 126".  */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #126\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #126\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #126\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #126\n} 2 } } */
+
+/* gcc converts "a >= 127" into "a > 126".  */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #126\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #126\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #126\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tcmphi\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #126\n} 2 } } */
+
+/* Also used for < 128.  */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.b, p[0-7]/z, z[0-9]+\.b, #127\n} 2 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.h, p[0-7]/z, z[0-9]+\.h, #127\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.s, p[0-7]/z, z[0-9]+\.s, #127\n} 4 } } */
+/* { dg-final { scan-assembler-times {\tcmpls\tp[0-7]\.d, p[0-7]/z, z[0-9]+\.d, #127\n} 4 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.C b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.C
deleted file mode 100644
index b3c54b74fde..00000000000
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.C
+++ /dev/null
@@ -1,118 +0,0 @@
-/* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -march=armv8-a+sve -fno-inline" } */
-
-#include "sve_vcond_2.C"
-
-#include <stdlib.h>
-
-#define TEST_VCOND(DATA_TYPE, CMP_TYPE, COND, SUFFIX)		\
-{								\
-  const int n = 32 / sizeof (DATA_TYPE);			\
-  CMP_TYPE x[n], y[n];						\
-  DATA_TYPE a[n], b[n];						\
-  for (int i = 0; i < n; ++i)					\
-    {								\
-      x[i] = i;							\
-      y[i] = (i & 1) + 5;					\
-      a[i] = 6 * i;						\
-      b[i] = 4 + i;						\
-    }								\
-  vcond_##CMP_TYPE##SUFFIX (r_##DATA_TYPE, a, b, x, y, n);	\
-  for (int i = 0; i < n; ++i)					\
-    if (r_##DATA_TYPE[i] != (x[i] COND y[i] ? a[i] : b[i]))	\
-      abort ();							\
-}
-
-#define TEST_VCOND_IMM(DATA_TYPE, CMP_TYPE, COND, IMM, SUFFIX)	\
-{								\
-  const int n = 32 / sizeof (DATA_TYPE);			\
-  CMP_TYPE x[n];						\
-  DATA_TYPE a[n], b[n];						\
-  for (int i = 0; i < n; ++i)					\
-    {								\
-      x[i] = i - 1;						\
-      a[i] = 5 * i + IMM;					\
-      b[i] = 7 + i - IMM * 2;					\
-    }								\
-  vcond_imm_##CMP_TYPE##SUFFIX (r_##DATA_TYPE, a, b, x, n);	\
-  for (int i = 0; i < n; ++i)					\
-    if (r_##DATA_TYPE[i] != (x[i] COND IMM ? a[i] : b[i]))	\
-      abort ();							\
-}
-
-#define TEST_VCOND_SIGNED_ALL(COND, SUFFIX)		\
-  TEST_VCOND (int8_t, int8_t, COND, SUFFIX)		\
-  TEST_VCOND (int16_t, int16_t, COND, SUFFIX)		\
-  TEST_VCOND (int32_t, int32_t, COND, SUFFIX)		\
-  TEST_VCOND (int64_t, int64_t, COND, SUFFIX)		\
-  TEST_VCOND (float, int32_t, COND, SUFFIX##_float)	\
-  TEST_VCOND (double, int64_t, COND, SUFFIX##_double)
-
-#define TEST_VCOND_UNSIGNED_ALL(COND, SUFFIX)		\
-  TEST_VCOND (uint8_t, uint8_t, COND, SUFFIX)		\
-  TEST_VCOND (uint16_t, uint16_t, COND, SUFFIX)		\
-  TEST_VCOND (uint32_t, uint32_t, COND, SUFFIX)		\
-  TEST_VCOND (uint64_t, uint64_t, COND, SUFFIX)		\
-  TEST_VCOND (float, uint32_t, COND, SUFFIX##_float)	\
-  TEST_VCOND (double, uint64_t, COND, SUFFIX##_double)
-
-#define TEST_VCOND_ALL(COND, SUFFIX)		\
-  TEST_VCOND_SIGNED_ALL (COND, SUFFIX)		\
-  TEST_VCOND_UNSIGNED_ALL (COND, SUFFIX)
-
-#define TEST_VCOND_IMM_SIGNED_ALL(COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (int8_t, int8_t, COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (int16_t, int16_t, COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (int32_t, int32_t, COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (int64_t, int64_t, COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (float, int32_t, COND, IMM, SUFFIX##_float)	\
-  TEST_VCOND_IMM (double, int64_t, COND, IMM, SUFFIX##_double)
-
-#define TEST_VCOND_IMM_UNSIGNED_ALL(COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (uint8_t, uint8_t, COND, IMM, SUFFIX)		\
-  TEST_VCOND_IMM (uint16_t, uint16_t, COND, IMM, SUFFIX)	\
-  TEST_VCOND_IMM (uint32_t, uint32_t, COND, IMM, SUFFIX)	\
-  TEST_VCOND_IMM (uint64_t, uint64_t, COND, IMM, SUFFIX)	\
-  TEST_VCOND_IMM (float, uint32_t, COND, IMM, SUFFIX##_float)	\
-  TEST_VCOND_IMM (double, uint64_t, COND, IMM, SUFFIX##_double)
-
-#define TEST_VCOND_IMM_ALL(COND, IMM, SUFFIX)	\
-  TEST_VCOND_IMM_SIGNED_ALL (COND, IMM, SUFFIX)	\
-  TEST_VCOND_IMM_UNSIGNED_ALL (COND, IMM, SUFFIX)
-
-#define DEF_INIT_VECTOR(TYPE)			\
-  TYPE r_##TYPE[NUM_ELEMS(TYPE)];		\
-  for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    r_##TYPE[i] = i * 3;
-
-int __attribute__ ((optimize (1)))
-main (int argc, char **argv)
-{
-  int result = 0;
-  DEF_INIT_VECTOR (int8_t)
-  DEF_INIT_VECTOR (int16_t)
-  DEF_INIT_VECTOR (int32_t)
-  DEF_INIT_VECTOR (int64_t)
-  DEF_INIT_VECTOR (uint8_t)
-  DEF_INIT_VECTOR (uint16_t)
-  DEF_INIT_VECTOR (uint32_t)
-  DEF_INIT_VECTOR (uint64_t)
-  DEF_INIT_VECTOR (float)
-  DEF_INIT_VECTOR (double)
-
-  TEST_VCOND_ALL (>, _gt)
-  TEST_VCOND_ALL (<, _lt)
-  TEST_VCOND_ALL (>=, _ge)
-  TEST_VCOND_ALL (<=, _le)
-  TEST_VCOND_ALL (==, _eq)
-  TEST_VCOND_ALL (!=, _ne)
-
-  TEST_VCOND_IMM_ALL (>, 5, _gt)
-  TEST_VCOND_IMM_ALL (<, 5, _lt)
-  TEST_VCOND_IMM_ALL (>=, 5, _ge)
-  TEST_VCOND_IMM_ALL (<=, 5, _le)
-  TEST_VCOND_IMM_ALL (==, 5, _eq)
-  TEST_VCOND_IMM_ALL (!=, 5, _ne)
-
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.c
new file mode 100644
index 00000000000..4cdb5bb9e43
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_2_run.c
@@ -0,0 +1,49 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
+
+#include "sve_vcond_2.c"
+
+#define N 97
+
+#define TEST_VCOND_VAR(DATA_TYPE, CMP_TYPE, COND, SUFFIX)	\
+{								\
+  DATA_TYPE x[N], y[N], r[N];					\
+  CMP_TYPE a[N], b[N];						\
+  for (int i = 0; i < N; ++i)					\
+    {								\
+      x[i] = i;							\
+      y[i] = (i & 1) + 5;					\
+      a[i] = i - N / 3;						\
+      b[i] = N - N / 3 - i;					\
+      asm volatile ("" ::: "memory");				\
+    }								\
+  vcond_var_##CMP_TYPE##_##SUFFIX (r, x, y, a, b, N);		\
+  for (int i = 0; i < N; ++i)					\
+    if (r[i] != (a[i] COND b[i] ? x[i] : y[i]))			\
+      __builtin_abort ();					\
+}
+
+#define TEST_VCOND_IMM(DATA_TYPE, CMP_TYPE, COND, IMM, SUFFIX)	\
+{								\
+  DATA_TYPE x[N], y[N], r[N];					\
+  CMP_TYPE a[N];						\
+  for (int i = 0; i < N; ++i)					\
+    {								\
+      x[i] = i;							\
+      y[i] = (i & 1) + 5;					\
+      a[i] = IMM - N / 3 + i;					\
+      asm volatile ("" ::: "memory");				\
+    }								\
+  vcond_imm_##CMP_TYPE##_##SUFFIX (r, x, y, a, N);		\
+  for (int i = 0; i < N; ++i)					\
+    if (r[i] != (a[i] COND (CMP_TYPE) IMM ? x[i] : y[i]))	\
+      __builtin_abort ();					\
+}
+
+int __attribute__ ((optimize (1)))
+main (int argc, char **argv)
+{
+  TEST_VAR_ALL (TEST_VCOND_VAR)
+  TEST_IMM_ALL (TEST_VCOND_IMM)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_3.C b/gcc/testsuite/gcc.target/aarch64/sve_vcond_3.c
index 68c033b1a7d..9750bd07fda 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_3.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_3.c
@@ -5,20 +5,18 @@
 
 #define DEF_SEL_IMM(TYPE, SUFFIX, IMM)					\
 void									\
-sel_##TYPE##_##SUFFIX (TYPE *__restrict__ a, TYPE *__restrict__ b,	\
-		       int n)						\
+sel_##TYPE##_##SUFFIX (TYPE *restrict a, TYPE *restrict b, int n)	\
 {									\
   for (int i = 0; i < n; i++)						\
     a[i] = b[i] != 0 ? IMM : 0;						\
 }
 
-#define DEF_SEL_VAR(TYPE)					\
-void								\
-sel_##TYPE##_var (TYPE *__restrict__ a, TYPE *__restrict__ b,	\
-		  TYPE val, int n)				\
-{								\
-  for (int i = 0; i < n; i++)					\
-    a[i] = b[i] != 0 ? val : 0;					\
+#define DEF_SEL_VAR(TYPE)						\
+void									\
+sel_##TYPE##_var (TYPE *restrict a, TYPE *restrict b, TYPE val, int n)	\
+{									\
+  for (int i = 0; i < n; i++)						\
+    a[i] = b[i] != 0 ? val : 0;						\
 }
 
 #define TEST_TYPE8(TYPE)			\
@@ -54,17 +52,17 @@ TEST_TYPE16 (int16_t)
 TEST_TYPE32 (int32_t)
 TEST_TYPE32 (int64_t)
 
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.b, p[0-7]/z, #-128\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.b, p[0-7]/z, #-127\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.b, p[0-7]/z, #2\n} 1 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.b, p[0-7]/z, #127\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, p[0-7]/z, #-128\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, p[0-7]/z, #-127\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, p[0-7]/z, #2\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, p[0-7]/z, #127\n} 1 } } */
 
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #-32768\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #-32512\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #-256\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #-128\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #-127\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #2\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #127\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #256\n} 3 } } */
-/* { dg-final { scan-assembler-times {\tmov\tz[0-9]*\.[hsd], p[0-7]/z, #32512\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #-32768\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #-32512\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #-256\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #-128\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #-127\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #2\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #127\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #256\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.[hsd], p[0-7]/z, #32512\n} 3 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vcond_4_run.c
index 36c43e9f1e8..e8d06bb9f17 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_4_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_4_run.c
@@ -8,8 +8,6 @@
 
 #include <fenv.h>
 
-extern void abort (void) __attribute__ ((noreturn));
-
 #include "sve_vcond_4.c"
 
 #define N 401
@@ -33,6 +31,7 @@ extern void abort (void) __attribute__ ((noreturn));
 	  b[i] = i * 0.1;						\
 	else								\
 	  b[i] = i;							\
+	asm volatile ("" ::: "memory");					\
       }									\
     feclearexcept (FE_ALL_EXCEPT);					\
     test_##TYPE1##_##TYPE2##_##CMP##_var (dest1, src, 11, a, b, N);	\
@@ -40,15 +39,15 @@ extern void abort (void) __attribute__ ((noreturn));
     test_##TYPE1##_##TYPE2##_##CMP##_sel (dest3, 33, 44, a, 9, N);	\
     if (TEST_EXCEPTIONS							\
 	&& !fetestexcept (FE_INVALID) != !(EXPECT_INVALID))		\
-      abort ();								\
+      __builtin_abort ();						\
     for (int i = 0; i < N; ++i)						\
       {									\
 	if (dest1[i] != (CMP (a[i], b[i]) ? src[i] : 11))		\
-	  abort ();							\
+	  __builtin_abort ();						\
 	if (dest2[i] != (CMP (a[i], 0) ? src[i] : 22))			\
-	  abort ();							\
+	  __builtin_abort ();						\
 	if (dest3[i] != (CMP (a[i], 9) ? 33 : 44))			\
-	  abort ();							\
+	  __builtin_abort ();						\
       }									\
   }
 
@@ -64,7 +63,7 @@ extern void abort (void) __attribute__ ((noreturn));
   RUN_LOOP (uint64_t, double, CMP, EXPECT_INVALID) \
   RUN_LOOP (double, double, CMP, EXPECT_INVALID)
 
-int __attribute__ ((optimize (1, "no-tree-vectorize")))
+int __attribute__ ((optimize (1)))
 main (void)
 {
   RUN_CMP (eq, 0)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_6.c b/gcc/testsuite/gcc.target/aarch64/sve_vcond_6.c
index 8ab040ef51e..74336050d8d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_6.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_6.c
@@ -25,6 +25,7 @@
   }
 
 #define TEST_BINOP(T, BINOP) \
+  T (_Float16, BINOP) \
   T (float, BINOP) \
   T (double, BINOP)
 
@@ -40,11 +41,11 @@
 TEST_ALL (LOOP)
 
 /* Currently we don't manage to remove ANDs from the other loops.  */
-/* { dg-final { scan-assembler-times {\tand\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times {\tand\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 { xfail *-*-* } } } */
 /* { dg-final { scan-assembler {\tand\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} } } */
-/* { dg-final { scan-assembler-times {\torr\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
-/* { dg-final { scan-assembler-times {\teor\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
-/* { dg-final { scan-assembler-times {\tnand\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
-/* { dg-final { scan-assembler-times {\tnor\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
-/* { dg-final { scan-assembler-times {\tbic\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
-/* { dg-final { scan-assembler-times {\torn\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 2 } } */
+/* { dg-final { scan-assembler-times {\torr\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
+/* { dg-final { scan-assembler-times {\teor\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
+/* { dg-final { scan-assembler-times {\tnand\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
+/* { dg-final { scan-assembler-times {\tnor\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
+/* { dg-final { scan-assembler-times {\tbic\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
+/* { dg-final { scan-assembler-times {\torn\tp[0-9]+\.b, p[0-9]+/z, p[0-9]+\.b, p[0-9]+\.b} 3 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vcond_6_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vcond_6_run.c
index ff8ad90da9f..edad9b8272d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vcond_6_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vcond_6_run.c
@@ -1,8 +1,6 @@
 /* { dg-do run { target aarch64_sve_hw } } */
 /* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-extern void abort (void) __attribute__ ((noreturn));
-
 #include "sve_vcond_6.c"
 
 #define N 401
@@ -17,6 +15,7 @@ extern void abort (void) __attribute__ ((noreturn));
 	b[i] = i % 7 < 4 ? __builtin_nan("") : i;			\
 	c[i] = i % 9 < 5 ? __builtin_nan("") : i;			\
 	d[i] = i % 11 < 6 ? __builtin_nan("") : i;			\
+	asm volatile ("" ::: "memory");					\
       }									\
     test_##TYPE##_##BINOP (dest, src, a, b, c, d, 100, N);		\
     for (int i = 0; i < N; ++i)						\
@@ -24,11 +23,11 @@ extern void abort (void) __attribute__ ((noreturn));
 	int res = BINOP (__builtin_isunordered (a[i], b[i]),		\
 			 __builtin_isunordered (c[i], d[i]));		\
 	if (dest[i] != (res ? src[i] : 100.0))				\
-	  abort ();							\
+	  __builtin_abort ();						\
       }									\
   }
 
-int __attribute__ ((optimize (1, "no-tree-vectorize")))
+int __attribute__ ((optimize (1)))
 main (void)
 {
   TEST_ALL (RUN_LOOP)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1.C b/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1.c
index d6194dcbf8f..95f19f7f786 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1.c
@@ -1,25 +1,30 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c++11 -O2 -ftree-vectorize -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
 #include <stdint.h>
 
 #define NUM_ELEMS(TYPE) (128 / sizeof (TYPE))
 
 #define DUP_FN(TYPE)				\
-void dup_##TYPE (TYPE *r, TYPE v)		\
+void __attribute__ ((noinline, noclone))	\
+dup_##TYPE (TYPE *r, TYPE v)			\
 {						\
   for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
     r[i] = v;					\
 }
 
+DUP_FN (int8_t)
 DUP_FN (int16_t)
 DUP_FN (int32_t)
 DUP_FN (int64_t)
+DUP_FN (_Float16)
 DUP_FN (float)
 DUP_FN (double)
 
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.b, w[0-9]+\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, w[0-9]+\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, w[0-9]+\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, x[0-9]+\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, h[0-9]+\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, s[0-9]+\n} 1 } } */
 /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, d[0-9]+\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1_run.C b/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1_run.c
index 579327ef81d..ba7eb44be70 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1_run.C
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_init_1_run.c
@@ -1,23 +1,26 @@
 /* { dg-do run { target aarch64_sve_hw } } */
-/* { dg-options "-std=c++11 -O3 -fno-inline -march=armv8-a+sve" } */
+/* { dg-options "-O2 -ftree-vectorize -march=armv8-a+sve" } */
 
-#include "sve_vec_init_1.C"
-
-#include <stdlib.h>
+#include "sve_vec_init_1.c"
 
 #define TEST_INIT_VECTOR(TYPE, VAL)		\
-  TYPE r_##TYPE[NUM_ELEMS (TYPE)];		\
-  dup_##TYPE (r_##TYPE, VAL);			\
+  {						\
+  TYPE r[NUM_ELEMS (TYPE)];			\
+  dup_##TYPE (r, VAL);				\
   for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
-    if (r_##TYPE[i] != VAL)			\
-      abort ();
+    if (r[i] != VAL)				\
+      __builtin_abort ();			\
+  }
 
-int main (void)
+int __attribute__ ((optimize (1)))
+main (void)
 {
+  TEST_INIT_VECTOR (int8_t, 0x2a);
   TEST_INIT_VECTOR (int16_t, 0x3976);
   TEST_INIT_VECTOR (int32_t, 0x31232976);
   TEST_INIT_VECTOR (int64_t, 0x9489363731232976LL);
 
+  TEST_INIT_VECTOR (_Float16, -0x1.fp10);
   TEST_INIT_VECTOR (float, -0x1.fe02p10);
   TEST_INIT_VECTOR (double, 0x1.fe02eeeee1p10);
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1.c
index 214c0c3930f..ae8542f2c75 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1.c
@@ -1,15 +1,19 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define VEC_PERM(TYPE, MASKTYPE)					\
-TYPE vec_perm_##TYPE (TYPE values1, TYPE values2, MASKTYPE mask)	\
+TYPE __attribute__ ((noinline, noclone)) 				\
+vec_perm_##TYPE (TYPE values1, TYPE values2, MASKTYPE mask)		\
 {									\
   return __builtin_shuffle (values1, values2, mask);			\
 }
@@ -20,8 +24,9 @@ VEC_PERM (v16hi, v16hi);
 VEC_PERM (v32qi, v32qi);
 VEC_PERM (v4df, v4di);
 VEC_PERM (v8sf, v8si);
+VEC_PERM (v16hf, v16hi);
 
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_overrange_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_overrange_run.c
index 630e30867cb..6ab82250d4c 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_overrange_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_overrange_run.c
@@ -1,8 +1,8 @@
 /* { dg-do run { target aarch64_sve_hw } } */
 /* { dg-options "-O -march=armv8-a+sve" } */
+/* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */
 
 #include "sve_vec_perm_1.c"
-extern void abort (void);
 
 #define TEST_VEC_PERM(TYPE, MASK_TYPE, EXPECTED_RESULT,			\
 		      VALUES1, VALUES2, MASK)				\
@@ -14,7 +14,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = vec_perm_##TYPE (values1, values2, mask);			\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -92,5 +92,20 @@ int main (void)
 			   15 + (16 * 4), 7 + (16 * 4),
 			   6 + (16 * 3), 5 + (16 * 2),
 			   4 + (16 * 1), 10 + (16 * 0) }));
+  TEST_VEC_PERM (v16hf, v16hi,
+		 ((v16hf) { 12.0, 16.0, 18.0, 10.0, 42.0, 43.0, 44.0, 34.0,
+			    7.0, 48.0, 3.0, 35.0, 9.0, 8.0, 7.0, 13.0 }),
+		 ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			    11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+		 ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			    41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }),
+		 ((v16hi) { 9 + (32 * 2), 13 + (32 * 2),
+			    15 + (32 * 8), 7 + (32 * 9),
+			    25 + (32 * 4), 26 + (32 * 3),
+			    27 + (32 * 1), 17 + (32 * 2),
+			    4 + (32 * 6), 31 + (32 * 7),
+			    0 + (32 * 8), 18 + (32 * 9),
+			    6 + (32 * 6), 5 + (32 * 7),
+			    4 + (32 * 2), 10 + (32 * 2) }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_run.c
index ce8cc79728a..4d46ff02192 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_1_run.c
@@ -3,7 +3,6 @@
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */
 
 #include "sve_vec_perm_1.c"
-extern void abort (void);
 
 #define TEST_VEC_PERM(TYPE, MASK_TYPE, EXPECTED_RESULT,			\
 		      VALUES1, VALUES2, MASK)				\
@@ -15,7 +14,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = vec_perm_##TYPE (values1, values2, mask);			\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -67,5 +66,14 @@ int main (void)
 		 ((v8sf) { 33.2, 34.2, 35.2, 36.2,
 			   37.2, 38.2, 39.2, 40.2 }),
 		 ((v8si) { 9, 13, 15, 7, 6, 5, 4, 10 }));
+  TEST_VEC_PERM (v16hf, v16hi,
+		 ((v16hf) { 12.0, 16.0, 18.0, 10.0, 42.0, 43.0, 44.0, 34.0,
+			    7.0, 48.0, 3.0, 35.0, 9.0, 8.0, 7.0, 13.0 }),
+		 ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			    11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+		 ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			    41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }),
+		 ((v16hi) { 9, 13, 15, 7, 25, 26, 27, 17,
+			    4, 31, 0, 18, 6, 5, 4, 10 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1.c
index d26d0902165..e76b3bc5abb 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1.c
@@ -1,15 +1,19 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define VEC_PERM_CONST(TYPE, MASK)			\
-TYPE vec_perm_##TYPE (TYPE values1, TYPE values2)	\
+TYPE __attribute__ ((noinline, noclone)) 		\
+vec_perm_##TYPE (TYPE values1, TYPE values2)		\
 {							\
   return __builtin_shuffle (values1, values2, MASK);	\
 }
@@ -24,8 +28,10 @@ VEC_PERM_CONST (v32qi, ((v32qi) { 13, 31, 11, 2, 48, 28, 3, 4,
 				  2, 57, 22, 11, 6, 16, 18, 21 }));
 VEC_PERM_CONST (v4df,  ((v4di) { 7, 3, 2, 1 }));
 VEC_PERM_CONST (v8sf,  ((v8si) { 1, 9, 13, 11, 2, 5, 4, 2 }));
+VEC_PERM_CONST (v16hf, ((v16hi) { 8, 27, 5, 4, 21, 12, 13, 0,
+				  22, 1, 8, 9, 3, 24, 15, 1 }));
 
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_overrun.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_overrun.c
index 8507cb46fb9..b4f82091f7c 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_overrun.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_overrun.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define VEC_PERM_CONST_OVERRUN(TYPE, MASK)			\
 TYPE vec_perm_overrun_##TYPE (TYPE values1, TYPE values2)	\
@@ -50,8 +53,16 @@ VEC_PERM_CONST_OVERRUN (v8sf,  ((v8si) { 1 + (16 * 1), 9 + (16 * 2),
 					 13 + (16 * 2), 11 + (16 * 3),
 					 2 + (16 * 2), 5 + (16 * 2),
 					 4 + (16 * 4), 2 + (16 * 3) }));
+VEC_PERM_CONST_OVERRUN (v16hf, ((v16hi) { 8 + (32 * 3), 27 + (32 * 1),
+					  5 + (32 * 3), 4 + (32 * 3),
+					  21 + (32 * 1), 12 + (32 * 3),
+					  13 + (32 * 3), 0 + (32 * 1),
+					  22 + (32 * 2), 1 + (32 * 2),
+					  8 + (32 * 2), 9 + (32 * 1),
+					  3 + (32 * 2), 24 + (32 * 2),
+					  15 + (32 * 1), 1 + (32 * 1) }));
 
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 4 } } */
-/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 4 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_run.c
index 7edda5398e2..7324c1da0a4 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_1_run.c
@@ -4,7 +4,6 @@
 
 #include "sve_vec_perm_const_1.c"
 #include "sve_vec_perm_const_1_overrun.c"
-extern void abort (void);
 
 #define TEST_VEC_PERM(TYPE, EXPECTED_RESULT, VALUES1, VALUES2)		\
 {									\
@@ -14,11 +13,11 @@ extern void abort (void);
   TYPE dest;								\
   dest = vec_perm_##TYPE (values1, values2);				\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
   TYPE dest2;								\
   dest2 = vec_perm_overrun_##TYPE (values1, values2);			\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -60,5 +59,12 @@ int main (void)
 		 ((v8sf) { 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5 }),
 		 ((v8sf) { 33.5, 34.5, 35.5, 36.5,
 			   37.5, 38.5, 39.5, 40.5 }));
+  TEST_VEC_PERM (v16hf,
+		 ((v16hf) { 11.0, 44.0, 8.0, 7.0, 38.0, 15.0, 16.0, 3.0,
+			    39.0, 4.0, 11.0, 12.0, 6.0, 41.0, 18.0, 4.0 }),
+		 ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0,
+			    12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+		 ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			    41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1.c
index c1e12faa850..a4efb4fea79 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define VEC_PERM_SINGLE(TYPE, MASK)			\
 TYPE vec_perm_##TYPE (TYPE values1, TYPE values2)	\
@@ -24,8 +27,10 @@ VEC_PERM_SINGLE (v32qi, ((v32qi) { 13, 21, 11, 2, 8, 28, 3, 4,
 				   2, 7, 22, 11, 6, 16, 18, 21 }));
 VEC_PERM_SINGLE (v4df,  ((v4di) { 3, 3, 1, 1 }));
 VEC_PERM_SINGLE (v8sf,  ((v8si) { 4, 5, 6, 0, 2, 7, 4, 2 }));
+VEC_PERM_SINGLE (v16hf, ((v16hi) { 8, 7, 5, 4, 11, 12, 13, 0,
+				   1, 1, 8, 9, 3, 14, 15, 1 }));
 
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1_run.c
index 2aa08f59590..fbae30c8d1c 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_const_single_1_run.c
@@ -3,7 +3,6 @@
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" { target aarch64_sve256_hw } } */
 
 #include "sve_vec_perm_const_single_1.c"
-extern void abort (void);
 
 #define TEST_VEC_PERM(TYPE, EXPECTED_RESULT, VALUES1, VALUES2)		\
 {									\
@@ -13,7 +12,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = vec_perm_##TYPE (values1, values2);				\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -55,5 +54,12 @@ int main (void)
 		 ((v8sf) { 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5 }),
 		 ((v8sf) { 33.5, 34.5, 35.5, 36.5,
 			   37.5, 38.5, 39.5, 40.5 }));
+  TEST_VEC_PERM (v16hf,
+		 ((v16hf) { 11.0, 10.0, 8.0, 7.0, 14.0, 15.0, 16.0, 3.0,
+			    4.0, 4.0, 11.0, 12.0, 6.0, 17.0, 18.0, 4.0 }),
+		 ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			    11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+		 ((v16hf) { 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
+			    41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1.c
index 54c3a3068b0..a82b57dc378 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1.c
@@ -1,12 +1,15 @@
 /* { dg-do compile } */
 /* { dg-options "-O -march=armv8-a+sve -msve-vector-bits=256" } */
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define VEC_PERM(TYPE, MASKTYPE)			\
 TYPE vec_perm_##TYPE (TYPE values, MASKTYPE mask)	\
@@ -14,14 +17,15 @@ TYPE vec_perm_##TYPE (TYPE values, MASKTYPE mask)	\
   return __builtin_shuffle (values, mask);		\
 }
 
-VEC_PERM (v4di, v4di);				\
-VEC_PERM (v8si, v8si);				\
-VEC_PERM (v16hi, v16hi);			\
-VEC_PERM (v32qi, v32qi);			\
-VEC_PERM (v4df, v4di);				\
-VEC_PERM (v8sf, v8si);
+VEC_PERM (v4di, v4di)
+VEC_PERM (v8si, v8si)
+VEC_PERM (v16hi, v16hi)
+VEC_PERM (v32qi, v32qi)
+VEC_PERM (v4df, v4di)
+VEC_PERM (v8sf, v8si)
+VEC_PERM (v16hf, v16hi)
 
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
-/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
 /* { dg-final { scan-assembler-times {\ttbl\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b\n} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1_run.c
index 6caa1f95cfd..539c99d4f61 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_vec_perm_single_1_run.c
@@ -5,7 +5,7 @@
 #include "sve_vec_perm_single_1.c"
 extern void abort (void);
 
-#define TEST_VEC_PERM(TYPE, MASK_TYPE, EXPECTED_RESULT,	VALUES, MASK)	\
+#define TEST_VEC_PERM(TYPE, MASK_TYPE, EXPECTED_RESULT, VALUES, MASK)	\
 {									\
   TYPE expected_result = EXPECTED_RESULT;				\
   TYPE values = VALUES;							\
@@ -13,7 +13,7 @@ extern void abort (void);
   TYPE dest;								\
   dest = vec_perm_##TYPE (values, mask);				\
   if (__builtin_memcmp (&dest, &expected_result, sizeof (TYPE)) != 0)	\
-    abort ();								\
+    __builtin_abort ();							\
 }
 
 int main (void)
@@ -54,5 +54,12 @@ int main (void)
 		 ((v8sf) { 4.2, 8.2, 10.2, 10.2, 9.2, 8.2, 7.2, 5.2 }),
 		 ((v8sf) { 3.2, 4.2, 5.2, 6.2, 7.2, 8.2, 9.2, 10.2 }),
 		 ((v8si) { 9, 13, 15, 7, 6, 5, 4, 10 }));
+  TEST_VEC_PERM (v16hf, v16hi,
+		 ((v16hf) { 12.0, 16.0, 18.0, 10.0, 12.0, 13.0, 14.0, 4.0,
+			    7.0, 18.0, 3.0, 5.0, 9.0, 8.0, 7.0, 13.0 }),
+		 ((v16hf) { 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
+			    11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0 }),
+		 ((v16hi) { 9, 13, 15, 7, 25, 26, 27, 17,
+			    4, 31, 0, 18, 6, 5, 4, 10 }));
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_zip1_1.c b/gcc/testsuite/gcc.target/aarch64/sve_zip1_1.c
index 509dddcb100..918313f62bd 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_zip1_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_zip1_1.c
@@ -5,12 +5,15 @@
 #define BIAS 0
 #endif
 
-typedef long v4di __attribute__((vector_size (32)));
-typedef int v8si __attribute__((vector_size (32)));
-typedef short v16hi __attribute__((vector_size (32)));
-typedef char v32qi __attribute__((vector_size (32)));
+#include <stdint.h>
+
+typedef int64_t v4di __attribute__((vector_size (32)));
+typedef int32_t v8si __attribute__((vector_size (32)));
+typedef int16_t v16hi __attribute__((vector_size (32)));
+typedef int8_t v32qi __attribute__((vector_size (32)));
 typedef double v4df __attribute__((vector_size (32)));
 typedef float v8sf __attribute__((vector_size (32)));
+typedef _Float16 v16hf __attribute__((vector_size (32)));
 
 #define MASK_2(X, Y) X, Y + X
 #define MASK_4(X, Y) MASK_2 (X, Y), MASK_2 (X + 1, Y)
@@ -38,7 +41,8 @@ typedef float v8sf __attribute__((vector_size (32)));
   T (v16hi, 16)					\
   T (v32qi, 32)					\
   T (v4df, 4)					\
-  T (v8sf, 8)
+  T (v8sf, 8)					\
+  T (v16hf, 16)
 
 TEST_ALL (PERMUTE)
 
@@ -46,5 +50,5 @@ TEST_ALL (PERMUTE)
 
 /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d} 2 } } */
 /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s} 2 } } */
-/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 2 } } */
 /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve_zip2_1.c b/gcc/testsuite/gcc.target/aarch64/sve_zip2_1.c
index 360ffab7d3e..40a899bc40a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve_zip2_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve_zip2_1.c
@@ -8,5 +8,5 @@
 
 /* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d} 2 } } */
 /* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.s, z[0-9]+\.s, z[0-9]+\.s} 2 } } */
-/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 1 } } */
+/* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.h, z[0-9]+\.h, z[0-9]+\.h} 2 } } */
 /* { dg-final { scan-assembler-times {\tzip2\tz[0-9]+\.b, z[0-9]+\.b, z[0-9]+\.b} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_11.c b/gcc/testsuite/gcc.target/aarch64/target_attr_11.c
index 7cfb826fc44..a3df438206b 100644
--- a/gcc/testsuite/gcc.target/aarch64/target_attr_11.c
+++ b/gcc/testsuite/gcc.target/aarch64/target_attr_11.c
@@ -10,4 +10,4 @@ foo (int a)
 }
 
 /* { dg-error "does not allow a negated form" "" { target *-*-* } 0 } */
-/* { dg-error "is invalid" "" { target *-*-* } 0 } */
+/* { dg-error "is not valid" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_12.c b/gcc/testsuite/gcc.target/aarch64/target_attr_12.c
index 39cb9964003..8a3a25bfed7 100644
--- a/gcc/testsuite/gcc.target/aarch64/target_attr_12.c
+++ b/gcc/testsuite/gcc.target/aarch64/target_attr_12.c
@@ -10,4 +10,4 @@ foo (int a)
 }
 
 /* { dg-error "does not accept an argument" "" { target *-*-* } 0 } */
-/* { dg-error "is invalid" "" { target *-*-* } 0 } */
+/* { dg-error "is not valid" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_17.c b/gcc/testsuite/gcc.target/aarch64/target_attr_17.c
index 483cc6d4a1d..2a7a7511bea 100644
--- a/gcc/testsuite/gcc.target/aarch64/target_attr_17.c
+++ b/gcc/testsuite/gcc.target/aarch64/target_attr_17.c
@@ -5,4 +5,4 @@ foo (int a)
   return a + 5;
 }
 
-/* { dg-error "target attribute.*is invalid" "" { target *-*-* } 0 } */
-\ No newline at end of file
+/* { dg-error "attribute 'target\\(\"invalid-attr-string\"\\)' is not valid" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-vcvt.c b/gcc/testsuite/gcc.target/aarch64/vect-vcvt.c
index a1422d7090b..436399c6195 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-vcvt.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-vcvt.c
@@ -56,13 +56,13 @@ TEST (SUFFIX, q, 32, 4, u,u,s)			\
 TEST (SUFFIX, q, 64, 2, u,u,d)			\
 
 BUILD_VARIANTS ( )
-/* { dg-final { scan-assembler "fcvtzs\\tw\[0-9\]+, s\[0-9\]+" } } */
-/* { dg-final { scan-assembler "fcvtzs\\tx\[0-9\]+, d\[0-9\]+" } } */
+/* { dg-final { scan-assembler "fcvtzs\\t(w|s)\[0-9\]+, s\[0-9\]+" } } */
+/* { dg-final { scan-assembler "fcvtzs\\t(x|d)\[0-9\]+, d\[0-9\]+" } } */
 /* { dg-final { scan-assembler "fcvtzs\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" } } */
 /* { dg-final { scan-assembler "fcvtzs\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" } } */
 /* { dg-final { scan-assembler "fcvtzs\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */
-/* { dg-final { scan-assembler "fcvtzu\\tw\[0-9\]+, s\[0-9\]+" } } */
-/* { dg-final { scan-assembler "fcvtzu\\tx\[0-9\]+, d\[0-9\]+" } } */
+/* { dg-final { scan-assembler "fcvtzu\\t(w|s)\[0-9\]+, s\[0-9\]+" } } */
+/* { dg-final { scan-assembler "fcvtzu\\t(x|d)\[0-9\]+, d\[0-9\]+" } } */
 /* { dg-final { scan-assembler "fcvtzu\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" } } */
 /* { dg-final { scan-assembler "fcvtzu\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" } } */
 /* { dg-final { scan-assembler "fcvtzu\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d" } } */
diff --git a/gcc/testsuite/gcc.target/alpha/sqrt.c b/gcc/testsuite/gcc.target/alpha/sqrt.c
new file mode 100644
index 00000000000..a3c8b243ae4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/alpha/sqrt.c
@@ -0,0 +1,25 @@
+/* glibc bug, https://sourceware.org/ml/libc-alpha/2017-04/msg00256.html
+   When using software completions, we have to prevent assembler to match
+   input and output operands of sqrtt/sqrtf insn.  Fixed in glibc 2.26.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-builtin-sqrt -mieee" } */
+
+double sqrt (double);
+
+static double
+float64frombits (unsigned long b)
+{
+  union { unsigned long __b; double __d; } u = { .__b = b };
+  return u.__d;
+}
+
+int
+main (void)
+{
+  double a = float64frombits (2);
+
+  if (sqrt (a) != 3.1434555694052576e-162)
+    __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c b/gcc/testsuite/gcc.target/arc/loop-1.c
new file mode 100755
index 00000000000..274bb4623c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/loop-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check how we handle empty body loops.  */
+
+int a;
+void fn1(void) {
+  int i;
+  for (; i < 8; i++) {
+    double A[a];
+  }
+}
diff --git a/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c b/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
index eb2b86ee7b6..d49eff6b87e 100644
--- a/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
+++ b/gcc/testsuite/gcc.target/arm/peep-ldrd-1.c
@@ -8,4 +8,4 @@ int foo(int a, int b, int* p, int *q)
   *p = a;
   return a;
 }
-/* { dg-final { scan-assembler "ldrd" } } */
+/* { dg-final { scan-assembler "ldrd\\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c b/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c
new file mode 100644
index 00000000000..6822c2b1454
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/peep-ldrd-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_prefer_ldrd_strd } */
+/* { dg-options "-O2 -mno-unaligned-access" }  */
+int foo(int a, int b, int* p, int *q)
+{
+  a = p[2] + p[3];
+  *q = a;
+  *p = a;
+  return a;
+}
+/* { dg-final { scan-assembler-not "ldrd\\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/peep-strd-1.c b/gcc/testsuite/gcc.target/arm/peep-strd-1.c
index bd330769599..fe1beac7229 100644
--- a/gcc/testsuite/gcc.target/arm/peep-strd-1.c
+++ b/gcc/testsuite/gcc.target/arm/peep-strd-1.c
@@ -6,4 +6,4 @@ void foo(int a, int b, int* p)
   p[2] = a;
   p[3] = b;
 }
-/* { dg-final { scan-assembler "strd" } } */
+/* { dg-final { scan-assembler "strd\\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/peep-strd-2.c b/gcc/testsuite/gcc.target/arm/peep-strd-2.c
new file mode 100644
index 00000000000..bfc5ebe9eec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/peep-strd-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_prefer_ldrd_strd } */
+/* { dg-options "-O2 -mno-unaligned-access" }  */
+void foo(int a, int b, int* p)
+{
+  p[2] = a;
+  p[3] = b;
+}
+/* { dg-final { scan-assembler-not "strd\\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c b/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c
index bd85e8640c2..268e9e42667 100644
--- a/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c
+++ b/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c
@@ -18,12 +18,12 @@ main (int argc)        /* line 9.  */
   return 0;
 }
 
-/* { dg-final { scan-assembler-not "\.loc 1 7 0" } } */
-/* { dg-final { scan-assembler-not "\.loc 1 8 0" } } */
-/* { dg-final { scan-assembler-not "\.loc 1 9 0" } } */
+/* { dg-final { scan-assembler-not "\.loc 1 7 \[0-9\]\+" } } */
+/* { dg-final { scan-assembler-not "\.loc 1 8 \[0-9\]\+" } } */
+/* { dg-final { scan-assembler-not "\.loc 1 9 \[0-9\]\+" } } */
 
 /* The loc at the start of the prologue.  */
-/* { dg-final { scan-assembler-times "\.loc 1 10 0" 1 } } */
+/* { dg-final { scan-assembler-times "\.loc 1 10 \[0-9\]\+" 1 } } */
 
 /* The loc at the end of the prologue, with the first user line.  */
-/* { dg-final { scan-assembler-times "\.loc 1 11 0" 1 } } */
+/* { dg-final { scan-assembler-times "\.loc 1 11 \[0-9\]\+" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vdot-exec.c b/gcc/testsuite/gcc.target/arm/simd/vdot-exec.c
new file mode 100644
index 00000000000..054f4703394
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vdot-exec.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O3" } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw } */
+/* { dg-add-options arm_v8_2a_dotprod_neon }  */
+
+#include <arm_neon.h>
+
+extern void abort();
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+# define ORDER(x, y) y
+#else
+# define ORDER(x, y) x - y
+#endif
+
+#define P(n1,n2) n1,n1,n1,n1,n2,n2,n2,n2
+#define ARR(nm, p, ty, ...) ty nm##_##p = { __VA_ARGS__ }
+#define TEST(t1, t2, t3, f, r1, r2, n1, n2) \
+	ARR(f, x, t1, r1);		    \
+	ARR(f, y, t2, r2);		    \
+	t3 f##_##r = {0};		    \
+	f##_##r = f (f##_##r, f##_##x, f##_##y);  \
+	if (f##_##r[0] != n1 || f##_##r[1] != n2)   \
+	  abort ();
+
+#define TEST_LANE(t1, t2, t3, f, r1, r2, n1, n2, n3, n4) \
+	ARR(f, x, t1, r1);		    \
+	ARR(f, y, t2, r2);		    \
+	t3 f##_##rx = {0};		    \
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, ORDER (1, 0));  \
+	if (f##_##rx[0] != n1 || f##_##rx[1] != n2)   \
+	  abort ();				    \
+	t3 f##_##rx1 = {0};			    \
+	f##_##rx1 =  f (f##_##rx1, f##_##x, f##_##y, ORDER (1, 1));  \
+	if (f##_##rx1[0] != n3 || f##_##rx1[1] != n4)   \
+	  abort (); \
+
+int
+main()
+{
+  TEST (uint8x8_t, uint8x8_t, uint32x2_t, vdot_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x8_t, int8x8_t, int32x2_t, vdot_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST (uint8x16_t, uint8x16_t, uint32x4_t, vdotq_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x16_t, int8x16_t, int32x4_t, vdotq_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST_LANE (uint8x8_t, uint8x8_t, uint32x2_t, vdot_lane_u32, P(1,2), P(2,3), 8, 16, 12, 24);
+
+  TEST_LANE (int8x8_t, int8x8_t, int32x2_t, vdot_lane_s32, P(1,2), P(-2,-3), -8, -16, -12, -24);
+
+  TEST_LANE (uint8x16_t, uint8x8_t, uint32x4_t, vdotq_lane_u32, P(1,2), P(2,3), 8, 16, 12, 24);
+  TEST_LANE (int8x16_t, int8x8_t, int32x4_t, vdotq_lane_s32, P(1,2), P(-2,-3), -8, -16, -12, -24);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/387-ficom-1.c b/gcc/testsuite/gcc.target/i386/387-ficom-1.c
index 8c73ddcb2da..325698854a9 100644
--- a/gcc/testsuite/gcc.target/i386/387-ficom-1.c
+++ b/gcc/testsuite/gcc.target/i386/387-ficom-1.c
@@ -37,5 +37,5 @@ int test_ld_i (int x)
   return (long double)i != x;
 }
 
-/* { dg-final { scan-assembler-times "ficomps" 3 } } */
+/* { dg-final { scan-assembler-times "ficomp\[s\t\]" 3 } } */
 /* { dg-final { scan-assembler-times "ficompl" 3 } } */
diff --git a/gcc/testsuite/gcc.target/i386/387-ficom-2.c b/gcc/testsuite/gcc.target/i386/387-ficom-2.c
index 4190ebaae71..d5283684f8e 100644
--- a/gcc/testsuite/gcc.target/i386/387-ficom-2.c
+++ b/gcc/testsuite/gcc.target/i386/387-ficom-2.c
@@ -5,5 +5,5 @@
 
 #include "387-ficom-1.c"
 
-/* { dg-final { scan-assembler-times "ficomps" 3 } } */
+/* { dg-final { scan-assembler-times "ficomp\[s\t\]" 3 } } */
 /* { dg-final { scan-assembler-times "ficompl" 3 } } */
diff --git a/gcc/testsuite/gcc.target/i386/attr-nocf-check-1a.c b/gcc/testsuite/gcc.target/i386/attr-nocf-check-1a.c
new file mode 100644
index 00000000000..9549e697658
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-nocf-check-1a.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection -mcet" } */
+
+int func (int) __attribute__ ((nocf_check));
+int (*fptr) (int) __attribute__ ((nocf_check));
+typedef void (*nocf_check_t) (void) __attribute__ ((nocf_check));
+
+int
+foo1 (int arg)
+{
+  return func (arg) + fptr (arg);
+}
+
+void
+foo2 (void (*foo) (void))
+{
+  void (*func) (void) __attribute__((nocf_check)) = foo; /* { dg-warning "incompatible pointer type" "" { target c } } */
+							 /* { dg-error "invalid conversion" "" { target c++ } .-1 } */
+  func ();
+}
+
+void
+foo3 (nocf_check_t foo)
+{
+  foo ();
+}
+
+void
+foo4 (void (*foo) (void) __attribute__((nocf_check)))
+{
+  foo ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/attr-nocf-check-3a.c b/gcc/testsuite/gcc.target/i386/attr-nocf-check-3a.c
new file mode 100644
index 00000000000..1a833012409
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-nocf-check-3a.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection -mcet" } */
+
+int  foo (void) __attribute__ ((nocf_check));
+void (*foo1) (void) __attribute__((nocf_check));
+void (*foo2) (void);
+
+int __attribute__ ((nocf_check))
+foo (void) /* The function's address is not tracked.  */
+{
+  /* This call site is not tracked for
+     control-flow instrumentation.  */
+  (*foo1)();
+
+  foo1 = foo2; /* { dg-warning "incompatible pointer type" "" { target c } } */
+	       /* { dg-error "invalid conversion" "" { target c++ } .-1 } */
+  /* This call site is still not tracked for
+     control-flow instrumentation.  */
+  (*foo1)();
+
+  /* This call site is tracked for
+     control-flow instrumentation.  */
+  (*foo2)();
+
+  foo2 = foo1; /* { dg-warning "incompatible pointer type" "" { target c } } */
+	       /* { dg-error "invalid conversion" "" { target c++ } .-1 } */
+  /* This call site is still tracked for
+     control-flow instrumentation.  */
+  (*foo2)();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 085ba81a672..46238265ae6 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul -mgfni" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
@@ -412,8 +412,8 @@
 /* avx512dqintrin.h */
 #define __builtin_ia32_kshiftliqi(A, B) __builtin_ia32_kshiftliqi(A, 8)
 #define __builtin_ia32_kshiftriqi(A, B) __builtin_ia32_kshiftriqi(A, 8)
-#define __builtin_ia32_reducess(A, B, F) __builtin_ia32_reducess(A, B, 1)
-#define __builtin_ia32_reducesd(A, B, F) __builtin_ia32_reducesd(A, B, 1)
+#define __builtin_ia32_reducess_mask(A, B, F, W, U) __builtin_ia32_reducess_mask(A, B, 1, W, U)
+#define __builtin_ia32_reducesd_mask(A, B, F, W, U) __builtin_ia32_reducesd_mask(A, B, 1, W, U)
 #define __builtin_ia32_reduceps512_mask(A, E, C, D) __builtin_ia32_reduceps512_mask(A, 1, C, D)
 #define __builtin_ia32_reducepd512_mask(A, E, C, D) __builtin_ia32_reducepd512_mask(A, 1, C, D)
 #define __builtin_ia32_rangess128_round(A, B, I, F) __builtin_ia32_rangess128_round(A, B, 1, 8)
@@ -603,6 +603,16 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
+/* gfniintrin.h */
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, 1) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, 1, D, E) 
+
+
+
 #include <wmmintrin.h>
 #include <immintrin.h>
 #include <mm3dnow.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx-pr82370.c b/gcc/testsuite/gcc.target/i386/avx-pr82370.c
new file mode 100644
index 00000000000..4dc8a5bdaaf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx-pr82370.c
@@ -0,0 +1,65 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx -mno-avx2 -masm=att" } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 3 } } */
+
+typedef short int v32hi __attribute__((vector_size (64)));
+typedef short int v16hi __attribute__((vector_size (32)));
+typedef short int v8hi __attribute__((vector_size (16)));
+typedef int v16si __attribute__((vector_size (64)));
+typedef int v8si __attribute__((vector_size (32)));
+typedef int v4si __attribute__((vector_size (16)));
+typedef long long int v8di __attribute__((vector_size (64)));
+typedef long long int v4di __attribute__((vector_size (32)));
+typedef long long int v2di __attribute__((vector_size (16)));
+typedef unsigned short int v32uhi __attribute__((vector_size (64)));
+typedef unsigned short int v16uhi __attribute__((vector_size (32)));
+typedef unsigned short int v8uhi __attribute__((vector_size (16)));
+typedef unsigned int v16usi __attribute__((vector_size (64)));
+typedef unsigned int v8usi __attribute__((vector_size (32)));
+typedef unsigned int v4usi __attribute__((vector_size (16)));
+typedef unsigned long long int v8udi __attribute__((vector_size (64)));
+typedef unsigned long long int v4udi __attribute__((vector_size (32)));
+typedef unsigned long long int v2udi __attribute__((vector_size (16)));
+
+#ifdef __AVX512F__
+v32hi f1 (v32hi *x) { return *x >> 3; }
+v32uhi f2 (v32uhi *x) { return *x >> 5; }
+v32uhi f3 (v32uhi *x) { return *x << 7; }
+#endif
+v16hi f4 (v16hi *x) { return *x >> 3; }
+v16uhi f5 (v16uhi *x) { return *x >> 5; }
+v16uhi f6 (v16uhi *x) { return *x << 7; }
+v8hi f7 (v8hi *x) { return *x >> 3; }
+v8uhi f8 (v8uhi *x) { return *x >> 5; }
+v8uhi f9 (v8uhi *x) { return *x << 7; }
+#ifdef __AVX512F__
+v16si f10 (v16si *x) { return *x >> 3; }
+v16usi f11 (v16usi *x) { return *x >> 5; }
+v16usi f12 (v16usi *x) { return *x << 7; }
+#endif
+v8si f13 (v8si *x) { return *x >> 3; }
+v8usi f14 (v8usi *x) { return *x >> 5; }
+v8usi f15 (v8usi *x) { return *x << 7; }
+v4si f16 (v4si *x) { return *x >> 3; }
+v4usi f17 (v4usi *x) { return *x >> 5; }
+v4usi f18 (v4usi *x) { return *x << 7; }
+#ifdef __AVX512F__
+v8di f19 (v8di *x) { return *x >> 3; }
+v8udi f20 (v8udi *x) { return *x >> 5; }
+v8udi f21 (v8udi *x) { return *x << 7; }
+#endif
+v4di f22 (v4di *x) { return *x >> 3; }
+v4udi f23 (v4udi *x) { return *x >> 5; }
+v4udi f24 (v4udi *x) { return *x << 7; }
+v2di f25 (v2di *x) { return *x >> 3; }
+v2udi f26 (v2udi *x) { return *x >> 5; }
+v2udi f27 (v2udi *x) { return *x << 7; }
diff --git a/gcc/testsuite/gcc.target/i386/avx2-pr82370.c b/gcc/testsuite/gcc.target/i386/avx2-pr82370.c
new file mode 100644
index 00000000000..6609ebb504a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-pr82370.c
@@ -0,0 +1,23 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2 -mno-avx512f -masm=att" } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+
+#include "avx-pr82370.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512-check.h b/gcc/testsuite/gcc.target/i386/avx512-check.h
index 9693fa46721..9390c1ab9ea 100644
--- a/gcc/testsuite/gcc.target/i386/avx512-check.h
+++ b/gcc/testsuite/gcc.target/i386/avx512-check.h
@@ -75,6 +75,9 @@ main ()
 #ifdef AVX512VPOPCNTDQ
       && (ecx & bit_AVX512VPOPCNTDQ)
 #endif
+#ifdef GFNI
+      && (ecx & bit_GFNI)
+#endif
       && avx512f_os_support ())
     {
       DO_TEST ();
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-pr82370.c b/gcc/testsuite/gcc.target/i386/avx512bw-pr82370.c
new file mode 100644
index 00000000000..174f499a885
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-pr82370.c
@@ -0,0 +1,33 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bw -mno-avx512vl -masm=att" } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %zmm\[0-9]\+, %zmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+
+#include "avx-pr82370.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-vpermt2w-1.c b/gcc/testsuite/gcc.target/i386/avx512bw-vpermt2w-1.c
index be8737ec785..a734cb600ce 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bw-vpermt2w-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bw-vpermt2w-1.c
@@ -1,14 +1,14 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512bw -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } *
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } *
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2w\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-1.c
index b7549fada36..b8f24a0ccbd 100644
--- a/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-1.c
@@ -2,13 +2,24 @@
 /* { dg-options "-mavx512dq -O2" } */
 /* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
 
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducesd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+
 #include <immintrin.h>
 
+#define IMM 123
+
 volatile __m128d x1, x2;
 volatile __mmask8 m;
 
 void extern
 avx512dq_test (void)
 {
-  x1 = _mm_reduce_sd (x1, x2, 123);
+  x1 = _mm_reduce_sd (x1, x2, IMM);
+
+  x1 = _mm_mask_reduce_sd(x1, m, x1, x2, IMM);
+
+  x1 = _mm_maskz_reduce_sd(m, x1, x2, IMM);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-2.c b/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-2.c
new file mode 100644
index 00000000000..93e18271cbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-vreducesd-2.c
@@ -0,0 +1,66 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512dq" } */
+/* { dg-require-effective-target avx512dq } */
+
+#define AVX512DQ
+#include "avx512f-helper.h"
+#include <string.h>
+
+#define SIZE (AVX512F_LEN / 64)
+#include "avx512f-mask-type.h"
+
+#define IMM 0x23
+
+void
+CALC (double *r, double *s)
+{
+  int i;
+
+  memcpy (&r[1], &s[1], sizeof(double));
+
+  for (i = 0; i < 1; i++)
+    {
+      double tmp = (int) (4 * s[i]) / 4.0;
+      r[i] = s[i] - tmp;
+    }
+}
+
+void
+TEST (void)
+{
+  union128d res1, res2, res3;
+  union128d s1, s2, src;
+  double res_ref[2];
+  MASK_TYPE mask = MASK_VALUE;
+  int j;
+
+  for (j = 0; j < 2; j++)
+    {
+      s1.a[j] = j / 123.456;
+      s2.a[j] = j / 123.456;
+      res_ref[j] = j / 123.456;
+      res1.a[j] = DEFAULT_VALUE;
+      res2.a[j] = DEFAULT_VALUE;
+      res3.a[j] = DEFAULT_VALUE;
+    }
+
+  res1.x = _mm_reduce_sd (s1.x, s2.x, IMM);
+  res2.x = _mm_mask_reduce_sd (s1.x, mask, s1.x, s2.x, IMM);
+  res3.x = _mm_maskz_reduce_sd (mask, s1.x, s2.x, IMM);
+
+  CALC (res_ref, s2.a);
+
+  if (check_union128d (res1, res_ref))
+    abort ();
+ 
+  MASK_MERGE (d) (res_ref, mask, 1);
+
+  if (check_union128d (res2, res_ref))
+    abort ();
+
+  MASK_ZERO (d) (res_ref, mask, 1);
+
+  if (check_union128d (res3, res_ref))
+    abort ();
+
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-1.c b/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-1.c
index 2a6afe9643b..804074e2ba6 100644
--- a/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-1.c
@@ -2,13 +2,23 @@
 /* { dg-options "-mavx512dq -O2" } */
 /* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
 
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vreducess\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
 #include <immintrin.h>
 
+#define IMM 123
+
 volatile __m128 x1, x2;
 volatile __mmask8 m;
 
 void extern
 avx512dq_test (void)
 {
-  x1 = _mm_reduce_ss (x1, x2, 123);
+  x1 = _mm_reduce_ss (x1, x2, IMM);
+
+  x1 = _mm_mask_reduce_ss (x1, m, x1, x2, IMM);
+ 
+  x1 = _mm_maskz_reduce_ss (m, x1, x2, IMM);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-2.c b/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-2.c
new file mode 100644
index 00000000000..8558c3b3468
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-vreducess-2.c
@@ -0,0 +1,68 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512dq" } */
+/* { dg-require-effective-target avx512dq } */
+
+#define AVX512DQ
+#include "avx512f-helper.h"
+#include <string.h>
+
+#define SIZE (AVX512F_LEN / 64)
+#include "avx512f-mask-type.h"
+
+#define IMM 0x23
+
+void
+CALC (float *r, float *s)
+{
+  int i;
+
+  memcpy (&r[1], &s[1], 2 * sizeof(float));
+
+  for (i = 0; i < 2; i++)
+    {
+      float tmp = (int) (4 * s[i]) / 4.0;
+      r[i] = s[i] - tmp;
+    }
+}
+
+void
+TEST (void)
+{
+  printf("\nsize = %d\n\n", SIZE);
+
+  union128 res1, res2, res3;
+  union128 s1, s2, src;
+  float res_ref[4];
+  MASK_TYPE mask = MASK_VALUE;
+  int j;
+
+  for (j = 0; j < 4; j++)
+    {
+      s1.a[j] = j / 123.456;
+      s2.a[j] = j / 123.456;
+      res_ref[j] = j / 123.456;
+      res1.a[j] = DEFAULT_VALUE;
+      res2.a[j] = DEFAULT_VALUE;
+      res3.a[j] = DEFAULT_VALUE;
+    }
+
+  res1.x = _mm_reduce_ss (s1.x, s2.x, IMM);
+  res2.x = _mm_mask_reduce_ss (s1.x, mask, s1.x, s2.x, IMM);
+  res3.x = _mm_maskz_reduce_ss (mask, s1.x, s2.x, IMM);
+
+  CALC (res_ref, s2.a);
+
+  if (check_union128 (res1, res_ref))
+    abort ();
+ 
+  MASK_MERGE () (res_ref, mask, 1);
+
+  if (check_union128 (res2, res_ref))
+    abort ();
+
+  MASK_ZERO () (res_ref, mask, 1);
+
+  if (check_union128 (res3, res_ref))
+    abort ();
+
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-gf2p8affineinvqb-2.c b/gcc/testsuite/gcc.target/i386/avx512f-gf2p8affineinvqb-2.c
new file mode 100644
index 00000000000..af4839f4434
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-gf2p8affineinvqb-2.c
@@ -0,0 +1,74 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512f -mgfni -mavx512bw" } */
+/* { dg-require-effective-target avx512f } */
+/* { dg-require-effective-target gfni } */
+
+#define AVX512F
+
+#define GFNI
+#include "avx512f-helper.h"
+
+#define SIZE (AVX512F_LEN / 8)
+
+#include "avx512f-mask-type.h"
+#include <x86intrin.h>
+
+static void
+CALC (unsigned char *r, unsigned char *s1, unsigned char *s2, unsigned char imm)
+{
+  for (int a = 0; a < SIZE/8; a++)
+    {
+      for (int val = 0; val < 8; val++)
+        {
+          unsigned char result = 0;
+          for (int bit = 0; bit < 8; bit++)
+          {
+            unsigned char temp = s1[a*8 + val] & s2[a*8 + bit];
+            unsigned char parity = __popcntd(temp);
+            if (parity % 2)
+              result |= (1 << (8 - bit - 1));
+          }
+          r[a*8 + val] = result ^ imm; 
+        }
+    }
+}
+
+void
+TEST (void)
+{
+  int i;
+  UNION_TYPE (AVX512F_LEN, i_b) res1, res2, res3, src1, src2;
+  MASK_TYPE mask = MASK_VALUE;
+  char res_ref[SIZE];
+  unsigned char imm = 0;
+
+  for (i = 0; i < SIZE; i++)
+    {
+      src1.a[i] = i %2 ; // gfni inverse of 1 and 0 are 1 and 0
+      src2.a[i] = 1;
+    }
+
+  for (i = 0; i < SIZE; i++)
+    {
+      res1.a[i] = DEFAULT_VALUE;
+      res2.a[i] = DEFAULT_VALUE;
+      res3.a[i] = DEFAULT_VALUE;
+    }
+
+  CALC (res_ref, src1.a, src2.a, imm);
+
+  res1.x = INTRINSIC (_gf2p8affineinv_epi64_epi8) (src1.x, src2.x, imm);
+  res2.x = INTRINSIC (_mask_gf2p8affineinv_epi64_epi8) (res2.x, mask, src1.x, src2.x, imm);
+  res3.x = INTRINSIC (_maskz_gf2p8affineinv_epi64_epi8) (mask, src1.x, src2.x, imm);
+
+  if (UNION_CHECK (AVX512F_LEN, i_b) (res1, res_ref))
+    abort ();
+
+  MASK_MERGE (i_b) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_b) (res2, res_ref))
+    abort ();
+
+  MASK_ZERO (i_b) (res_ref, mask, SIZE);
+  if (UNION_CHECK (AVX512F_LEN, i_b) (res3, res_ref))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-pr82370.c b/gcc/testsuite/gcc.target/i386/avx512f-pr82370.c
new file mode 100644
index 00000000000..20ad8dccd29
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-pr82370.c
@@ -0,0 +1,33 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512bw -mno-avx512vl -masm=att" } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %zmm\[0-9]\+, %zmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 0 } } */
+
+#include "avx-pr82370.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-1.c
index 4b53e379acc..d3c30fcedb9 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512f" } */
-/* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 9 } } */
+/* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 9 } } */
 /* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vcmppd\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 
@@ -17,4 +17,29 @@ avx512f_test (void)
   m = _mm512_mask_cmp_pd_mask (m, x, x, _CMP_FALSE_OQ);
   m = _mm512_cmp_round_pd_mask (x, x, _CMP_FALSE_OQ, _MM_FROUND_NO_EXC);
   m = _mm512_mask_cmp_round_pd_mask (m, x, x, _CMP_FALSE_OQ, _MM_FROUND_NO_EXC);
+
+  m = _mm512_cmpeq_pd_mask (x, x);
+  m = _mm512_mask_cmpeq_pd_mask (m, x, x);
+
+  m = _mm512_cmplt_pd_mask (x, x);
+  m = _mm512_mask_cmplt_pd_mask (m, x, x);
+
+  m = _mm512_cmple_pd_mask (x, x);
+  m = _mm512_mask_cmple_pd_mask (m, x, x);
+
+  m = _mm512_cmpunord_pd_mask (x, x);
+  m = _mm512_mask_cmpunord_pd_mask (m, x, x);
+
+  m = _mm512_cmpneq_pd_mask (x, x);
+  m = _mm512_mask_cmpneq_pd_mask (m, x, x);
+
+  m = _mm512_cmpnlt_pd_mask (x, x);
+  m = _mm512_mask_cmpnlt_pd_mask (m, x, x);
+
+  m = _mm512_cmpnle_pd_mask (x, x);
+  m = _mm512_mask_cmpnle_pd_mask (m, x, x);
+
+  m = _mm512_cmpord_pd_mask (x, x);
+  m = _mm512_mask_cmpord_pd_mask (m, x, x);
 }
+
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-2.c
index 52e226d9f15..cee11971399 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcmppd-2.c
@@ -11,58 +11,69 @@
 #define SIZE (AVX512F_LEN / 64)
 #include "avx512f-mask-type.h"
 
+#undef SUF
+#undef SSIZE
+#undef GEN_CMP
+#undef CHECK_CMP
+
 #if AVX512F_LEN == 512
-#define CMP(imm, rel)					\
-    dst_ref = 0;					\
-    for (i = 0; i < 8; i++)				\
-    {							\
-      dst_ref = (((int) rel) << i) | dst_ref;		\
-    }							\
-    source1.x = _mm512_loadu_pd(s1);			\
-    source2.x = _mm512_loadu_pd(s2);			\
-    dst1 = _mm512_cmp_pd_mask(source1.x, source2.x, imm);\
-    dst2 = _mm512_mask_cmp_pd_mask(mask, source1.x, source2.x, imm);\
-    if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
+#define SUF(fun) _mm512##fun
+#define SSIZE 8
+
+#define GEN_CMP(type)				\
+    {						\
+    dst3 = _mm512_cmp##type##_pd_mask(source1.x, source2.x);\
+    dst4 = _mm512_mask_cmp##type##_pd_mask(mask, source1.x, source2.x);\
+    if (dst3 != dst1) abort();			\
+    if (dst4 != dst2) abort();			\
+    }
+
+#define CHECK_CMP(imm)				\
+    if (imm == _CMP_EQ_OQ) GEN_CMP(eq)		\
+    if (imm == _CMP_LT_OS) GEN_CMP(lt)		\
+    if (imm == _CMP_LE_OS) GEN_CMP(le)		\
+    if (imm == _CMP_UNORD_Q) GEN_CMP(unord)	\
+    if (imm == _CMP_NEQ_UQ) GEN_CMP(neq)	\
+    if (imm == _CMP_NLT_US) GEN_CMP(nlt)	\
+    if (imm == _CMP_NLE_US) GEN_CMP(nle)	\
+    if (imm == _CMP_ORD_Q) GEN_CMP(ord)	
+
 #endif
 
 #if AVX512F_LEN == 256
-#undef CMP
-#define CMP(imm, rel)					\
-    dst_ref = 0;					\
-    for (i = 0; i < 4; i++)				\
-    {							\
-      dst_ref = (((int) rel) << i) | dst_ref;		\
-    }							\
-    source1.x = _mm256_loadu_pd(s1);			\
-    source2.x = _mm256_loadu_pd(s2);			\
-    dst1 = _mm256_cmp_pd_mask(source1.x, source2.x, imm);\
-    dst2 = _mm256_mask_cmp_pd_mask(mask, source1.x, source2.x, imm);\
-    if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
+#define SUF(fun) _mm256##fun
+#define SSIZE 4
+#define GEN_CMP(type)
+#define CHECK_CMP(imm)
 #endif
 
 #if AVX512F_LEN == 128
+#define SUF(fun) _mm##fun
+#define SSIZE 2
+#define GEN_CMP(type)
+#define CHECK_CMP(imm)
+#endif
+
 #undef CMP
 #define CMP(imm, rel)					\
     dst_ref = 0;					\
-    for (i = 0; i < 2; i++)				\
+    for (i = 0; i < SSIZE; i++)				\
     {							\
       dst_ref = (((int) rel) << i) | dst_ref;		\
     }							\
-    source1.x = _mm_loadu_pd(s1);			\
-    source2.x = _mm_loadu_pd(s2);			\
-    dst1 = _mm_cmp_pd_mask(source1.x, source2.x, imm);\
-    dst2 = _mm_mask_cmp_pd_mask(mask, source1.x, source2.x, imm);\
+    source1.x = SUF(_loadu_pd)(s1);			\
+    source2.x = SUF(_loadu_pd)(s2);			\
+    dst1 = SUF(_cmp_pd_mask)(source1.x, source2.x, imm);\
+    dst2 = SUF(_mask_cmp_pd_mask)(mask, source1.x, source2.x, imm);\
     if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
-#endif
+    if ((dst_ref & mask) != dst2) abort();		\
+    CHECK_CMP(imm)
 
 void
 TEST ()
 {
     UNION_TYPE (AVX512F_LEN, d) source1, source2;
-    MASK_TYPE dst1, dst2, dst_ref;
+    MASK_TYPE dst1, dst2, dst3, dst4, dst_ref;
     MASK_TYPE mask = MASK_VALUE;
     int i;
     double s1[8]={2134.3343, 6678.346, 453.345635, 54646.464,
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-1.c
index 9812915a4e7..27be36070ef 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512f" } */
-/* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 9 } } */
+/* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\[^\}\]%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 9 } } */
 /* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vcmpps\[ \\t\]+\[^\{\n\]*\{sae\}\[^\n\]*%zmm\[0-9\]+\[^\n^k\]*%k\[1-7\]\{%k\[0-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 
@@ -17,4 +17,28 @@ avx512f_test (void)
   m = _mm512_mask_cmp_ps_mask (m, x, x, _CMP_FALSE_OQ);
   m = _mm512_cmp_round_ps_mask (x, x, _CMP_FALSE_OQ, _MM_FROUND_NO_EXC);
   m = _mm512_mask_cmp_round_ps_mask (m, x, x, _CMP_FALSE_OQ, _MM_FROUND_NO_EXC);
+
+  m = _mm512_cmpeq_ps_mask (x, x);
+  m = _mm512_mask_cmpeq_ps_mask (m, x, x);
+
+  m = _mm512_cmplt_ps_mask (x, x);
+  m = _mm512_mask_cmplt_ps_mask (m, x, x);
+
+  m = _mm512_cmple_ps_mask (x, x);
+  m = _mm512_mask_cmple_ps_mask (m, x, x);
+
+  m = _mm512_cmpunord_ps_mask (x, x);
+  m = _mm512_mask_cmpunord_ps_mask (m, x, x);
+
+  m = _mm512_cmpneq_ps_mask (x, x);
+  m = _mm512_mask_cmpneq_ps_mask (m, x, x);
+
+  m = _mm512_cmpnlt_ps_mask (x, x);
+  m = _mm512_mask_cmpnlt_ps_mask (m, x, x);
+
+  m = _mm512_cmpnle_ps_mask (x, x);
+  m = _mm512_mask_cmpnle_ps_mask (m, x, x);
+
+  m = _mm512_cmpord_ps_mask (x, x);
+  m = _mm512_mask_cmpord_ps_mask (m, x, x);
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-2.c
index 2ffa2ed16b7..22e368f723e 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vcmpps-2.c
@@ -11,59 +11,69 @@
 #define SIZE (AVX512F_LEN / 32)
 #include "avx512f-mask-type.h"
 
+#undef SUF
+#undef SSIZE
+#undef GEN_CMP
+#undef CHECK_CMP
+
 #if AVX512F_LEN == 512
-#undef CMP
-#define CMP(imm, rel)					\
-    dst_ref = 0;					\
-    for (i = 0; i < 16; i++)				\
-    {							\
-      dst_ref = (((int) rel) << i) | dst_ref;		\
-    }							\
-    source1.x = _mm512_loadu_ps(s1);			\
-    source2.x = _mm512_loadu_ps(s2);			\
-    dst1 = _mm512_cmp_ps_mask(source1.x, source2.x, imm);\
-    dst2 = _mm512_mask_cmp_ps_mask(mask, source1.x, source2.x, imm);\
-    if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
+#define SUF(fun) _mm512##fun
+#define SSIZE 16
+
+#define GEN_CMP(type)				\
+    {						\
+    dst3 = _mm512_cmp##type##_ps_mask(source1.x, source2.x);\
+    dst4 = _mm512_mask_cmp##type##_ps_mask(mask, source1.x, source2.x);\
+    if (dst3 != dst1) abort();			\
+    if (dst4 != dst2) abort();			\
+    }
+
+#define CHECK_CMP(imm)				\
+    if (imm == _CMP_EQ_OQ) GEN_CMP(eq)		\
+    if (imm == _CMP_LT_OS) GEN_CMP(lt)		\
+    if (imm == _CMP_LE_OS) GEN_CMP(le)		\
+    if (imm == _CMP_UNORD_Q) GEN_CMP(unord)	\
+    if (imm == _CMP_NEQ_UQ) GEN_CMP(neq)	\
+    if (imm == _CMP_NLT_US) GEN_CMP(nlt)	\
+    if (imm == _CMP_NLE_US) GEN_CMP(nle)	\
+    if (imm == _CMP_ORD_Q) GEN_CMP(ord)	
+
 #endif
 
 #if AVX512F_LEN == 256
-#undef CMP
-#define CMP(imm, rel)					\
-    dst_ref = 0;					\
-    for (i = 0; i < 8; i++)				\
-    {							\
-      dst_ref = (((int) rel) << i) | dst_ref;		\
-    }							\
-    source1.x = _mm256_loadu_ps(s1);			\
-    source2.x = _mm256_loadu_ps(s2);			\
-    dst1 = _mm256_cmp_ps_mask(source1.x, source2.x, imm);\
-    dst2 = _mm256_mask_cmp_ps_mask(mask, source1.x, source2.x, imm);\
-    if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
+#define SUF(fun) _mm256##fun
+#define SSIZE 8
+#define GEN_CMP(type)
+#define CHECK_CMP(imm)
 #endif
 
 #if AVX512F_LEN == 128
+#define SUF(fun) _mm##fun
+#define SSIZE 4
+#define GEN_CMP(type)
+#define CHECK_CMP(imm)
+#endif
+
 #undef CMP
 #define CMP(imm, rel)					\
     dst_ref = 0;					\
-    for (i = 0; i < 4; i++)				\
+    for (i = 0; i < SSIZE; i++)				\
     {							\
       dst_ref = (((int) rel) << i) | dst_ref;		\
     }							\
-    source1.x = _mm_loadu_ps(s1);			\
-    source2.x = _mm_loadu_ps(s2);			\
-    dst1 = _mm_cmp_ps_mask(source1.x, source2.x, imm);\
-    dst2 = _mm_mask_cmp_ps_mask(mask, source1.x, source2.x, imm);\
+    source1.x = SUF(_loadu_ps)(s1);			\
+    source2.x = SUF(_loadu_ps)(s2);			\
+    dst1 = SUF(_cmp_ps_mask)(source1.x, source2.x, imm);\
+    dst2 = SUF(_mask_cmp_ps_mask)(mask, source1.x, source2.x, imm);\
     if (dst_ref != dst1) abort();			\
-    if ((dst_ref & mask) != dst2) abort();
-#endif
+    if ((dst_ref & mask) != dst2) abort();		\
+    CHECK_CMP(imm)
 
 void
 TEST ()
 {
     UNION_TYPE (AVX512F_LEN,) source1, source2;
-    MASK_TYPE dst1, dst2, dst_ref;
+    MASK_TYPE dst1, dst2, dst3, dst4, dst_ref;
     MASK_TYPE mask = MASK_VALUE;
     int i;
     float s1[16] = {2134.3343, 6678.346, 453.345635, 54646.464,
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2d-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2d-1.c
index ceb1bd3bf0c..919cd217c98 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2d-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2d-1.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2pd-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2pd-1.c
index 2a4955b0f34..c021efb3192 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2pd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2pd-1.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2ps-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2ps-1.c
index dadc6d70530..ffe177bf320 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2ps-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2ps-1.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2q-1.c b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2q-1.c
index 9c6e989b9dd..74bb4ed037c 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-vpermt2q-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vpermt2q-1.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512f -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vbmi-vpermt2b-1.c b/gcc/testsuite/gcc.target/i386/avx512vbmi-vpermt2b-1.c
index f1c31cc56b0..24a0b9e3fce 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vbmi-vpermt2b-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vbmi-vpermt2b-1.c
@@ -1,14 +1,14 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512vbmi -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+" 3 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+" 3 } } *
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+" 3 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+" 3 } } *
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+" 3 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\[^\{\]" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2b\[ \\t\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c b/gcc/testsuite/gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c
new file mode 100644
index 00000000000..fa545263041
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512bw -mavx512vl -mgfni" } */
+/* { dg-require-effective-target avx512vl } */
+/* { dg-require-effective-target avx512bw } */
+/* { dg-require-effective-target gfni } */
+
+#define AVX512VL
+#define AVX512F_LEN 256
+#define AVX512F_LEN_HALF 128
+#include "avx512f-gf2p8affineinvqb-2.c"
+
+#undef AVX512F_LEN
+#undef AVX512F_LEN_HALF
+
+#define AVX512F_LEN 128
+#define AVX512F_LEN_HALF 128
+#include "avx512f-gf2p8affineinvqb-2.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-pr82370.c b/gcc/testsuite/gcc.target/i386/avx512vl-pr82370.c
new file mode 100644
index 00000000000..486ece5c2ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-pr82370.c
@@ -0,0 +1,31 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512bw -masm=att" } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %xmm\[0-9]\+, %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, %ymm\[0-9]\+, %ymm\[0-9]\+" 3 } } */
+/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dq]\[ \t]\+\\\$\[357], %\[xyz]mm\[0-9]\+, %\[xyz]mm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vps\[lr]\[la]w\[ \t]\+\\\$\[357], \\(%\[a-z0-9,]*\\), %\[xyz]mm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+
+#include "avx-pr82370.c"
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2d-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2d-1.c
index 3a6de905cb6..218650c6cc4 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2d-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2d-1.c
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2d\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2pd-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2pd-1.c
index 5dd0734bca4..64bd30e40c3 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2pd-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2pd-1.c
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2pd\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2ps-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2ps-1.c
index 0d7e37bb548..7af2dea6f9a 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2ps-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2ps-1.c
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
 /* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2ps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2ps\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2q-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2q-1.c
index 475aa6dbd04..0cbd8b5b2a3 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2q-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vpermt2q-1.c
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vpermt2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)"  1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[ti]2q\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vlbw-pr82370.c b/gcc/testsuite/gcc.target/i386/avx512vlbw-pr82370.c
new file mode 100644
index 00000000000..6809b4dda60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512vlbw-pr82370.c
@@ -0,0 +1,33 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
+/* { dg-final { scan-assembler-times "vps\[lr]\[la]\[dwq]\[ \t]\+\\\$\[357], %\[xyz]mm\[0-9]\+, %\[xyz]mm\[0-9]\+" 0 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllq\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsllw\[ \t]\+\\\$7, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrad\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraq\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrlw\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+
+#include "avx-pr82370.c"
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-10.c b/gcc/testsuite/gcc.target/i386/cet-intrin-10.c
new file mode 100644
index 00000000000..695dc5edc34
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-10.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "clrssbsy" 1 } } */
+
+#include <immintrin.h>
+
+void f2 (void *__B)
+{
+  _clrssbsy (__B);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-3.c b/gcc/testsuite/gcc.target/i386/cet-intrin-3.c
new file mode 100644
index 00000000000..bcd7203fdb4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-3.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 2 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 4 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler "rdsspd|incsspd\[ \t]+(%|)eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "rdssp\[dq]\[ \t]+(%|)\[re]ax" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler "incssp\[dq]\[ \t]+(%|)\[re]di" { target { ! ia32 } } } } */
+
+#include <immintrin.h>
+
+unsigned int f1 ()
+{
+  unsigned int x = 0;
+  return _rdsspd (x);
+}
+
+void f3 (unsigned int _a)
+{
+  _incsspd (_a);
+}
+
+#ifdef __x86_64__
+unsigned long long f2 ()
+{
+  unsigned long long x = 0;
+  return _rdsspq (x);
+}
+
+void f4 (unsigned int _a)
+{
+  _incsspq (_a);
+}
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-4.c b/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
new file mode 100644
index 00000000000..76ec160543f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-4.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mshstk" } */
+/* { dg-final { scan-assembler "rdsspd|incsspd\[ \t]+(%|)eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "rdssp\[dq]\[ \t]+(%|)\[re]ax"  { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler "incssp\[dq]\[ \t]+(%|)\[re]di" { target { ! ia32 } } } } */
+
+#include <immintrin.h>
+
+unsigned int f1 ()
+{
+  unsigned int x = 0;
+  return _rdsspd (x);
+}
+
+void f3 (unsigned int _a)
+{
+  _incsspd (_a);
+}
+
+#ifdef __x86_64__
+unsigned long long f2 ()
+{
+  unsigned long long x = 0;
+  return _rdsspq (x);
+}
+
+void f4 (unsigned int _a)
+{
+  _incsspq (_a);
+}
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-5.c b/gcc/testsuite/gcc.target/i386/cet-intrin-5.c
new file mode 100644
index 00000000000..8a1b637905c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "saveprevssp" 1 } } */
+
+#include <immintrin.h>
+
+void f2 (void)
+{
+  _saveprevssp ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-6.c b/gcc/testsuite/gcc.target/i386/cet-intrin-6.c
new file mode 100644
index 00000000000..dfa6d20ca26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "rstorssp" 1 } } */
+
+#include <immintrin.h>
+
+void f2 (void *__B)
+{
+  _rstorssp (__B);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-7.c b/gcc/testsuite/gcc.target/i386/cet-intrin-7.c
new file mode 100644
index 00000000000..ecd1825a303
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-7.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "wrssd" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "wrss\[d|q]" 2 { target lp64 } } } */
+
+#include <immintrin.h>
+
+void f1 (unsigned int __A, void *__B)
+{
+  _wrssd (__A, __B);
+}
+
+#ifdef __x86_64__
+void f2 (unsigned long long __A, void *__B)
+{
+  _wrssq (__A, __B);
+}
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-8.c b/gcc/testsuite/gcc.target/i386/cet-intrin-8.c
new file mode 100644
index 00000000000..2188876cca5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-8.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "wrussd" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "wruss\[d|q]" 2 { target lp64 } } } */
+
+#include <immintrin.h>
+
+void f1 (unsigned int __A, void *__B)
+{
+  _wrussd (__A, __B);
+}
+
+#ifdef __x86_64__
+void f2 (unsigned long long __A, void *__B)
+{
+  _wrussq (__A, __B);
+}
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/cet-intrin-9.c b/gcc/testsuite/gcc.target/i386/cet-intrin-9.c
new file mode 100644
index 00000000000..569931a9492
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-intrin-9.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "setssbsy" 1 } } */
+
+#include <immintrin.h>
+
+void f2 (void)
+{
+  _setssbsy ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-label-2.c b/gcc/testsuite/gcc.target/i386/cet-label-2.c
new file mode 100644
index 00000000000..c7f79819079
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-label-2.c
@@ -0,0 +1,24 @@
+/* Verify that CET works.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 3 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 3 { target { ! ia32 } } } } */
+
+__attribute__ ((noinline, noclone))
+static int
+func (int arg)
+{
+  static void *array[] = { &&foo, &&bar };
+
+  goto *array[arg];
+foo:
+  return arg*111;
+bar:
+  return arg*777;
+}
+
+int
+foo (int arg)
+{
+  return func (arg);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-label.c b/gcc/testsuite/gcc.target/i386/cet-label.c
new file mode 100644
index 00000000000..8fb8d420349
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-label.c
@@ -0,0 +1,16 @@
+/* Verify that CET works.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 3 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 3 { target { ! ia32 } } } } */
+
+int func (int arg)
+{
+  static void *array[] = { &&foo, &&bar };
+
+  goto *array[arg];
+foo:
+  return arg*111;
+bar:
+  return arg*777;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-1a.c b/gcc/testsuite/gcc.target/i386/cet-notrack-1a.c
new file mode 100644
index 00000000000..ab0bd3ba9b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-1a.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -fcf-protection=none -mno-cet" } */
+/* { dg-final { scan-assembler-not "endbr" } } */
+/* { dg-final { scan-assembler-not "notrack call\[ \t]+" } } */
+
+int func (int a) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+int (*fptr) (int a) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+
+int foo (int arg)
+{
+  int a, b;
+  a = func (arg);
+  b = (*fptr) (arg);
+  return a+b;
+}
+
+int __attribute__ ((nocf_check))
+func (int arg)
+{ /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+  int (*fptrl) (int a) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+  return arg*(*fptrl)(arg);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-1b.c b/gcc/testsuite/gcc.target/i386/cet-notrack-1b.c
new file mode 100644
index 00000000000..6faf88fdf04
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-1b.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 2 } } */
+
+int func (int a) __attribute__ ((nocf_check));
+int (*fptr) (int a) __attribute__ ((nocf_check));
+
+int foo (int arg)
+{
+int a, b;
+  a = func (arg);
+  b = (*fptr) (arg);
+  return a+b;
+}
+
+int __attribute__ ((nocf_check))
+func (int arg)
+{
+int (*fptrl) (int a) __attribute__ ((nocf_check));
+  return arg*(*fptrl)(arg);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-2a.c b/gcc/testsuite/gcc.target/i386/cet-notrack-2a.c
new file mode 100644
index 00000000000..6f441e49edf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-2a.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+
+void
+bar (void (*foo) (void))
+{
+  void (*func) (void) __attribute__((nocf_check)) = foo; /* { dg-warning "incompatible pointer type" } */
+  func ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-2b.c b/gcc/testsuite/gcc.target/i386/cet-notrack-2b.c
new file mode 100644
index 00000000000..0df46450e88
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-2b.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack jmp\[ \t]+" 1 } } */
+
+void
+bar (void (*foo) (void))
+{
+  void (*func) (void) __attribute__((nocf_check)) = foo; /* { dg-warning "incompatible pointer type" } */
+  func ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-3.c b/gcc/testsuite/gcc.target/i386/cet-notrack-3.c
new file mode 100644
index 00000000000..5e124c7f95c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+
+typedef void (*func_t) (void) __attribute__((nocf_check));
+extern func_t func;
+
+void
+bar (void)
+{
+  func ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-4a.c b/gcc/testsuite/gcc.target/i386/cet-notrack-4a.c
new file mode 100644
index 00000000000..34cfd9098c2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-4a.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-fcf-protection=none -mno-cet" } */
+
+int var1 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+int *var2 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+void (**var3) (void) __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-4b.c b/gcc/testsuite/gcc.target/i386/cet-notrack-4b.c
new file mode 100644
index 00000000000..6065ef69c25
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-4b.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+
+int var1 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+int *var2 __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
+void (**var3) (void) __attribute__((nocf_check)); /* { dg-warning "'nocf_check' attribute only applies to function types" } */
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-5a.c b/gcc/testsuite/gcc.target/i386/cet-notrack-5a.c
new file mode 100644
index 00000000000..d23968e58d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-5a.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-not "\tcall\[ \t]+" } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+
+int (*fptr) (int) __attribute__ ((nocf_check));
+
+int
+foo (int arg)
+{
+  int a;
+  a = (*fptr) (arg); /* notrack call.  */
+  return arg+a;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-5b.c b/gcc/testsuite/gcc.target/i386/cet-notrack-5b.c
new file mode 100644
index 00000000000..42d9d07b19d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-5b.c
@@ -0,0 +1,21 @@
+/* Check the attribute do not proparate through assignment.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcall\[ \t]+" 1 } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+
+int (*fptr) (int) __attribute__ ((nocf_check));
+int (*fptr1) (int);
+
+int
+foo (int arg)
+{
+  int a;
+  a = (*fptr) (arg); /* non-checked call.  */
+  arg += a;
+  fptr1 = fptr; /* { dg-warning "incompatible pointer type" } */ 
+  a = (*fptr1) (arg); /* checked call.  */
+  return arg+a;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-6a.c b/gcc/testsuite/gcc.target/i386/cet-notrack-6a.c
new file mode 100644
index 00000000000..e0fb4f90aaf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-6a.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\t(?:call|jmp)\[ \t]+.*foo" 1 } } */
+/* { dg-final { scan-assembler-not "notrack call\[ \t]+" } } */
+
+int foo (int arg);
+
+int func (int arg)
+{
+  int (*fptrl) (int a) __attribute__ ((nocf_check)) = foo; /* { dg-warning "incompatible pointer type" } */
+
+  return (*fptrl)(arg);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-6b.c b/gcc/testsuite/gcc.target/i386/cet-notrack-6b.c
new file mode 100644
index 00000000000..1c47c9f7d20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-6b.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-not "\tcall\[ \t]+" } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+" 1 } } */
+
+int foo (int arg);
+
+int func (int arg)
+{
+  int (*fptrl) (int a) __attribute__ ((nocf_check)) = foo; /* { dg-warning "incompatible pointer type" } */
+
+  return (*fptrl)(arg);  /* notrack call.  */
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-7.c b/gcc/testsuite/gcc.target/i386/cet-notrack-7.c
new file mode 100644
index 00000000000..f2e31d0258a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-7.c
@@ -0,0 +1,15 @@
+/* Check the notrack prefix is not generated for direct call.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack call\[ \t]+.*foo" 0 } } */
+/* { dg-final { scan-assembler-times "\tcall\[ \t]+.*foo" 1 } } */
+
+extern void foo (void) __attribute__((nocf_check));
+
+void
+bar (void)
+{
+  foo ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-icf-1.c b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-1.c
new file mode 100644
index 00000000000..7987d53d305
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-1.c
@@ -0,0 +1,31 @@
+/* Verify nocf_check functions are not ICF optimized.  */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "endbr" } } */
+/* { dg-final { scan-assembler-not "fn3:" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn2,fn1" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn3,fn1" } } */
+
+static __attribute__((noinline)) int
+fn1 (int x)
+{
+  return x + 12;
+}
+
+static __attribute__((noinline)) int
+fn2 (int x)
+{
+  return x + 12;
+}
+
+static __attribute__((noinline, nocf_check)) int
+fn3 (int x)
+{ /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+  return x + 12;
+}
+
+int
+fn4 (int x)
+{
+  return fn1 (x) + fn2 (x) + fn3 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-icf-2.c b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-2.c
new file mode 100644
index 00000000000..db0b0a44237
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-2.c
@@ -0,0 +1,30 @@
+/* Verify nocf_check functions are not ICF optimized.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler "endbr" } } */
+/* { dg-final { scan-assembler "fn3:" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn2,fn1" } } */
+
+static __attribute__((noinline)) int
+fn1 (int x)
+{
+  return x + 12;
+}
+
+static __attribute__((noinline)) int
+fn2 (int x)
+{
+  return x + 12;
+}
+
+static __attribute__((noinline, nocf_check)) int
+fn3 (int x)
+{
+  return x + 12;
+}
+
+int
+fn4 (int x)
+{
+  return fn1 (x) + fn2 (x) + fn3 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-icf-3.c b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-3.c
new file mode 100644
index 00000000000..07c4a6b61ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-3.c
@@ -0,0 +1,36 @@
+/* Verify nocf_check function calls are not ICF optimized.  */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "endbr" } } */
+/* { dg-final { scan-assembler-not "fn2:" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn2,fn1" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn3,fn1" } } */
+
+int (*foo)(int);
+
+typedef int (*type1_t) (int) __attribute__ ((nocf_check)); /* { dg-warning "'nocf_check' attribute ignored. Use -fcf-protection option to enable it" } */
+typedef int (*type2_t) (int);
+
+static __attribute__((noinline)) int
+fn1 (int x)
+{
+  return ((type2_t)foo)(x + 12);
+}
+
+static __attribute__((noinline)) int
+fn2 (int x)
+{
+  return ((type1_t)foo)(x + 12);
+}
+
+static __attribute__((noinline)) int
+fn3 (int x)
+{
+  return ((type2_t)foo)(x + 12);
+}
+
+int
+fn4 (int x)
+{
+  return fn1 (x) + fn2 (x) + fn3 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-notrack-icf-4.c b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-4.c
new file mode 100644
index 00000000000..e4e96aaf0dc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-notrack-icf-4.c
@@ -0,0 +1,35 @@
+/* Verify nocf_check function calls are not ICF optimized.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler "endbr" } } */
+/* { dg-final { scan-assembler "fn2:" } } */
+/* { dg-final { scan-assembler "set\[ \t]+fn3,fn1" } } */
+
+int (*foo)(int);
+
+typedef int (*type1_t) (int) __attribute__ ((nocf_check));
+typedef int (*type2_t) (int);
+
+static __attribute__((noinline)) int
+fn1 (int x)
+{
+  return ((type2_t)foo)(x + 12);
+}
+
+static __attribute__((noinline)) int
+fn2 (int x)
+{
+  return ((type1_t)foo)(x + 12);
+}
+
+static __attribute__((noinline)) int
+fn3 (int x)
+{
+  return ((type2_t)foo)(x + 12);
+}
+
+int
+fn4 (int x)
+{
+  return fn1 (x) + fn2 (x) + fn3 (x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-property-1.c b/gcc/testsuite/gcc.target/i386/cet-property-1.c
new file mode 100644
index 00000000000..df243efc574
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-property-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-fcf-protection -mcet" } */
+/* { dg-final { scan-assembler ".note.gnu.property" } } */
+
+extern void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-property-2.c b/gcc/testsuite/gcc.target/i386/cet-property-2.c
new file mode 100644
index 00000000000..5a87dab92f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-property-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mcet" } */
+/* { dg-final { scan-assembler-not ".note.gnu.property" } } */
+
+extern void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c b/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c
new file mode 100644
index 00000000000..fb50ff43504
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target cet } } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+
+void _exit(int status) __attribute__ ((__noreturn__));
+
+#ifdef __x86_64__
+# define incssp(x) __builtin_ia32_incsspq (x)
+# define rdssp(x) __builtin_ia32_rdsspq (x)
+#else
+# define incssp(x) __builtin_ia32_incsspd (x)
+# define rdssp(x) __builtin_ia32_rdsspd (x)
+#endif
+
+static void
+__attribute__ ((noinline, noclone))
+test (unsigned long frames)
+{
+  unsigned long ssp = 0;
+  ssp = rdssp (ssp);
+  if (ssp != 0)
+    {
+      unsigned long tmp = frames;
+      while (tmp > 255)
+	{
+	  incssp (tmp);
+	  tmp -= 255;
+	}
+      incssp (tmp);
+    }
+  /* We must call _exit since shadow stack is incorrect now.  */
+  _exit (0);
+}
+
+int
+main ()
+{
+  test (1);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-sjlj-1.c b/gcc/testsuite/gcc.target/i386/cet-sjlj-1.c
new file mode 100644
index 00000000000..374d12aa745
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-sjlj-1.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 4 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 4 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "rdssp\[dq]" 2 } } */
+/* { dg-final { scan-assembler-times "incssp\[dq]" 1 } } */
+
+/* Based on gcc.dg/setjmp-3.c.  */
+
+void *buf[5];
+
+extern void abort (void);
+
+void raise0(void)
+{
+  __builtin_longjmp (buf, 1);
+}
+
+int execute(int cmd)
+{
+  int last = 0;
+
+  if (__builtin_setjmp (buf) == 0)
+    while (1)
+      {
+	last = 1;
+	raise0 ();
+      }
+
+  if (last == 0)
+    return 0;
+  else
+    return cmd;
+}
+
+int main(void)
+{
+  if (execute (1) == 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-sjlj-2.c b/gcc/testsuite/gcc.target/i386/cet-sjlj-2.c
new file mode 100644
index 00000000000..c97094a19c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-sjlj-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target cet } } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+
+#include "cet-sjlj-1.c"
diff --git a/gcc/testsuite/gcc.target/i386/cet-sjlj-3.c b/gcc/testsuite/gcc.target/i386/cet-sjlj-3.c
new file mode 100644
index 00000000000..585f4d7ae89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-sjlj-3.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 4 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 4 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "call	_?setjmp" 1 } } */
+/* { dg-final { scan-assembler-times "call	longjmp" 1 } } */
+
+#include <stdio.h>
+#include <setjmp.h>
+
+jmp_buf buf;
+int bar (int);
+
+int
+foo (int i)
+{
+  int j = i * 11;
+
+  if (!setjmp (buf))
+    {
+      j += 33;
+      printf ("After setjmp: j = %d\n", j);
+      bar (j);
+    }
+
+  return j + i;
+}
+
+int
+bar (int i)
+{
+int j = i;
+
+  j -= 111;
+  printf ("In longjmp: j = %d\n", j);
+  longjmp (buf, 1);
+
+  return j;
+}
+
+int
+main ()
+{
+  foo (10);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-sjlj-4.c b/gcc/testsuite/gcc.target/i386/cet-sjlj-4.c
new file mode 100644
index 00000000000..d41406fde1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-sjlj-4.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 3 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 3 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "rdssp\[dq]" 2 } } */
+/* { dg-final { scan-assembler-times "incssp\[dq]" 1 } } */
+
+/* Based on gcc.dg/setjmp-3.c.  */
+
+void *buf[5];
+
+extern void abort (void);
+
+void
+raise0 (void)
+{
+  __builtin_longjmp (buf, 1);
+}
+
+__attribute__ ((noinline, noclone))
+static int
+execute (int cmd)
+{
+  int last = 0;
+
+  if (__builtin_setjmp (buf) == 0)
+    while (1)
+      {
+	last = 1;
+	raise0 ();
+      }
+
+  if (last == 0)
+    return 0;
+  else
+    return cmd;
+}
+
+int main(void)
+{
+  if (execute (1) == 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-sjlj-5.c b/gcc/testsuite/gcc.target/i386/cet-sjlj-5.c
new file mode 100644
index 00000000000..8e54b4bfec8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-sjlj-5.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 2 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 2 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "call	_?setjmp" 1 } } */
+/* { dg-final { scan-assembler-times "call	longjmp" 1 } } */
+
+#include <stdio.h>
+#include <setjmp.h>
+
+jmp_buf buf;
+static int bar (int);
+
+__attribute__ ((noinline, noclone))
+static int
+foo (int i)
+{
+  int j = i * 11;
+
+  if (!setjmp (buf))
+    {
+      j += 33;
+      printf ("After setjmp: j = %d\n", j);
+      bar (j);
+    }
+
+  return j + i;
+}
+
+__attribute__ ((noinline, noclone))
+static int
+bar (int i)
+{
+ int j = i;
+
+  j -= 111;
+  printf ("In longjmp: j = %d\n", j);
+  longjmp (buf, 1);
+
+  return j;
+}
+
+int
+main ()
+{
+  foo (10);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-switch-1.c b/gcc/testsuite/gcc.target/i386/cet-switch-1.c
new file mode 100644
index 00000000000..7a75857fcb1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-switch-1.c
@@ -0,0 +1,26 @@
+/* Verify that CET works.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times "endbr32" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "notrack jmp\[ \t]+\[*]" 1 } } */
+
+void func2 (int);
+
+int func1 (int arg)
+{
+  switch (arg)
+  {
+    case 1: func2 (arg*100);
+    case 2: func2 (arg*300);
+    case 5: func2 (arg*500);
+    case 8: func2 (arg*700);
+    case 7: func2 (arg*900);
+    case -1: func2 (arg*-100);
+    case -2: func2 (arg*-300);
+    case -5: func2 (arg*-500);
+    case -7: func2 (arg*-700);
+    case -9: func2 (arg*-900);
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-switch-2.c b/gcc/testsuite/gcc.target/i386/cet-switch-2.c
new file mode 100644
index 00000000000..e620b837a3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-switch-2.c
@@ -0,0 +1,26 @@
+/* Verify that CET works.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet -mcet-switch" } */
+/* { dg-final { scan-assembler-times "endbr32" 12 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 12 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\[ \t]+jmp\[ \t]+\[*]" 1 } } */
+
+void func2 (int);
+
+int func1 (int arg)
+{
+  switch (arg)
+  {
+    case 1: func2 (arg*100);
+    case 2: func2 (arg*300);
+    case 5: func2 (arg*500);
+    case 8: func2 (arg*700);
+    case 7: func2 (arg*900);
+    case -1: func2 (arg*-100);
+    case -2: func2 (arg*-300);
+    case -5: func2 (arg*-500);
+    case -7: func2 (arg*-700);
+    case -9: func2 (arg*-900);
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/cet-switch-3.c b/gcc/testsuite/gcc.target/i386/cet-switch-3.c
new file mode 100644
index 00000000000..9b1b4369582
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-switch-3.c
@@ -0,0 +1,34 @@
+/* Verify that CET works.  */
+/* { dg-do compile } */
+/* { dg-options "-O -fcf-protection -mcet -mcet-switch" } */
+/* { dg-final { scan-assembler-times "endbr32" 12 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "endbr64" 12 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\[ \t]+jmp\[ \t]+\[*]" 1 } } */
+
+void func2 (int);
+
+__attribute__ ((noinline, noclone))
+static int
+func1 (int arg)
+{
+  switch (arg)
+  {
+    case 1: func2 (arg*100);
+    case 2: func2 (arg*300);
+    case 5: func2 (arg*500);
+    case 8: func2 (arg*700);
+    case 7: func2 (arg*900);
+    case -1: func2 (arg*-100);
+    case -2: func2 (arg*-300);
+    case -5: func2 (arg*-500);
+    case -7: func2 (arg*-700);
+    case -9: func2 (arg*-900);
+  }
+  return 0;
+}
+
+int
+foo (int arg)
+{
+  return func1 (arg);
+}
diff --git a/gcc/testsuite/gcc.target/i386/gfni-1.c b/gcc/testsuite/gcc.target/i386/gfni-1.c
new file mode 100644
index 00000000000..5e22c9eae92
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/gfni-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-mgfni -mavx512bw -mavx512f -O2" } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%zmm\[0-9\]+\[^\n\r]*%zmm\[0-9\]+\[^\n\r]*%zmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%zmm\[0-9\]+\[^\\n\\r]*%zmm\[0-9\]+\[^\\n\\r\]*%zmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%zmm\[0-9\]+\[^\\n\\r]*%zmm\[0-9\]+\[^\\n\\r\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <x86intrin.h>
+
+volatile __m512i x1, x2;
+volatile __mmask64 m64;
+ 
+void extern
+avx512vl_test (void)
+{
+    x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
+    x1 = _mm512_mask_gf2p8affineinv_epi64_epi8(x1, m64, x2, x1, 3);
+    x1 = _mm512_maskz_gf2p8affineinv_epi64_epi8(m64, x1, x2, 3);
+}
diff --git a/gcc/testsuite/gcc.target/i386/gfni-2.c b/gcc/testsuite/gcc.target/i386/gfni-2.c
new file mode 100644
index 00000000000..4d1f151aa40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/gfni-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-mgfni -mavx512bw -mavx512vl -O2" } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%ymm\[0-9\]+\[^\n\r]*%ymm\[0-9\]+\[^\n\r]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%ymm\[0-9\]+\[^\\n\\r]*%ymm\[0-9\]+\[^\\n\\r\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%ymm\[0-9\]+\[^\\n\\r]*%ymm\[0-9\]+\[^\\n\\r\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\\n\\r]*%xmm\[0-9\]+\[^\\n\\r\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\\n\\r]*%xmm\[0-9\]+\[^\\n\\r\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <x86intrin.h>
+
+int *p;
+volatile __m256i x3, x4;
+volatile __m128i x5, x6;
+volatile __mmask32 m32;
+volatile __mmask16 m16;
+ 
+void extern
+avx512vl_test (void)
+{
+    x3 = _mm256_gf2p8affineinv_epi64_epi8(x3, x4, 3);
+    x3 = _mm256_mask_gf2p8affineinv_epi64_epi8(x3, m32, x4, x3, 3);
+    x3 = _mm256_maskz_gf2p8affineinv_epi64_epi8(m32, x3, x4, 3);
+    x5 = _mm_gf2p8affineinv_epi64_epi8(x5, x6, 3);
+    x5 = _mm_mask_gf2p8affineinv_epi64_epi8(x5, m16, x6, x5, 3);
+    x5 = _mm_maskz_gf2p8affineinv_epi64_epi8(m16, x5, x6, 3);
+}
diff --git a/gcc/testsuite/gcc.target/i386/gfni-3.c b/gcc/testsuite/gcc.target/i386/gfni-3.c
new file mode 100644
index 00000000000..de5f80b1124
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/gfni-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-mgfni -mavx -O2" } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%ymm\[0-9\]+\[^\n\r]*%ymm\[0-9\]+\[^\n\r]*%ymm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vgf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <x86intrin.h>
+
+int *p;
+volatile __m256i x3, x4;
+volatile __m128i x5, x6;
+ 
+void extern
+avx512vl_test (void)
+{
+    x3 = _mm256_gf2p8affineinv_epi64_epi8(x3, x4, 3);
+    x5 = _mm_gf2p8affineinv_epi64_epi8(x5, x6, 3);
+}
diff --git a/gcc/testsuite/gcc.target/i386/gfni-4.c b/gcc/testsuite/gcc.target/i386/gfni-4.c
new file mode 100644
index 00000000000..1532716191e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/gfni-4.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-mgfni -O2" } */
+/* { dg-final { scan-assembler-times "gf2p8affineinvqb\[ \\t\]+\[^\{\n\]*\\\$3\[^\n\r]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 } } */
+
+#include <x86intrin.h>
+
+int *p;
+volatile __m128i x5, x6;
+ 
+void extern
+avx512vl_test (void)
+{
+    x5 = _mm_gf2p8affineinv_epi64_epi8(x5, x6, 3);
+}
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index eae253192ad..b2bdbfdc06b 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -421,6 +421,21 @@ proc check_effective_target_avx512vpopcntdq { } {
     } "-mavx512vpopcntdq" ]
 }
 
+# Return 1 if gfni instructions can be compiled.
+proc check_effective_target_gfni { } {
+    return [check_no_compiler_messages gfni object {
+        typedef char __v16qi __attribute__ ((__vector_size__ (16)));
+
+        __v16qi
+        _mm_gf2p8affineinv_epi64_epi8 (__v16qi __A, __v16qi __B, const int __C)
+        {
+            return (__v16qi) __builtin_ia32_vgf2p8affineinvqb_v16qi ((__v16qi) __A,
+								     (__v16qi) __B,
+								      0);
+        }
+    } "-mgfni" ]
+}
+
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CFLAGS
 if ![info exists DEFAULT_CFLAGS] then {
diff --git a/gcc/testsuite/gcc.target/i386/naked-1.c b/gcc/testsuite/gcc.target/i386/naked-1.c
index cf62bb1114f..07bb10edd8f 100644
--- a/gcc/testsuite/gcc.target/i386/naked-1.c
+++ b/gcc/testsuite/gcc.target/i386/naked-1.c
@@ -10,5 +10,5 @@ foo (void)
   __asm__ ("# naked");
 }
 /* { dg-final { scan-assembler "# naked" } } */
-/* { dg-final { scan-assembler "ud2" } } */
-/* { dg-final { scan-assembler-not "ret" } } */
+/* { dg-final { scan-assembler "(?n)^\\s*ud2$" } } */
+/* { dg-final { scan-assembler-not "(?n)^\\s*ret$" } } */
diff --git a/gcc/testsuite/gcc.target/i386/naked-2.c b/gcc/testsuite/gcc.target/i386/naked-2.c
index adcd7121541..2da8b81a8cb 100644
--- a/gcc/testsuite/gcc.target/i386/naked-2.c
+++ b/gcc/testsuite/gcc.target/i386/naked-2.c
@@ -10,5 +10,5 @@ foo (void)
   __asm__ ("# naked");
 }
 /* { dg-final { scan-assembler "# naked" } } */
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "(?n)^\\s*push" } } */
+/* { dg-final { scan-assembler-not "(?n)^\\s*pop" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr61403.c b/gcc/testsuite/gcc.target/i386/pr61403.c
index 0a89f56753f..38ba4a1b1ec 100644
--- a/gcc/testsuite/gcc.target/i386/pr61403.c
+++ b/gcc/testsuite/gcc.target/i386/pr61403.c
@@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, int size)
     }
 }
 
-/* { dg-final { scan-assembler "blend" } } */
+/* { dg-final { scan-assembler "rsqrtps" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr70021.c b/gcc/testsuite/gcc.target/i386/pr70021.c
index de6da345119..6562c0f2bd0 100644
--- a/gcc/testsuite/gcc.target/i386/pr70021.c
+++ b/gcc/testsuite/gcc.target/i386/pr70021.c
@@ -1,7 +1,7 @@
 /* PR target/70021 */
 /* { dg-do run } */
 /* { dg-require-effective-target avx2 } */
-/* { dg-options "-O2 -ftree-vectorize -mavx2 -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
 
 #include "avx2-check.h"
 
diff --git a/gcc/testsuite/gcc.target/i386/pr70263-2.c b/gcc/testsuite/gcc.target/i386/pr70263-2.c
index 18ebbf05fb7..19f79fd0e36 100644
--- a/gcc/testsuite/gcc.target/i386/pr70263-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr70263-2.c
@@ -4,20 +4,13 @@
 /* { dg-final { scan-rtl-dump "Adding REG_EQUIV to insn \[0-9\]+ for source of insn \[0-9\]+" "ira" } } */
 
 typedef float XFtype __attribute__ ((mode (XF)));
-typedef _Complex float XCtype __attribute__ ((mode (XC)));
-XCtype
-__mulxc3 (XFtype a, XFtype b, XFtype c, XFtype d)
+
+void bar (XFtype);
+
+void
+foo (XFtype a, XFtype c)
 {
-  XFtype ac, bd, ad, bc, x, y;
-  ac = a * c;
-__asm__ ("": "=m" (ac):"m" (ac));
-  if (x != x)
-    {
-      _Bool recalc = 0;
-      if (((!(!(((ac) - (ac)) != ((ac) - (ac)))))))
-	recalc = 1;
-      if (recalc)
-	x = __builtin_huge_vall () * (a * c - b * d);
-    }
-  return x;
+  XFtype ac = a * c;
+
+  bar (ac);
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr79683.c b/gcc/testsuite/gcc.target/i386/pr79683.c
index cbd43fd2af0..9e28d85fc89 100644
--- a/gcc/testsuite/gcc.target/i386/pr79683.c
+++ b/gcc/testsuite/gcc.target/i386/pr79683.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -msse2" } */
+/* { dg-options "-O3 -msse2 -fvect-cost-model=unlimited" } */
 
 struct s {
     __INT64_TYPE__ a;
diff --git a/gcc/testsuite/gcc.target/i386/pr81706.c b/gcc/testsuite/gcc.target/i386/pr81706.c
new file mode 100644
index 00000000000..333fd159770
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81706.c
@@ -0,0 +1,32 @@
+/* PR libstdc++/81706 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -mno-avx512f" } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
+/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+extern double cos (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+extern double sin (double) __attribute__ ((nothrow, leaf, simd ("notinbranch")));
+#ifdef __cplusplus
+}
+#endif
+double p[1024] = { 1.0 };
+double q[1024] = { 1.0 };
+
+void
+foo (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = cos (q[i]);
+}
+
+void
+bar (void)
+{
+  int i;
+  for (i = 0; i < 1024; i++)
+    p[i] = __builtin_sin (q[i]);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-1.c b/gcc/testsuite/gcc.target/i386/pr82002-1.c
new file mode 100644
index 00000000000..86678a01992
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms" } */
+
+void a (char *);
+void
+b ()
+{
+  char c[10000000000];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-2a.c b/gcc/testsuite/gcc.target/i386/pr82002-2a.c
new file mode 100644
index 00000000000..bc85080ba8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-2a.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms" } */
+/* { dg-xfail-if "" { *-*-* }  } */
+/* { dg-xfail-run-if "" { *-*-* }  } */
+
+void __attribute__((sysv_abi)) a (char *);
+void
+b ()
+{
+  char c[10000000000];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82002-2b.c b/gcc/testsuite/gcc.target/i386/pr82002-2b.c
new file mode 100644
index 00000000000..10e44cd7b1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82002-2b.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-Ofast -mstackrealign -mabi=ms -mcall-ms2sysv-xlogues" } */
+/* { dg-xfail-if "" { *-*-* }  } */
+/* { dg-xfail-run-if "" { *-*-* }  } */
+
+void __attribute__((sysv_abi)) a (char *);
+void
+b ()
+{
+  char c[10000000000];
+  c[1099511627776] = 'b';
+  a (c);
+  a (c);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82196-1.c b/gcc/testsuite/gcc.target/i386/pr82196-1.c
index 541d975480d..ff108132bb5 100644
--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
+/* { dg-options "-mno-avx -msse -mcall-ms2sysv-xlogues -O2" } */
 /* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
 /* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr82370.c b/gcc/testsuite/gcc.target/i386/pr82370.c
new file mode 100644
index 00000000000..cc4d9b6f255
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82370.c
@@ -0,0 +1,18 @@
+/* PR target/82370 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
+/* { dg-final { scan-assembler-times "vpslldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %xmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %ymm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpslldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+/* { dg-final { scan-assembler-times "vpsrldq\[ \t]\+\\\$5, \\(%\[a-z0-9,]*\\), %zmm\[0-9]\+" 1 } } */
+
+#include <x86intrin.h>
+
+__m512i f1 (__m512i *x) { return _mm512_bslli_epi128 (*x, 5); }
+__m512i f2 (__m512i *x) { return _mm512_bsrli_epi128 (*x, 5); }
+__m256i f3 (__m256i *x) { return _mm256_bslli_epi128 (*x, 5); }
+__m256i f4 (__m256i *x) { return _mm256_bsrli_epi128 (*x, 5); }
+__m128i f5 (__m128i *x) { return _mm_bslli_si128 (*x, 5); }
+__m128i f6 (__m128i *x) { return _mm_bsrli_si128 (*x, 5); }
diff --git a/gcc/testsuite/gcc.target/i386/pr82460-1.c b/gcc/testsuite/gcc.target/i386/pr82460-1.c
new file mode 100644
index 00000000000..6529c4a9b9e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82460-1.c
@@ -0,0 +1,30 @@
+/* PR target/82460 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vbmi" } */
+/* { dg-final { scan-assembler-not {\mvmovd} } } */
+
+#include <x86intrin.h>
+
+__m512i
+f1 (__m512i x, __m512i y, char *z)
+{
+  return _mm512_permutex2var_epi32 (y, x, _mm512_loadu_si512 (z));
+}
+
+__m512i
+f2 (__m512i x, __m512i y, char *z)
+{
+  return _mm512_permutex2var_epi32 (x, y, _mm512_loadu_si512 (z));
+}
+
+__m512i
+f3 (__m512i x, __m512i y, __m512i z)
+{
+  return _mm512_permutex2var_epi8 (y, x, z);
+}
+
+__m512i
+f4 (__m512i x, __m512i y, __m512i z)
+{
+  return _mm512_permutex2var_epi8 (x, y, z);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82460-2.c b/gcc/testsuite/gcc.target/i386/pr82460-2.c
new file mode 100644
index 00000000000..4d965216b59
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82460-2.c
@@ -0,0 +1,17 @@
+/* PR target/82460 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -mavx512vbmi -mno-prefer-avx256" } */
+/* We want to reuse the permutation mask in the loop, so use vpermt2b rather
+   than vpermi2b.  */
+/* { dg-final { scan-assembler-not {\mvpermi2b\M} } } */
+/* { dg-final { scan-assembler {\mvpermt2b\M} } } */
+
+void
+foo (unsigned char *__restrict__ x, const unsigned short *__restrict__ y,
+     unsigned long z)
+{
+  unsigned char *w = x + z;
+  do
+    *x++ = *y++ >> 8;
+  while (x < w);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82499-1.c b/gcc/testsuite/gcc.target/i386/pr82499-1.c
new file mode 100644
index 00000000000..3aba62a466f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82499-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* The pic register save adds unavoidable stack pointer references.  */
+/* { dg-skip-if "" { ia32 && { ! nonpic } } } */
+/* These options are selected to ensure 1 word needs to be allocated
+   on the stack to maintain alignment for the call.  This should be
+   transformed to push+pop.  We also want to force unwind info updates.  */
+/* { dg-options "-Os -fomit-frame-pointer -fasynchronous-unwind-tables" } */
+/* { dg-additional-options "-mpreferred-stack-boundary=3" { target ia32 } } */
+/* { dg-additional-options "-mpreferred-stack-boundary=4" { target { ! ia32 } } } */
+/* ms_abi has reserved stack-region.  */
+/* { dg-skip-if "" { x86_64-*-mingw* } } */
+
+extern void g (void);
+int
+f (void)
+{
+  g ();
+  return 42;
+}
+
+/* { dg-final { scan-assembler-not "(sub|add)(l|q)\[\\t \]*\\$\[0-9\]*,\[\\t \]*%\[re\]?sp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82499-2.c b/gcc/testsuite/gcc.target/i386/pr82499-2.c
new file mode 100644
index 00000000000..dde4d657e1a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82499-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* The pic register save adds unavoidable stack pointer references.  */
+/* { dg-skip-if "" { ia32 && { ! nonpic } } } */
+/* These options are selected to ensure 1 word needs to be allocated
+   on the stack to maintain alignment for the call.  This should be
+   transformed to push+pop.  We also want to force unwind info updates.  */
+/* { dg-options "-Os -fomit-frame-pointer -fasynchronous-unwind-tables" } */
+/* { dg-additional-options "-mpreferred-stack-boundary=3" { target ia32 } } */
+/* { dg-additional-options "-mpreferred-stack-boundary=4 -mno-red-zone" { target { ! ia32 } } } */
+/* ms_abi has reserved stack-region.  */
+/* { dg-skip-if "" { x86_64-*-mingw* } } */
+
+extern void g (void);
+int
+f (void)
+{
+  g ();
+  return 42;
+}
+
+/* { dg-final { scan-assembler-not "(sub|add)(l|q)\[\\t \]*\\$\[0-9\]*,\[\\t \]*%\[re\]?sp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82499-3.c b/gcc/testsuite/gcc.target/i386/pr82499-3.c
new file mode 100644
index 00000000000..b55a860fcca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82499-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* The pic register save adds unavoidable stack pointer references.  */
+/* { dg-skip-if "" { ia32 && { ! nonpic } } } */
+/* These options are selected to ensure 1 word needs to be allocated
+   on the stack to maintain alignment for the call.  This should be
+   transformed to push+pop.  We also want to force unwind info updates.  */
+/* { dg-options "-O2 -mtune-ctrl=single_push,single_pop -fomit-frame-pointer -fasynchronous-unwind-tables" } */
+/* { dg-additional-options "-mpreferred-stack-boundary=3" { target ia32 } } */
+/* { dg-additional-options "-mpreferred-stack-boundary=4" { target { ! ia32 } } } */
+/* ms_abi has reserved stack-region.  */
+/* { dg-skip-if "" { x86_64-*-mingw* } } */
+
+extern void g (void);
+int
+f (void)
+{
+  g ();
+  return 42;
+}
+
+/* { dg-final { scan-assembler-not "(sub|add)(l|q)\[\\t \]*\\$\[0-9\]*,\[\\t \]*%\[re\]?sp" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82556.c b/gcc/testsuite/gcc.target/i386/pr82556.c
new file mode 100644
index 00000000000..409a301af30
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82556.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-strict-aliasing -fwrapv -fexcess-precision=standard" } */
+extern int foo();
+typedef struct {
+  char id;
+  unsigned char fork_flags;
+  short data_length;
+} Header;
+int a;
+void X() {
+  do {
+    char* b;
+    Header c;
+    if (a)
+      c.fork_flags |= 1;
+    __builtin_memcpy(b, &c, __builtin_offsetof(Header, data_length));
+    b += foo();
+  } while (1);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82580.c b/gcc/testsuite/gcc.target/i386/pr82580.c
new file mode 100644
index 00000000000..965dfeec28d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82580.c
@@ -0,0 +1,39 @@
+/* PR target/82580 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#ifdef __SIZEOF_INT128__
+typedef unsigned __int128 U;
+typedef signed __int128 S;
+#else
+typedef unsigned long long U;
+typedef signed long long S;
+#endif
+void bar (void);
+int f0 (U x, U y) { return x == y; }
+int f1 (U x, U y) { return x != y; }
+int f2 (U x, U y) { return x > y; }
+int f3 (U x, U y) { return x >= y; }
+int f4 (U x, U y) { return x < y; }
+int f5 (U x, U y) { return x <= y; }
+int f6 (S x, S y) { return x == y; }
+int f7 (S x, S y) { return x != y; }
+int f8 (S x, S y) { return x > y; }
+int f9 (S x, S y) { return x >= y; }
+int f10 (S x, S y) { return x < y; }
+int f11 (S x, S y) { return x <= y; }
+void f12 (U x, U y) { if (x == y) bar (); }
+void f13 (U x, U y) { if (x != y) bar (); }
+void f14 (U x, U y) { if (x > y) bar (); }
+void f15 (U x, U y) { if (x >= y) bar (); }
+void f16 (U x, U y) { if (x < y) bar (); }
+void f17 (U x, U y) { if (x <= y) bar (); }
+void f18 (S x, S y) { if (x == y) bar (); }
+void f19 (S x, S y) { if (x != y) bar (); }
+void f20 (S x, S y) { if (x > y) bar (); }
+void f21 (S x, S y) { if (x >= y) bar (); }
+void f22 (S x, S y) { if (x < y) bar (); }
+void f23 (S x, S y) { if (x <= y) bar (); }
+
+/* { dg-final { scan-assembler-times {\msbb} 16 } } */
+/* { dg-final { scan-assembler-not {\mmovzb} { target lp64 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82618.c b/gcc/testsuite/gcc.target/i386/pr82618.c
new file mode 100644
index 00000000000..f6e3589c808
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82618.c
@@ -0,0 +1,18 @@
+/* PR target/82618 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#ifdef __SIZEOF_INT128__
+typedef unsigned __int128 U;
+typedef unsigned long long H;
+#else
+typedef unsigned long long U;
+typedef unsigned int H;
+#endif
+
+H f0 (U x, U y)
+{
+  return (x - y) >> (__CHAR_BIT__ * sizeof (H));
+}
+
+/* { dg-final { scan-assembler {\mcmp} } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82628.c b/gcc/testsuite/gcc.target/i386/pr82628.c
new file mode 100644
index 00000000000..d7135220485
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82628.c
@@ -0,0 +1,34 @@
+/* { dg-do run { target ia32 } } */
+/* { dg-options "-Os" } */
+
+void
+__attribute__ ((noipa))
+foo (const char *x)
+{
+  asm volatile ("" : "+g" (x) : : "memory");
+  if (x)
+    __builtin_abort ();
+}
+
+int a, b = 1;
+
+int
+main ()
+{
+  while (1)
+    {
+      unsigned long long d = 18446744073709551615UL;
+      while (1)
+	{
+	  int e = b;
+	  while (d < 2)
+	    foo ("0");
+	  if (a)
+	    d++;
+	  if (b)
+	    break;
+	}
+      break;
+    }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-1.c b/gcc/testsuite/gcc.target/i386/pr82659-1.c
new file mode 100644
index 00000000000..485771d0f38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 1 } } */
+
+extern int x;
+
+static void
+__attribute__ ((noinline, noclone))
+test (int i)
+{
+  x = i;
+}
+
+void
+bar (int i)
+{
+  test (i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-2.c b/gcc/testsuite/gcc.target/i386/pr82659-2.c
new file mode 100644
index 00000000000..7afffa440aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+extern int x;
+
+void
+test (int i)
+{
+  x = i;
+}
+
+void
+bar (int i)
+{
+  test (i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-3.c b/gcc/testsuite/gcc.target/i386/pr82659-3.c
new file mode 100644
index 00000000000..5f97b314092
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+extern int x;
+
+static void
+__attribute__ ((noinline, noclone))
+test (int i)
+{
+  x = i;
+}
+
+extern __typeof (test) foo __attribute__ ((alias ("test")));
+
+void
+bar (int i)
+{
+  test (i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-4.c b/gcc/testsuite/gcc.target/i386/pr82659-4.c
new file mode 100644
index 00000000000..c3cacaccbef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-4.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+static void
+test (void)
+{
+}
+
+void *
+bar (void)
+{
+  return test;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-5.c b/gcc/testsuite/gcc.target/i386/pr82659-5.c
new file mode 100644
index 00000000000..95413671d5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 1 } } */
+
+static void
+test (void)
+{
+}
+
+void (*test_p) (void) = test;
diff --git a/gcc/testsuite/gcc.target/i386/pr82659-6.c b/gcc/testsuite/gcc.target/i386/pr82659-6.c
new file mode 100644
index 00000000000..51fc1a9f5c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82659-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection -mcet" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+extern int x;
+
+ __attribute__ ((visibility ("hidden")))
+void
+test (int i)
+{
+  x = i;
+}
+
+void
+bar (int i)
+{
+  test (i);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr82662.c b/gcc/testsuite/gcc.target/i386/pr82662.c
new file mode 100644
index 00000000000..8a9332b5c5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82662.c
@@ -0,0 +1,26 @@
+/* PR target/82580 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#ifdef __SIZEOF_INT128__
+typedef unsigned __int128 U;
+typedef signed __int128 S;
+#else
+typedef unsigned long long U;
+typedef signed long long S;
+#endif
+void bar (void);
+int f0 (U x, U y) { return x == y; }
+int f1 (U x, U y) { return x != y; }
+int f2 (U x, U y) { return x > y; }
+int f3 (U x, U y) { return x >= y; }
+int f4 (U x, U y) { return x < y; }
+int f5 (U x, U y) { return x <= y; }
+int f6 (S x, S y) { return x == y; }
+int f7 (S x, S y) { return x != y; }
+int f8 (S x, S y) { return x > y; }
+int f9 (S x, S y) { return x >= y; }
+int f10 (S x, S y) { return x < y; }
+int f11 (S x, S y) { return x <= y; }
+
+/* { dg-final { scan-assembler-times {\mset} 12 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82673.c b/gcc/testsuite/gcc.target/i386/pr82673.c
new file mode 100644
index 00000000000..50eb5a3bcfc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82673.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -fno-omit-frame-pointer -fvar-tracking-assignments" } */
+
+register long *B asm ("ebp");
+
+long y = 20;
+
+void
+bar (void) /* { dg-error "frame pointer required, but reserved" } */
+{
+  B = &y;
+} /* { dg-error "bp cannot be used in asm here" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82795.c b/gcc/testsuite/gcc.target/i386/pr82795.c
new file mode 100644
index 00000000000..9e7fec74699
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82795.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2" } */
+
+void
+sj (int qh, int rn, int *by)
+{
+  for (;;)
+    if (qh != 0)
+      {
+	int dc;
+
+	for (dc = 0; dc < 17; ++dc)
+	  {
+	    int nn;
+
+	    nn = (rn != 0) ? qh : dc;
+	    if (nn != 0)
+	      qh = nn;
+	    else
+	      qh = (qh != 0) ? *by : dc;
+	  }
+      }
+}
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index b98b8b60aa7..82f5d3c653b 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -1,9 +1,9 @@
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
-   popcntintrin.h and mm_malloc.h are usable
+   popcntintrin.h gfniintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index c5c43b12611..c35ec9a47cb 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
@@ -429,8 +429,8 @@
 /* avx512dqintrin.h */
 #define __builtin_ia32_kshiftliqi(A, B) __builtin_ia32_kshiftliqi(A, 8)
 #define __builtin_ia32_kshiftriqi(A, B) __builtin_ia32_kshiftriqi(A, 8)
-#define __builtin_ia32_reducess(A, B, F) __builtin_ia32_reducess(A, B, 1)
-#define __builtin_ia32_reducesd(A, B, F) __builtin_ia32_reducesd(A, B, 1)
+#define __builtin_ia32_reducess_mask(A, B, F, W, U) __builtin_ia32_reducess_mask(A, B, 1, W, U)
+#define __builtin_ia32_reducesd_mask(A, B, F, W, U) __builtin_ia32_reducesd_mask(A, B, 1, W, U)
 #define __builtin_ia32_reduceps512_mask(A, E, C, D) __builtin_ia32_reduceps512_mask(A, 1, C, D)
 #define __builtin_ia32_reducepd512_mask(A, E, C, D) __builtin_ia32_reducepd512_mask(A, 1, C, D)
 #define __builtin_ia32_rangess128_round(A, B, I, F) __builtin_ia32_rangess128_round(A, B, 1, 8)
@@ -620,4 +620,12 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
+/* gfniintrin.h */
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, 1) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, 1, D, E) 
+
 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index c2a19b3ccef..388026f927a 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
@@ -7,8 +7,8 @@
 /* Test that the intrinsics compile without optimization.  All of them are
    defined as inline functions in {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h,
    fma4intrin.h, xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, 
-   lwpintrin.h, fmaintrin.h and mm_malloc.h that reference the proper 
-   builtin functions.
+   lwpintrin.h, fmaintrin.h gfniintrin.h and mm_malloc.h that reference
+   the proper builtin functions.
 
    Defining away "extern" and "__inline" results in all of them being compiled
    as proper functions.  */
@@ -684,3 +684,8 @@ test_1 ( __bextri_u32, unsigned int, unsigned int, 1)
 #ifdef __x86_64__
 test_1 ( __bextri_u64, unsigned long long, unsigned long long, 1)
 #endif
+
+/* gfniintrin.h */
+test_2 (_mm_gf2p8affineinv_epi64_epi8, __m128i, __m128i, __m128i, 1)
+test_2 (_mm256_gf2p8affineinv_epi64_epi8, __m256i, __m256i, __m256i, 1)
+test_2 (_mm512_gf2p8affineinv_epi64_epi8, __m512i, __m512i, __m512i, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index cd8945be1cb..3e64e2915ec 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -101,7 +101,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -218,7 +218,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)
 
 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq")
+#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
@@ -695,6 +695,11 @@ test_2 (_mm_rsqrt28_round_ss, __m128, __m128, __m128, 8)
 /* shaintrin.h */
 test_2 (_mm_sha1rnds4_epu32, __m128i, __m128i, __m128i, 1)
 
+/* gfniintrin.h */
+test_2 (_mm_gf2p8affineinv_epi64_epi8, __m128i, __m128i, __m128i, 1)
+test_2 (_mm256_gf2p8affineinv_epi64_epi8, __m256i, __m256i, __m256i, 1)
+test_2 (_mm512_gf2p8affineinv_epi64_epi8, __m512i, __m512i, __m512i, 1)
+
 /* wmmintrin.h (AES/PCLMUL).  */
 #ifdef DIFFERENT_PRAGMAS
 #pragma GCC target ("aes,pclmul")
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index fc339a51e63..911258fa042 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -428,8 +428,8 @@
 /* avx512dqintrin.h */
 #define __builtin_ia32_kshiftliqi(A, B) __builtin_ia32_kshiftliqi(A, 8)
 #define __builtin_ia32_kshiftriqi(A, B) __builtin_ia32_kshiftriqi(A, 8)
-#define __builtin_ia32_reducess(A, B, F) __builtin_ia32_reducess(A, B, 1)
-#define __builtin_ia32_reducesd(A, B, F) __builtin_ia32_reducesd(A, B, 1)
+#define __builtin_ia32_reducess_mask(A, B, F, W, U) __builtin_ia32_reducess_mask(A, B, 1, W, U)
+#define __builtin_ia32_reducesd_mask(A, B, F, W, U) __builtin_ia32_reducesd_mask(A, B, 1, W, U)
 #define __builtin_ia32_reduceps512_mask(A, E, C, D) __builtin_ia32_reduceps512_mask(A, 1, C, D)
 #define __builtin_ia32_reducepd512_mask(A, E, C, D) __builtin_ia32_reducepd512_mask(A, 1, C, D)
 #define __builtin_ia32_rangess128_round(A, B, I, F) __builtin_ia32_rangess128_round(A, B, 1, 8)
@@ -619,6 +619,14 @@
 #define __builtin_ia32_extracti64x2_256_mask(A, E, C, D) __builtin_ia32_extracti64x2_256_mask(A, 1, C, D)
 #define __builtin_ia32_extractf64x2_256_mask(A, E, C, D) __builtin_ia32_extractf64x2_256_mask(A, 1, C, D)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid")
+/* gfniintrin.h */
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v16qi(A, B, 1) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v32qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, C) __builtin_ia32_vgf2p8affineinvqb_v64qi(A, B, 1)
+#define __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v16qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v32qi_mask(A, B, 1, D, E) 
+#define __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, C, D, E) __builtin_ia32_vgf2p8affineinvqb_v64qi_mask(A, B, 1, D, E) 
+
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni")
 
 #include <x86intrin.h>
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c b/gcc/testsuite/gcc.target/i386/stack-check-12.c
new file mode 100644
index 00000000000..cb69bb08086
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+__attribute__ ((noreturn)) void exit (int);
+
+__attribute__ ((noreturn)) void
+f (void)
+{
+  asm volatile ("nop" ::: "edi");
+  exit (1);
+}
+
+/* { dg-final { scan-assembler-not "or\[ql\]" } } */
+/* { dg-final { scan-assembler "pushl	%esi" { target ia32 } } } */
+/* { dg-final { scan-assembler "popl	%esi" { target ia32 } } }*/
+/* { dg-final { scan-assembler "pushq	%rax" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler "popq	%rax" { target { ! ia32 } } } }*/
+
diff --git a/gcc/testsuite/gcc.target/i386/vect-pack-trunc-2.c b/gcc/testsuite/gcc.target/i386/vect-pack-trunc-2.c
index f3d899c1134..3503deaa9d9 100644
--- a/gcc/testsuite/gcc.target/i386/vect-pack-trunc-2.c
+++ b/gcc/testsuite/gcc.target/i386/vect-pack-trunc-2.c
@@ -25,4 +25,4 @@ avx512bw_test ()
       abort ();
 }
 
-/* { dg-final { scan-assembler-times "vpermi2w\[ \\t\]+\[^\n\]*%zmm" 1 } } */
+/* { dg-final { scan-assembler-times "vperm\[it]2w\[ \\t\]+\[^\n\]*%zmm" 1 } } */
diff --git a/gcc/testsuite/gcc.target/mips/msa.c b/gcc/testsuite/gcc.target/mips/msa.c
index 6b35e21bfd3..cdd5ca28dac 100644
--- a/gcc/testsuite/gcc.target/mips/msa.c
+++ b/gcc/testsuite/gcc.target/mips/msa.c
@@ -1,6 +1,6 @@
 /* Test MIPS MSA ASE instructions */
 /* { dg-do compile } */
-/* { dg-options "-mfp64 -mhard-float -mmsa -fexpensive-optimizations" } */
+/* { dg-options "-mfp64 -mhard-float -mmsa -fexpensive-optimizations -fcommon" } */
 /* { dg-skip-if "madd and msub need combine" { *-*-* } { "-O0" } { "" } } */
 
 /* { dg-final { scan-assembler-times "\t.comm\tv16i8_\\d+,16,16" 3 } } */
diff --git a/gcc/testsuite/gcc.target/nios2/cdx-branch.c b/gcc/testsuite/gcc.target/nios2/cdx-branch.c
index 3b984f2712a..3a9c459cec3 100644
--- a/gcc/testsuite/gcc.target/nios2/cdx-branch.c
+++ b/gcc/testsuite/gcc.target/nios2/cdx-branch.c
@@ -23,7 +23,7 @@ extern int i (int);
 extern int j (int);
 extern int k (int);
 
-int h (int a)
+int h (int a, int b)
 {
   int x;
 
@@ -31,7 +31,7 @@ int h (int a)
      an unconditional branch from one branch of the "if" to
      the return statement.  We compile this testcase with -Os to
      avoid insertion of a duplicate epilogue in place of the branch.  */
-  if (a == 1)
+  if (a == b)
     x = i (37);
   else
     x = j (42);
diff --git a/gcc/testsuite/gcc.target/nios2/gpopt-gprel-sec.c b/gcc/testsuite/gcc.target/nios2/gpopt-gprel-sec.c
new file mode 100644
index 00000000000..1083fe6e6ab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/gpopt-gprel-sec.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mgpopt=local -mgprel-sec=\\.frog.+" } */
+
+extern int a __attribute__ ((section (".frog1")));
+static volatile int b __attribute__ ((section (".frog2"))) = 1;
+extern int c __attribute__ ((section (".data")));
+static volatile int d __attribute__ ((section (".data"))) = 2;
+
+extern int e;
+static volatile int f = 3;
+
+volatile int g __attribute__ ((weak)) = 4;
+
+extern int h[100];
+static int i[100];
+static int j[100] __attribute__ ((section (".sdata")));
+
+typedef int (*ftype) (int);
+extern int foo (int);
+
+extern int bar (int, int*, int*, int*, ftype);
+
+int baz (void)
+{
+  return bar (a + b + c + d + e + f + g, h, i, j, foo);
+}
+
+/* { dg-final { scan-assembler "%gprel\\(a\\)" } } */
+/* { dg-final { scan-assembler "%gprel\\(b\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(c\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(d\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(e\\)" } } */
+/* { dg-final { scan-assembler "%gprel\\(f\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(g\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(h\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(i\\)" } } */
+/* { dg-final { scan-assembler "%gprel\\(j\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(foo\\)" } } */
diff --git a/gcc/testsuite/gcc.target/nios2/gpopt-r0rel-sec.c b/gcc/testsuite/gcc.target/nios2/gpopt-r0rel-sec.c
new file mode 100644
index 00000000000..5fda9e9a381
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/gpopt-r0rel-sec.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mgpopt=local -mr0rel-sec=\\.frog.+" } */
+
+extern int a __attribute__ ((section (".frog1")));
+static volatile int b __attribute__ ((section (".frog2"))) = 1;
+extern int c __attribute__ ((section (".data")));
+static volatile int d __attribute__ ((section (".data"))) = 2;
+
+extern int e;
+static volatile int f = 3;
+
+volatile int g __attribute__ ((weak)) = 4;
+
+extern int h[100];
+static int i[100];
+static int j[100] __attribute__ ((section (".sdata")));
+
+typedef int (*ftype) (int);
+extern int foo (int);
+
+extern int bar (int, int*, int*, int*, ftype);
+
+int baz (void)
+{
+  return bar (a + b + c + d + e + f + g, h, i, j, foo);
+}
+
+/* { dg-final { scan-assembler "%lo\\(a\\)\\(r0\\)" } } */
+/* { dg-final { scan-assembler "%lo\\(b\\)\\(r0\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(c\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(d\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(e\\)" } } */
+/* { dg-final { scan-assembler "%gprel\\(f\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(g\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(h\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(i\\)" } } */
+/* { dg-final { scan-assembler "%gprel\\(j\\)" } } */
+/* { dg-final { scan-assembler-not "%gprel\\(foo\\)" } } */
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-bypass.c b/gcc/testsuite/gcc.target/nios2/lo-addr-bypass.c
new file mode 100644
index 00000000000..24e6cfd4cc0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-bypass.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=r2 -mbypass-cache" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 12 } } */
+/* { dg-final { scan-assembler-not "ldw\t" } } */
+/* { dg-final { scan-assembler-not "stw\t" } } */
+/* { dg-final { scan-assembler-not "ldwio\tr., %lo" } } */
+/* { dg-final { scan-assembler-not "stwio\tr., %lo" } } */
+
+/* Check that we do not generate %lo addresses with R2 ldstio instructions.
+   %lo requires a 16-bit relocation and on R2 these instructions only have a
+   12-bit register offset.  */
+#define TYPE int
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-char.c b/gcc/testsuite/gcc.target/nios2/lo-addr-char.c
new file mode 100644
index 00000000000..dd992458323
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-char.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldbu\tr., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldb\tr., %lo" 16 } } */
+/* { dg-final { scan-assembler-times "stb\tr., %lo" 4 } } */
+
+/* Check that various address forms involving a symbolic constant
+   with a possible constant offset and/or index register are optimized
+   to generate a %lo relocation in the load/store instructions instead
+   of a plain register indirect addressing mode.  */
+/* Note: get* uses ldhu but ext* uses ldh since TYPE is signed.  */
+
+#define TYPE signed char
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
+int extw1 (void) { return (int)(S1); }
+int extw2 (int i) { return (int)(S2[i]); }
+int extw3 (void) { return (int)(S3.x2); }
+int extw4 (int i) { return (int)(S4[i].x2); }
+unsigned int extwu1 (void) { return (unsigned int)(S1); }
+unsigned int extwu2 (int i) { return (unsigned int)(S2[i]); }
+unsigned int extwu3 (void) { return (unsigned int)(S3.x2); }
+unsigned int extwu4 (int i) { return (unsigned int)(S4[i].x2); }
+
+short exth1 (void) { return (short)(S1); }
+short exth2 (int i) { return (short)(S2[i]); }
+short exth3 (void) { return (short)(S3.x2); }
+short exth4 (int i) { return (short)(S4[i].x2); }
+unsigned short exthu1 (void) { return (unsigned short)(S1); }
+unsigned short exthu2 (int i) { return (unsigned short)(S2[i]); }
+unsigned short exthu3 (void) { return (unsigned short)(S3.x2); }
+unsigned short exthu4 (int i) { return (unsigned short)(S4[i].x2); }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-int.c b/gcc/testsuite/gcc.target/nios2/lo-addr-int.c
new file mode 100644
index 00000000000..9a6f779d383
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-int.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldw\tr., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "stw\tr., %lo" 4 } } */
+
+/* Check that various address forms involving a symbolic constant
+   with a possible constant offset and/or index register are optimized
+   to generate a %lo relocation in the load/store instructions instead
+   of a plain register indirect addressing mode.  */
+
+#define TYPE int
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-pic.c b/gcc/testsuite/gcc.target/nios2/lo-addr-pic.c
new file mode 100644
index 00000000000..bcd623785bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-pic.c
@@ -0,0 +1,38 @@
+/* { dg-do compile { target nios2-*-linux-gnu } } */
+/* { dg-options "-O2 -fpic" } */
+/* { dg-final { scan-assembler-not "ldw\tr., %lo" } } */
+/* { dg-final { scan-assembler-not "stw\tr., %lo" } } */
+
+/* Check that address transformations for symbolic constants do NOT
+   apply to code compiled with -fPIC, which requires references to
+   go through the GOT pointer (r22) instead.  */
+
+#define TYPE int
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-short.c b/gcc/testsuite/gcc.target/nios2/lo-addr-short.c
new file mode 100644
index 00000000000..792ec227291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-short.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldhu\tr., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldh\tr., %lo" 8 } } */
+/* { dg-final { scan-assembler-times "sth\tr., %lo" 4 } } */
+
+/* Check that various address forms involving a symbolic constant
+   with a possible constant offset and/or index register are optimized
+   to generate a %lo relocation in the load/store instructions instead
+   of a plain register indirect addressing mode.  */
+/* Note: get* uses ldhu but ext* uses ldh since TYPE is signed.  */
+
+#define TYPE short
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
+int extw1 (void) { return (int)(S1); }
+int extw2 (int i) { return (int)(S2[i]); }
+int extw3 (void) { return (int)(S3.x2); }
+int extw4 (int i) { return (int)(S4[i].x2); }
+unsigned int extwu1 (void) { return (unsigned int)(S1); }
+unsigned int extwu2 (int i) { return (unsigned int)(S2[i]); }
+unsigned int extwu3 (void) { return (unsigned int)(S3.x2); }
+unsigned int extwu4 (int i) { return (unsigned int)(S4[i].x2); }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-tls.c b/gcc/testsuite/gcc.target/nios2/lo-addr-tls.c
new file mode 100644
index 00000000000..d56fbc2ed81
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-tls.c
@@ -0,0 +1,38 @@
+/* { dg-require-effective-target tls } */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "ldw\tr., %lo" } } */
+/* { dg-final { scan-assembler-not "stw\tr., %lo" } } */
+
+/* Check that address transformations for symbolic constants do NOT
+   apply to TLS variables.  */
+
+#define TYPE int
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern __thread TYPE S1;
+extern __thread TYPE S2[];
+
+extern __thread struct ss S3;
+extern __thread struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-uchar.c b/gcc/testsuite/gcc.target/nios2/lo-addr-uchar.c
new file mode 100644
index 00000000000..e9733afde4a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-uchar.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldbu\tr., %lo" 20 } } */
+/* { dg-final { scan-assembler-times "stb\tr., %lo" 4 } } */
+
+/* Check that various address forms involving a symbolic constant
+   with a possible constant offset and/or index register are optimized
+   to generate a %lo relocation in the load/store instructions instead
+   of a plain register indirect addressing mode.  */
+
+#define TYPE unsigned char
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
+int extw1 (void) { return (int)(S1); }
+int extw2 (int i) { return (int)(S2[i]); }
+int extw3 (void) { return (int)(S3.x2); }
+int extw4 (int i) { return (int)(S4[i].x2); }
+unsigned int extwu1 (void) { return (unsigned int)(S1); }
+unsigned int extwu2 (int i) { return (unsigned int)(S2[i]); }
+unsigned int extwu3 (void) { return (unsigned int)(S3.x2); }
+unsigned int extwu4 (int i) { return (unsigned int)(S4[i].x2); }
+
+short exth1 (void) { return (short)(S1); }
+short exth2 (int i) { return (short)(S2[i]); }
+short exth3 (void) { return (short)(S3.x2); }
+short exth4 (int i) { return (short)(S4[i].x2); }
+unsigned short exthu1 (void) { return (unsigned short)(S1); }
+unsigned short exthu2 (int i) { return (unsigned short)(S2[i]); }
+unsigned short exthu3 (void) { return (unsigned short)(S3.x2); }
+unsigned short exthu4 (int i) { return (unsigned short)(S4[i].x2); }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-ushort.c b/gcc/testsuite/gcc.target/nios2/lo-addr-ushort.c
new file mode 100644
index 00000000000..4a19c13bf2c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-ushort.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 4 } } */
+/* { dg-final { scan-assembler-times "ldhu\tr., %lo" 12 } } */
+/* { dg-final { scan-assembler-times "sth\tr., %lo" 4 } } */
+
+/* Check that various address forms involving a symbolic constant
+   with a possible constant offset and/or index register are optimized
+   to generate a %lo relocation in the load/store instructions instead
+   of a plain register indirect addressing mode.  */
+
+#define TYPE unsigned short
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern TYPE S1;
+extern TYPE S2[];
+
+extern struct ss S3;
+extern struct ss S4[];
+
+TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
+int extw1 (void) { return (int)(S1); }
+int extw2 (int i) { return (int)(S2[i]); }
+int extw3 (void) { return (int)(S3.x2); }
+int extw4 (int i) { return (int)(S4[i].x2); }
+unsigned int extwu1 (void) { return (unsigned int)(S1); }
+unsigned int extwu2 (int i) { return (unsigned int)(S2[i]); }
+unsigned int extwu3 (void) { return (unsigned int)(S3.x2); }
+unsigned int extwu4 (int i) { return (unsigned int)(S4[i].x2); }
+
diff --git a/gcc/testsuite/gcc.target/nios2/lo-addr-volatile.c b/gcc/testsuite/gcc.target/nios2/lo-addr-volatile.c
new file mode 100644
index 00000000000..40a8be429bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nios2/lo-addr-volatile.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=r2 -mno-cache-volatile" } */
+/* { dg-final { scan-assembler-times "addi\tr., r., %lo" 12 } } */
+/* { dg-final { scan-assembler-not "ldw\t" } } */
+/* { dg-final { scan-assembler-not "stw\t" } } */
+/* { dg-final { scan-assembler-not "ldwio\tr., %lo" } } */
+/* { dg-final { scan-assembler-not "stwio\tr., %lo" } } */
+
+/* Check that we do not generate %lo addresses with R2 ldstio instructions.
+   %lo requires a 16-bit relocation and on R2 these instructions only have a
+   12-bit register offset.  */
+
+#define TYPE int
+
+struct ss
+{
+  TYPE x1,x2;
+};
+
+extern volatile TYPE S1;
+extern volatile TYPE S2[];
+
+extern volatile struct ss S3;
+extern volatile struct ss S4[];
+
+volatile TYPE *addr1 (void) { return &S1; }
+TYPE get1 (void) { return S1; }
+void set1 (TYPE value) { S1 = value; }
+
+volatile TYPE *addr2 (int i) { return &(S2[i]); }
+TYPE get2 (int i) { return S2[i]; }
+void set2 (int i, TYPE value) { S2[i] = value; }
+
+volatile TYPE *addr3 (void) { return &(S3.x2); }
+TYPE get3 (void) { return S3.x2; }
+void set3 (TYPE value) { S3.x2 = value; }
+
+volatile TYPE *addr4 (int i) { return &(S4[i].x2); }
+TYPE get4 (int i) { return S4[i].x2; }
+void set4 (int i, TYPE value) { S4[i].x2 = value; }
+
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-fma2.c b/gcc/testsuite/gcc.target/powerpc/float128-fma2.c
deleted file mode 100644
index e5f15aa2de9..00000000000
--- a/gcc/testsuite/gcc.target/powerpc/float128-fma2.c
+++ /dev/null
@@ -1,9 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mpower9-vector -mno-float128-hardware -O2" } */
-
-__float128
-xfma (__float128 a, __float128 b, __float128 c)
-{
-  return __builtin_fmaf128 (a, b, c); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
-}
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw.c b/gcc/testsuite/gcc.target/powerpc/float128-hw.c
index 68e4c27aa58..929c6ddabe0 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-hw.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-hw.c
@@ -2,16 +2,58 @@
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mpower9-vector -O2" } */
 
-__float128 f128_add (__float128 a, __float128 b) { return a+b; }
-__float128 f128_sub (__float128 a, __float128 b) { return a-b; }
-__float128 f128_mul (__float128 a, __float128 b) { return a*b; }
-__float128 f128_div (__float128 a, __float128 b) { return a/b; }
-__float128 f128_fma (__float128 a, __float128 b, __float128 c) { return (a*b)+c; }
-long f128_cmove (__float128 a, __float128 b, long c, long d) { return (a == b) ? c : d; }
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test the code generation of the various _Float128 operations.  */
+TYPE f128_add (TYPE a, TYPE b) { return a+b; }
+TYPE f128_sub (TYPE a, TYPE b) { return a-b; }
+TYPE f128_mul (TYPE a, TYPE b) { return a*b; }
+TYPE f128_div (TYPE a, TYPE b) { return a/b; }
+TYPE f128_fma (TYPE a, TYPE b, TYPE c) { return (a*b)+c; }
+TYPE f128_fms (TYPE a, TYPE b, TYPE c) { return (a*b)-c; }
+TYPE f128_nfma (TYPE a, TYPE b, TYPE c) { return -((a*b)+c); }
+TYPE f128_nfms (TYPE a, TYPE b, TYPE c) { return -((a*b)-c); }
+TYPE f128_neg (TYPE a) { return -a; }
+
+long f128_cmove (TYPE a, TYPE b, long c, long d) { return (a == b) ? c : d; }
+
+double f128_to_double (TYPE a) { return (double)a; }
+float f128_to_float (TYPE a) { return (float)a; }
+long f128_to_long (TYPE a) { return (long)a; }
+unsigned long f128_to_ulong (TYPE a) { return (unsigned long)a; }
+int f128_to_int (TYPE a) { return (int)a; }
+unsigned int f128_to_uint (TYPE a) { return (unsigned int)a; }
+
+TYPE double_to_f128 (double a) { return (TYPE)a; }
+TYPE float_to_f128 (float a) { return (TYPE)a; }
+TYPE long_to_f128 (long a) { return (TYPE)a; }
+TYPE ulong_to_f128 (unsigned long a) { return (TYPE)a; }
+TYPE int_to_f128 (int a) { return (TYPE)a; }
+TYPE uint_to_f128 (unsigned int a) { return (TYPE)a; }
+
+/* { dg-final { scan-assembler     {\mmfvsrd\M}    } } */
+/* { dg-final { scan-assembler     {\mmfvsrwz\M}   } } */
+/* { dg-final { scan-assembler     {\mmtvsrd\M}    } } */
+/* { dg-final { scan-assembler     {\mmtvsrwa\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxscmpuqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxscvdpqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpdp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpdpo\M} } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpsdz\M} } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpswz\M} } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpudz\M} } } */
+/* { dg-final { scan-assembler 	   {\mxscvqpuwz\M} } } */
+/* { dg-final { scan-assembler 	   {\mxscvsdqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxscvudqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxsdivqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsmaddqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxsmsubqp\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxsmulqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsnegqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsnmaddqp\M} } } */
+/* { dg-final { scan-assembler 	   {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler 	   {\mxssubqp\M}   } } */
+/* { dg-final { scan-assembler-not {\mbl\M}        } } */
 
-/* { dg-final { scan-assembler "xsaddqp"  } } */
-/* { dg-final { scan-assembler "xssubqp"  } } */
-/* { dg-final { scan-assembler "xsmulqp"  } } */
-/* { dg-final { scan-assembler "xsdivqp"  } } */
-/* { dg-final { scan-assembler "xsmaddqp" } } */
-/* { dg-final { scan-assembler "xscmpuqp" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw2.c b/gcc/testsuite/gcc.target/powerpc/float128-hw2.c
new file mode 100644
index 00000000000..f144360da3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-hw2.c
@@ -0,0 +1,60 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math -std=gnu11" } */
+
+/* Test to make sure the compiler handles the standard _Float128 functions that
+   have hardware support in ISA 3.0/power9.  */
+
+#define __STDC_WANT_IEC_60559_TYPES_EXT__ 1
+
+#ifndef __FP_FAST_FMAF128
+#error "__FP_FAST_FMAF128 should be defined."
+#endif
+
+extern _Float128 copysignf128 (_Float128, _Float128);
+extern _Float128 sqrtf128 (_Float128);
+extern _Float128 fmaf128 (_Float128, _Float128, _Float128);
+
+_Float128
+do_copysign (_Float128 a, _Float128 b)
+{
+  return copysignf128 (a, b);
+}
+
+_Float128
+do_sqrt (_Float128 a)
+{
+  return sqrtf128 (a);
+}
+
+_Float128
+do_fma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, c);
+}
+
+_Float128
+do_fms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, -c);
+}
+
+_Float128
+do_nfma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, c);
+}
+
+_Float128
+do_nfms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, -c);
+}
+
+/* { dg-final { scan-assembler     {\mxscpsgnqp\M} } } */
+/* { dg-final { scan-assembler     {\mxssqrtqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsmaddqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsmsubqp\M}  } } */
+/* { dg-final { scan-assembler     {\mxsnmaddqp\M} } } */
+/* { dg-final { scan-assembler     {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler-not {\mbl\M}        } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw3.c b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c
new file mode 100644
index 00000000000..e63099dde08
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c
@@ -0,0 +1,56 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math -std=c11" } */
+
+/* Test to make sure the compiler calls the external function instead of doing
+   the built-in processing for _Float128 functions that have hardware support
+   in ISA 3.0/power9 if are in strict standards mode, where the <func>f128 name
+   is not a synonym for __builtin_<func>f128.  */
+
+extern _Float128 copysignf128 (_Float128, _Float128);
+extern _Float128 sqrtf128 (_Float128);
+extern _Float128 fmaf128 (_Float128, _Float128, _Float128);
+
+_Float128
+do_copysign (_Float128 a, _Float128 b)
+{
+  return copysignf128 (a, b);
+}
+
+_Float128
+do_sqrt (_Float128 a)
+{
+  return sqrtf128 (a);
+}
+
+_Float128
+do_fma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, c);
+}
+
+_Float128
+do_fms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return fmaf128 (a, b, -c);
+}
+
+_Float128
+do_nfma (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, c);
+}
+
+_Float128
+do_nfms (_Float128 a, _Float128 b, _Float128 c)
+{
+  return -fmaf128 (a, b, -c);
+}
+
+/* { dg-final { scan-assembler-not   {\mxscpsgnqp\M} } } */
+/* { dg-final { scan-assembler-not   {\mxssqrtqp\M}  } } */
+/* { dg-final { scan-assembler-not   {\mxsmaddqp\M}  } } */
+/* { dg-final { scan-assembler-not   {\mxsmsubqp\M}  } } */
+/* { dg-final { scan-assembler-not   {\mxsnmaddqp\M} } } */
+/* { dg-final { scan-assembler-not   {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler-times {\mbl\M} 6      } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c b/gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c
deleted file mode 100644
index 94527ebbd98..00000000000
--- a/gcc/testsuite/gcc.target/powerpc/float128-sqrt2.c
+++ /dev/null
@@ -1,9 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mpower9-vector -mno-float128-hardware -O2" } */
-
-__float128
-xsqrt (__float128 a)
-{
-  return __builtin_sqrtf128 (a); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
-}
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-char.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-char.c
new file mode 100644
index 00000000000..19ea3d3184a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-char.c
@@ -0,0 +1,19 @@
+/* Verify that overloaded built-ins for vec_neg with char
+   inputs produce the right code.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include <altivec.h>
+
+vector signed char
+test2 (vector signed char x)
+{
+  return vec_neg (x);
+}
+
+/* { dg-final { scan-assembler-times "xxspltib|vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vsububm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsb" 0 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-floatdouble.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-floatdouble.c
new file mode 100644
index 00000000000..79ad92465a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-floatdouble.c
@@ -0,0 +1,23 @@
+/* Verify that overloaded built-ins for vec_neg with float and
+   double inputs for VSX produce the right code.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include <altivec.h>
+
+vector float
+test1 (vector float x)
+{
+  return vec_neg (x);
+}
+
+vector double
+test2 (vector double x)
+{
+  return vec_neg (x);
+}
+
+/* { dg-final { scan-assembler-times "xvnegsp" 1 } } */
+/* { dg-final { scan-assembler-times "xvnegdp" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-int.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-int.c
new file mode 100644
index 00000000000..d6ca1283bc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-int.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_neg with int
+   inputs produce the right code.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include <altivec.h>
+
+vector signed int
+test1 (vector signed int x)
+{
+  return vec_neg (x);
+}
+
+/* { dg-final { scan-assembler-times "xxspltib|vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vsubuwm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsw" 0 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-longlong.c
new file mode 100644
index 00000000000..48f71788648
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-longlong.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_neg with long long
+   inputs produce the right code.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+#include <altivec.h>
+
+vector signed long long
+test3 (vector signed long long x)
+{
+  return vec_neg (x);
+}
+
+/* { dg-final { scan-assembler-times "xxspltib|vspltisw" 1 } } */
+/* { dg-final { scan-assembler-times "vsubudm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsd" 0 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-short.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-short.c
new file mode 100644
index 00000000000..997a9d48617
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-neg-short.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_neg with short
+   inputs produce the right code.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include <altivec.h>
+
+vector signed short
+test3 (vector signed short x)
+{
+  return vec_neg (x);
+}
+
+/* { dg-final { scan-assembler-times "xxspltib|vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vsubuhm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsh" 0 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c
index 7f3e57447f9..1333d882e0e 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c
@@ -16,7 +16,7 @@ testbl (vector bool long long vbl2, vector bool long long vbl3,
 }
 
 vector signed long long
-testsl (vector signed long vsl2, vector signed long vsl3,
+testsl (vector signed long long vsl2, vector signed long long vsl3,
 	vector unsigned char vuc)
 {
   return vec_perm (vsl2, vsl3, vuc);
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-addpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-addpd-1.c
new file mode 100644
index 00000000000..1dffbbe3ce7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-addpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_addpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] + s2.a[0];
+  e[1] = s1.a[1] + s2.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-addsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-addsd-1.c
new file mode 100644
index 00000000000..12c414d2d93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-addsd-1.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+
+#include <stdint.h>
+#include <stdio.h>
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_addsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+
+  e[0] = s1.a[0] + s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_addsd_1; check_union128d failed\n");
+      printf ("\t [%f,%f] + [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-andnpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-andnpd-1.c
new file mode 100644
index 00000000000..89e52aa1265
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-andnpd-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_andnpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_andnot_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  long long source1[2]={34545, 95567};
+  long long source2[2]={674, 57897};
+  long long e[2];
+   
+  s1.x = _mm_loadu_pd ((double *)source1);
+  s2.x = _mm_loadu_pd ((double *)source2);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = (~source1[0]) & source2[0];
+  e[1] = (~source1[1]) & source2[1];
+
+  if (check_union128d (u, (double *)e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-andpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-andpd-1.c
new file mode 100644
index 00000000000..d23099b0e9e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-andpd-1.c
@@ -0,0 +1,49 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_andpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_and_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  
+  union
+  {
+    double d[2];
+    long long ll[2];
+  }source1, source2, e;
+   
+  s1.x = _mm_set_pd (34545, 95567);
+  s2.x = _mm_set_pd (674, 57897);
+
+  _mm_storeu_pd (source1.d, s1.x);
+  _mm_storeu_pd (source2.d, s2.x);
+
+  u.x = test (s1.x, s2.x); 
+   
+  e.ll[0] = source1.ll[0] & source2.ll[0];
+  e.ll[1] = source1.ll[1] & source2.ll[1];
+
+  if (check_union128d (u, e.d))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-check.h b/gcc/testsuite/gcc.target/powerpc/sse2-check.h
new file mode 100644
index 00000000000..beb1b7d24f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-check.h
@@ -0,0 +1,52 @@
+#include <stdlib.h>
+
+/* Define this to enable the combination of VSX vector double and
+   SSE2 data types.  */
+#define __VSX_SSE2__ 1
+
+#include "m128-check.h"
+
+/* define DEBUG replace abort with printf on error.  */
+//#define DEBUG 1
+
+#if 1
+
+#define TEST sse2_test
+
+static void sse2_test (void);
+
+static void
+__attribute__ ((noinline))
+do_test (void)
+{
+  sse2_test ();
+}
+
+int
+main ()
+  {
+#ifdef __BUILTIN_CPU_SUPPORTS__
+    /* Most SSE2 (vector double) intrinsic operations require VSX
+       instructions, but some operations may need only VMX
+       instructions.  This also true for SSE2 scalar doubles as they
+       imply that "other half" of the vector remains unchanged or set
+       to zeros.  The VSX scalar operations leave ther "other half"
+       undefined, and require additional merge operations.
+       Some conversions (to/from integer) need the  direct register
+       transfer instructions from POWER8 for best performance.
+       So we test for arch_2_07.  */
+    if ( __builtin_cpu_supports ("arch_2_07") )
+      {
+	do_test ();
+#ifdef DEBUG
+	printf ("PASSED\n");
+#endif
+      }
+#ifdef DEBUG
+    else
+    printf ("SKIPPED\n");
+#endif
+#endif /* __BUILTIN_CPU_SUPPORTS__ */
+    return 0;
+  }
+#endif
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cmppd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cmppd-1.c
new file mode 100644
index 00000000000..af9df4d3209
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cmppd-1.c
@@ -0,0 +1,76 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cmp_pd_1
+#endif
+
+#include <emmintrin.h>
+#include <math.h>
+
+double ps1[] = {2134.3343, 6678.346};
+double ps2[] = {41124.234, 6678.346};
+long long pdd[] =  {1, 2}, pd[2];
+union{long long l[2]; double d[2];} pe;
+
+void pd_check(char *id, __m128d dst)
+{
+    __v2di dest = (__v2di)dst;
+
+    if(checkVl(pd, pe.l, 2))
+    {
+        printf("mm_cmp%s_pd FAILED\n", id);
+	printf("dst [%lld, %lld], e.l[%lld, %lld]\n", 
+		dest[0], dest[1], pe.l[0], pe.l[1]);
+    }
+}
+
+#define CMP(cmp, rel0, rel1)					\
+    pe.l[0] = rel0 ? -1 : 0;	                        \
+    pe.l[1] = rel1 ? -1 : 0;	                        \
+    dest = _mm_loadu_pd((double*)pdd);	      		\
+    source1 = _mm_loadu_pd(ps1);				\
+    source2 = _mm_loadu_pd(ps2);				\
+    dest = _mm_cmp##cmp##_pd(source1, source2);		\
+    _mm_storeu_pd((double*) pd, dest);			\
+    pd_check("" #cmp "", dest);
+
+static void
+TEST ()
+{
+    __m128d source1, source2, dest;
+
+    CMP(eq, !isunordered(ps1[0], ps2[0]) && ps1[0] == ps2[0],
+    		!isunordered(ps1[1], ps2[1]) && ps1[1] == ps2[1]);
+    CMP(lt, !isunordered(ps1[0], ps2[0]) && ps1[0] < ps2[0],
+    		!isunordered(ps1[1], ps2[1]) && ps1[1] < ps2[1]);
+    CMP(le, !isunordered(ps1[0], ps2[0]) && ps1[0] <= ps2[0],
+    		!isunordered(ps1[1], ps2[1]) && ps1[1] <= ps2[1]);
+    CMP(unord, isunordered(ps1[0], ps2[0]),
+    		isunordered(ps1[1], ps2[1]));
+    CMP(neq, isunordered(ps1[0], ps2[0]) || ps1[0] != ps2[0],
+    		isunordered(ps1[1], ps2[1]) || ps1[1] != ps2[01]);
+    CMP(nlt, isunordered(ps1[0], ps2[0]) || ps1[0] >= ps2[0],
+    		isunordered(ps1[1], ps2[1]) || ps1[1] >= ps2[1]);
+    CMP(nle, isunordered(ps1[0], ps2[0]) || ps1[0] > ps2[0],
+    		isunordered(ps1[1], ps2[1]) || ps1[1] > ps2[1]);
+    CMP(ord, !isunordered(ps1[0], ps2[0]),
+    		!isunordered(ps1[1], ps2[1]));
+
+    CMP(ge, isunordered(ps1[0], ps2[0]) || ps1[0] >= ps2[0],
+    		isunordered(ps1[1], ps2[1]) || ps1[1] >= ps2[1]);
+    CMP(gt, isunordered(ps1[0], ps2[0]) || ps1[0] > ps2[0],
+    		isunordered(ps1[1], ps2[1]) || ps1[1] > ps2[1]);
+    CMP(nge, !isunordered(ps1[0], ps2[0]) && ps1[0] < ps2[0],
+    		!isunordered(ps1[1], ps2[1]) && ps1[1] < ps2[1]);
+    CMP(ngt, !isunordered(ps1[0], ps2[0]) && ps1[0] <= ps2[0],
+    		!isunordered(ps1[1], ps2[1]) && ps1[1] <= ps2[1]);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cmpsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cmpsd-1.c
new file mode 100644
index 00000000000..331923c53d3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cmpsd-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cmp_sd_1
+#endif
+
+#include <emmintrin.h>
+#include <math.h>
+
+double s1[] = {2134.3343, 6678.346};
+double s2[] = {41124.234, 6678.346};
+long long dd[] =  {1, 2}, d[2];
+union{long long l[2]; double d[2];} e;
+
+void check(char *id, __m128d dst)
+{
+   __v2di dest = (__v2di)dst;
+
+   if(checkVl(d, e.l, 2))
+    {
+      printf("mm_cmp%s_sd FAILED\n", id);
+      printf("dst [%lld, %lld], e.l[%lld]\n",
+             dest[0], dest[1], e.l[0]);
+    }
+}
+
+#define CMP(cmp, rel)					\
+    e.l[0] = rel ? -1 : 0;	                        \
+    dest = _mm_loadu_pd((double*)dd);	      		\
+    source1 = _mm_loadu_pd(s1);				\
+    source2 = _mm_loadu_pd(s2);				\
+    dest = _mm_cmp##cmp##_sd(source1, source2);	\
+    _mm_storeu_pd((double*) d, dest);			\
+    check("" #cmp "", dest);
+
+static void
+TEST ()
+{
+    __m128d source1, source2, dest;
+
+    e.d[1] = s1[1];
+
+    CMP(eq, !isunordered(s1[0], s2[0]) && s1[0] == s2[0]);
+    CMP(lt, !isunordered(s1[0], s2[0]) && s1[0] < s2[0]);
+    CMP(le, !isunordered(s1[0], s2[0]) && s1[0] <= s2[0]);
+    CMP(unord, isunordered(s1[0], s2[0]));
+    CMP(neq, isunordered(s1[0], s2[0]) || s1[0] != s2[0]);
+    CMP(nlt, isunordered(s1[0], s2[0]) || s1[0] >= s2[0]);
+    CMP(nle, isunordered(s1[0], s2[0]) || s1[0] > s2[0]);
+    CMP(ord, !isunordered(s1[0], s2[0]));
+
+    CMP(ge, isunordered(s1[0], s2[0]) || s1[0] >= s2[0]);
+    CMP(gt, isunordered(s1[0], s2[0]) || s1[0] > s2[0]);
+    CMP(nge, !isunordered(s1[0], s2[0]) && s1[0] < s2[0]);
+    CMP(ngt, !isunordered(s1[0], s2[0]) && s1[0] <= s2[0]);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-1.c
new file mode 100644
index 00000000000..7bed4b41f5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_1
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comieq_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] == s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-2.c
new file mode 100644
index 00000000000..6a8b45d3102
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_2
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comilt_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] < s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-3.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-3.c
new file mode 100644
index 00000000000..2ed5e4ee06f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-3.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_3
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comile_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] <= s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-4.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-4.c
new file mode 100644
index 00000000000..2a3b5b8465f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-4.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_4
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comigt_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] > s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-5.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-5.c
new file mode 100644
index 00000000000..59139cb0a9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-5.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_5
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comige_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] >= s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-comisd-6.c b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-6.c
new file mode 100644
index 00000000000..e904e2bc7fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-comisd-6.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_comi_sd_6
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_comineq_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] != s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2pd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2pd-1.c
new file mode 100644
index 00000000000..0c9ee3a351d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2pd-1.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtepi32_pd
+#endif
+
+#include <emmintrin.h>
+#ifdef _ARCH_PWR8
+static __m128d
+__attribute__((noinline, unused))
+test (__m128i p)
+{
+  return _mm_cvtepi32_pd (p); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128d u;
+  union128i_d s;
+  double e[2];
+
+  s.x = _mm_set_epi32 (123, 321, 456, 987);
+
+  u.x = test (s.x);
+
+  e[0] = (double)s.a[0]; 
+  e[1] = (double)s.a[1]; 
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtepi32_pd; check_union128d failed\n");
+      printf ("\t [%d,%d, %d, %d] -> [%f,%f]\n",
+    		  s.a[0], s.a[1], s.a[2], s.a[3],
+			  u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n",
+			  e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2ps-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2ps-1.c
new file mode 100644
index 00000000000..50dec1bf5a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtdq2ps-1.c
@@ -0,0 +1,43 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtepi32_ps
+#endif
+
+#include <emmintrin.h>
+
+static __m128
+__attribute__((noinline, unused))
+test (__m128i p)
+{
+  return _mm_cvtepi32_ps (p); 
+}
+
+static void
+TEST (void)
+{
+  union128 u;
+  union128i_d s;
+  float e[4];
+
+  s.x = _mm_set_epi32 (123, 321, 456, 987);
+
+  u.x = test (s.x);
+
+  e[0] = (float)s.a[0]; 
+  e[1] = (float)s.a[1]; 
+  e[2] = (float)s.a[2]; 
+  e[3] = (float)s.a[3]; 
+
+  if (check_union128 (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2dq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2dq-1.c
new file mode 100644
index 00000000000..ecbcbe99b53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2dq-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtpd_epi32
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  return _mm_cvtpd_epi32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u;
+  union128d s;
+  int e[4] = {0};
+
+  s.x = _mm_set_pd (2.78, 7777768.82);
+
+  u.x = test (s.x);
+
+  e[0] = (int)(s.a[0] + 0.5); 
+  e[1] = (int)(s.a[1] + 0.5); 
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtpd_epi32; check_union128i_d failed\n");
+      printf ("\t [%f,%f] -> [%d,%d,%d,%d]\n", s.a[0], s.a[1], u.a[0], u.a[1],
+	      u.a[2], u.a[3]);
+      printf ("\t expect [%d,%d,%d,%d]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2ps-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2ps-1.c
new file mode 100644
index 00000000000..7c9c01dc33a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtpd2ps-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtpd_ps
+#endif
+
+#include <emmintrin.h>
+
+static __m128
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  return _mm_cvtpd_ps (p); 
+}
+
+static void
+TEST (void)
+{
+  union128 u;
+  union128d s;
+  float e[4] = { 0.0 };
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  u.x = test (s.x);
+
+  e[0] = (float)s.a[0]; 
+  e[1] = (float)s.a[1]; 
+
+  if (check_union128 (u, e))
+#if DEBUG
+  {
+    printf ("sse2_test_cvtpd_ps; check_union128 failed\n");
+      printf ("\t [%f,%f] -> [%f,%f,%f,%f]\n", s.a[0], s.a[1], u.a[0], u.a[1],
+	      u.a[2], u.a[3]);
+      printf ("\t expect [%f,%f,%f,%f]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2dq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2dq-1.c
new file mode 100644
index 00000000000..36a94ff88f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2dq-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtps2dq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128 p)
+{
+  return _mm_cvtps_epi32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u;
+  union128 s;
+  int e[4] = {0};
+
+  s.x = _mm_set_ps (2.78, 7777768.82, 2.331, 3.456);
+
+  u.x = test (s.x);
+
+  e[0] = (int)(s.a[0] + 0.5); 
+  e[1] = (int)(s.a[1] + 0.5); 
+  e[2] = (int)(s.a[2] + 0.5); 
+  e[3] = (int)(s.a[3] + 0.5); 
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtps2dq_1; check_union128i_d failed\n");
+      printf ("\t [%f,%f,%f,%f] -> [%d,%d,%d,%d]\n", s.a[0], s.a[1], s.a[2],
+	      s.a[3], u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\t expect [%d,%d,%d,%d]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2pd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2pd-1.c
new file mode 100644
index 00000000000..de85ac407dc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtps2pd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtps2pd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128 p)
+{
+  return _mm_cvtps_pd (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d u;
+  union128 s;
+  double e[2];
+
+  s.x = _mm_set_ps (2.78, 7777768.82, 2.331, 3.456);
+
+  u.x = test (s.x);
+
+  e[0] = (double)s.a[0]; 
+  e[1] = (double)s.a[1]; 
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtps2pd_1; check_union128d failed\n");
+      printf ("\t cvt\t [%f,%f,%f,%f] -> [%f,%f]\n", s.a[0], s.a[1], s.a[2],
+	      s.a[3], u.a[0], u.a[1]);
+      printf ("\t expect\t [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-1.c
new file mode 100644
index 00000000000..77a1ad5af4c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-1.c
@@ -0,0 +1,49 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtsd2si_1
+#endif
+
+#include <emmintrin.h>
+
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  return _mm_cvtsd_si32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  int e;
+  int d;
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  d = test (s.x);
+
+  e = (int)(s.a[0] + 0.5);
+
+  if (d != e)
+#if DEBUG
+  {
+      printf ("sse2_test_cvtsd2si_1; failed\n");
+      printf ("\t [%f,%f] -> [%d]\n", s.a[0], s.a[1], d);
+      printf ("\t expect [%d]\n", e);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-2.c
new file mode 100644
index 00000000000..a36e0e90fb6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2si-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtsd2si_2
+#endif
+
+#include <emmintrin.h>
+
+static long long
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  return _mm_cvtsd_si64 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  long long e;
+  long long d;
+
+  s.x = _mm_set_pd (829496729501.4, 429496729501.4);
+
+  d = test (s.x);
+
+  e = (long long)(s.a[0] + 0.5);
+
+  if (d != e)
+#if DEBUG
+  {
+      printf ("sse2_test_cvtsd2si_2; failed\n");
+      printf ("\t [%f,%f] -> [%ld]\n", s.a[0], s.a[1], d);
+      printf ("\t expect [%ld]\n", e);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2ss-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2ss-1.c
new file mode 100644
index 00000000000..33274cfa73d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsd2ss-1.c
@@ -0,0 +1,53 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtsd2ss_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128 
+__attribute__((noinline, unused))
+test (__m128 p1, __m128d p2)
+{
+  return _mm_cvtsd_ss (p1, p2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1;
+  union128 u, s2;
+  double source1[2] = {123.345, 67.3321};
+  float  e[4] = {5633.098, 93.21, 3.34, 4555.2};
+
+  s1.x = _mm_loadu_pd (source1);
+  s2.x = _mm_loadu_ps (e);
+
+  __asm("" : "+v"(s1.x), "+v"(s2.x));
+  u.x = test(s2.x, s1.x);
+
+  e[0] = (float)source1[0];
+
+  if (check_union128(u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtsd2ss_1; check_union128 failed\n");
+      printf ("\t [%f,%f,%f,%f],[%f,%f]\n", s2.a[0], s2.a[1], s2.a[2], s2.a[3],
+    		  s1.a[0], s1.a[1]);
+      printf ("\t -> \t[%f,%f,%f,%f]\n", u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\texpect\t[%f,%f,%f,%f]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-1.c
new file mode 100644
index 00000000000..5465945e8b5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtsi2sd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d 
+__attribute__((noinline, unused))
+test (__m128d p, int b)
+{
+  __asm("" : "+v"(p), "+r"(b));
+  return _mm_cvtsi32_sd (p, b); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s;
+  int b = 128;
+  double e[2];
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  u.x = test (s.x, b);
+  e[0] = (double)b;
+  e[1] = s.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-2.c
new file mode 100644
index 00000000000..cd8f0840702
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtsi2sd-2.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtsi2sd_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128d 
+__attribute__((noinline, unused))
+test (__m128d p, long long b)
+{
+  __asm("" : "+v"(p), "+r"(b));
+  return _mm_cvtsi64_sd (p, b); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s;
+  long long b = 42949672951333LL;
+  double e[2];
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  u.x = test (s.x, b);
+  e[0] = (double)b;
+  e[1] = s.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvtss2sd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvtss2sd-1.c
new file mode 100644
index 00000000000..d93bae68d8b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvtss2sd-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvtss2sd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d 
+__attribute__((noinline, unused))
+test (__m128d a, __m128 b)
+{
+  return _mm_cvtss_sd (a, b); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1;
+  union128 s2;
+  double e[2];
+
+  s1.x = _mm_set_pd (123.321, 456.987);
+  s2.x = _mm_set_ps (123.321, 456.987, 666.45, 231.987);
+
+  u.x = test (s1.x, s2.x);
+
+  e[0] = (double)s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_cvtss2sd_1; check_union128d failed\n");
+      printf ("\t [%f,%f], [%f,%f,%f,%f]\n", s1.a[0], s1.a[1], s2.a[0], s2.a[1],
+	      s2.a[2], s2.a[3]);
+      printf ("\t -> \t[%f,%f]\n", u.a[0], u.a[1]);
+      printf ("\texpect\t[%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvttpd2dq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvttpd2dq-1.c
new file mode 100644
index 00000000000..baa7d3baa75
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvttpd2dq-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvttpd_epi32
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  return _mm_cvttpd_epi32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  union128i_d u;
+  int e[4] = {0};
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  u.x = test (s.x);
+
+  e[0] = (int)s.a[0]; 
+  e[1] = (int)s.a[1]; 
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+  {
+	  printf ("sse2_test_cvttpd_epi32; check_union128i_d failed\n");
+      printf ("\t [%f,%f] -> [%d,%d,%d,%d]\n", s.a[0], s.a[1], u.a[0], u.a[1],
+	      u.a[2], u.a[3]);
+      printf ("\t expect [%d,%d,%d,%d]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvttps2dq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvttps2dq-1.c
new file mode 100644
index 00000000000..88427d8c6f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvttps2dq-1.c
@@ -0,0 +1,43 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvttps2dq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128 p)
+{
+  return _mm_cvttps_epi32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128 s;
+  union128i_d u;
+  int e[4] = {0};
+
+  s.x = _mm_set_ps (123.321, 456.987, 33.56, 7765.321);
+
+  u.x = test (s.x);
+
+  e[0] = (int)s.a[0]; 
+  e[1] = (int)s.a[1]; 
+  e[2] = (int)s.a[2]; 
+  e[3] = (int)s.a[3]; 
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-1.c
new file mode 100644
index 00000000000..2dc96d1eaca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvttsd2si_1
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  __asm("" : "+v"(p));
+  return _mm_cvttsd_si32 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  int e;
+  int d;
+
+  s.x = _mm_set_pd (123.321, 456.987);
+
+  d = test (s.x);
+  e = (int)(s.a[0]);
+
+  if (d != e)
+#if DEBUG
+  {
+      printf ("sse2_test_cvttsd2si_1; failed\n");
+      printf ("\t [%f,%f] -> [%d]\n", s.a[0], s.a[1], d);
+      printf ("\t expect [%d]\n", e);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-2.c
new file mode 100644
index 00000000000..cd6fa926b4a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-cvttsd2si-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_cvttsd2si_2
+#endif
+
+#include <emmintrin.h>
+
+static long long
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  __asm("" : "+v"(p));
+  return _mm_cvttsd_si64 (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  long long e;
+  long long d;
+
+  s.x = _mm_set_pd (123.321, 42949672339501.4);
+
+  d = test (s.x);
+  e = (long long)(s.a[0]);
+
+  if (d != e)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-divpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-divpd-1.c
new file mode 100644
index 00000000000..e4a3bdaa54f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-divpd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_divpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_div_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] / s2.a[0];
+  e[1] = s1.a[1] / s2.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+    printf ("sse2_test_divpd_1; check_union128d failed\n");
+    printf ("\t [%f,%f] * [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+    printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-divsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-divsd-1.c
new file mode 100644
index 00000000000..197151e9a18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-divsd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_divsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_div_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x);
+   
+  e[0] = s1.a[0] / s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+    printf ("sse2_test_divsd_1; check_union128d failed\n");
+    printf ("\t [%f,%f] / [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+    printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-maxpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-maxpd-1.c
new file mode 100644
index 00000000000..4462a2ee011
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-maxpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_maxpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_max_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] > s2.a[0] ? s1.a[0]:s2.a[0];
+  e[1] = s1.a[1] > s2.a[1] ? s1.a[1]:s2.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-maxsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-maxsd-1.c
new file mode 100644
index 00000000000..e17628950fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-maxsd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_maxsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_max_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] > s2.a[0] ? s1.a[0]:s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_maxsd_3; check_union128d failed\n");
+      printf ("\t [%f,%f] + [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-minpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-minpd-1.c
new file mode 100644
index 00000000000..f4d1960bf78
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-minpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_minpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_min_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] < s2.a[0] ? s1.a[0]:s2.a[0];
+  e[1] = s1.a[1] < s2.a[1] ? s1.a[1]:s2.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-minsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-minsd-1.c
new file mode 100644
index 00000000000..4b3087bc403
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-minsd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_minsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_min_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] < s2.a[0] ? s1.a[0]:s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_minsd_3; check_union128d failed\n");
+      printf ("\t [%f,%f] + [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-mmx.c b/gcc/testsuite/gcc.target/powerpc/sse2-mmx.c
new file mode 100644
index 00000000000..115d83a4283
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-mmx.c
@@ -0,0 +1,82 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#include "sse2-check.h"
+
+#ifndef TEST
+#define TEST sse2_test_mmx_1
+#endif
+
+#include <mmintrin.h>
+
+#define N 4
+
+unsigned long long a[N], b[N], result[N];
+
+unsigned long long check_data[N] =
+  { 0x101010101010100full,
+    0x1010101010101010ull,
+    0x1010101010101010ull,
+    0x1010101010101010ull };
+
+__m64
+unsigned_add3 (const __m64 * a, const __m64 * b,
+	       __m64 * result, unsigned int count)
+{
+  __m64 _a, _b, one, sum, carry, onesCarry;
+
+  unsigned int i;
+
+  carry = _mm_setzero_si64 ();
+
+  one = _mm_cmpeq_pi8 (carry, carry);
+  one = _mm_sub_si64 (carry, one);
+
+  for (i = 0; i < count; i++)
+    {
+      _a = a[i];
+      _b = b[i];
+
+      sum = _mm_add_si64 (_a, _b);
+      sum = _mm_add_si64 (sum, carry);
+
+      result[i] = sum;
+
+      onesCarry = _mm_and_si64 (_mm_xor_si64 (_a, _b), carry);
+      onesCarry = _mm_or_si64 (_mm_and_si64 (_a, _b), onesCarry);
+      onesCarry = _mm_and_si64 (onesCarry, one);
+
+      _a = _mm_srli_si64 (_a, 1);
+      _b = _mm_srli_si64 (_b, 1);
+
+      carry = _mm_add_si64 (_mm_add_si64 (_a, _b), onesCarry);
+      carry = _mm_srli_si64 (carry, 63);
+    }
+
+  return carry;
+}
+
+void __attribute__((noinline))
+TEST (void)
+{
+  unsigned long long carry;
+  int i;
+
+  /* Really long numbers.  */
+  a[3] = a[2] = a[1] = a[0] = 0xd3d3d3d3d3d3d3d3ull;
+  b[3] = b[2] = b[1] = b[0] = 0x3c3c3c3c3c3c3c3cull;
+
+  carry = (unsigned long long) unsigned_add3
+    ((__m64 *)a, (__m64 *)b, (__m64 *)result, N);
+
+  _mm_empty ();
+
+  if (carry != 1)
+    abort ();
+
+  for (i = 0; i < N; i++)
+    if (result [i] != check_data[i])
+      abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-1.c
new file mode 100644
index 00000000000..9b7c2ec3a92
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movhpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, double *p)
+{
+  __asm("" : "+v"(s1), "+b"(p));
+  return _mm_loadh_pd (s1, p); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1;
+  double s2[2] = {41124.234,2344.2354};
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  u.x = test (s1.x, s2); 
+
+  e[0] = s1.a[0];
+  e[1] = s2[0];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-2.c
new file mode 100644
index 00000000000..b5eb08657cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movhpd-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movhpd_2
+#endif
+
+#include <emmintrin.h>
+
+static void
+__attribute__((noinline, unused))
+test (double *p, __m128d a)
+{
+  __asm("" : "+v"(a), "+b"(p));
+  return _mm_storeh_pd (p, a); 
+}
+
+static void
+TEST (void)
+{
+  union128d s;
+  double d[1];
+  double e[1];
+   
+  s.x = _mm_set_pd (2134.3343,1234.635654);
+  test (d, s.x);
+
+  e[0] = s.a[1];
+
+  if (e[0] != d[0])
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-1.c
new file mode 100644
index 00000000000..fec05bca99a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-1.c
@@ -0,0 +1,43 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movlpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d a, double *e)
+{
+  __asm("" : "+v"(a), "+b"(e));
+  return _mm_loadl_pd (a, e); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1;
+  double d[2] = {2134.3343,1234.635654};
+  double e[2];
+
+  s1.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = _mm_loadu_pd (d);
+
+  u.x = test (s1.x, d);  
+
+  e[0] = d[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-2.c
new file mode 100644
index 00000000000..6974d3be646
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movlpd-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movlpd_2
+#endif
+
+#include <emmintrin.h>
+
+static void
+__attribute__((noinline, unused))
+test (double *e, __m128d a)
+{
+  __asm("" : "+v"(a), "+b"(e));
+  return _mm_storel_pd (e, a); 
+}
+
+static void
+TEST (void)
+{
+  union128d u;
+  double e[2];
+
+  u.x = _mm_set_pd (41124.234,2344.2354);
+
+  test (e, u.x);  
+
+  e[1] = u.a[1];
+  
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movmskpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-movmskpd-1.c
new file mode 100644
index 00000000000..dda519dc762
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movmskpd-1.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movmskpd_1
+#endif
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static int
+__attribute__((noinline, unused))
+test (__m128d p)
+{
+  __asm("" : "+v"(p));
+  return _mm_movemask_pd (p); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  double source[2] = {1.234, -2234.23};
+  union128d s1;
+  int d;
+  int e;
+
+  s1.x = _mm_loadu_pd (source);
+
+  d = test (s1.x);
+
+  e = 0;
+  if (source[0] < 0)
+    e |= 1;
+  
+  if (source[1] < 0)
+    e |= 1 << 1;
+
+  if (checkVi (&d, &e, 1))
+#if DEBUG
+  {
+    printf ("sse2_test_movmskpd_1; check_union128d failed\n");
+    printf ("\t [%f,%f] -> [%d]\n",
+    		  s1.a[0], s1.a[1], d);
+    printf ("\t expect [%d]\n",
+			  e);
+  }
+#else
+    abort ();
+#endif
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-movq-1.c
new file mode 100644
index 00000000000..6b65e15b09c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movq-1.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i b)
+{
+  __asm("" : "+v"(b));
+  return _mm_move_epi64 (b); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u, s1;
+  long long e[2] = { 0 };
+
+  s1.x = _mm_set_epi64x(12876, 3376590);
+  u.x = test (s1.x);
+  e[0] = s1.a[0];
+
+  if (check_union128i_q (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_movq_1; check_union128i_q failed\n");
+      printf ("\t move_epi64 ([%llx, %llx]) -> [%llx, %llx]\n", s1.a[0],
+	      s1.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%llx, %llx]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movq-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-movq-2.c
new file mode 100644
index 00000000000..e742157e9ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movq-2.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movq_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (long long b)
+{
+  __asm("" : "+r" (b));
+  return _mm_cvtsi64_si128 (b); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u;
+  long long b = 4294967295133LL;
+  long long e[2] = {0};
+
+  u.x = test (b);
+
+  e[0] = b;
+
+  if (check_union128i_q (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movq-3.c b/gcc/testsuite/gcc.target/powerpc/sse2-movq-3.c
new file mode 100644
index 00000000000..ea80e2375d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movq-3.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movq_3
+#endif
+
+#include <emmintrin.h>
+
+static long long
+__attribute__((noinline, unused))
+test (__m128i b)
+{
+  __asm("" : "+v"(b));
+  return _mm_cvtsi128_si64 (b); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u;
+  long long e;
+
+  u.x = _mm_set_epi64x (4294967295133LL, 3844294967295133LL);
+  e = test (u.x);
+  if (e != u.a[0])
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-1.c
new file mode 100644
index 00000000000..fe471ed1aa2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (double *p)
+{
+  return _mm_load_sd (p); 
+}
+
+static void
+TEST (void)
+{
+  union128d u;
+  double d[2] = {128.023, 3345.1234};
+  double e[2];
+
+  u.x = _mm_loadu_pd (e);
+  u.x = test (d);
+
+  e[0] = d[0];
+  e[1] = 0.0;
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movsd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-2.c
new file mode 100644
index 00000000000..2c1e35538f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-2.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movsd_2
+#endif
+
+#include <emmintrin.h>
+
+static void 
+__attribute__((noinline, unused))
+test (double *p, __m128d a)
+{
+  _mm_store_sd (p, a); 
+}
+
+static void
+TEST (void)
+{
+  union128d u;
+  double d[1];
+  double e[1];
+
+  u.x = _mm_set_pd (128.023, 3345.1234);
+  test (d, u.x);
+
+  e[0] = u.a[0];
+
+  if (checkVd (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-movsd-3.c b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-3.c
new file mode 100644
index 00000000000..57a5c23ae1c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-movsd-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_movsd_3
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d a, __m128d b)
+{
+  __asm("" : "+v"(a), "+v"(b));
+  return _mm_move_sd (a, b);
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2] = { 256.046, 3345.1234 };
+
+  s1.x = _mm_setr_pd (128.023, 3345.1234);
+  s2.x = _mm_setr_pd (256.046, 4533.1234);
+  __asm("" : "+v"(s1.x), "+v"(s2.x));
+  u.x = test (s1.x, s2.x);
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_movsd_3; check_union128d failed\n");
+      printf ("\t [%f,%f], [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-mulpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-mulpd-1.c
new file mode 100644
index 00000000000..b19f3b86123
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-mulpd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_mulpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_mul_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x);
+   
+  e[0] = s1.a[0] * s2.a[0];
+  e[1] = s1.a[1] * s2.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_mul_pd_1; check_union128d failed\n");
+      printf ("\t [%f,%f] * [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-mulsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-mulsd-1.c
new file mode 100644
index 00000000000..8206d263459
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-mulsd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_mulsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_mul_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] * s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_mul_sd_1; check_union128d failed\n");
+      printf ("\t [%f,%f] * [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-orpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-orpd-1.c
new file mode 100644
index 00000000000..f3d9ab8f458
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-orpd-1.c
@@ -0,0 +1,49 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_orpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_or_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+
+  union
+  {
+     double d[2];
+     long long ll[2];
+  }d1, d2, e;
+
+  s1.x = _mm_set_pd (1234, 44386);
+  s2.x = _mm_set_pd (5198, 23098);
+
+  _mm_storeu_pd (d1.d, s1.x);
+  _mm_storeu_pd (d2.d, s2.x);
+
+  u.x = test (s1.x, s2.x);
+  
+  e.ll[0] = d1.ll[0] | d2.ll[0];
+  e.ll[1] = d1.ll[1] | d2.ll[1];
+
+  if (check_union128d (u, e.d))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-packssdw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-packssdw-1.c
new file mode 100644
index 00000000000..d67d47c4360
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-packssdw-1.c
@@ -0,0 +1,73 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_packssdw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_packs_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d s1, s2;
+  union128i_w u;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi32 (2134, -128, 655366, 9999);
+  s2.x = _mm_set_epi32 (41124, 234, 2, -800900);
+  u.x = test (s1.x, s2.x); 
+
+  for (i = 0; i < 4; i++)
+    {
+      if (s1.a[i] > 32767)
+        e[i] = 32767;
+      else if (s1.a[i] < -32768)
+        e[i] = -32768;
+      else
+        e[i] = s1.a[i];
+    }
+  
+  for (i = 0; i < 4; i++)
+   {
+      if (s2.a[i] > 32767)
+        e[i+4] = 32767;
+      else if (s2.a[i] < -32768)
+        e[i+4] = -32768;
+      else
+        e[i+4] = s2.a[i];
+    }
+
+  if (check_union128i_w (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_packssdw_1; check_union128i_w failed\n");
+      printf (
+	  "\t ([%x,%x,%x,%x], [%x,%x,%x,%x]) -> [%x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  s1.a[0], s1.a[1], s1.a[2], s1.a[3], s2.a[0], s2.a[1], s2.a[2],
+	  s2.a[3], u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6],
+	  u.a[7]);
+      printf ("\t expect [%x,%x,%x,%x, %x,%x,%x,%x]\n", e[0], e[1], e[2], e[3],
+			  e[4], e[5], e[6], e[7]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-packsswb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-packsswb-1.c
new file mode 100644
index 00000000000..3043688bfd4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-packsswb-1.c
@@ -0,0 +1,78 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_packsswb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_packs_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w s1, s2;
+  union128i_b u;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi16 (2134, -128, 1234, 6354, 1002, 3004, 4050, 9999);
+  s2.x = _mm_set_epi16 (41124, 234, 2344, 2354, 607, 1, 2, -8009);
+  u.x = test (s1.x, s2.x); 
+
+  for (i = 0; i < 8; i++)
+    {
+      if (s1.a[i] > 127)
+        e[i] = 127;
+      else if (s1.a[i] < -128)
+        e[i] = -128;
+      else
+        e[i] = s1.a[i];
+    }
+  
+  for (i = 0; i < 8; i++)
+   {
+      if (s2.a[i] > 127)
+        e[i+8] = 127;
+      else if (s2.a[i] < -128)
+        e[i+8] = -128;
+      else
+        e[i+8] = s2.a[i];
+    }
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_packsswb_1; check_union128i_w failed\n");
+      printf ("\t ([%x,%x,%x,%x, %x,%x,%x,%x], [%x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5],
+	      s2.a[6], s2.a[7]);
+      printf ("\t\t -> [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\t expect [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-packuswb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-packuswb-1.c
new file mode 100644
index 00000000000..825003d6103
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-packuswb-1.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_packuswb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_packus_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w s1, s2;
+  union128i_ub u;
+  unsigned char e[16];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (1, 2, 3, 4, -5, -6, -7, -8);
+  s2.x = _mm_set_epi16 (-9, -10, -11, -12, 13, 14, 15, 16);
+  u.x = test (s1.x, s2.x); 
+
+  for (i=0; i<8; i++)
+    {
+      tmp = s1.a[i]<0 ? 0 : s1.a[i];
+      tmp = tmp>255 ? 255 : tmp;
+      e[i] = tmp;
+
+      tmp = s2.a[i]<0 ? 0 : s2.a[i];
+      tmp = tmp>255 ? 255 : tmp;
+      e[i+8] = tmp;
+    }
+
+  if (check_union128i_ub (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_packuswb_1; check_union128i_w failed\n");
+      printf ("\t ([%x,%x,%x,%x, %x,%x,%x,%x], [%x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5],
+	      s2.a[6], s2.a[7]);
+      printf ("\t\t -> [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\t expect [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddb-1.c
new file mode 100644
index 00000000000..766e2ecfde7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddb-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+     e[i] = s1.a[i] + s2.a[i];
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddd-1.c
new file mode 100644
index 00000000000..8c5796b282f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddd-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (30,90,-80,-40);
+  s2.x = _mm_set_epi32 (76, -100, -34, -78);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+     e[i] = s1.a[i] + s2.a[i];
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddq-1.c
new file mode 100644
index 00000000000..67a85a089d3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddq-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_epi64 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u, s1, s2;
+  long long e[2];
+  int i;
+   
+  s1.x = _mm_set_epi64x (90,-80);
+  s2.x = _mm_set_epi64x (76, -100);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 2; i++)
+     e[i] = s1.a[i] + s2.a[i];
+
+  if (check_union128i_q (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddsb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddsb-1.c
new file mode 100644
index 00000000000..cb6f37ad229
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddsb-1.c
@@ -0,0 +1,74 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddsb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_adds_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+    {
+      tmp = (signed char)s1.a[i] + (signed char)s2.a[i];
+
+      if (tmp > 127)
+        tmp = 127;
+      if (tmp < -128)
+        tmp = -128;
+
+      e[i] = tmp;
+    }
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_paddsb_1; check_union128i_b failed\n");
+      printf (
+	  "\tadds\t([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x],\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s1.a[8], s1.a[9], s1.a[10], s1.a[11], s1.a[12], s1.a[13],
+	      s1.a[14], s1.a[15]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5], s2.a[6],
+	      s2.a[7], s2.a[8], s2.a[9], s2.a[10], s2.a[11], s2.a[12], s2.a[13],
+	      s2.a[14], s2.a[15]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddsw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddsw-1.c
new file mode 100644
index 00000000000..82ce0a4b62a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddsw-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddsw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_adds_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] + s2.a[i];
+      
+      if (tmp > 32767)
+        tmp = 32767;
+      if (tmp < -32768)
+        tmp = -32768;
+      
+      e[i] = tmp;
+    }
+
+  if (check_union128i_w (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_paddsw_1; check_union128i_w failed\n");
+      printf ("\tadds\t([%x,%x,%x,%x, %x,%x,%x,%x],\n", s1.a[0], s1.a[1],
+	      s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6], s1.a[7]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x])\n", s2.a[0], s2.a[1], s2.a[2],
+	      s2.a[3], s2.a[4], s2.a[5], s2.a[6], s2.a[7]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2],
+	      u.a[3], u.a[4], u.a[5], u.a[6], u.a[7]);
+      printf ("\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", e[0], e[1], e[2], e[3],
+	      e[4], e[5], e[6], e[7]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddusb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddusb-1.c
new file mode 100644
index 00000000000..df3d8b230ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddusb-1.c
@@ -0,0 +1,74 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddusb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_adds_epu8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16] = {0};
+  int i, tmp;
+   
+  s1.x = _mm_set_epi8 (30, 2, 3, 4, 10, 20, 30, 90, 80, 40, 100, 15, 98, 25, 98, 7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, 100, 34, 78, 39, 6,  3,  4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+    {
+      tmp = (unsigned char)s1.a[i] + (unsigned char)s2.a[i];
+
+      if (tmp > 255)
+        tmp = -1;
+      if (tmp < 0)
+        tmp = 0; 
+
+      e[i] = tmp; 
+    }
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_paddusb_1; check_union128i_b failed\n");
+      printf (
+	  "\tadds\t([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x],\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s1.a[8], s1.a[9], s1.a[10], s1.a[11], s1.a[12], s1.a[13],
+	      s1.a[14], s1.a[15]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5], s2.a[6],
+	      s2.a[7], s2.a[8], s2.a[9], s2.a[10], s2.a[11], s2.a[12], s2.a[13],
+	      s2.a[14], s2.a[15]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddusw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddusw-1.c
new file mode 100644
index 00000000000..0bc707446b5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddusw-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddusw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_adds_epu16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,80,40,100,15);
+  s2.x = _mm_set_epi16 (11, 98, 76, 100, 34, 78, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] + s2.a[i];
+      
+      if (tmp > 65535)
+        tmp = -1;
+
+      if (tmp < 0)
+        tmp = 0;
+      
+      e[i] = tmp;
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-paddw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-paddw-1.c
new file mode 100644
index 00000000000..d91351efa92
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-paddw-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_paddw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_add_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+     e[i] = s1.a[i] + s2.a[i];
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pavgb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pavgb-1.c
new file mode 100644
index 00000000000..5d489a3ce08
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pavgb-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pavgb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_avg_epu8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_ub u, s1, s2;
+  unsigned char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,80,40,100,15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, 100, 34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+     e[i] = (s1.a[i] + s2.a[i]+1)>>1;
+
+  if (check_union128i_ub (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pavgw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pavgw-1.c
new file mode 100644
index 00000000000..995cb8fa488
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pavgw-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pavgw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_avg_epu16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_uw u, s1, s2;
+  unsigned short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,80,40,100,15);
+  s2.x = _mm_set_epi16 (11, 98, 76, 100, 34, 78, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+     e[i] = (s1.a[i] + s2.a[i]+1)>>1;
+
+  if (check_union128i_uw (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqb-1.c
new file mode 100644
index 00000000000..3d0e4c60e33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqb-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpeqb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpeq_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,80,40,100,15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, 100, 34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+     e[i] = (s1.a[i] == s2.a[i]) ? -1:0;
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqd-1.c
new file mode 100644
index 00000000000..4af7deccae8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpeqd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpeq_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (98, 25, 98,7);
+  s2.x = _mm_set_epi32 (88, 44, 33, 229);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+     e[i] = (s1.a[i] == s2.a[i]) ? -1:0;
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqw-1.c
new file mode 100644
index 00000000000..fca6b099e96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpeqw-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpeqw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpeq_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (20,30,90,80,40,100,15,98);
+  s2.x = _mm_set_epi16 (34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+     e[i] = (s1.a[i] == s2.a[i]) ? -1:0;
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtb-1.c
new file mode 100644
index 00000000000..10fbc5cd912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtb-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpgtb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpgt_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,80,40,100,15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, 100, 34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+     e[i] = (s1.a[i] > s2.a[i]) ? -1:0;
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtd-1.c
new file mode 100644
index 00000000000..bc046d6944b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpgtd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpgt_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (98, 25, 98,7);
+  s2.x = _mm_set_epi32 (88, 44, 33, 229);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+     e[i] = (s1.a[i] > s2.a[i]) ? -1:0;
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtw-1.c
new file mode 100644
index 00000000000..19b82cbd7d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pcmpgtw-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pcmpgtw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_cmpgt_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (20,30,90,80,40,100,15,98);
+  s2.x = _mm_set_epi16 (34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+     e[i] = (s1.a[i] > s2.a[i]) ? -1:0;
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pextrw.c b/gcc/testsuite/gcc.target/powerpc/sse2-pextrw.c
new file mode 100644
index 00000000000..2bb812e411f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pextrw.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pextrw_1
+#endif
+
+#include <emmintrin.h>
+
+#define msk0   0
+#define msk1   1
+#define msk2   2
+#define msk3   3
+#define msk4   4
+#define msk5   5
+#define msk6   6
+#define msk7   7
+
+static void
+TEST (void)
+{
+  union
+    {
+      __m128i x;
+      int i[4];
+      short s[8];
+    } val1;
+  int res[8], masks[8];
+  int i;
+
+  val1.i[0] = 0x04030201;
+  val1.i[1] = 0x08070605;
+  val1.i[2] = 0x0C0B0A09;
+  val1.i[3] = 0x100F0E0D;
+
+  res[0] = _mm_extract_epi16 (val1.x, msk0);
+  res[1] = _mm_extract_epi16 (val1.x, msk1);
+  res[2] = _mm_extract_epi16 (val1.x, msk2);
+  res[3] = _mm_extract_epi16 (val1.x, msk3);
+  res[4] = _mm_extract_epi16 (val1.x, msk4);
+  res[5] = _mm_extract_epi16 (val1.x, msk5);
+  res[6] = _mm_extract_epi16 (val1.x, msk6);
+  res[7] = _mm_extract_epi16 (val1.x, msk7);
+
+  masks[0] = msk0;
+  masks[1] = msk1;
+  masks[2] = msk2;
+  masks[3] = msk3;
+  masks[4] = msk4;
+  masks[5] = msk5;
+  masks[6] = msk6;
+  masks[7] = msk7;
+
+  for (i = 0; i < 8; i++)
+    if (res[i] != val1.s [masks[i]])
+      abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pinsrw.c b/gcc/testsuite/gcc.target/powerpc/sse2-pinsrw.c
new file mode 100644
index 00000000000..2fd5b0bd625
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pinsrw.c
@@ -0,0 +1,87 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pinsrw_1
+#endif
+
+#include <emmintrin.h>
+#include <string.h>
+
+#define msk0 0x00
+#define msk1 0x01
+#define msk2 0x02
+#define msk3 0x03
+#define msk4 0x04
+#define msk5 0x05
+#define msk6 0x06
+#define msk7 0x07
+
+static void
+TEST (void)
+{
+  union
+    {
+      __m128i x;
+      unsigned int i[4];
+      unsigned short s[8];
+    } res [8], val, tmp;
+  int masks[8];
+  unsigned short ins[4] = { 3, 4, 5, 6 };
+  int i;
+
+  val.i[0] = 0x35251505;
+  val.i[1] = 0x75655545;
+  val.i[2] = 0xB5A59585;
+  val.i[3] = 0xF5E5D5C5;
+
+  /* Check pinsrw imm8, r32, xmm.  */
+  res[0].x = _mm_insert_epi16 (val.x, ins[0], msk0);
+  res[1].x = _mm_insert_epi16 (val.x, ins[0], msk1);
+  res[2].x = _mm_insert_epi16 (val.x, ins[0], msk2);
+  res[3].x = _mm_insert_epi16 (val.x, ins[0], msk3);
+  res[4].x = _mm_insert_epi16 (val.x, ins[0], msk4);
+  res[5].x = _mm_insert_epi16 (val.x, ins[0], msk5);
+  res[6].x = _mm_insert_epi16 (val.x, ins[0], msk6);
+  res[7].x = _mm_insert_epi16 (val.x, ins[0], msk7);
+
+  masks[0] = msk0;
+  masks[1] = msk1;
+  masks[2] = msk2;
+  masks[3] = msk3;
+  masks[4] = msk4;
+  masks[5] = msk5;
+  masks[6] = msk6;
+  masks[7] = msk7;
+
+  for (i = 0; i < 8; i++)
+    {
+      tmp.x = val.x;
+      tmp.s[masks[i]] = ins[0];
+      if (memcmp (&tmp, &res[i], sizeof (tmp)))
+	abort ();
+    }
+    
+  /* Check pinsrw imm8, m16, xmm.  */
+  for (i = 0; i < 8; i++)
+    {
+      res[i].x = _mm_insert_epi16 (val.x, ins[i % 2], msk0);
+      masks[i] = msk0;
+    }
+
+  for (i = 0; i < 8; i++)
+    {
+      tmp.x = val.x;
+      tmp.s[masks[i]] = ins[i % 2];
+      if (memcmp (&tmp, &res[i], sizeof (tmp)))
+	abort ();
+    }
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmaddwd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmaddwd-1.c
new file mode 100644
index 00000000000..9595598ac63
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmaddwd-1.c
@@ -0,0 +1,43 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmaddwd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_madd_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w  s1, s2;
+  union128i_d u;
+  int e[4]; 
+  int i; 
+ 
+  s1.x = _mm_set_epi16 (2134,3343,1234,6354, 1, 3, 4, 5);
+  s2.x = _mm_set_epi16 (41124,234,2344,2354,9, -1, -8, -10);
+  u.x = test (s1.x, s2.x); 
+
+  for (i = 0; i < 4; i++)
+    e[i] = (s1.a[i*2] * s2.a[i*2])+(s1.a[(i*2) + 1] * s2.a[(i*2) + 1]);   
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmaxsw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmaxsw-1.c
new file mode 100644
index 00000000000..c1765aeecc6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmaxsw-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmaxsw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_max_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (1,2,3,4,5,6,7,8);
+  s2.x = _mm_set_epi16 (8,7,6,5,4,3,2,1);
+  u.x = test (s1.x, s2.x);
+
+  for (i=0; i<8; i++)
+    e[i] = s1.a[i]>s2.a[i]?s1.a[i]:s2.a[i]; 
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmaxub-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmaxub-1.c
new file mode 100644
index 00000000000..500d02e2e33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmaxub-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmaxub_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_max_epu8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_ub u, s1, s2;
+  unsigned char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);
+  s2.x = _mm_set_epi8 (16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1);
+  u.x = test (s1.x, s2.x);
+
+  for (i=0; i<16; i++)
+    e[i] = s1.a[i]>s2.a[i]?s1.a[i]:s2.a[i]; 
+
+  if (check_union128i_ub (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pminsw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pminsw-1.c
new file mode 100644
index 00000000000..5af2280a722
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pminsw-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pminsw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_min_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (1,2,3,4,5,6,7,8);
+  s2.x = _mm_set_epi16 (8,7,6,5,4,3,2,1);
+  u.x = test (s1.x, s2.x);
+
+  for (i=0; i<8; i++)
+    e[i] = s1.a[i]<s2.a[i]?s1.a[i]:s2.a[i]; 
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pminub-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pminub-1.c
new file mode 100644
index 00000000000..1eeca208d94
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pminub-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pminub_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_min_epu8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_ub u, s1, s2;
+  unsigned char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);
+  s2.x = _mm_set_epi8 (16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1);
+  u.x = test (s1.x, s2.x);
+
+  for (i=0; i<16; i++)
+    e[i] = s1.a[i]<s2.a[i]?s1.a[i]:s2.a[i]; 
+
+  if (check_union128i_ub (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmovmskb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmovmskb-1.c
new file mode 100644
index 00000000000..37d72c231ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmovmskb-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmovmskb_1
+#endif
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static int
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_movemask_epi8 (s1); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128i_b s1;
+  int i, u, e=0;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+
+  __asm("" : "+v"(s1.x));
+  u = test (s1.x); 
+  
+  for (i = 0; i < 16; i++)
+    if (s1.a[i] & (1<<7))
+      e = e | (1<<i);
+
+  if (checkVi (&u, &e, 1))
+#if DEBUG
+    {
+      printf ("sse2_test_pmovmskb_1; checkVi failed\n");
+      printf ("\t ([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x], -> %x)\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s1.a[8], s1.a[9], s1.a[10], s1.a[11], s1.a[12], s1.a[13],
+	      s1.a[14], s1.a[15], u);
+      printf ("\t expect %x\n", e);
+    }
+#else
+    abort ();
+#endif
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmulhuw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmulhuw-1.c
new file mode 100644
index 00000000000..3635a867dbc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmulhuw-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmulhuw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_mulhi_epu16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_uw u, s1, s2;
+  unsigned short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,2067,3033,90,80,40,1000,15);
+  s2.x = _mm_set_epi16 (11, 9834, 7444, 10222, 34, 7833, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] * s2.a[i];
+    
+      e[i] = (tmp & 0xffff0000)>>16;
+    }
+
+  if (check_union128i_uw (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmulhw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmulhw-1.c
new file mode 100644
index 00000000000..1255c03b051
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmulhw-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmulhw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_mulhi_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,2067,-3033,90,80,40,-1000,15);
+  s2.x = _mm_set_epi16 (11, 9834, 7444, -10222, 34, -7833, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] * s2.a[i];
+    
+      e[i] = (tmp & 0xffff0000)>>16;
+    }
+
+  if (check_union128i_w (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_pmulhw_1; check_union128i_w failed\n");
+      printf ("\tmulhi\t([%x,%x,%x,%x, %x,%x,%x,%x],\n", s1.a[0], s1.a[1],
+	      s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6], s1.a[7]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x])\n", s2.a[0], s2.a[1], s2.a[2],
+	      s2.a[3], s2.a[4], s2.a[5], s2.a[6], s2.a[7]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2],
+	      u.a[3], u.a[4], u.a[5], u.a[6], u.a[7]);
+      printf ("\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", e[0], e[1], e[2], e[3],
+	      e[4], e[5], e[6], e[7]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmullw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmullw-1.c
new file mode 100644
index 00000000000..3dca01ab3a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmullw-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#define NO_WARN_X86_INTRINSICS 1
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmullw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_mullo_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,2067,-3033,90,80,40,-1000,15);
+  s2.x = _mm_set_epi16 (11, 9834, 7444, -10222, 34, -7833, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] * s2.a[i];
+    
+      e[i] = tmp;
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pmuludq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pmuludq-1.c
new file mode 100644
index 00000000000..54fb06fc9fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pmuludq-1.c
@@ -0,0 +1,53 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pmuludq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_mul_epu32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d s1, s2;
+  union128i_q u;
+  long long e[2];
+   
+  s1.x = _mm_set_epi32 (10,2067,3033,905);
+  s2.x = _mm_set_epi32 (11, 9834, 7444, 10222);
+  __asm("" : "+v"(s1.x), "+v"(s2.x));
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] * s2.a[0];
+  e[1] = s1.a[2] * s2.a[2];
+
+  if (check_union128i_q (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_pmuludq_1; check_union128i_q failed\n");
+      printf ("\t ([%x,%x,%x,%x], [%x,%x,%x,%x], -> [%llx, %llx])\n", s1.a[0],
+	      s1.a[1], s1.a[2], s1.a[3], s2.a[0], s2.a[1], s2.a[2], s2.a[3],
+	      u.a[0], u.a[1]);
+      printf ("\t expect [%llx, %llx]\n", e[0], e[1]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psadbw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psadbw-1.c
new file mode 100644
index 00000000000..0d6d6a340b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psadbw-1.c
@@ -0,0 +1,69 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psadbw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_sad_epu8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_ub s1, s2;
+  union128i_w u;
+  short e[8] = { 0 };
+  unsigned char tmp[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);
+  s2.x = _mm_set_epi8 (16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1);
+  u.x = test (s1.x, s2.x);
+
+  for (i = 0; i < 16; i++)
+    tmp [i] = __builtin_abs (s1.a[i] - s2.a[i]);
+
+  for (i = 0; i < 8; i++)
+    e[0] += tmp[i];
+
+  for (i = 8; i < 16; i++)
+    e[4] += tmp[i]; 
+
+
+  if (check_union128i_w (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_psadbw_1; check_union128i_w failed\n");
+      printf (
+	  "\tadds\t([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x],\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s1.a[8], s1.a[9], s1.a[10], s1.a[11], s1.a[12], s1.a[13],
+	      s1.a[14], s1.a[15]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5], s2.a[6],
+	      s2.a[7], s2.a[8], s2.a[9], s2.a[10], s2.a[11], s2.a[12], s2.a[13],
+	      s2.a[14], s2.a[15]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2],
+	      u.a[3], u.a[4], u.a[5], u.a[6], u.a[7]);
+      printf ("\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x]\n", e[0], e[1], e[2], e[3],
+	      e[4], e[5], e[6], e[7]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pshufd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pshufd-1.c
new file mode 100644
index 00000000000..6c195fb4557
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pshufd-1.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pshufd_1
+#endif
+
+#define N 0xec
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_shuffle_epi32 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1;
+  int e[4] = { 0 };
+  int i;
+   
+  s1.x = _mm_set_epi32 (16,15,14,13);
+  u.x = test (s1.x);
+
+  for (i = 0; i < 4; i++)
+    e[i] = s1.a[((N & (0x3<<(2*i)))>>(2*i))];
+
+  if (check_union128i_d(u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_pshufd_1; check_union128i_d failed\n");
+      printf ("\t ([%x,%x,%x,%x]) -> [%x,%x,%x,%x]\n", s1.a[0], s1.a[1],
+	      s1.a[2], s1.a[3], u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\t expect [%x,%x,%x,%x]\n", e[0], e[1], e[2], e[3]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pshufhw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pshufhw-1.c
new file mode 100644
index 00000000000..a9230217830
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pshufhw-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pshufhw_1
+#endif
+
+#define N 0xec
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_shufflehi_epi16 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q s1;
+  union128i_w u;
+  short  e[8] = { 0 };
+  int i;
+  int m1[4] = { 0x3, 0x3<<2, 0x3<<4, 0x3<<6 };
+  int m2[4];
+  
+  s1.x = _mm_set_epi64x (0xabcde,0xef58a234);
+  u.x = test (s1.x);
+
+  for (i = 0; i < 4; i++)
+    e[i] = (s1.a[0]>>(16 * i)) & 0xffff;
+
+  for (i = 0; i < 4; i++)
+    m2[i] = (N & m1[i])>>(2*i);
+
+  for (i = 0; i < 4; i++)
+    e[i+4] = (s1.a[1] >> (16 * m2[i])) & 0xffff;
+
+  if (check_union128i_w(u, e))
+#if DEBUG
+  {
+    union128i_w s;
+    s.x = s1.x;
+    printf ("sse2_test_pshufhw_1; check_union128i_w failed\n");
+    printf ("\t ([%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx])\n", s.a[0], s.a[1],
+	      s.a[2], s.a[3], s.a[4], s.a[5], s.a[6], s.a[7]);
+    printf ("\t\t -> [%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx]\n", u.a[0], u.a[1],
+	      u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7]);
+    printf ("\t expect [%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx]\n", e[0], e[1],
+	      e[2], e[3], e[4], e[5], e[6], e[7]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pshuflw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pshuflw-1.c
new file mode 100644
index 00000000000..e662cec6e19
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pshuflw-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pshuflw_1
+#endif
+
+#define N 0xec
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_shufflelo_epi16 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q s1;
+  union128i_w u;
+  short  e[8] = { 0 };
+  int i;
+  int m1[4] = { 0x3, 0x3<<2, 0x3<<4, 0x3<<6 };
+  int m2[4];
+  
+  s1.x = _mm_set_epi64x (0xabcde,0xef58a234);
+  u.x = test (s1.x);
+
+  for (i = 0; i < 4; i++)
+    e[i+4] = (s1.a[1]>>(16 * i)) & 0xffff;
+
+  for (i = 0; i < 4; i++)
+    m2[i] = (N & m1[i])>>(2*i);
+
+  for (i = 0; i < 4; i++)
+    e[i] = (s1.a[0] >> (16 * m2[i])) & 0xffff;
+
+  if (check_union128i_w(u, e))
+#if DEBUG
+  {
+    union128i_w s;
+    s.x = s1.x;
+    printf ("sse2_test_pshuflw_1; check_union128i_w failed\n");
+    printf ("\t ([%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx])\n", s.a[0], s.a[1],
+	      s.a[2], s.a[3], s.a[4], s.a[5], s.a[6], s.a[7]);
+    printf ("\t\t -> [%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx]\n", u.a[0], u.a[1],
+	      u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7]);
+    printf ("\t expect [%hx,%hx,%hx,%hx, %hx,%hx,%hx,%hx]\n", e[0], e[1],
+	      e[2], e[3], e[4], e[5], e[6], e[7]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c
new file mode 100644
index 00000000000..90aac103fc2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pslld_1
+#endif
+
+#define N 0xf
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_slli_epi32 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  int e[4] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi32 (1, -2, 3, 4);
+
+  u.x = test (s.x);
+
+  if (N < 32)
+    for (i = 0; i < 4; i++)
+      e[i] = s.a[i] << N; 
+
+  if (check_union128i_d (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pslld-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-pslld-2.c
new file mode 100644
index 00000000000..2b4c5e797ff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pslld-2.c
@@ -0,0 +1,55 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pslld_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_sll_epi32 (s1, c); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  union128i_q c;
+  int e[4] = { 0 };
+  int i;
+ 
+  s.x = _mm_set_epi32 (2, -3, 0x7000, 0x9000);
+  c.x = _mm_set_epi64x (12, 23);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 32)
+    for (i = 0; i < 4; i++)
+      e[i] = s.a[i] << c.a[0]; 
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_pslld_2; check_union128i_d failed\n");
+      printf ("\tsll\t([%x,%x,%x,%x], [%llx,%llx]\n", s.a[0], s.a[1], s.a[2],
+	      s.a[3], c.a[0], c.a[1]);
+      printf ("\t ->\t [%x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\texpect\t [%x,%x,%x,%x]\n", e[0], e[1], e[2], e[3]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-pslldq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-pslldq-1.c
new file mode 100644
index 00000000000..f4bd6002fe5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-pslldq-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pslldq_1
+#endif
+
+#define N 0x5
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_slli_si128 (s1, N); 
+}
+
+static void 
+TEST (void)
+{
+  union128i_b u, s;
+  char src[16] =
+    { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
+  char e[16] =
+    { 0 };
+  int i;
+
+  s.x = _mm_loadu_si128 ((__m128i *) src);
+
+  u.x = test (s.x);
+
+  for (i = 0; i < 16 - N; i++)
+    e[i + N] = src[i];
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_pslldq_1; check_union128i_b failed\n");
+
+      printf ("\t s ([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s.a[0], s.a[1], s.a[2], s.a[3], s.a[4], s.a[5], s.a[6], s.a[7],
+	      s.a[8], s.a[9], s.a[10], s.a[11], s.a[12], s.a[13], s.a[14],
+	      s.a[15]);
+      printf (
+	  "\t u ->\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	  u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14], u.a[15]);
+      printf (
+	  "\t expect\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+  abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c
new file mode 100644
index 00000000000..06904c50217
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psllq_1
+#endif
+
+#define N 60
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_slli_epi64 (s1, N); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128i_q u, s;
+  long long e[2] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi64x (-1, 0xf);
+
+  u.x = test (s.x);
+
+  if (N < 64)
+    for (i = 0; i < 2; i++)
+      e[i] = s.a[i] << N; 
+
+  if (check_union128i_q (u, e))
+    abort (); 
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psllq-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psllq-2.c
new file mode 100644
index 00000000000..5eb7bc39a60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psllq-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psllq_2
+#endif
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_sll_epi64 (s1, c); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128i_q u, s, c;
+  long long e[2] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi64x (-1, 0xf);
+  c.x = _mm_set_epi64x (60,50);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 64)
+    for (i = 0; i < 2; i++)
+      e[i] = s.a[i] << c.a[0]; 
+
+  if (check_union128i_q (u, e))
+    abort (); 
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c
new file mode 100644
index 00000000000..f744bb244cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psllw_1
+#endif
+
+#define N 0xb
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_slli_epi16 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  short e[8] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000);
+
+  u.x = test (s.x);
+
+  if (N < 16)
+    for (i = 0; i < 8; i++)
+      e[i] = s.a[i] << N; 
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psllw-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psllw-2.c
new file mode 100644
index 00000000000..1335e2bb249
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psllw-2.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psllw_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_sll_epi16 (s1, c); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  union128i_q c;
+  short e[8] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000);
+  c.x = _mm_set_epi64x (12, 13);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 16)
+    for (i = 0; i < 8; i++)
+      e[i] = s.a[i] << c.a[0]; 
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrad-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrad-1.c
new file mode 100644
index 00000000000..03c40a11d07
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrad-1.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrad_1
+#endif
+
+#define N 0xf
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srai_epi32 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  int e[4] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi32 (1, -2, 3, 4);
+
+  u.x = test (s.x);
+
+  if (N < 32)
+    for (i = 0; i < 4; i++)
+      e[i] = s.a[i] >> N; 
+
+  if (check_union128i_d (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrad-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrad-2.c
new file mode 100644
index 00000000000..387f383f3db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrad-2.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrad_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i count)
+{
+  return _mm_sra_epi32 (s1, count); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  union128i_q c;
+  int e[4] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi32 (1, -2, 3, 4);
+  c.x = _mm_set_epi64x (16, 29);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 32)
+    for (i = 0; i < 4; i++)
+      e[i] = s.a[i] >> c.a[0]; 
+
+  if (check_union128i_d (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psraw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psraw-1.c
new file mode 100644
index 00000000000..23a423a4762
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psraw-1.c
@@ -0,0 +1,44 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psraw_1
+#endif
+
+#define N 0xb
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srai_epi16 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  short e[8] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, -2, 3, 4, -5, 6, 0x7000, 0x9000);
+
+  u.x = test (s.x);
+
+  if (N < 16)
+    for (i = 0; i < 8; i++)
+      e[i] = s.a[i] >> N; 
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psraw-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psraw-2.c
new file mode 100644
index 00000000000..b41a6d3157a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psraw-2.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psraw_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_sra_epi16 (s1, c); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  union128i_q c;
+  short e[8] = {0};
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, -2, 3, 4, 5, 6, -0x7000, 0x9000);
+  c.x = _mm_set_epi64x (12, 13);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 16)
+    for (i = 0; i < 8; i++)
+      e[i] = s.a[i] >> c.a[0]; 
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrld-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrld-1.c
new file mode 100644
index 00000000000..e96cf0a3ca8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrld-1.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrld_1
+#endif
+
+#define N 0xf
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srli_epi32 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  int e[4] = { 0 };
+  unsigned int tmp;
+  int i;
+ 
+  s.x = _mm_set_epi32 (1, -2, 3, 4);
+
+  u.x = test (s.x);
+
+  if (N < 32)
+    for (i = 0; i < 4; i++)
+      {
+        tmp  = s.a[i];
+        e[i] = tmp >> N; 
+      }
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_psrld_1; check_union128i_d failed\n");
+      printf ("\tsrl\t([%x,%x,%x,%x],%d\n", s.a[0], s.a[1], s.a[2], s.a[3], N);
+      printf ("\t ->\t [%x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\texpect\t [%x,%x,%x,%x]\n", e[0], e[1], e[2], e[3]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrld-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrld-2.c
new file mode 100644
index 00000000000..6192e2a2d59
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrld-2.c
@@ -0,0 +1,59 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrld_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_srl_epi32 (s1, c); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s;
+  union128i_q c;
+  int e[4] = { 0 };
+  unsigned int tmp;
+  int i;
+ 
+  s.x = _mm_set_epi32 (2, -3, 0x7000, 0x9000);
+  c.x = _mm_set_epi64x (12, 23);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 32)
+    for (i = 0; i < 4; i++)
+      {
+        tmp = s.a[i];
+        e[i] = tmp >> c.a[0];
+      } 
+
+  if (check_union128i_d (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_psrld_2; check_union128i_d failed\n");
+      printf ("\tsrld\t([%x,%x,%x,%x], [%llx,%llx]\n", s.a[0], s.a[1], s.a[2],
+	      s.a[3], c.a[0], c.a[1]);
+      printf ("\t ->\t [%x,%x,%x,%x]\n", u.a[0], u.a[1], u.a[2], u.a[3]);
+      printf ("\texpect\t [%x,%x,%x,%x]\n", e[0], e[1], e[2], e[3]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrldq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrldq-1.c
new file mode 100644
index 00000000000..5b74cae7f90
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrldq-1.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrldq_1
+#endif
+
+#define N 0x5
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srli_si128 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s;
+  char src[16] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 };
+  char e[16] = { 0 };
+  int i;
+   
+  s.x = _mm_loadu_si128 ((__m128i *)src);
+
+  u.x = test (s.x);
+
+  for (i = 0; i < 16-N; i++)
+    e[i] = src[i+N];
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_psrldq_1; check_union128i_b failed\n");
+      printf ("\tsrl\t([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x],\n",
+	      s.a[0], s.a[1], s.a[2], s.a[3], s.a[4], s.a[5], s.a[6], s.a[7],
+	      s.a[8], s.a[9], s.a[10], s.a[11], s.a[12], s.a[13], s.a[14],
+	      s.a[15]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-1.c
new file mode 100644
index 00000000000..9b13f0be7fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-1.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrlq_1
+#endif
+
+#define N 60
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srli_epi64 (s1, N); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128i_q u, s;
+  long long e[2] = {0};
+  unsigned long long tmp;
+  int i;
+ 
+  s.x = _mm_set_epi64x (-1, 0xf);
+
+  u.x = test (s.x);
+
+  if (N < 64)
+    for (i = 0; i < 2; i++) {
+      tmp = s.a[i]; 
+      e[i] = tmp >> N;
+    }
+
+  if (check_union128i_q (u, e))
+    abort (); 
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-2.c
new file mode 100644
index 00000000000..168c77f99ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrlq-2.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrlq_2
+#endif
+
+#include <emmintrin.h>
+
+#ifdef _ARCH_PWR8
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_srl_epi64 (s1, c); 
+}
+#endif
+
+static void
+TEST (void)
+{
+#ifdef _ARCH_PWR8
+  union128i_q u, s, c;
+  long long e[2] = {0};
+  unsigned long long tmp;
+  int i;
+ 
+  s.x = _mm_set_epi64x (-1, 0xf);
+  c.x = _mm_set_epi64x (60,50);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 64)
+    for (i = 0; i < 2; i++){
+      tmp = s.a[i];
+      e[i] =tmp >> c.a[0];
+    } 
+
+  if (check_union128i_q (u, e))
+    abort (); 
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-1.c
new file mode 100644
index 00000000000..6f0a856fc7a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrlw_1
+#endif
+
+#define N 0xb
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1)
+{
+  return _mm_srli_epi16 (s1, N); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  short e[8] = {0};
+  unsigned short tmp;
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, -2, 3, -4, 5, 6, 0x7000, 0x9000);
+
+  u.x = test (s.x);
+
+  if (N < 16)
+    for (i = 0; i < 8; i++)
+      {
+        tmp = s.a[i];
+        e[i] = tmp >> N;
+      }
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-2.c
new file mode 100644
index 00000000000..9457b49a8cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psrlw-2.c
@@ -0,0 +1,49 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psrlw_2
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i c)
+{
+  return _mm_srl_epi16 (s1, c); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s;
+  union128i_q c;
+  short e[8] = {0};
+  unsigned short tmp;
+  int i;
+ 
+  s.x = _mm_set_epi16 (1, -2, 3, 4, 5, 6, -0x7000, 0x9000);
+  c.x = _mm_set_epi64x (12, 13);
+
+  __asm("" : "+v"(s.x), "+v"(c.x));
+  u.x = test (s.x, c.x);
+
+  if (c.a[0] < 16)
+    for (i = 0; i < 8; i++)
+      {
+        tmp = s.a[i];
+        e[i] = tmp >> c.a[0];
+      }
+
+  if (check_union128i_w (u, e))
+    abort (); 
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubb-1.c
new file mode 100644
index 00000000000..a0d99ddada1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubb-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+     e[i] = s1.a[i] - s2.a[i];
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubd-1.c
new file mode 100644
index 00000000000..624ae2de5be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubd-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (30,90,-80,-40);
+  s2.x = _mm_set_epi32 (76, -100, -34, -78);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+     e[i] = s1.a[i] - s2.a[i];
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubq-1.c
new file mode 100644
index 00000000000..426ccb7dcd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubq-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_epi64 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u, s1, s2;
+  long long e[2];
+  int i;
+   
+  s1.x = _mm_set_epi64x (90,-80);
+  s2.x = _mm_set_epi64x (76, -100);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 2; i++)
+     e[i] = s1.a[i] - s2.a[i];
+
+  if (check_union128i_q (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubsb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubsb-1.c
new file mode 100644
index 00000000000..be02da5a34b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubsb-1.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubsb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_subs_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+    {
+      tmp = (signed char)s1.a[i] - (signed char)s2.a[i];
+
+      if (tmp > 127)
+        tmp = 127;
+      if (tmp < -128)
+        tmp = -128;
+
+      e[i] = tmp;
+    }
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubsw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubsw-1.c
new file mode 100644
index 00000000000..afed3c86914
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubsw-1.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubsw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_subs_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = s1.a[i] - s2.a[i];
+      
+      if (tmp > 32767)
+        tmp = 32767;
+      if (tmp < -32768)
+        tmp = -32768;
+      
+      e[i] = tmp;
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubusb-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubusb-1.c
new file mode 100644
index 00000000000..e5f128b6c38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubusb-1.c
@@ -0,0 +1,74 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubusb_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_subs_epu8 (s1, s2);
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16] = { 0 };
+  int i, tmp;
+   
+  s1.x = _mm_set_epi8 (30, 2, 3, 4, 10, 20, 30, 90, 80, 40, 100, 15, 98, 25, 98, 7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, 100, 34, 78, 39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+    {
+      tmp = (unsigned char)s1.a[i] - (unsigned char)s2.a[i];
+
+      if (tmp > 255)
+        tmp = -1;
+      if (tmp < 0)
+        tmp = 0; 
+
+      e[i] = tmp; 
+    }
+
+  if (check_union128i_b (u, e))
+#if DEBUG
+    {
+      printf ("sse2_test_psubusb_1; check_union128i_b failed\n");
+      printf (
+	  "\tadds\t([%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x],\n",
+	      s1.a[0], s1.a[1], s1.a[2], s1.a[3], s1.a[4], s1.a[5], s1.a[6],
+	      s1.a[7], s1.a[8], s1.a[9], s1.a[10], s1.a[11], s1.a[12], s1.a[13],
+	      s1.a[14], s1.a[15]);
+      printf ("\t\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x])\n",
+	      s2.a[0], s2.a[1], s2.a[2], s2.a[3], s2.a[4], s2.a[5], s2.a[6],
+	      s2.a[7], s2.a[8], s2.a[9], s2.a[10], s2.a[11], s2.a[12], s2.a[13],
+	      s2.a[14], s2.a[15]);
+      printf ("\t ->\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	      u.a[0], u.a[1], u.a[2], u.a[3], u.a[4], u.a[5], u.a[6], u.a[7],
+	      u.a[8], u.a[9], u.a[10], u.a[11], u.a[12], u.a[13], u.a[14],
+	      u.a[15]);
+      printf (
+	  "\texpect\t [%x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x, %x,%x,%x,%x]\n",
+	  e[0], e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9], e[10],
+	  e[11], e[12], e[13], e[14], e[15]);
+    }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubusw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubusw-1.c
new file mode 100644
index 00000000000..11ddca627e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubusw-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubusw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_subs_epu16 (s1, s2);
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i, tmp;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,80,40,100,15);
+  s2.x = _mm_set_epi16 (11, 98, 76, 100, 34, 78, 39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      tmp = (unsigned short)s1.a[i] - (unsigned short)s2.a[i];
+      
+      if (tmp > 65535)
+        tmp = -1;
+
+      if (tmp < 0)
+        tmp = 0;
+      
+      e[i] = tmp;
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-psubw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-psubw-1.c
new file mode 100644
index 00000000000..04570a2bc2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-psubw-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_psubw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+     e[i] = s1.a[i] - s2.a[i];
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpckhbw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhbw-1.c
new file mode 100644
index 00000000000..b666c58e276
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhbw-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpckhbw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpackhi_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      e[2*i] = s1.a[8+i];
+      e[2*i + 1] = s2.a[8+i];
+    }
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpckhdq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhdq-1.c
new file mode 100644
index 00000000000..c4866198110
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhdq-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpckhdq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpackhi_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (10,20,-80,-40);
+  s2.x = _mm_set_epi32 (11, -34, -78, -39);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 2; i++)
+    {
+      e[2*i] = s1.a[2+i];
+      e[2*i+1] = s2.a[2+i];
+    }
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpckhqdq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhqdq-1.c
new file mode 100644
index 00000000000..849f23f5433
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhqdq-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpckhqdq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpackhi_epi64 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u, s1, s2;
+  long long  e[2];
+   
+  s1.x = _mm_set_epi64x (10,-40);
+  s2.x = _mm_set_epi64x (1134, -7839);
+  u.x = test (s1.x, s2.x); 
+  
+  e[0] = s1.a[1];
+  e[1] = s2.a[1];
+
+  if (check_union128i_q (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpckhwd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhwd-1.c
new file mode 100644
index 00000000000..7077ecc5a2f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpckhwd-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpckhwd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpackhi_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+    {
+      e[2*i] = s1.a[4+i];
+      e[2*i+1] = s2.a[4+i];
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpcklbw-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklbw-1.c
new file mode 100644
index 00000000000..e1ee1aee35d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklbw-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpcklbw_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpacklo_epi8 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 8; i++)
+    {
+      e[2*i] = s1.a[i];
+      e[2*i + 1] = s2.a[i];
+    }
+
+  if (check_union128i_b (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpckldq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpckldq-1.c
new file mode 100644
index 00000000000..a47f72dda80
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpckldq-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpckldq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpacklo_epi32 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_d u, s1, s2;
+  int e[4];
+  int i;
+   
+  s1.x = _mm_set_epi32 (10,20,-80,-40);
+  s2.x = _mm_set_epi32 (11, -34, -78, -39);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 2; i++)
+    {
+      e[2*i] = s1.a[i];
+      e[2*i+1] = s2.a[i];
+    }
+
+  if (check_union128i_d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpcklqdq-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklqdq-1.c
new file mode 100644
index 00000000000..a45f636bdf5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklqdq-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpcklqdq_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpacklo_epi64 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_q u, s1, s2;
+  long long  e[2];
+   
+  s1.x = _mm_set_epi64x (10,-40);
+  s2.x = _mm_set_epi64x (1134, -7839);
+  u.x = test (s1.x, s2.x); 
+  
+  e[0] = s1.a[0];
+  e[1] = s2.a[0];
+
+  if (check_union128i_q (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-punpcklwd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklwd-1.c
new file mode 100644
index 00000000000..5afd9799df7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-punpcklwd-1.c
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_punpcklwd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpacklo_epi16 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_w u, s1, s2;
+  short e[8];
+  int i;
+   
+  s1.x = _mm_set_epi16 (10,20,30,90,-80,-40,-100,-15);
+  s2.x = _mm_set_epi16 (11, 98, 76, -100, -34, -78, -39, 14);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 4; i++)
+    {
+      e[2*i] = s1.a[i];
+      e[2*i+1] = s2.a[i];
+    }
+
+  if (check_union128i_w (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-shufpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-shufpd-1.c
new file mode 100644
index 00000000000..e81c818fa2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-shufpd-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_shufpd_1
+#endif
+
+#define N 0xab
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_shuffle_pd (s1, s2, N); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2] = {0.0};
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (453.345635,54646.464356);
+  u.x = test (s1.x, s2.x);
+
+  e[0] = (N & (1 << 0)) ? s1.a[1] : s1.a[0];
+  e[1] = (N & (1 << 1)) ? s2.a[1] : s2.a[0];
+
+  if (check_union128d(u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-sqrtpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-sqrtpd-1.c
new file mode 100644
index 00000000000..fa0b1fee8be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-sqrtpd-1.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_sqrt_pd_1
+#endif
+
+#include <emmintrin.h>
+#include <math.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1)
+{
+  return _mm_sqrt_pd (s1); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1;
+  __m128d bogus = { 123.0, 456.0 };
+  double e[2];
+  int i;
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  u.x = test (s1.x); 
+
+  for (i = 0; i < 2; i++)
+    {
+      __m128d tmp = _mm_load_sd (&s1.a[i]);
+      tmp = _mm_sqrt_sd (bogus, tmp);
+      _mm_store_sd (&e[i], tmp);
+    }
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_sqrt_pd_1; check_union128d failed\n");
+      printf ("\t [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-subpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-subpd-1.c
new file mode 100644
index 00000000000..6428dc92e31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-subpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_subpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] - s2.a[0];
+  e[1] = s1.a[1] - s2.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-subsd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-subsd-1.c
new file mode 100644
index 00000000000..c5afa3ab02c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-subsd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_subsd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_sub_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0] - s2.a[0];
+  e[1] = s1.a[1];
+
+  if (check_union128d (u, e))
+#if DEBUG
+  {
+      printf ("sse2_test_subsd_1; check_union128d failed\n");
+      printf ("\t [%f,%f] - [%f,%f] -> [%f,%f]\n", s1.a[0], s1.a[1], s2.a[0],
+	      s2.a[1], u.a[0], u.a[1]);
+      printf ("\t expect [%f,%f]\n", e[0], e[1]);
+  }
+#else
+    abort ();
+#endif
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-1.c
new file mode 100644
index 00000000000..a68ed519ca8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-1.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_1
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomieq_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,2344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] == s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-2.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-2.c
new file mode 100644
index 00000000000..b3f00c82632
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-2.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_2
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomilt_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] < s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-3.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-3.c
new file mode 100644
index 00000000000..2bfcf84357d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-3.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_3
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomile_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1] = {0};
+  int e[1] = {0};
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] <= s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-4.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-4.c
new file mode 100644
index 00000000000..42243b29c5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-4.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_4
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomigt_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] > s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-5.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-5.c
new file mode 100644
index 00000000000..1fc2a2b3c08
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-5.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_5
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomige_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] >= s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-6.c b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-6.c
new file mode 100644
index 00000000000..5ce8d453527
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-ucomisd-6.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_ucomisd_6
+#endif
+
+#include <emmintrin.h>
+
+static int 
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_ucomineq_sd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d s1, s2;
+  int d[1];
+  int e[1];
+ 
+  s1.x = _mm_set_pd (2134.3343,12344.2354);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  d[0] = test (s1.x, s2.x); 
+  e[0] = s1.a[0] != s2.a[0];
+
+  if (checkVi (d, e, 1))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-unpckhpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-unpckhpd-1.c
new file mode 100644
index 00000000000..f1547861c21
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-unpckhpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_unpckhpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpackhi_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[1];
+  e[1] = s2.a[1];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-unpcklpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-unpcklpd-1.c
new file mode 100644
index 00000000000..5c1ad1e18ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-unpcklpd-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_unpcklpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  __asm("" : "+v"(s1), "+v"(s2));
+  return _mm_unpacklo_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128d u, s1, s2;
+  double e[2];
+   
+  s1.x = _mm_set_pd (2134.3343,1234.635654);
+  s2.x = _mm_set_pd (41124.234,2344.2354);
+  u.x = test (s1.x, s2.x); 
+   
+  e[0] = s1.a[0];
+  e[1] = s2.a[0];
+
+  if (check_union128d (u, e))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse2-xorpd-1.c b/gcc/testsuite/gcc.target/powerpc/sse2-xorpd-1.c
new file mode 100644
index 00000000000..d1c04bfc8f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse2-xorpd-1.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_xorpd_1
+#endif
+
+#include <emmintrin.h>
+
+static __m128d
+__attribute__((noinline, unused))
+test (__m128d s1, __m128d s2)
+{
+  return _mm_xor_pd (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union
+    {
+      double d[2];
+      long long l[2];
+    }source1, source2, e;
+
+  union128d u, s1, s2;
+  int i; 
+   
+  s1.x = _mm_set_pd (11.1321456, 2.287332);
+  s2.x = _mm_set_pd (3.37768, 4.43222234);
+
+  _mm_storeu_pd (source1.d, s1.x);
+  _mm_storeu_pd (source2.d, s2.x);
+
+  u.x = test (s1.x, s2.x); 
+ 
+  for (i = 0; i < 2; i++)
+    e.l[i] = source1.l[i] ^ source2.l[i];
+
+  if (check_union128d (u, e.d))
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c b/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c
index 0711f9c0531..1e63defa063 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c
@@ -7,197 +7,197 @@
 
 #include <vecintrin.h>
 
-extern void foo (void);
+int g = 1;
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_eq_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_eq (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_eq_double:\n\tvfcedbs\t%v\[0-9\]*,%v24,%v26\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_ne_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_ne (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_ne_double:\n\tvfcedbs\t%v\[0-9\]*,%v24,%v26\n\tjle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_gt_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_gt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_gt_double:\n\tvfchdbs\t%v\[0-9\]*,%v24,%v26\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_lt_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_lt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_lt_double:\n\tvfchdbs\t%v\[0-9\]*,%v26,%v24\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_ge_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_ge (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_ge_double:\n\tvfchedbs\t%v\[0-9\]*,%v24,%v26\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_le_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_all_le (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_le_double:\n\tvfchedbs\t%v\[0-9\]*,%v26,%v24\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_eq_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_eq (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_eq_double:\n\tvfcedbs\t%v\[0-9\]*,%v24,%v26\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_ne_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_ne (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_ne_double:\n\tvfcedbs\t%v\[0-9\]*,%v24,%v26\n\tje 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_gt_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_gt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_gt_double:\n\tvfchdbs\t%v\[0-9\]*,%v24,%v26\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_lt_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_lt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_lt_double:\n\tvfchdbs\t%v\[0-9\]*,%v26,%v24\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_ge_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_ge (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_ge_double:\n\tvfchedbs\t%v\[0-9\]*,%v24,%v26\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_le_double (vector double a, vector double b)
 {
   if (__builtin_expect (vec_any_le (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_le_double:\n\tvfchedbs\t%v\[0-9\]*,%v26,%v24\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_eq_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_eq (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_eq_int:\n\tvceqfs\t%v\[0-9\]*,%v24,%v26\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_ne_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_ne (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_ne_int:\n\tvceqfs\t%v\[0-9\]*,%v24,%v26\n\tjle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_gt_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_gt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_gt_int:\n\tvchfs\t%v\[0-9\]*,%v24,%v26\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_lt_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_lt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_lt_int:\n\tvchfs\t%v\[0-9\]*,%v26,%v24\n\tjne 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_ge_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_ge (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_ge_int:\n\tvchfs\t%v\[0-9\]*,%v26,%v24\n\tjle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 all_le_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_all_le (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times all_le_int:\n\tvchfs\t%v\[0-9\]*,%v24,%v26\n\tjle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_eq_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_eq (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_eq_int:\n\tvceqfs\t%v\[0-9\]*,%v24,%v26\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_ne_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_ne (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_ne_int:\n\tvceqfs\t%v\[0-9\]*,%v24,%v26\n\tje 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_gt_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_gt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_gt_int:\n\tvchfs\t%v\[0-9\]*,%v24,%v26\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_lt_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_lt (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_lt_int:\n\tvchfs\t%v\[0-9\]*,%v26,%v24\n\tjnle 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_ge_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_ge (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_ge_int:\n\tvchfs\t%v\[0-9\]*,%v26,%v24\n\tje 1 } } */
 
-int __attribute__((noinline,noclone))
+void __attribute__((noinline,noclone))
 any_le_int (vector int a, vector int b)
 {
   if (__builtin_expect (vec_any_le (a, b), 1))
-    foo ();
+    g = 2;
 }
 /* { dg-final { scan-assembler-times any_le_int:\n\tvchfs\t%v\[0-9\]*,%v24,%v26\n\tje 1 } } */
 
diff --git a/gcc/testsuite/gfortran.dg/allocate_error_7.f90 b/gcc/testsuite/gfortran.dg/allocate_error_7.f90
new file mode 100644
index 00000000000..f1c8bc3db64
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocate_error_7.f90
@@ -0,0 +1,12 @@
+! { dg-do compile }
+!
+! Code contributed by Gerhard Steinmetz
+!
+program pr82620
+   type t(a)
+      integer, len :: a
+   end type
+   type(t(:)), allocatable :: x, y
+   allocate(t(4) :: x)
+   allocate)t(7) :: y)     ! { dg-error "Syntax error in ALLOCATE" }
+end program pr82620
diff --git a/gcc/testsuite/gfortran.dg/array_constructor_51.f90 b/gcc/testsuite/gfortran.dg/array_constructor_51.f90
new file mode 100644
index 00000000000..4c3cdf71fcf
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/array_constructor_51.f90
@@ -0,0 +1,20 @@
+! { dg-do compile }
+! { dg-additional-options "-ffrontend-optimize -fdump-tree-original" }
+! PR 82567 - long compile times caused by large constant constructors
+! multiplied by variables
+
+  SUBROUTINE sub()
+  IMPLICIT NONE
+  
+  INTEGER, PARAMETER :: n = 1000
+  REAL, ALLOCATABLE :: x(:)
+  REAL :: xc, h
+  INTEGER :: i
+ 
+  ALLOCATE( x(n) )
+  xc = 100.
+  h = xc/n
+  x = h*[(i,i=1,n)]
+  
+end
+! { dg-final { scan-tree-dump-times "__var" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/assumed_size_2.f90 b/gcc/testsuite/gfortran.dg/assumed_size_2.f90
new file mode 100644
index 00000000000..e9a1185b527
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/assumed_size_2.f90
@@ -0,0 +1,4 @@
+! { dg-do compile }
+subroutine foo(a)
+  dimension  a(*,*) ! { dg-error "Bad specification for assumed size array" }
+end
diff --git a/gcc/testsuite/gfortran.dg/class_63.f90 b/gcc/testsuite/gfortran.dg/class_63.f90
new file mode 100644
index 00000000000..cf99bcf9cb2
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_63.f90
@@ -0,0 +1,80 @@
+! { dg-do run }
+!
+! Tests the fix for PR81758, in which the vpointer for 'ptr' in
+! function 'pointer_value' would be set to the vtable of the component
+! 'container' rather than that of the component 'vec_elem'. In this test
+! case it is ensured that there is a single typebound procedure for both
+! types, so that different values are returned. In the original problem
+! completely different procedures were involved so that a segfault resulted.
+!
+! Reduced from the original code of Dimitry Liakh  <liakhdi@ornl.gov> by
+!                                   Paul Thomas  <pault@gcc.gnu.org>
+!
+module types
+  type, public:: gfc_container_t
+  contains
+    procedure, public:: get_value => ContTypeGetValue
+  end type gfc_container_t
+
+  !Element of a container:
+  type, public:: gfc_cont_elem_t
+    integer :: value_p
+  contains
+    procedure, public:: get_value => ContElemGetValue
+  end type gfc_cont_elem_t
+
+  !Vector element:
+  type, extends(gfc_cont_elem_t), public:: vector_elem_t
+  end type vector_elem_t
+
+  !Vector:
+  type, extends(gfc_container_t), public:: vector_t
+    type(vector_elem_t), allocatable, private :: vec_elem
+  end type vector_t
+
+  type, public :: vector_iter_t
+    class(vector_t), pointer, private :: container => NULL()
+  contains
+    procedure, public:: get_vector_value => vector_Value
+    procedure, public:: get_pointer_value => pointer_value
+  end type
+
+contains
+  integer function ContElemGetValue (this)
+    class(gfc_cont_elem_t) :: this
+    ContElemGetValue = this%value_p
+  end function
+
+  integer function ContTypeGetValue (this)
+    class(gfc_container_t) :: this
+    ContTypeGetValue = 0
+  end function
+
+  integer function vector_Value (this)
+    class(vector_iter_t) :: this
+    vector_value = this%container%vec_elem%get_value()
+  end function
+
+  integer function pointer_value (this)
+    class(vector_iter_t), target :: this
+    class(gfc_cont_elem_t), pointer :: ptr
+    ptr => this%container%vec_elem
+    pointer_value = ptr%get_value()
+  end function
+
+  subroutine factory (arg)
+    class (vector_iter_t), pointer :: arg
+    allocate (vector_iter_t :: arg)
+    allocate (vector_t :: arg%container)
+    allocate (arg%container%vec_elem)
+    arg%container%vec_elem%value_p = 99
+  end subroutine
+end module
+
+  use types
+  class (vector_iter_t), pointer :: x
+
+  call factory (x)
+  if (x%get_vector_value() .ne. 99) call abort
+  if (x%get_pointer_value() .ne. 99) call abort
+end
diff --git a/gcc/testsuite/gfortran.dg/class_64.f90 b/gcc/testsuite/gfortran.dg/class_64.f90
new file mode 100644
index 00000000000..059ebaa8a01
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_64.f90
@@ -0,0 +1,38 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+! Test the fix for PR80850 in which the _len field was not being
+! set for 'arg' in the call to 'foo'.
+!
+  type :: mytype
+    integer :: i
+  end type
+  class (mytype), pointer :: c
+
+  allocate (c, source = mytype (99_8))
+
+  call foo(c)
+  call bar(c)
+
+  deallocate (c)
+
+contains
+
+  subroutine foo (arg)
+    class(*) :: arg
+    select type (arg)
+      type is (mytype)
+        if (arg%i .ne. 99_8) call abort
+    end select
+  end subroutine
+
+  subroutine bar (arg)
+    class(mytype) :: arg
+    select type (arg)
+      type is (mytype)
+        if (arg%i .ne. 99_8) call abort
+    end select
+  end subroutine
+
+end
+! { dg-final { scan-tree-dump-times "arg.*._len" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/dec_structure_22.f90 b/gcc/testsuite/gfortran.dg/dec_structure_22.f90
new file mode 100644
index 00000000000..ddbee02602a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/dec_structure_22.f90
@@ -0,0 +1,38 @@
+      ! { dg-do run }
+      ! { dg-options "-fdec-structure" }
+      !
+      ! PR fortran/82511
+      !
+      ! Verify that structure variables with UNION components
+      ! are accepted in an I/O-list READ.
+      !
+      implicit none
+
+      structure /s/
+        union
+          map
+            character(16) :: c16_1
+          end map
+          map
+            character(16) :: c16_2
+          end map
+        end union
+      end structure
+
+      record /s/ r
+      character(32) :: instr = "ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^"
+
+      r.c16_1 = '                '
+      r.c16_2 = '                '
+      ! The record r shall be treated as if its components are listed:
+      ! read(...) r.c16_1, r.c16_2
+      ! This shall correspond to the formatted read of A16,A16
+      read(instr, '(A16,A16)') r
+
+      ! r.c16_1 and r.c16_2 are in a union, thus share the same memory
+      ! and the first 16 bytes of instr are overwritten
+      if ( r.c16_1 .ne. instr(17:32) .or. r.c16_2 .ne. instr(17:32) ) then
+        call abort()
+      endif
+
+      end
diff --git a/gcc/testsuite/gfortran.dg/derived_init_4.f90 b/gcc/testsuite/gfortran.dg/derived_init_4.f90
new file mode 100644
index 00000000000..114975150aa
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/derived_init_4.f90
@@ -0,0 +1,60 @@
+! { dg-do run }
+!
+! Test the fix for PR81048, where in the second call to 'g2' the
+! default initialization was "forgotten". 'g1', 'g1a' and 'g3' check
+! that this does not occur for scalars and explicit results.
+!
+! Contributed by David Smith  <dm577216smith@gmail.com>
+!
+program test
+   type f
+       integer :: f = -1
+   end type
+   type(f) :: a, b(3)
+   type(f), allocatable :: ans
+   b = g2(a)
+   b = g2(a)
+   ans = g1(a)
+   if (ans%f .ne. -1) call abort
+   ans = g1(a)
+   if (ans%f .ne. -1) call abort
+   ans = g1a(a)
+   if (ans%f .ne. -1) call abort
+   ans = g1a(a)
+   if (ans%f .ne. -1) call abort
+   b = g3(a)
+   b = g3(a)
+contains
+   function g3(a) result(res)
+      type(f) :: a, res(3)
+      do j = 1, 3
+         if (res(j)%f == -1) then
+             res(j)%f = a%f - 1
+         else
+             call abort
+         endif
+      enddo
+   end function g3
+
+   function g2(a)
+      type(f) :: a, g2(3)
+      do j = 1, 3
+         if (g2(j)%f == -1) then
+             g2(j)%f = a%f - 1
+         else
+             call abort
+         endif
+      enddo
+   end function g2
+
+   function g1(a)
+     type(f) :: g1, a
+     if (g1%f .ne. -1 ) call abort
+   end function
+
+   function g1a(a) result(res)
+     type(f) :: res, a
+     if (res%f .ne. -1 ) call abort
+   end function
+end program test
+
diff --git a/gcc/testsuite/gfortran.dg/dtio_13.f90 b/gcc/testsuite/gfortran.dg/dtio_13.f90
index 9b907201afc..131af05c847 100644
--- a/gcc/testsuite/gfortran.dg/dtio_13.f90
+++ b/gcc/testsuite/gfortran.dg/dtio_13.f90
@@ -136,9 +136,7 @@ program test
    character(3) :: a, b
    class(t) :: chairman ! { dg-error "must be dummy, allocatable or pointer" }
    open (unit=71, file='myunformatted_data.dat', form='unformatted')
-! The following error is spurious and is eliminated if previous error is corrected.
-! TODO Although better than an ICE, fix me.
-   read (71) a, chairman, b ! { dg-error "cannot be polymorphic" }
+   read (71) a, chairman, b 
    close (unit=71)
 end
 
diff --git a/gcc/testsuite/gfortran.dg/equiv_pure.f90 b/gcc/testsuite/gfortran.dg/equiv_pure.f90
new file mode 100644
index 00000000000..5b0ce419d2a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/equiv_pure.f90
@@ -0,0 +1,52 @@
+! { dg-do compile }
+! PR fortran/82796
+! Code contributed by ripero84 at gmail dot com 
+module eq
+   implicit none
+   integer :: n1, n2
+   integer, dimension(2) :: a
+   equivalence (a(1), n1)
+   equivalence (a(2), n2)
+   common /a/ a
+end module eq
+
+module m
+   use eq
+   implicit none
+   type, public :: t
+     integer :: i
+   end type t
+end module m
+
+module p
+   implicit none
+   contains
+   pure integer function d(h)
+     use m
+     implicit none
+     integer, intent(in) :: h
+     d = h
+   end function
+end module p
+
+module q
+   implicit none
+   contains
+   pure integer function d(h)
+     use m, only : t
+     implicit none
+     integer, intent(in) :: h
+     d = h
+   end function
+end module q
+
+module r
+   implicit none
+   contains
+   pure integer function d(h)
+     use m, only : a          ! { dg-error "cannot be an EQUIVALENCE object" }
+     implicit none
+     integer, intent(in) :: h
+     d = h
+   end function
+end module r
diff --git a/gcc/testsuite/gfortran.dg/execute_command_line_3.f90 b/gcc/testsuite/gfortran.dg/execute_command_line_3.f90
index 87d73d1b50f..c1790d801f3 100644
--- a/gcc/testsuite/gfortran.dg/execute_command_line_3.f90
+++ b/gcc/testsuite/gfortran.dg/execute_command_line_3.f90
@@ -15,10 +15,9 @@ character(len=:), allocatable :: command
    if (j /= 3 .or. msg /= "Invalid command line" ) call abort
    msg = ''
    call execute_command_line(command , wait=.false., exitstat=i,            cmdmsg=msg )
-   print *,msg
-   if (msg /= '') call abort
-   call execute_command_line(command ,               exitstat=i, cmdstat=j             )
    if (j /= 3) call abort
    call execute_command_line(command , wait=.false., exitstat=i                        )
+   if (msg /= '') call abort
+   call execute_command_line(command ,               exitstat=i, cmdstat=j             )
 
 end program boom
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr82568.f90 b/gcc/testsuite/gfortran.dg/gomp/pr82568.f90
new file mode 100644
index 00000000000..303278ca58f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr82568.f90
@@ -0,0 +1,75 @@
+! PR fortran/82568
+
+MODULE PR82568_MOD
+  INTEGER :: N
+END MODULE
+PROGRAM PR82568
+  INTEGER :: I, L
+  !$OMP PARALLEL DO
+  DO I=1,2
+    BLOCK
+      USE PR82568_MOD
+      INTEGER :: J
+      DO J=1,2
+        PRINT*,I,J
+      END DO
+      DO K=1,2
+        PRINT*,I,K
+      END DO
+      DO L=1,2
+        PRINT*,I,L
+      END DO
+      DO N=1,2
+        PRINT*,I,N
+      END DO
+    END BLOCK
+    DO M=1,2
+      PRINT*,I,M
+    END DO
+  END DO
+  !$OMP TASK
+  DO I=1,2
+    BLOCK
+      USE PR82568_MOD
+      INTEGER :: J
+      DO J=1,2
+        PRINT*,I,J
+      END DO
+      DO K=1,2
+        PRINT*,I,K
+      END DO
+      DO L=1,2
+        PRINT*,I,L
+      END DO
+      DO N=1,2
+        PRINT*,I,N
+      END DO
+    END BLOCK
+    DO M=1,2
+      PRINT*,I,M
+    END DO
+  END DO
+  !$OMP END TASK
+  !$OMP TASKLOOP
+  DO I=1,2
+    BLOCK
+      USE PR82568_MOD
+      INTEGER :: J
+      DO J=1,2
+        PRINT*,I,J
+      END DO
+      DO K=1,2
+        PRINT*,I,K
+      END DO
+      DO L=1,2
+        PRINT*,I,L
+      END DO
+      DO N=1,2
+        PRINT*,I,N
+      END DO
+    END BLOCK
+    DO M=1,2
+      PRINT*,I,M
+    END DO
+  END DO
+END PROGRAM PR82568
diff --git a/gcc/testsuite/gfortran.dg/graphite/pr82672.f90 b/gcc/testsuite/gfortran.dg/graphite/pr82672.f90
new file mode 100644
index 00000000000..77a1c706218
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/graphite/pr82672.f90
@@ -0,0 +1,33 @@
+! { dg-do compile }
+! { dg-options "-O2 -floop-nest-optimize" }
+
+  character(len=20,kind=4) :: s4
+  character(len=20,kind=1) :: s1
+
+  s1 = "foo\u0000"
+  s1 = "foo\u00ff"
+  s1 = "foo\u0100"
+  s1 = "foo\u0101"
+  s1 = "foo\U00000101"
+
+  s1 = 4_"foo bar"
+  s1 = 4_"foo\u00ff"
+  s1 = 4_"foo\u0101"
+  s1 = 4_"foo\u1101"
+  s1 = 4_"foo\UFFFFFFFF"
+
+  s4 = "foo\u0000"
+  s4 = "foo\u00ff"
+  s4 = "foo\u0100"
+  s4 = "foo\U00000100"
+
+  s4 = 4_"foo bar"
+  s4 = 4_"\xFF\x96"
+  s4 = 4_"\x00\x96"
+  s4 = 4_"foo\u00ff"
+  s4 = 4_"foo\u0101"
+  s4 = 4_"foo\u1101"
+  s4 = 4_"foo\Uab98EF56"
+  s4 = 4_"foo\UFFFFFFFF"
+
+end
diff --git a/gcc/testsuite/gfortran.dg/illegal_char.f90 b/gcc/testsuite/gfortran.dg/illegal_char.f90
new file mode 100644
index 00000000000..597c7b98ddd
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/illegal_char.f90
@@ -0,0 +1,6 @@
+! { dg-do compile }
+! PR 82372 - show hexcode of illegal, non-printable characters
+program main
+  tmp =�   1.0 ! { dg-error "Invalid character 0xC8" }
+  print *,tmp
+end
diff --git a/gcc/testsuite/gfortran.dg/implied_do_io_1.f90 b/gcc/testsuite/gfortran.dg/implied_do_io_1.f90
index e4a6d6b37b3..aef36af13eb 100644
--- a/gcc/testsuite/gfortran.dg/implied_do_io_1.f90
+++ b/gcc/testsuite/gfortran.dg/implied_do_io_1.f90
@@ -56,4 +56,4 @@ program main
 1000 format (A2,100I4)
 end program main
 
-! { dg-final { scan-tree-dump-times "while" 7 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)^\\s*while \\(1\\)$" 7 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/large_real_kind_2.F90 b/gcc/testsuite/gfortran.dg/large_real_kind_2.F90
index 7ed4c30e0d5..486c8c00361 100644
--- a/gcc/testsuite/gfortran.dg/large_real_kind_2.F90
+++ b/gcc/testsuite/gfortran.dg/large_real_kind_2.F90
@@ -1,6 +1,5 @@
 ! { dg-do run }
 ! { dg-require-effective-target fortran_large_real }
-! { dg-xfail-if "" { "*-*-freebsd*" } }
 
 ! Testing library calls on large real kinds (larger than kind=8)
   implicit none
diff --git a/gcc/testsuite/gfortran.dg/matmul_const.f90 b/gcc/testsuite/gfortran.dg/matmul_const.f90
new file mode 100644
index 00000000000..35dce322774
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/matmul_const.f90
@@ -0,0 +1,10 @@
+! { dg-do  run }
+! { dg-additional-options "-fno-frontend-optimize -fdump-tree-original" }
+program main
+  integer, parameter :: A(3,2) = reshape([1,2,3,4,5,6],[3,2])
+  integer, parameter :: B(2,3) = reshape([1,1,1,1,1,1],[2,3])
+  character (len=30) :: line
+  write (unit=line,fmt='(9i3)') matmul(A,B)
+  if (line /= '  5  7  9  5  7  9  5  7  9') call abort
+end program main
+!  dg-final { scan-tree-dump-times "matmul_i4" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/pdt_16.f03 b/gcc/testsuite/gfortran.dg/pdt_16.f03
new file mode 100644
index 00000000000..067d87d660d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pdt_16.f03
@@ -0,0 +1,21 @@
+! { dg-do compile }
+!
+! Test the fix for all three errors in PR82586
+!
+! Contributed by G Steinmetz  <gscfq@t-online.de>
+!
+module m
+   type t(a)                 ! { dg-error "does not have a component" }
+   end type
+end
+
+program p
+   type t(a                  ! { dg-error "Expected parameter list" }
+      integer, kind :: a
+      real(a) :: x
+   end type
+   type u(a, a)              ! { dg-error "Duplicate name" }
+      integer, kind :: a     ! { dg-error "already declared" }
+      integer, len :: a      ! { dg-error "already declared" }
+   end type
+end
diff --git a/gcc/testsuite/gfortran.dg/pdt_17.f03 b/gcc/testsuite/gfortran.dg/pdt_17.f03
new file mode 100644
index 00000000000..1b0a30dca4c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pdt_17.f03
@@ -0,0 +1,11 @@
+! { dg-do compile }
+!
+! Test the fix for PR82587
+!
+! Contributed by G Steinmetz  <gscfq@t-online.de>
+!
+program p
+   type t(a)                   ! { dg-error "does not have a component" }
+      integer(kind=t()) :: x   ! { dg-error "used before it is defined" }
+   end type
+end
diff --git a/gcc/testsuite/gfortran.dg/pdt_18.f03 b/gcc/testsuite/gfortran.dg/pdt_18.f03
new file mode 100644
index 00000000000..896a727eaae
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pdt_18.f03
@@ -0,0 +1,19 @@
+! { dg-do compile }
+!
+! Test the fix for PR82589
+!
+! Contributed by G Steinmetz  <gscfq@t-online.de>
+!
+module m
+   type t(a)
+      integer, KIND, private :: a        ! { dg-error "attribute conflicts with" }
+      integer, KIND, allocatable :: a    ! { dg-error "attribute conflicts with" }
+      integer, KIND, POINTER :: a        ! { dg-error "attribute conflicts with" }
+      integer, KIND, dimension(2) :: a   ! { dg-error "attribute conflicts with" }
+      integer, len, private :: a         ! { dg-error "attribute conflicts with" }
+      integer, len, allocatable :: a     ! { dg-error "attribute conflicts with" }
+      integer, len, POINTER :: a         ! { dg-error "attribute conflicts with" }
+      integer, len, dimension(2) :: a    ! { dg-error "attribute conflicts with" }
+      integer, kind :: a
+   end type
+end
diff --git a/gcc/testsuite/gfortran.dg/pdt_4.f03 b/gcc/testsuite/gfortran.dg/pdt_4.f03
index 13c00af79f1..15cb6417ca7 100644
--- a/gcc/testsuite/gfortran.dg/pdt_4.f03
+++ b/gcc/testsuite/gfortran.dg/pdt_4.f03
@@ -26,7 +26,7 @@ end module
   integer, kind :: bad_kind    ! { dg-error "not allowed outside a TYPE definition" }
   integer, len :: bad_len      ! { dg-error "not allowed outside a TYPE definition" }
 
-  type :: bad_pdt (a,b, c, d)
+  type :: bad_pdt (a,b, c, d)  ! { dg-error "does not have a component" }
     real, kind :: a            ! { dg-error "must be INTEGER" }
     INTEGER(8), kind :: b      ! { dg-error "be default integer kind" }
     real, LEN :: c             ! { dg-error "must be INTEGER" }
diff --git a/gcc/testsuite/gfortran.dg/pdt_8.f03 b/gcc/testsuite/gfortran.dg/pdt_8.f03
index d5e393e5e0c..aeec407fb4b 100644
--- a/gcc/testsuite/gfortran.dg/pdt_8.f03
+++ b/gcc/testsuite/gfortran.dg/pdt_8.f03
@@ -15,9 +15,10 @@ type :: t(i,a,x)         ! { dg-error "does not|has neither" }
   real, kind :: x        ! { dg-error "must be INTEGER" }
 end type
 
-type :: t1(k,y)          ! { dg-error "not declared as a component of the type" }
+type :: t1(k,y)          ! { dg-error "does not have a component" }
   integer, kind :: k
 end type
 
-type(t1(4,4)) :: z
+! This is a knock-on from the previous error
+type(t1(4,4)) :: z       ! { dg-error "Invalid character in name" }
 end
diff --git a/gcc/testsuite/gfortran.dg/pr81735.f90 b/gcc/testsuite/gfortran.dg/pr81735.f90
new file mode 100644
index 00000000000..6aae203aa0f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr81735.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+! Contributed by Danila  <flashmozzg@gmail.com>
+!
+program fooprog
+    implicit none
+    type FooType
+        integer, allocatable :: x
+    end type FooType
+
+    type(FooType), pointer :: bar
+
+    bar => foo()
+
+contains
+    function foo() result(res)
+        type(FooType), pointer :: res
+
+        character(:), allocatable :: rt
+        rt = ""
+        res => null()
+    end function foo
+end program fooprog
+! { dg-final { scan-tree-dump-times "__builtin_free" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/spellcheck-operator.f90 b/gcc/testsuite/gfortran.dg/spellcheck-operator.f90
new file mode 100644
index 00000000000..810a770698b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/spellcheck-operator.f90
@@ -0,0 +1,30 @@
+! { dg-do compile }
+! test levenshtein based spelling suggestions
+
+module mymod1
+  implicit none
+  contains
+    function something_good (iarg1)
+      integer :: something_good
+      integer, intent(in) :: iarg1
+      something_good = iarg1 + 42
+    end function something_good
+end module mymod1
+
+program spellchekc
+  use mymod1
+  implicit none
+
+  interface operator (.mywrong.)
+    module procedure something_wring ! { dg-error "Procedure .something_wring. in operator interface .mywrong. at .1. is neither function nor subroutine; did you mean .something_good.\\?|User operator procedure .something_wring. at .1. must be a FUNCTION" }
+  end interface
+
+  interface operator (.mygood.)
+    module procedure something_good
+  end interface
+
+  integer :: i, j, added
+  i = 0
+  j = 0
+  added = .mygoof. j ! { dg-error "Unknown operator .mygoof. at .1.; did you mean .mygood.\\?" }
+end program spellchekc
diff --git a/gcc/testsuite/gfortran.dg/spellcheck-parameter.f90 b/gcc/testsuite/gfortran.dg/spellcheck-parameter.f90
new file mode 100644
index 00000000000..715c5abcce7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/spellcheck-parameter.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! Contributed by Joost VandeVondele
+! test levenshtein based spelling suggestions for keyword arguments
+
+module test
+contains
+  subroutine mysub(iarg1)
+    integer :: iarg1
+  end subroutine
+end module
+
+use test
+call mysub(iarg=1) ! { dg-error "Keyword argument .iarg. at .1. is not in the procedure; did you mean .iarg1.\\?" }
+
+end
diff --git a/gcc/testsuite/gfortran.dg/spellcheck-procedure_1.f90 b/gcc/testsuite/gfortran.dg/spellcheck-procedure_1.f90
new file mode 100644
index 00000000000..3b7f7169468
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/spellcheck-procedure_1.f90
@@ -0,0 +1,41 @@
+! { dg-do compile }
+! test levenshtein based spelling suggestions
+
+module mymod1
+  implicit none
+  contains
+    function something_else (iarg1)
+      integer :: something_else
+      integer, intent(in) :: iarg1
+      something_else = iarg1 + 42
+    end function something_else
+    function add_fourtytwo (iarg1)
+      integer :: add_fourtytwo
+      integer, intent(in) :: iarg1
+      add_fourtytwo = iarg1 + 42
+    end function add_fourtytwo
+end module mymod1
+
+function myadd(iarg1, iarg2)
+  implicit none
+  integer :: myadd
+  integer, intent(in) :: iarg1, iarg2
+  myadd = iarg1 + iarg2
+end function myadd
+
+program spellchekc
+  use mymod1, something_good => something_else
+  implicit none
+
+  integer :: myadd, i, j, myvar
+  i = 0
+  j = 0
+
+  j = something_goof(j) ! { dg-error "no IMPLICIT type; did you mean .something_good.\\?" }
+  j = myaddd(i, j) ! { dg-error "no IMPLICIT type; did you mean .myadd.\\?" }
+  if (j /= 42) call abort
+  j = add_fourtytow(i, j) ! { dg-error "no IMPLICIT type; did you mean .add_fourtytwo.\\?" }
+  myval = myadd(i, j) ! { dg-error "no IMPLICIT type; did you mean .myvar.\\?" }
+  if (j /= 42 * 2) call abort
+
+end program spellchekc
diff --git a/gcc/testsuite/gfortran.dg/spellcheck-procedure_2.f90 b/gcc/testsuite/gfortran.dg/spellcheck-procedure_2.f90
new file mode 100644
index 00000000000..a6ea5f9f280
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/spellcheck-procedure_2.f90
@@ -0,0 +1,35 @@
+! { dg-do compile }
+! test levenshtein based spelling suggestions
+
+
+program spellchekc
+  implicit none (external) ! { dg-warning "GNU Extension: IMPORT NONE with spec list" }
+
+  interface
+    subroutine bark_unless_zero(iarg)
+      implicit none
+      integer, intent(in) :: iarg
+    end subroutine bark_unless_zero
+  end interface
+
+  integer :: i
+  i = 0
+
+  if (i /= 1) call abort
+  call bark_unless_0(i) ! { dg-error "not explicitly declared; did you mean .bark_unless_zero.\\?" }
+!  call complain_about_0(i) ! { -dg-error "not explicitly declared; did you mean .complain_about_zero.\\?" }
+
+contains
+! We cannot reliably see this ATM, would need an unambiguous bit somewhere
+  subroutine complain_about_zero(iarg)
+    integer, intent(in) :: iarg
+    if (iarg /= 0) call abort
+  end subroutine complain_about_zero
+
+end program spellchekc
+
+subroutine bark_unless_zero(iarg)
+  implicit none
+  integer, intent(in) :: iarg
+  if (iarg /= 0) call abort
+end subroutine bark_unless_zero
diff --git a/gcc/testsuite/gfortran.dg/spellcheck-structure.f90 b/gcc/testsuite/gfortran.dg/spellcheck-structure.f90
new file mode 100644
index 00000000000..929e05f2151
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/spellcheck-structure.f90
@@ -0,0 +1,35 @@
+! { dg-do compile }
+! test levenshtein based spelling suggestions
+implicit none
+
+!!!!!!!!!!!!!! structure tests !!!!!!!!!!!!!!
+type type1
+   real :: radius
+   integer :: i
+end type type1
+
+type type2
+  integer :: myint
+  type(type1) :: mytype
+end type type2
+
+type type3
+  type(type2) :: type_2
+end type type3
+type type4
+  type(type3) :: type_3
+end type type4
+
+type(type1) :: t1
+t1%radiuz = .0 ! { dg-error ".radiuz. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" }
+t1%x = .0 ! { dg-error ".x. at .1. is not a member of the .type1. structure" }
+type(type2) :: t2
+t2%mytape%radius = .0 ! { dg-error ".mytape. at .1. is not a member of the .type2. structure; did you mean .mytype.\\?" }
+t2%mytype%radious = .0 ! { dg-error ".radious. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" }
+type(type4) :: t4
+t4%type_3%type_2%mytype%radium = 88.0 ! { dg-error ".radium. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" }
+
+!!!!!!!!!!!!!! symbol tests !!!!!!!!!!!!!!
+integer :: iarg1
+iarg2 = 1 ! { dg-error "Symbol .iarg2. at .1. has no IMPLICIT type; did you mean .iarg1.\\?" }
+end
diff --git a/gcc/testsuite/gfortran.dg/submodule_30.f08 b/gcc/testsuite/gfortran.dg/submodule_30.f08
new file mode 100644
index 00000000000..25dcbebe656
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/submodule_30.f08
@@ -0,0 +1,42 @@
+! { dg-do run }
+!
+! Test the fix for PR82550 in which the reference to 'p' in 'foo'
+! was not being correctly handled.
+!
+! Contributed by Reinhold Bader  <Bader@lrz.de>
+!
+module m_subm_18_pos
+  implicit none
+  integer :: i = 0
+  interface
+    module subroutine foo(fun_ptr)
+      procedure(p), pointer, intent(out) :: fun_ptr
+    end subroutine
+  end interface
+contains
+  subroutine p()
+    i = 1
+  end subroutine p
+end module m_subm_18_pos
+submodule (m_subm_18_pos) subm_18_pos
+    implicit none
+contains
+    module subroutine foo(fun_ptr)
+      procedure(p), pointer, intent(out) :: fun_ptr
+      fun_ptr => p
+    end subroutine
+end submodule
+program p_18_pos
+  use m_subm_18_pos
+  implicit none
+  procedure(), pointer :: x
+  call foo(x)
+  call x()
+  if (i == 1) then
+     write(*,*) 'OK'
+  else
+     write(*,*) 'FAIL'
+     call abort
+  end if
+end program p_18_pos
+
diff --git a/gcc/testsuite/gnat.dg/default_pkg_actual.adb b/gcc/testsuite/gnat.dg/default_pkg_actual.adb
new file mode 100644
index 00000000000..d10ae0c152b
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/default_pkg_actual.adb
@@ -0,0 +1,32 @@
+--  { dg-do compile }
+
+procedure Default_Pkg_Actual is
+
+   generic
+   package As is
+   end As;
+
+   generic
+      type T is private;
+      with package A0 is new As;
+   package Bs is
+   end Bs;
+
+   generic
+      with package Xa is new As;
+   package Xs is
+      package Xb is new Bs(T => Integer, A0 => Xa);
+   end Xs;
+
+   generic
+      with package Yb is new Bs(T => Integer, others => <>);
+   package Ys is
+   end Ys;
+
+   package A is new As;
+   package X is new Xs(Xa => A);
+   package Y is new Ys(Yb => X.Xb);
+
+begin
+   null;
+end;
diff --git a/gcc/testsuite/gnat.dg/default_pkg_actual2.adb b/gcc/testsuite/gnat.dg/default_pkg_actual2.adb
new file mode 100644
index 00000000000..7ab614a0994
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/default_pkg_actual2.adb
@@ -0,0 +1,27 @@
+--  { dg-do compile }
+
+procedure Default_Pkg_Actual2 is
+
+   generic
+   package P1 is
+   end;
+
+   generic
+      with package FP1a is new P1;
+      with package FP1b is new P1;
+   package P2 is
+   end;
+
+   generic
+      with package FP2 is new P2 (FP1a => <>,  FP1b => <>);
+   package P3 is
+   end;
+
+   package NP1a is new P1;
+   package NP1b is new P1;
+   package NP2  is new P2 (NP1a, NP1b);
+   package NP4  is new P3 (NP2);
+
+begin
+   null;
+end;
diff --git a/gcc/testsuite/gnat.dg/dimensions.adb b/gcc/testsuite/gnat.dg/dimensions.adb
new file mode 100644
index 00000000000..86fc6eef670
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dimensions.adb
@@ -0,0 +1,5 @@
+--  { dg-do compile }
+
+package body Dimensions is
+   procedure Dummy is null;
+end Dimensions;
diff --git a/gcc/testsuite/gnat.dg/dimensions.ads b/gcc/testsuite/gnat.dg/dimensions.ads
new file mode 100644
index 00000000000..54bab081470
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dimensions.ads
@@ -0,0 +1,29 @@
+package Dimensions is
+
+   type Mks_Int_Type is new Integer
+     with
+      Dimension_System => (
+        (Unit_Name => Meter,    Unit_Symbol => 'm',   Dim_Symbol => 'L'),
+        (Unit_Name => Kilogram, Unit_Symbol => "kg",  Dim_Symbol => 'M'),
+        (Unit_Name => Second,   Unit_Symbol => 's',   Dim_Symbol => 'T'),
+        (Unit_Name => Ampere,   Unit_Symbol => 'A',   Dim_Symbol => 'I'),
+        (Unit_Name => Kelvin,   Unit_Symbol => 'K',   Dim_Symbol => '@'),
+        (Unit_Name => Mole,     Unit_Symbol => "mol", Dim_Symbol => 'N'),
+        (Unit_Name => Candela,  Unit_Symbol => "cd",  Dim_Symbol => 'J'));
+
+   subtype Int_Length is Mks_Int_Type
+     with
+      Dimension => (Symbol => 'm',
+        Meter  => 1,
+        others => 0);
+
+   subtype Int_Speed is Mks_Int_Type
+     with
+      Dimension => (
+        Meter  =>  1,
+        Second => -1,
+        others =>  0);
+
+   procedure Dummy;
+
+end Dimensions;
diff --git a/gcc/testsuite/gnat.dg/opt68.adb b/gcc/testsuite/gnat.dg/opt68.adb
new file mode 100644
index 00000000000..caf6b713996
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/opt68.adb
@@ -0,0 +1,53 @@
+-- { dg-do compile }
+-- { dg-options "-O3" }
+
+with Ada.Unchecked_Deallocation;
+
+package body Opt68 is
+
+  procedure Free
+    is new Ada.Unchecked_Deallocation (Queue_Element, A_Queue_Element);
+
+  procedure Copy (dest : in out Queue; src : Queue) is
+    d, s, pd, ps, t : A_Queue_Element;
+  begin
+    if src.sz /= 0 then
+      d := dest.front;
+      s := src.front;
+      while d /= null and s /= null loop
+        d.value := s.value;
+        pd := d;
+        ps := s;
+        d  := d.next;
+        s  := s.next;
+      end loop;
+      if src.sz = dest.sz then
+        return;
+      elsif s = null then
+        while d /= null loop
+          t := d.next;
+          Free (d);
+          d := t;
+        end loop;
+        dest.back      := pd;
+        dest.back.next := null;
+      else
+        if pd = null then
+          dest.front       := new Queue_Element;
+          dest.front.value := s.value;
+          s                := s.next;
+          pd               := dest.front;
+        end if;
+        while s /= null loop
+          pd.next       := new Queue_Element;
+          pd.next.value := s.value;
+          pd            := pd.next;
+          s             := s.next;
+        end loop;
+        dest.back := pd;
+      end if;
+      dest.sz := src.sz;
+    end if;
+  end;
+
+end Opt68;
diff --git a/gcc/testsuite/gnat.dg/opt68.ads b/gcc/testsuite/gnat.dg/opt68.ads
new file mode 100644
index 00000000000..25e28a50d7b
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/opt68.ads
@@ -0,0 +1,26 @@
+with Ada.Finalization;
+
+package Opt68 is
+
+  type Cont is new Ada.Finalization.Controlled with null record;
+
+  type Element is record
+    C : Cont;
+  end record;
+
+  type Queue_Element;
+  type A_Queue_Element is access Queue_Element;
+  type Queue_Element is record
+    Value : Element;
+    Next  : A_Queue_Element;
+  end record;
+
+  type Queue is limited record
+    Sz    : Natural;
+    Front : A_Queue_Element;
+    Back  : A_Queue_Element;
+  end record;
+
+  procedure Copy (dest : in out Queue; src : Queue);
+
+end Opt68;
diff --git a/gcc/testsuite/gnat.dg/remote_call_iface.adb b/gcc/testsuite/gnat.dg/remote_call_iface.adb
new file mode 100644
index 00000000000..6816ad95a65
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/remote_call_iface.adb
@@ -0,0 +1,7 @@
+--  { dg-do compile }
+
+package body Remote_Call_Iface is
+   procedure Proc is begin null; end;
+begin
+   Proc;
+end Remote_Call_Iface;
diff --git a/gcc/testsuite/gnat.dg/remote_call_iface.ads b/gcc/testsuite/gnat.dg/remote_call_iface.ads
new file mode 100644
index 00000000000..ce12fef88ca
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/remote_call_iface.ads
@@ -0,0 +1,5 @@
+generic
+package Remote_Call_Iface is
+   pragma Remote_Call_Interface;
+   procedure Proc;
+end Remote_Call_Iface;
diff --git a/gcc/testsuite/gnat.dg/specs/discr_private.ads b/gcc/testsuite/gnat.dg/specs/discr2.ads
index 0ddfbd137ff..f7ece058812 100644
--- a/gcc/testsuite/gnat.dg/specs/discr_private.ads
+++ b/gcc/testsuite/gnat.dg/specs/discr2.ads
@@ -1,7 +1,7 @@
 -- { dg-do compile }
 -- { dg-options "-gnatws" }
 
-package Discr_Private is
+package Discr2 is
 
    package Dec is
       type T_DECIMAL (Prec : Integer := 1) is private;
@@ -47,4 +47,4 @@ package Discr_Private is
        end case;
    end record;
 
-end Discr_Private;
+end Discr2;
diff --git a/gcc/testsuite/gnat.dg/specs/discr_record_constant.ads b/gcc/testsuite/gnat.dg/specs/discr3.ads
index f43b1386909..bcb996b7386 100644
--- a/gcc/testsuite/gnat.dg/specs/discr_record_constant.ads
+++ b/gcc/testsuite/gnat.dg/specs/discr3.ads
@@ -2,7 +2,7 @@
 
 pragma Restrictions (No_Implicit_Heap_Allocations);
 
-package Discr_Record_Constant is
+package Discr3 is
 
    type T (Big : Boolean := False) is record
       case Big is
@@ -19,4 +19,4 @@ package Discr_Record_Constant is
     Con : constant T := D;    --  Violation of restriction
     Ter : constant T := Con;  --  Violation of restriction
 
-end Discr_Record_Constant;
+end Discr3;
diff --git a/gcc/testsuite/gnat.dg/specs/discr4.ads b/gcc/testsuite/gnat.dg/specs/discr4.ads
new file mode 100644
index 00000000000..a7fc25b9d66
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/specs/discr4.ads
@@ -0,0 +1,23 @@
+-- { dg-do compile }
+-- { dg-options "-O" }
+
+with Discr4_Pkg; use Discr4_Pkg;
+
+package Discr4 is
+
+   type Data is record
+      Val : Rec;
+      Set : Boolean;
+   end record;
+
+   type Pair is record
+      Lower, Upper : Data;
+   end record;
+
+   function Build (L, U : Rec) return Pair is ((L, True), (U, False));
+
+   C1 : constant Pair := Build (Rec_One, Rec_Three);
+
+   C2 : constant Pair := Build (Get (0), Rec_Three);
+
+end Discr4;
diff --git a/gcc/testsuite/gnat.dg/specs/discr4_pkg.ads b/gcc/testsuite/gnat.dg/specs/discr4_pkg.ads
new file mode 100644
index 00000000000..231a8fb77e8
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/specs/discr4_pkg.ads
@@ -0,0 +1,27 @@
+package Discr4_Pkg is
+
+   type Enum is (One, Two, Three);
+
+   type Rec is private;
+
+   Rec_One : constant Rec;
+   Rec_Three  : constant Rec;
+
+   function Get (Value : Integer) return Rec;
+
+private
+
+   type Rec (D : Enum := Two) is record
+      case D is
+         when One => null;
+         when Two => Value : Integer;
+         when Three => null;
+      end case;
+   end record;
+
+   Rec_One   : constant Rec := (D => One);
+   Rec_Three : constant Rec := (D => Three);
+
+   function Get (Value : Integer) return Rec is (Two, Value);
+
+end Discr4_Pkg;
diff --git a/gcc/testsuite/gnat.dg/stack_usage4.adb b/gcc/testsuite/gnat.dg/stack_usage4.adb
new file mode 100644
index 00000000000..24cd1a75bf0
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/stack_usage4.adb
@@ -0,0 +1,11 @@
+-- { dg-do compile }
+-- { dg-options "-Wstack-usage=512" }
+
+with Stack_Usage4_Pkg; use Stack_Usage4_Pkg;
+
+procedure Stack_Usage4 is
+   BS : Bounded_String := Get;
+   S : String := BS.Data (BS.Data'First .. BS.Len);
+begin
+   null;
+end;
diff --git a/gcc/testsuite/gnat.dg/stack_usage4_pkg.ads b/gcc/testsuite/gnat.dg/stack_usage4_pkg.ads
new file mode 100644
index 00000000000..9bad62776cd
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/stack_usage4_pkg.ads
@@ -0,0 +1,12 @@
+package Stack_Usage4_Pkg is
+
+   subtype Name_Index_Type is Natural range 1 .. 63;
+
+   type Bounded_String is record
+      Len  : Name_Index_Type;
+      Data : String (Name_Index_Type'Range);
+   end record;
+
+   function Get return Bounded_String;
+
+end Stack_Usage4_Pkg;
diff --git a/gcc/testsuite/gnat.dg/sync_iface_call.adb b/gcc/testsuite/gnat.dg/sync_iface_call.adb
new file mode 100644
index 00000000000..1603981892e
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_iface_call.adb
@@ -0,0 +1,34 @@
+--  { dg-do compile }
+
+with Sync_Iface_Call_Pkg;
+with Sync_Iface_Call_Pkg2;
+
+procedure Sync_Iface_Call is
+
+   Impl : access Sync_Iface_Call_Pkg.IFace'Class :=
+       new Sync_Iface_Call_Pkg2.Impl;
+   Val : aliased Integer := 10;
+begin
+   select
+      Impl.Do_Stuff (Val);
+   or
+      delay 10.0;
+   end select;
+   select
+      Impl.Do_Stuff_Access (Val'Access);
+   or
+      delay 10.0;
+   end select;
+
+   select
+      Impl.Do_Stuff_2 (Val);
+   or
+      delay 10.0;
+   end select;
+
+   select
+      Impl.Do_Stuff_2_Access (Val'Access);
+   or
+      delay 10.0;
+   end select;
+end Sync_Iface_Call;
diff --git a/gcc/testsuite/gnat.dg/sync_iface_call_pkg.ads b/gcc/testsuite/gnat.dg/sync_iface_call_pkg.ads
new file mode 100644
index 00000000000..e392c024c79
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_iface_call_pkg.ads
@@ -0,0 +1,21 @@
+package Sync_Iface_Call_Pkg is
+
+   type IFace is synchronized interface;
+
+   procedure Do_Stuff
+     (This  : in out IFace;
+      Value : in Integer) is null;
+
+   procedure Do_Stuff_Access
+     (This  : in out IFace;
+      Value : not null access Integer) is null;
+
+   procedure Do_Stuff_2
+     (This  : not null access IFace;
+      Value : in Integer) is null;
+
+   procedure Do_Stuff_2_Access
+     (This  : not null access IFace;
+      Value : not null access Integer) is null;
+
+end Sync_Iface_Call_Pkg;
diff --git a/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.adb b/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.adb
new file mode 100644
index 00000000000..b3c221e5b1a
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.adb
@@ -0,0 +1,8 @@
+package body Sync_Iface_Call_Pkg2 is
+
+   task body Impl is
+   begin
+      null;
+   end Impl;
+
+end Sync_Iface_Call_Pkg2;
diff --git a/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.ads b/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.ads
new file mode 100644
index 00000000000..ca21b1d6d08
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_iface_call_pkg2.ads
@@ -0,0 +1,7 @@
+with Sync_Iface_Call_Pkg;
+
+package Sync_Iface_Call_Pkg2 is
+
+   task type Impl is new Sync_Iface_Call_Pkg.IFace with end;
+
+end Sync_Iface_Call_Pkg2;
diff --git a/gcc/testsuite/jit.dg/jit.exp b/gcc/testsuite/jit.dg/jit.exp
index 39e37c2da82..869d9f693a0 100644
--- a/gcc/testsuite/jit.dg/jit.exp
+++ b/gcc/testsuite/jit.dg/jit.exp
@@ -580,6 +580,15 @@ proc jit-dg-test { prog do_what extra_tool_flags } {
 	verbose "$name is not meant to generate a reproducer"
     }
 
+    # Normally we would return $comp_output and $output_file to the
+    # caller, which would delete $output_file, the generated executable.
+    # If we need to debug, it's handy to be able to suppress this behavior,
+    # keeping the executable around.
+    set preserve_executables [info exists env(PRESERVE_EXECUTABLES)]
+    if $preserve_executables {
+	set output_file ""
+    }
+    
     return [list $comp_output $output_file]
 }
 
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index cb5d1843c92..d8f9b7bd2bb 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -560,7 +560,7 @@ proc gcc-dg-debug-runtest { target_compile trivial opt_opts testcases } {
 
     if ![info exists DEBUG_TORTURE_OPTIONS] {
 	set DEBUG_TORTURE_OPTIONS ""
-	foreach type {-gdwarf-2 -gstabs -gstabs+ -gxcoff -gxcoff+ -gcoff} {
+	foreach type {-gdwarf-2 -gstabs -gstabs+ -gxcoff -gxcoff+} {
 	    set comp_output [$target_compile \
 		    "$srcdir/$subdir/$trivial" "trivial.S" assembly \
 		    "additional_flags=$type"]
diff --git a/gcc/testsuite/lib/gcov.exp b/gcc/testsuite/lib/gcov.exp
index 632d50667a7..ede01e70212 100644
--- a/gcc/testsuite/lib/gcov.exp
+++ b/gcc/testsuite/lib/gcov.exp
@@ -59,7 +59,7 @@ proc verify-lines { testname testcase file } {
     while { [gets $fd line] >= 0 } {
         # We want to match both "-" and "#####" as count as well as numbers,
         # since we want to detect lines that shouldn't be marked as covered.
-	if [regexp "^ *(\[^:]*): *(\[0-9\\-#]+):.*count\\((\[0-9\\-#=]+)\\)(.*)" \
+	if [regexp "^ *(\[^:]*): *(\[0-9\\-#]+):.*count\\((\[0-9\\-#=\\.kMGTPEZY]+)\\)(.*)" \
 		"$line" all is n shouldbe rest] {
 	    if [regexp "^ *{(.*)}" $rest all xfailed] {
 		switch [dg-process-target $xfailed] {
@@ -108,7 +108,7 @@ proc verify-intermediate { testname testcase file } {
 	if [regexp "^function:(\[0-9\]+),(\[0-9\]+),.*" $line] {
 	    incr function
 	}
-	if [regexp "^lcount:(\[0-9\]+),(\[0-9\]+)" $line] {
+	if [regexp "^lcount:(\[0-9\]+),(\[0-9\]+),(\[01\])" $line] {
 	    incr lcount
 	}
 	if [regexp "^branch:(\[0-9\]+),(taken|nottaken|notexec)" $line] {
diff --git a/gcc/testsuite/lib/gfortran-dg.exp b/gcc/testsuite/lib/gfortran-dg.exp
index 27b2a69b9e2..6f190092f28 100644
--- a/gcc/testsuite/lib/gfortran-dg.exp
+++ b/gcc/testsuite/lib/gfortran-dg.exp
@@ -162,7 +162,7 @@ proc gfortran-dg-debug-runtest { target_compile trivial opt_opts testcases } {
 
     if ![info exists DEBUG_TORTURE_OPTIONS] {
        set DEBUG_TORTURE_OPTIONS ""
-       set type_list [list "-gstabs" "-gstabs+" "-gxcoff" "-gxcoff+" "-gcoff" "-gdwarf-2" ]
+       set type_list [list "-gstabs" "-gstabs+" "-gxcoff" "-gxcoff+" "-gdwarf-2" ]
        foreach type $type_list {
            set comp_output [$target_compile \
                    "$srcdir/$subdir/$trivial" "trivial.S" assembly \
diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index bab23e8e165..a66bb282531 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -231,6 +231,7 @@ proc scan-assembler-times { args } {
 
     set testcase [testname-for-summary]
     set pattern [lindex $args 0]
+    set times [lindex $args 1]
     set pp_pattern [make_pattern_printable $pattern]
 
     # This must match the rule in gcc-dg.exp.
@@ -239,7 +240,7 @@ proc scan-assembler-times { args } {
     set files [glob -nocomplain $output_file]
     if { $files == "" } {
 	verbose -log "$testcase: output file does not exist"
-	unresolved "$testcase scan-assembler-times $pp_pattern [lindex $args 1]"
+	unresolved "$testcase scan-assembler-times $pp_pattern $times"
 	return
     }
 
@@ -247,10 +248,11 @@ proc scan-assembler-times { args } {
     set text [read $fd]
     close $fd
 
-    if { [llength [regexp -inline -all -- $pattern $text]] == [lindex $args 1]} {
-	pass "$testcase scan-assembler-times $pp_pattern [lindex $args 1]"
+    set result_count [llength [regexp -inline -all -- $pattern $text]]
+    if {$result_count == $times} {
+	pass "$testcase scan-assembler-times $pp_pattern $times"
     } else {
-	fail "$testcase scan-assembler-times $pp_pattern [lindex $args 1]"
+	fail "$testcase scan-assembler-times $pp_pattern $times (found $result_count times)"
     }
 }
 
@@ -482,16 +484,16 @@ proc dg-function-on-line { args } {
     }
 
     if { [istarget hppa*-*-*] } {
-	set pattern [format {\t;[^:]+:%d\n(\t[^\t]+\n)+%s:\n\t.PROC} \
+	set pattern [format {\t;[^:]+:%d(:[0-9]+)?\n(\t[^\t]+\n)+%s:\n\t.PROC} \
                      $line $symbol]
     } elseif { [istarget mips*-*-*] } {
-	set pattern [format {\t\.loc [0-9]+ %d 0( [^\n]*)?\n(\t.cfi_startproc[^\t]*\n)*\t\.set\t(no)?mips16\n\t(\.set\t(no)?micromips\n\t)?\.ent\t%s\n\t\.type\t%s, @function\n%s:\n} \
+	set pattern [format {\t\.loc [0-9]+ %d [0-9]+( [^\n]*)?\n(\t.cfi_startproc[^\t]*\n)*\t\.set\t(no)?mips16\n\t(\.set\t(no)?micromips\n\t)?\.ent\t%s\n\t\.type\t%s, @function\n%s:\n} \
 		     $line $symbol $symbol $symbol]
     } elseif { [istarget microblaze*-*-*] } {
-        set pattern [format {:%d\n\$.*:\n\t\.ent\t%s\n\t\.type\t%s, @function\n%s:\n} \
+        set pattern [format {:%d(:[0-9]+)?\n\$.*:\n\t\.ent\t%s\n\t\.type\t%s, @function\n%s:\n} \
                      $line $symbol $symbol $symbol]
     } else {
-	set pattern [format {%s:[^\t]*(\t.(fnstart|frame|mask|file)[^\t]*)*\t[^:]+:%d\n} \
+	set pattern [format {%s:[^\t]*(\t.(fnstart|frame|mask|file)[^\t]*)*\t[^:]+:%d(:[0-9]+)?\n} \
                      $symbol $line]
     }
 
diff --git a/gcc/testsuite/lib/scandump.exp b/gcc/testsuite/lib/scandump.exp
index 2e6eebfaf33..4a64ac6e05d 100644
--- a/gcc/testsuite/lib/scandump.exp
+++ b/gcc/testsuite/lib/scandump.exp
@@ -86,6 +86,7 @@ proc scan-dump-times { args } {
     }
 
     set testcase [testname-for-summary]
+    set times [lindex $args 2]
     set suf [dump-suffix [lindex $args 3]]
     set printable_pattern [make_pattern_printable [lindex $args 1]]
     set testname "$testcase scan-[lindex $args 0]-dump-times $suf \"$printable_pattern\" [lindex $args 2]"
@@ -101,10 +102,11 @@ proc scan-dump-times { args } {
     set text [read $fd]
     close $fd
 
-    if { [llength [regexp -inline -all -- [lindex $args 1] $text]] == [lindex $args 2]} {
+    set result_count [llength [regexp -inline -all -- [lindex $args 1] $text]]
+    if {$result_count == $times} {
         pass "$testname"
     } else {
-        fail "$testname"
+        fail "$testname (found $result_count times)"
     }
 }
 
diff --git a/gcc/testsuite/lib/scanlang.exp b/gcc/testsuite/lib/scanlang.exp
index 796214385c8..729d3069c2a 100644
--- a/gcc/testsuite/lib/scanlang.exp
+++ b/gcc/testsuite/lib/scanlang.exp
@@ -28,11 +28,11 @@ load_lib scandump.exp
 proc scan-lang-dump { args } {
 
     if { [llength $args] < 2 } {
-	error "scan-tree-dump: too few arguments"
+	error "scan-lang-dump: too few arguments"
 	return
     }
     if { [llength $args] > 3 } {
-	error "scan-tree-dump: too many arguments"
+	error "scan-lang-dump: too many arguments"
 	return
     }
     if { [llength $args] >= 3 } {
diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp
index d50d8b07ada..6080421fa9e 100644
--- a/gcc/testsuite/lib/target-supports-dg.exp
+++ b/gcc/testsuite/lib/target-supports-dg.exp
@@ -180,6 +180,21 @@ proc dg-require-iconv { args } {
     }
 }
 
+# If this target does not have sufficient stack size, skip this test.
+
+proc dg-require-stack-size { args } {
+    if { ![is-effective-target stack_size] } {
+	return
+    }
+    
+    set stack_size [dg-effective-target-value stack_size]
+    set required [expr [lindex $args 1]]
+    if { $stack_size < $required } {
+	upvar dg-do-what dg-do-what
+        set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
+    }
+}
+
 # If this target does not support named sections skip this test.
 
 proc dg-require-named-sections { args } {
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 5fbdb740ac6..b2096723426 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -548,7 +548,8 @@ proc check_effective_target_keeps_null_pointer_checks { } {
     if [target_info exists keeps_null_pointer_checks] {
       return 1
     }
-    if { [istarget avr-*-*] } {
+    if { [istarget avr-*-*]
+	 || [istarget msp430-*-*] } {
 	return 1;   
     }
     return 0
@@ -3296,7 +3297,8 @@ proc check_effective_target_vect_peeling_profitable { } {
     } else {
 	set et_vect_peeling_profitable_saved($et_index) 1
         if { ([istarget s390*-*-*]
-	      && [check_effective_target_s390_vx]) } {
+	      && [check_effective_target_s390_vx])
+	     || [check_effective_target_vect_element_align_preferred] } {
 	    set et_vect_peeling_profitable_saved($et_index) 0
         }
     }
@@ -3367,12 +3369,8 @@ proc check_effective_target_aarch64_sve { } {
     }]
 }
 
-# If targetting AArch64 SVE, return the size in bits of an SVE vector,
-# or -1 if the size is variable.  Return 0 if not targetting AArch64 SVE.
+# Return the size in bits of an SVE vector, or 0 if the size is variable.
 proc aarch64_sve_bits { } {
-    if { ![check_effective_target_aarch64_sve] } {
-	return 0
-    }
     return [check_cached_effective_target aarch64_sve_bits {
 	global tool
 
@@ -3388,25 +3386,6 @@ proc aarch64_sve_bits { } {
     }]
 }
 
-# Return true if targetting AArch64 SVE and if the target system's
-# vectors have exactly BITS bits.
-proc aarch64_sve_hw_bits { bits } {
-    if { ![check_effective_target_aarch64_sve_hw] } {
-	return 0
-    }
-    return [check_runtime aarch64_sve${bits}_hw [subst {
-	int
-	main (void)
-	{
-	  int res;
-	  asm volatile ("cntd %0" : "=r" (res));
-	  if (res * 64 != $bits)
-	    __builtin_abort ();
-	  return 0;
-	}
-    }]]
-}
-
 # Return 1 if this is a compiler supporting ARC atomic operations
 proc check_effective_target_arc_atomic { } {
     return [check_no_compiler_messages arc_atomic assembly {
@@ -4332,18 +4311,45 @@ proc check_effective_target_arm_neon_hw { } {
     } [add_options_for_arm_neon ""]]
 }
 
+# Return true if this is an AArch64 target that can run SVE code.
+
 proc check_effective_target_aarch64_sve_hw { } {
+    if { ![istarget aarch64*-*-*] } {
+	return 0
+    }
     return [check_runtime aarch64_sve_hw_available {
 	int
 	main (void)
 	{
-	  unsigned long res;
 	  asm volatile ("ptrue p0.b");
 	  return 0;
 	}
     }]
 }
 
+# Return true if this is an AArch64 target that can run SVE code and
+# if its SVE vectors have exactly BITS bits.
+
+proc aarch64_sve_hw_bits { bits } {
+    if { ![check_effective_target_aarch64_sve_hw] } {
+	return 0
+    }
+    return [check_runtime aarch64_sve${bits}_hw [subst {
+	int
+	main (void)
+	{
+	  int res;
+	  asm volatile ("cntd %0" : "=r" (res));
+	  if (res * 64 != $bits)
+	    __builtin_abort ();
+	  return 0;
+	}
+    }]]
+}
+
+# Return true if this is an AArch64 target that can run SVE code and
+# if its SVE vectors have exactly 256 bits.
+
 proc check_effective_target_aarch64_sve256_hw { } {
     return [aarch64_sve_hw_bits 256]
 }
@@ -4470,6 +4476,48 @@ proc check_effective_target_arm_v8_2a_fp16_neon_ok { } {
 		check_effective_target_arm_v8_2a_fp16_neon_ok_nocache]
 }
 
+# Return 1 if the target supports ARMv8.2 Adv.SIMD Dot Product
+# instructions, 0 otherwise.  The test is valid for ARM and for AArch64.
+# Record the command line options needed.
+
+proc check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache { } {
+    global et_arm_v8_2a_dotprod_neon_flags
+    set et_arm_v8_2a_dotprod_neon_flags ""
+
+    if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
+        return 0;
+    }
+
+    # Iterate through sets of options to find the compiler flags that
+    # need to be added to the -march option.
+    foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" "-mfloat-abi=hard -mfpu=neon-fp-armv8"} {
+        if { [check_no_compiler_messages_nocache \
+                  arm_v8_2a_dotprod_neon_ok object {
+            #if !defined (__ARM_FEATURE_DOTPROD)
+            #error "__ARM_FEATURE_DOTPROD not defined"
+            #endif
+        } "$flags -march=armv8.2-a+dotprod"] } {
+            set et_arm_v8_2a_dotprod_neon_flags "$flags -march=armv8.2-a+dotprod"
+            return 1
+        }
+    }
+
+    return 0;
+}
+
+proc check_effective_target_arm_v8_2a_dotprod_neon_ok { } {
+    return [check_cached_effective_target arm_v8_2a_dotprod_neon_ok \
+                check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache]
+}
+
+proc add_options_for_arm_v8_2a_dotprod_neon { flags } {
+    if { ! [check_effective_target_arm_v8_2a_dotprod_neon_ok] } {
+        return "$flags"
+    }
+    global et_arm_v8_2a_dotprod_neon_flags
+    return "$flags $et_arm_v8_2a_dotprod_neon_flags"
+}
+
 # Return 1 if the target supports executing ARMv8 NEON instructions, 0
 # otherwise.
 
@@ -4607,6 +4655,42 @@ proc check_effective_target_arm_v8_2a_fp16_neon_hw { } {
     } [add_options_for_arm_v8_2a_fp16_neon ""]]
 }
 
+# Return 1 if the target supports executing AdvSIMD instructions from ARMv8.2
+# with the Dot Product extension, 0 otherwise.  The test is valid for ARM and for
+# AArch64.
+
+proc check_effective_target_arm_v8_2a_dotprod_neon_hw { } {
+    if { ![check_effective_target_arm_v8_2a_dotprod_neon_ok] } {
+        return 0;
+    }
+    return [check_runtime arm_v8_2a_dotprod_neon_hw_available {
+        #include "arm_neon.h"
+        int
+        main (void)
+        {
+
+	  uint32x2_t results = {0,0};
+	  uint8x8_t a = {1,1,1,1,2,2,2,2};
+	  uint8x8_t b = {2,2,2,2,3,3,3,3};
+
+          #ifdef __ARM_ARCH_ISA_A64
+          asm ("udot %0.2s, %1.8b, %2.8b"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+
+	  #else
+          asm ("vudot.u8 %P0, %P1, %P2"
+               : "=w"(results)
+               : "w"(a), "w"(b)
+               : /* No clobbers.  */);
+          #endif
+
+          return (results[0] == 8 && results[1] == 24) ? 1 : 0;
+        }
+    } [add_options_for_arm_v8_2a_dotprod_neon ""]]
+}
+
 # Return 1 if this is a ARM target with NEON enabled.
 
 proc check_effective_target_arm_neon { } {
@@ -5544,11 +5628,6 @@ proc check_effective_target_vect_perm { } {
     return $et_vect_perm_saved($et_index)
 }
 
-proc check_effective_target_vect_any_perm { } {
-    return [expr { [check_effective_target_vect_perm]
-		   || [istarget aarch64*-*-*] }]
-}
-
 # Return 1 if, for some VF:
 #
 # - the target's default vector size is VF * ELEMENT_BITS bits
@@ -5558,22 +5637,42 @@ proc check_effective_target_vect_any_perm { } {
 #      int<ELEMENT_BITS>_t s1[COUNT][COUNT * VF], s2[COUNT * VF];
 #      for (int i = 0; i < COUNT; ++i)
 #        for (int j = 0; j < COUNT * VF; ++j)
-#          s1[i][j] = s2[J - j % COUNT + i % COUNT]
+#          s1[i][j] = s2[j - j % COUNT + i]
 #
 #   using only a single 2-vector permute for each vector in s1.
 #
 # E.g. for COUNT == 3 and vector length 4, the two arrays would be:
 #
-#  s2    | a0 a1 a2 a3 | b0 b1 b2 b3 | c0 c1 c2 c3
-#  ------+-------------+-------------+------------
-#  s1[0] | a0 a0 a0 a3 | a3 a3 b2 b2 | b2 c1 c1 c1
-#  s1[1] | a1 a1 a1 b0 | b0 b0 b3 b3 | b3 c2 c2 c3
-#  s1[2] | a2 a2 a2 b1 | b1 b1 c0 c0 | c0 c3 c3 c3
+#    s2    | a0 a1 a2 a3 | b0 b1 b2 b3 | c0 c1 c2 c3
+#    ------+-------------+-------------+------------
+#    s1[0] | a0 a0 a0 a3 | a3 a3 b2 b2 | b2 c1 c1 c1
+#    s1[1] | a1 a1 a1 b0 | b0 b0 b3 b3 | b3 c2 c2 c2
+#    s1[2] | a2 a2 a2 b1 | b1 b1 c0 c0 | c0 c3 c3 c3
 #
 # Each s1 permute requires only two of a, b and c.
 #
-# In general, this is possible for a VF if VF <= COUNT or if
-# (VF - gcd (VF, COUNT)) is a multiple of COUNT.
+# The distance between the start of vector n in s1[0] and the start
+# of vector n in s2 is:
+#
+#    A = (n * VF) % COUNT
+#
+# The corresponding value for the end of vector n is:
+#
+#    B = (n * VF + VF - 1) % COUNT
+#
+# Subtracting i from each value gives the corresponding difference
+# for s1[i].  The condition being tested by this function is false
+# iff A - i > 0 and B - i < 0 for some i and n, such that the first
+# element for s1[i] comes from vector n - 1 of s2 and the last element
+# comes from vector n + 1 of s2.  The condition is therefore true iff
+# A <= B for all n.  This is turn means the condition is true iff:
+#
+#    (n * VF) % COUNT + (VF - 1) % COUNT < COUNT
+#
+# for all n.  COUNT - (n * VF) % COUNT is bounded by gcd (VF, COUNT),
+# and will be that value for at least one n in [0, COUNT), so we want:
+#
+#    (VF - 1) % COUNT < gcd (VF, COUNT)
 
 proc vect_perm_supported { count element_bits } {
     set vector_bits [lindex [available_vector_sizes] 0]
@@ -5581,10 +5680,16 @@ proc vect_perm_supported { count element_bits } {
 	return 0
     }
     set vf [expr { $vector_bits / $element_bits }]
-    # Since VF is a power of 2, gcd (VF, COUNT) == (COUNT & -COUNT)
-    # when COUNT < VF.
-    return [expr { $vf <= $count
-		   || $vf % $count == ($count & -$count) }]
+
+    # Compute gcd (VF, COUNT).
+    set gcd $vf
+    set temp1 $count
+    while { $temp1 > 0 } {
+	set temp2 [expr { $gcd % $temp1 }]
+	set gcd $temp1
+	set temp1 $temp2
+    }
+    return [expr { ($vf - 1) % $count < $gcd }]
 }
 
 # Return 1 if the target supports SLP permutation of 3 vectors when each
@@ -6005,6 +6110,8 @@ proc check_effective_target_vect_sdot_qi { } {
     } else {
 	set et_vect_sdot_qi_saved($et_index) 0
 	if { [istarget ia64-*-*]
+	     || [istarget aarch64*-*-*]
+	     || [istarget arm*-*-*]
 	     || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
             set et_vect_udot_qi_saved 1
@@ -6029,6 +6136,8 @@ proc check_effective_target_vect_udot_qi { } {
     } else {
 	set et_vect_udot_qi_saved($et_index) 0
         if { [istarget powerpc*-*-*]
+	     || [istarget aarch64*-*-*]
+	     || [istarget arm*-*-*]
 	     || [istarget ia64-*-*]
 	     || ([istarget mips*-*-*]
 		 && [et-is-effective-target mips_msa]) } {
@@ -6354,9 +6463,8 @@ proc check_effective_target_vect_align_stack_vars { } {
 
 proc check_effective_target_vector_alignment_reachable { } {
     set et_vector_alignment_reachable 0
-    if { ![check_effective_target_vect_element_align_preferred]
-	 && ([check_effective_target_vect_aligned_arrays]
-	     || [check_effective_target_natural_alignment_32]) } {
+    if { [check_effective_target_vect_aligned_arrays]
+	 || [check_effective_target_natural_alignment_32] } {
 	set et_vector_alignment_reachable 1
     }
     verbose "check_effective_target_vector_alignment_reachable:\
@@ -6368,9 +6476,8 @@ proc check_effective_target_vector_alignment_reachable { } {
 
 proc check_effective_target_vector_alignment_reachable_for_64bit { } {
     set et_vector_alignment_reachable_for_64bit 0
-    if { ![check_effective_target_vect_element_align_preferred]
-	 && ([check_effective_target_vect_aligned_arrays] 
-	     || [check_effective_target_natural_alignment_64]) } {
+    if { [check_effective_target_vect_aligned_arrays] 
+	 || [check_effective_target_natural_alignment_64] } {
 	set et_vector_alignment_reachable_for_64bit 1
     }
     verbose "check_effective_target_vector_alignment_reachable_for_64bit:\
@@ -6407,7 +6514,7 @@ proc check_effective_target_vect_element_align { } {
 proc check_effective_target_vect_unaligned_possible { } {
     return [expr { ![check_effective_target_vect_element_align_preferred]
 		   && (![check_effective_target_vect_no_align]
-		       || [check_effective_target vect_hw_misalign]) }]
+		       || [check_effective_target_vect_hw_misalign]) }]
 }
 
 # Return 1 if the target supports vector LOAD_LANES operations, 0 otherwise.
@@ -6697,19 +6804,6 @@ foreach N {2 3 4 8} {
 # Return the list of vector sizes (in bits) that each target supports.
 # A vector length of "0" indicates variable-length vectors.
 
-proc check_effective_target_vect_multiple_sizes { } {
-    global et_vect_multiple_sizes_saved
-    global et_index
-
-    set et_vect_multiple_sizes_saved($et_index) 0
-    if { [istarget aarch64*-*-*]
-	 || [is-effective-target arm_neon]
-	 || (([istarget i?86-*-*] || [istarget x86_64-*-*])
-	     && ([check_avx_available] && ![check_prefer_avx128])) } {
-	set et_vect_multiple_sizes_saved($et_index) 1
-    }
-}
-
 proc available_vector_sizes { } {
     set result {}
     if { [istarget aarch64*-*-*] } {
@@ -6732,28 +6826,22 @@ proc available_vector_sizes { } {
     return $result
 }
 
-# Return true if variable-length vectors are supported.
-
-proc check_effective_target_vect_variable_length { } {
-    return [expr { [lindex [available_vector_sizes] 0] == 0 }]
-}
-
-# Return true if exactly 3 distinct vector sizes are supported.
+# Return 1 if the target supports multiple vector sizes
 
-proc check_effective_target_vect_3_sizes { } {
-    return [expr { [llength [available_vector_sizes]] == 3 }]
+proc check_effective_target_vect_multiple_sizes { } {
+    return [expr { [llength [available_vector_sizes]] > 1 }]
 }
 
-# Return true if exactly 2 distinct vector sizes are supported.
+# Return true if variable-length vectors are supported.
 
-proc check_effective_target_vect_2_sizes { } {
-    return [expr { [llength [available_vector_sizes]] == 2 }]
+proc check_effective_target_vect_variable_length { } {
+    return [expr { [lindex [available_vector_sizes] 0] == 0 }]
 }
 
-# Return true if exactly 1 distinct vector size is supported.
+# Return 1 if the target supports vectors of 64 bits.
 
-proc check_effective_target_vect_1_size { } {
-    return [expr { [llength [available_vector_sizes]] == 1 }]
+proc check_effective_target_vect64 { } {
+    return [expr { [lsearch -exact [available_vector_sizes] 64] >= 0 }]
 }
 
 # Return 1 if the target supports vectors of 256 bits.
@@ -6762,12 +6850,6 @@ proc check_effective_target_vect256 { } {
     return [expr { [lsearch -exact [available_vector_sizes] 256] >= 0 }]
 }
 
-# Return 1 if the target supports vectors of 64 bits.
-
-proc check_effective_target_vect64 { } {
-    return [expr { [lsearch -exact [available_vector_sizes] 64] >= 0 }]
-}
-
 # Return 1 if the target supports vector copysignf calls.
 
 proc check_effective_target_vect_call_copysignf { } {
@@ -8552,7 +8634,7 @@ proc check_effective_target_aarch64_tiny { } {
 # Create functions to check that the AArch64 assembler supports the
 # various architecture extensions via the .arch_extension pseudo-op.
 
-foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse"} {
+foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod"} {
     eval [string map [list FUNC $aarch64_ext] {
 	proc check_effective_target_aarch64_asm_FUNC_ok { } {
 	  if { [istarget aarch64*-*-*] } {
@@ -9111,14 +9193,9 @@ proc check_effective_target_autoincdec { } {
 # 
 proc check_effective_target_supports_stack_clash_protection { } {
 
-   # Temporary until the target bits are fully ACK'd.
-#  if { [istarget aarch*-*-*] } {
-#	return 1
-#  }
-
     if { [istarget x86_64-*-*] || [istarget i?86-*-*] 
 	  || [istarget powerpc*-*-*] || [istarget rs6000*-*-*]
-	  || [istarget s390*-*-*] } {
+	  || [istarget aarch64*-**] || [istarget s390*-*-*] } {
 	return 1
     }
   return 0
@@ -9185,3 +9262,16 @@ proc check_effective_target_callee_realigns_stack { } {
   }
   return 0
 }
+
+# Return 1 if CET instructions can be compiled.
+proc check_effective_target_cet { } {
+    if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+	return 0
+    }
+    return [check_no_compiler_messages cet object {
+	void foo (void)
+	{
+	  asm ("setssbsy");
+	}
+    } "-O2" ]
+}
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 48d580c3ab0..e5292d4b314 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -88,8 +88,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbxout.h"
 #endif
 
-#include "sdbout.h"
-
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data declarations. */
 #endif
@@ -958,7 +956,7 @@ output_stack_usage (void)
   stack_usage_kind = STATIC;
 
   /* Add the maximum amount of space pushed onto the stack.  */
-  if (maybe_nonzero (current_function_pushed_stack_size))
+  if (may_ne (current_function_pushed_stack_size, 0))
     {
       HOST_WIDE_INT extra;
       if (current_function_pushed_stack_size.is_constant (&extra))
@@ -1290,6 +1288,32 @@ process_options (void)
 	   "-floop-parallelize-all)");
 #endif
 
+  if (flag_cf_protection != CF_NONE
+      && !(flag_cf_protection & CF_SET))
+    {
+      if (flag_cf_protection == CF_FULL)
+	{
+	  error_at (UNKNOWN_LOCATION,
+		    "%<-fcf-protection=full%> is not supported for this "
+		    "target");
+	  flag_cf_protection = CF_NONE;
+	}
+      if (flag_cf_protection == CF_BRANCH)
+	{
+	  error_at (UNKNOWN_LOCATION,
+		    "%<-fcf-protection=branch%> is not supported for this "
+		    "target");
+	  flag_cf_protection = CF_NONE;
+	}
+      if (flag_cf_protection == CF_RETURN)
+	{
+	  error_at (UNKNOWN_LOCATION,
+		    "%<-fcf-protection=return%> is not supported for this "
+		    "target");
+	  flag_cf_protection = CF_NONE;
+	}
+    }
+
   if (flag_check_pointer_bounds)
     {
       if (targetm.chkp_bound_mode () == VOIDmode)
@@ -1454,8 +1478,6 @@ process_options (void)
   else if (write_symbols == XCOFF_DEBUG)
     debug_hooks = &xcoff_debug_hooks;
 #endif
-  else if (SDB_DEBUGGING_INFO && write_symbols == SDB_DEBUG)
-    debug_hooks = &sdb_debug_hooks;
 #ifdef DWARF2_DEBUGGING_INFO
   else if (write_symbols == DWARF2_DEBUG)
     debug_hooks = &dwarf2_debug_hooks;
diff --git a/gcc/tracer.c b/gcc/tracer.c
index dd071c1650c..58caf13b0de 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -132,9 +132,9 @@ count_insns (basic_block bb)
 static bool
 better_p (const_edge e1, const_edge e2)
 {
-  if (e1->count.initialized_p () && e2->count.initialized_p ()
-      && !(e1->count == e2->count))
-    return e1->count > e2->count;
+  if (e1->count ().initialized_p () && e2->count ().initialized_p ()
+      && ((e1->count () > e2->count ()) || (e1->count () < e2->count  ())))
+    return e1->count () > e2->count ();
   if (EDGE_FREQUENCY (e1) != EDGE_FREQUENCY (e2))
     return EDGE_FREQUENCY (e1) > EDGE_FREQUENCY (e2);
   /* This is needed to avoid changes in the decision after
@@ -179,7 +179,7 @@ find_best_predecessor (basic_block bb)
   if (!best || ignore_bb_p (best->src))
     return NULL;
   if (EDGE_FREQUENCY (best) * REG_BR_PROB_BASE
-      < bb->frequency * branch_ratio_cutoff)
+      < bb->count.to_frequency (cfun) * branch_ratio_cutoff)
     return NULL;
   return best;
 }
@@ -194,7 +194,7 @@ find_trace (basic_block bb, basic_block *trace)
   edge e;
 
   if (dump_file)
-    fprintf (dump_file, "Trace seed %i [%i]", bb->index, bb->frequency);
+    fprintf (dump_file, "Trace seed %i [%i]", bb->index, bb->count.to_frequency (cfun));
 
   while ((e = find_best_predecessor (bb)) != NULL)
     {
@@ -203,11 +203,11 @@ find_trace (basic_block bb, basic_block *trace)
 	  || find_best_successor (bb2) != e)
 	break;
       if (dump_file)
-	fprintf (dump_file, ",%i [%i]", bb->index, bb->frequency);
+	fprintf (dump_file, ",%i [%i]", bb->index, bb->count.to_frequency (cfun));
       bb = bb2;
     }
   if (dump_file)
-    fprintf (dump_file, " forward %i [%i]", bb->index, bb->frequency);
+    fprintf (dump_file, " forward %i [%i]", bb->index, bb->count.to_frequency (cfun));
   trace[i++] = bb;
 
   /* Follow the trace in forward direction.  */
@@ -218,7 +218,7 @@ find_trace (basic_block bb, basic_block *trace)
 	  || find_best_predecessor (bb) != e)
 	break;
       if (dump_file)
-	fprintf (dump_file, ",%i [%i]", bb->index, bb->frequency);
+	fprintf (dump_file, ",%i [%i]", bb->index, bb->count.to_frequency (cfun));
       trace[i++] = bb;
     }
   if (dump_file)
@@ -282,11 +282,11 @@ tail_duplicate (void)
     {
       int n = count_insns (bb);
       if (!ignore_bb_p (bb))
-	blocks[bb->index] = heap.insert (-bb->frequency, bb);
+	blocks[bb->index] = heap.insert (-bb->count.to_frequency (cfun), bb);
 
       counts [bb->index] = n;
       ninsns += n;
-      weighted_insns += n * bb->frequency;
+      weighted_insns += n * bb->count.to_frequency (cfun);
     }
 
   if (profile_info && profile_status_for_fn (cfun) == PROFILE_READ)
@@ -314,7 +314,7 @@ tail_duplicate (void)
       n = find_trace (bb, trace);
 
       bb = trace[0];
-      traced_insns += bb->frequency * counts [bb->index];
+      traced_insns += bb->count.to_frequency (cfun) * counts [bb->index];
       if (blocks[bb->index])
 	{
 	  heap.delete_node (blocks[bb->index]);
@@ -330,7 +330,7 @@ tail_duplicate (void)
 	      heap.delete_node (blocks[bb2->index]);
 	      blocks[bb2->index] = NULL;
 	    }
-	  traced_insns += bb2->frequency * counts [bb2->index];
+	  traced_insns += bb2->count.to_frequency (cfun) * counts [bb2->index];
 	  if (EDGE_COUNT (bb2->preds) > 1
 	      && can_duplicate_block_p (bb2)
 	      /* We have the tendency to duplicate the loop header
@@ -345,11 +345,11 @@ tail_duplicate (void)
 	      /* Reconsider the original copy of block we've duplicated.
 	         Removing the most common predecessor may make it to be
 	         head.  */
-	      blocks[bb2->index] = heap.insert (-bb2->frequency, bb2);
+	      blocks[bb2->index] = heap.insert (-bb2->count.to_frequency (cfun), bb2);
 
 	      if (dump_file)
 		fprintf (dump_file, "Duplicated %i as %i [%i]\n",
-			 bb2->index, copy->index, copy->frequency);
+			 bb2->index, copy->index, copy->count.to_frequency (cfun));
 
 	      bb2 = copy;
 	      changed = true;
diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index 40b53681186..ef5655aa61a 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -2932,17 +2932,13 @@ expand_transaction (struct tm_region *region, void *data ATTRIBUTE_UNUSED)
       edge ef = make_edge (test_bb, join_bb, EDGE_FALSE_VALUE);
       redirect_edge_pred (fallthru_edge, join_bb);
 
-      join_bb->frequency = test_bb->frequency = transaction_bb->frequency;
       join_bb->count = test_bb->count = transaction_bb->count;
 
       ei->probability = profile_probability::always ();
       et->probability = profile_probability::likely ();
       ef->probability = profile_probability::unlikely ();
-      et->count = test_bb->count.apply_probability (et->probability);
-      ef->count = test_bb->count.apply_probability (ef->probability);
 
-      code_bb->count = et->count;
-      code_bb->frequency = EDGE_FREQUENCY (et);
+      code_bb->count = et->count ();
 
       transaction_bb = join_bb;
     }
@@ -2966,7 +2962,6 @@ expand_transaction (struct tm_region *region, void *data ATTRIBUTE_UNUSED)
       gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
 
       edge ei = make_edge (transaction_bb, test_bb, EDGE_FALLTHRU);
-      test_bb->frequency = transaction_bb->frequency;
       test_bb->count = transaction_bb->count;
       ei->probability = profile_probability::always ();
 
@@ -2975,15 +2970,11 @@ expand_transaction (struct tm_region *region, void *data ATTRIBUTE_UNUSED)
       redirect_edge_pred (fallthru_edge, test_bb);
       fallthru_edge->flags = EDGE_FALSE_VALUE;
       fallthru_edge->probability = profile_probability::very_likely ();
-      fallthru_edge->count = test_bb->count.apply_probability
-				(fallthru_edge->probability);
 
       // Abort/over edge.
       redirect_edge_pred (abort_edge, test_bb);
       abort_edge->flags = EDGE_TRUE_VALUE;
       abort_edge->probability = profile_probability::unlikely ();
-      abort_edge->count = test_bb->count.apply_probability
-				(abort_edge->probability);
 
       transaction_bb = test_bb;
     }
@@ -3011,8 +3002,7 @@ expand_transaction (struct tm_region *region, void *data ATTRIBUTE_UNUSED)
       // out of the fallthru edge.
       edge e = make_edge (transaction_bb, test_bb, fallthru_edge->flags);
       e->probability = fallthru_edge->probability;
-      test_bb->count = e->count = fallthru_edge->count;
-      test_bb->frequency = EDGE_FREQUENCY (e);
+      test_bb->count = fallthru_edge->count ();
 
       // Now update the edges to the inst/uninist implementations.
       // For now assume that the paths are equally likely.  When using HTM,
@@ -3022,14 +3012,10 @@ expand_transaction (struct tm_region *region, void *data ATTRIBUTE_UNUSED)
       redirect_edge_pred (inst_edge, test_bb);
       inst_edge->flags = EDGE_FALSE_VALUE;
       inst_edge->probability = profile_probability::even ();
-      inst_edge->count
-	= test_bb->count.apply_probability (inst_edge->probability);
 
       redirect_edge_pred (uninst_edge, test_bb);
       uninst_edge->flags = EDGE_TRUE_VALUE;
       uninst_edge->probability = profile_probability::even ();
-      uninst_edge->count
-	= test_bb->count.apply_probability (uninst_edge->probability);
     }
 
   // If we have no previous special cases, and we have PHIs at the beginning
@@ -3214,10 +3200,7 @@ split_bb_make_tm_edge (gimple *stmt, basic_block dest_bb,
     }
   edge e = make_edge (bb, dest_bb, EDGE_ABNORMAL);
   if (e)
-    {
-      e->probability = profile_probability::guessed_never ();
-      e->count = profile_count::guessed_zero ();
-    }
+    e->probability = profile_probability::guessed_never ();
 
   // Record the need for the edge for the benefit of the rtl passes.
   if (cfun->gimple_df->tm_restart == NULL)
diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
index 092b1e017af..5e7aef1113a 100644
--- a/gcc/tree-affine.c
+++ b/gcc/tree-affine.c
@@ -815,16 +815,16 @@ wide_int_constant_multiple_p (const poly_widest_int &val,
 {
   poly_widest_int rem, cst;
 
-  if (known_zero (val))
+  if (must_eq (val, 0))
     {
-      if (*mult_set && maybe_nonzero (*mult))
+      if (*mult_set && may_ne (*mult, 0))
 	return false;
       *mult_set = true;
       *mult = 0;
       return true;
     }
 
-  if (maybe_zero (div))
+  if (may_eq (div, 0))
     return false;
 
   if (!multiple_p (val, div, &cst))
@@ -848,7 +848,7 @@ aff_combination_constant_multiple_p (aff_tree *val, aff_tree *div,
   bool mult_set = false;
   unsigned i;
 
-  if (val->n == 0 && known_zero (val->offset))
+  if (val->n == 0 && must_eq (val->offset, 0))
     {
       *mult = 0;
       return true;
diff --git a/gcc/tree-affine.h b/gcc/tree-affine.h
index c08d4e5fc6b..0acf47410a7 100644
--- a/gcc/tree-affine.h
+++ b/gcc/tree-affine.h
@@ -102,7 +102,7 @@ aff_combination_zero_p (aff_tree *aff)
   if (!aff)
     return true;
 
-  if (aff->n == 0 && known_zero (aff->offset))
+  if (aff->n == 0 && must_eq (aff->offset, 0))
     return true;
 
   return false;
@@ -121,7 +121,7 @@ inline bool
 aff_combination_singleton_var_p (aff_tree *aff)
 {
   return (aff->n == 1
-	  && known_zero (aff->offset)
+	  && must_eq (aff->offset, 0)
 	  && (aff->elts[0].coef == 1 || aff->elts[0].coef == -1));
 }
 #endif /* GCC_TREE_AFFINE_H */
diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index 1578350c0c6..02c89cce62f 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -314,6 +314,7 @@ can_test_argument_range (gcall *call)
     CASE_FLT_FN (BUILT_IN_POW10):
     /* Sqrt.  */
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
       return check_builtin_call (call);
     /* Special one: two argument pow.  */
     case BUILT_IN_POW:
@@ -342,6 +343,7 @@ edom_only_function (gcall *call)
     CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
     CASE_FLT_FN (BUILT_IN_SIN):
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
     CASE_FLT_FN (BUILT_IN_FMOD):
     CASE_FLT_FN (BUILT_IN_REMAINDER):
       return true;
@@ -703,6 +705,7 @@ get_no_error_domain (enum built_in_function fnc)
                          308, true, false);
     /* sqrt: [0, +inf)  */
     CASE_FLT_FN (BUILT_IN_SQRT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_SQRT):
       return get_domain (0, true, true,
                          0, false, false);
     default:
@@ -903,7 +906,6 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds,
      Here we take the second approach because it's slightly simpler
      and because it's easy to see that it doesn't lose profile counts.  */
   bi_call_bb->count = profile_count::zero ();
-  bi_call_bb->frequency = 0;
   while (!edges.is_empty ())
     {
       edge_pair e = edges.pop ();
@@ -913,23 +915,13 @@ shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds,
       gcc_assert (src_bb == nocall_edge->src);
 
       call_edge->probability = profile_probability::very_unlikely ();
-      call_edge->count
-	 = src_bb->count.apply_probability (call_edge->probability);
       nocall_edge->probability = profile_probability::always ()
 				 - call_edge->probability;
-      nocall_edge->count = src_bb->count - call_edge->count;
 
-      unsigned int call_frequency
-	 = call_edge->probability.apply (src_bb->frequency);
-
-      bi_call_bb->count += call_edge->count;
-      bi_call_bb->frequency += call_frequency;
+      bi_call_bb->count += call_edge->count ();
 
       if (nocall_edge->dest != join_tgt_bb)
-	{
-	  nocall_edge->dest->count = nocall_edge->count;
-	  nocall_edge->dest->frequency = src_bb->frequency - call_frequency;
-	}
+	nocall_edge->dest->count = src_bb->count - bi_call_bb->count;
     }
 
   if (dom_info_available_p (CDI_DOMINATORS))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 53978fbafa1..105e5a1dde7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -1062,8 +1062,8 @@ gimple_find_sub_bbs (gimple_seq seq, gimple_stmt_iterator *gsi)
       edge_iterator ei;
       FOR_EACH_EDGE (e, ei, bb->preds)
 	{
-	  if (e->count.initialized_p ())
-	    cnt += e->count;
+	  if (e->count ().initialized_p ())
+	    cnt += e->count ();
 	  else
 	    all = false;
 	  freq += EDGE_FREQUENCY (e);
@@ -1071,9 +1071,6 @@ gimple_find_sub_bbs (gimple_seq seq, gimple_stmt_iterator *gsi)
       tree_guess_outgoing_edge_probabilities (bb);
       if (all || profile_status_for_fn (cfun) == PROFILE_READ)
         bb->count = cnt;
-      bb->frequency = freq;
-      FOR_EACH_EDGE (e, ei, bb->succs)
-	e->count = bb->count.apply_probability (e->probability);
 
       bb = bb->next_bb;
     }
@@ -2083,7 +2080,6 @@ gimple_merge_blocks (basic_block a, basic_block b)
   if (a->loop_father == b->loop_father)
     {
       a->count = a->count.merge (b->count);
-      a->frequency = MAX (a->frequency, b->frequency);
     }
 
   /* Merge the sequences.  */
@@ -2842,8 +2838,7 @@ gimple_split_edge (edge edge_in)
   after_bb = split_edge_bb_loc (edge_in);
 
   new_bb = create_empty_bb (after_bb);
-  new_bb->frequency = EDGE_FREQUENCY (edge_in);
-  new_bb->count = edge_in->count;
+  new_bb->count = edge_in->count ();
 
   e = redirect_edge_and_branch (edge_in, new_bb);
   gcc_assert (e == edge_in);
@@ -6342,9 +6337,8 @@ gimple_duplicate_sese_region (edge entry, edge exit,
   bool free_region_copy = false, copying_header = false;
   struct loop *loop = entry->dest->loop_father;
   edge exit_copy;
-  vec<basic_block> doms;
+  vec<basic_block> doms = vNULL;
   edge redirected;
-  int total_freq = 0, entry_freq = 0;
   profile_count total_count = profile_count::uninitialized ();
   profile_count entry_count = profile_count::uninitialized ();
 
@@ -6406,27 +6400,16 @@ gimple_duplicate_sese_region (edge entry, edge exit,
   if (entry->dest->count.initialized_p ())
     {
       total_count = entry->dest->count;
-      entry_count = entry->count;
+      entry_count = entry->count ();
       /* Fix up corner cases, to avoid division by zero or creation of negative
 	 frequencies.  */
       if (entry_count > total_count)
 	entry_count = total_count;
     }
-  if (!(total_count > 0) || !(entry_count > 0))
-    {
-      total_freq = entry->dest->frequency;
-      entry_freq = EDGE_FREQUENCY (entry);
-      /* Fix up corner cases, to avoid division by zero or creation of negative
-	 frequencies.  */
-      if (total_freq == 0)
-	total_freq = 1;
-      else if (entry_freq > total_freq)
-	entry_freq = total_freq;
-    }
 
   copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
 	    split_edge_bb_loc (entry), update_dominance);
-  if (total_count > 0 && entry_count > 0)
+  if (total_count.initialized_p () && entry_count.initialized_p ())
     {
       scale_bbs_frequencies_profile_count (region, n_region,
 				           total_count - entry_count,
@@ -6434,12 +6417,6 @@ gimple_duplicate_sese_region (edge entry, edge exit,
       scale_bbs_frequencies_profile_count (region_copy, n_region, entry_count,
 				           total_count);
     }
-  else
-    {
-      scale_bbs_frequencies_int (region, n_region, total_freq - entry_freq,
-				 total_freq);
-      scale_bbs_frequencies_int (region_copy, n_region, entry_freq, total_freq);
-    }
 
   if (copying_header)
     {
@@ -6528,7 +6505,6 @@ gimple_duplicate_sese_tail (edge entry, edge exit,
   struct loop *orig_loop = entry->dest->loop_father;
   basic_block switch_bb, entry_bb, nentry_bb;
   vec<basic_block> doms;
-  int total_freq = 0, exit_freq = 0;
   profile_count total_count = profile_count::uninitialized (),
 		exit_count = profile_count::uninitialized ();
   edge exits[2], nexits[2], e;
@@ -6573,30 +6549,16 @@ gimple_duplicate_sese_tail (edge entry, edge exit,
      inside.  */
   doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region);
 
-  if (exit->src->count > 0)
-    {
-      total_count = exit->src->count;
-      exit_count = exit->count;
-      /* Fix up corner cases, to avoid division by zero or creation of negative
-	 frequencies.  */
-      if (exit_count > total_count)
-	exit_count = total_count;
-    }
-  else
-    {
-      total_freq = exit->src->frequency;
-      exit_freq = EDGE_FREQUENCY (exit);
-      /* Fix up corner cases, to avoid division by zero or creation of negative
-	 frequencies.  */
-      if (total_freq == 0)
-	total_freq = 1;
-      if (exit_freq > total_freq)
-	exit_freq = total_freq;
-    }
+  total_count = exit->src->count;
+  exit_count = exit->count ();
+  /* Fix up corner cases, to avoid division by zero or creation of negative
+     frequencies.  */
+  if (exit_count > total_count)
+    exit_count = total_count;
 
   copy_bbs (region, n_region, region_copy, exits, 2, nexits, orig_loop,
 	    split_edge_bb_loc (exit), true);
-  if (total_count.initialized_p ())
+  if (total_count.initialized_p () && exit_count.initialized_p ())
     {
       scale_bbs_frequencies_profile_count (region, n_region,
 				           total_count - exit_count,
@@ -6604,12 +6566,6 @@ gimple_duplicate_sese_tail (edge entry, edge exit,
       scale_bbs_frequencies_profile_count (region_copy, n_region, exit_count,
 				           total_count);
     }
-  else
-    {
-      scale_bbs_frequencies_int (region, n_region, total_freq - exit_freq,
-				 total_freq);
-      scale_bbs_frequencies_int (region_copy, n_region, exit_freq, total_freq);
-    }
 
   /* Create the switch block, and put the exit condition to it.  */
   entry_bb = entry->dest;
@@ -6631,10 +6587,8 @@ gimple_duplicate_sese_tail (edge entry, edge exit,
   sorig = single_succ_edge (switch_bb);
   sorig->flags = exits[1]->flags;
   sorig->probability = exits[1]->probability;
-  sorig->count = exits[1]->count;
   snew = make_edge (switch_bb, nentry_bb, exits[0]->flags);
   snew->probability = exits[0]->probability;
-  snew->count = exits[1]->count;
   
 
   /* Register the new edge from SWITCH_BB in loop exit lists.  */
@@ -7652,9 +7606,15 @@ move_sese_region_to_fn (struct function *dest_cfun, basic_block entry_bb,
      FIXME, this is silly.  The CFG ought to become a parameter to
      these helpers.  */
   push_cfun (dest_cfun);
-  make_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun), entry_bb, EDGE_FALLTHRU);
+  ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = entry_bb->count;
+  make_single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun), entry_bb, EDGE_FALLTHRU);
   if (exit_bb)
-    make_edge (exit_bb,  EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
+    {
+      make_single_succ_edge (exit_bb,  EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
+      EXIT_BLOCK_PTR_FOR_FN (cfun)->count = exit_bb->count;
+    }
+  else
+    EXIT_BLOCK_PTR_FOR_FN (cfun)->count = profile_count::zero ();
   pop_cfun ();
 
   /* Back in the original function, the SESE region has disappeared,
@@ -8369,7 +8329,6 @@ gimple_flow_call_edges_add (sbitmap blocks)
 		    }
 		  e = make_edge (bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE);
 		  e->probability = profile_probability::guessed_never ();
-		  e->count = profile_count::guessed_zero ();
 		}
 	      gsi_prev (&gsi);
 	    }
@@ -8730,7 +8689,7 @@ gimple_account_profile_record (basic_block bb, int after_pass,
       else if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
 	record->time[after_pass]
 	  += estimate_num_insns (gsi_stmt (i),
-				 &eni_time_weights) * bb->frequency;
+				 &eni_time_weights) * bb->count.to_frequency (cfun);
     }
 }
 
@@ -8881,14 +8840,11 @@ insert_cond_bb (basic_block bb, gimple *stmt, gimple *cond,
   new_bb = create_empty_bb (bb);
   edge e = make_edge (bb, new_bb, EDGE_TRUE_VALUE);
   e->probability = prob;
-  e->count = bb->count.apply_probability (prob);
-  new_bb->count = e->count;
-  new_bb->frequency = prob.apply (bb->frequency);
+  new_bb->count = e->count ();
   make_single_succ_edge (new_bb, fall->dest, EDGE_FALLTHRU);
 
   /* Fix edge for split bb.  */
   fall->flags = EDGE_FALSE_VALUE;
-  fall->count -= e->count;
   fall->probability -= e->probability;
 
   /* Update dominance info.  */
@@ -9118,13 +9074,29 @@ pass_warn_function_return::execute (function *fun)
       && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0)
     {
       location = UNKNOWN_LOCATION;
-      FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (fun)->preds)
+      for (ei = ei_start (EXIT_BLOCK_PTR_FOR_FN (fun)->preds);
+	   (e = ei_safe_edge (ei)); )
 	{
 	  last = last_stmt (e->src);
 	  if ((gimple_code (last) == GIMPLE_RETURN
 	       || gimple_call_builtin_p (last, BUILT_IN_RETURN))
-	      && (location = gimple_location (last)) != UNKNOWN_LOCATION)
+	      && location == UNKNOWN_LOCATION
+	      && (location = gimple_location (last)) != UNKNOWN_LOCATION
+	      && !optimize)
 	    break;
+	  /* When optimizing, replace return stmts in noreturn functions
+	     with __builtin_unreachable () call.  */
+	  if (optimize && gimple_code (last) == GIMPLE_RETURN)
+	    {
+	      tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+	      gimple *new_stmt = gimple_build_call (fndecl, 0);
+	      gimple_set_location (new_stmt, gimple_location (last));
+	      gimple_stmt_iterator gsi = gsi_for_stmt (last);
+	      gsi_replace (&gsi, new_stmt, true);
+	      remove_edge (e);
+	    }
+	  else
+	    ei_next (&ei);
 	}
       if (location == UNKNOWN_LOCATION)
 	location = cfun->function_end_locus;
@@ -9286,23 +9258,18 @@ execute_fixup_cfg (void)
   basic_block bb;
   gimple_stmt_iterator gsi;
   int todo = 0;
-  edge e;
-  edge_iterator ei;
   cgraph_node *node = cgraph_node::get (current_function_decl);
   profile_count num = node->count;
   profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
-  bool scale = num.initialized_p ()
-	       && (den > 0 || num == profile_count::zero ())
-	       && !(num == den);
+  bool scale = num.initialized_p () && den.ipa_p ()
+	       && (den.nonzero_p () || num == profile_count::zero ())
+	       && !(num == den.ipa ());
 
   if (scale)
     {
       ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = node->count;
       EXIT_BLOCK_PTR_FOR_FN (cfun)->count
         = EXIT_BLOCK_PTR_FOR_FN (cfun)->count.apply_scale (num, den);
-
-      FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
-	e->count = e->count.apply_scale (num, den);
     }
 
   FOR_EACH_BB_FN (bb, cfun)
@@ -9377,10 +9344,6 @@ execute_fixup_cfg (void)
 	  gsi_next (&gsi);
 	}
 
-      if (scale)
-	FOR_EACH_EDGE (e, ei, bb->succs)
-	  e->count = e->count.apply_scale (num, den);
-
       /* If we have a basic block with no successors that does not
 	 end with a control statement or a noreturn call end it with
 	 a call to __builtin_unreachable.  This situation can occur
diff --git a/gcc/tree-cfgcleanup.c b/gcc/tree-cfgcleanup.c
index 1a71c93aeed..9b7f08c586c 100644
--- a/gcc/tree-cfgcleanup.c
+++ b/gcc/tree-cfgcleanup.c
@@ -195,7 +195,6 @@ cleanup_control_expr_graph (basic_block bb, gimple_stmt_iterator gsi,
 		}
 
 	      taken_edge->probability += e->probability;
-	      taken_edge->count += e->count;
 	      remove_edge_and_dominated_blocks (e);
 	      retval = true;
 	    }
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index de18984f29c..f73db4000ce 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -2276,8 +2276,7 @@ chkp_build_returned_bound (gcall *call)
      it separately.  */
   if (fndecl
       && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
-      && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA
-	  || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA_WITH_ALIGN))
+      && ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (fndecl)))
     {
       tree size = gimple_call_arg (call, 0);
       gimple_stmt_iterator iter = gsi_for_stmt (call);
diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
index d61047bbf5f..146b52bbd52 100644
--- a/gcc/tree-complex.c
+++ b/gcc/tree-complex.c
@@ -60,6 +60,11 @@ typedef int complex_lattice_t;
 
 #define PAIR(a, b)  ((a) << 2 | (b))
 
+class complex_propagate : public ssa_propagation_engine
+{
+  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
+  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
+};
 
 static vec<complex_lattice_t> complex_lattice_values;
 
@@ -300,9 +305,9 @@ init_dont_simulate_again (void)
 
 /* Evaluate statement STMT against the complex lattice defined above.  */
 
-static enum ssa_prop_result
-complex_visit_stmt (gimple *stmt, edge *taken_edge_p ATTRIBUTE_UNUSED,
-		    tree *result_p)
+enum ssa_prop_result
+complex_propagate::visit_stmt (gimple *stmt, edge *taken_edge_p ATTRIBUTE_UNUSED,
+			       tree *result_p)
 {
   complex_lattice_t new_l, old_l, op1_l, op2_l;
   unsigned int ver;
@@ -395,8 +400,8 @@ complex_visit_stmt (gimple *stmt, edge *taken_edge_p ATTRIBUTE_UNUSED,
 
 /* Evaluate a PHI node against the complex lattice defined above.  */
 
-static enum ssa_prop_result
-complex_visit_phi (gphi *phi)
+enum ssa_prop_result
+complex_propagate::visit_phi (gphi *phi)
 {
   complex_lattice_t new_l, old_l;
   unsigned int ver;
@@ -1186,19 +1191,16 @@ expand_complex_div_wide (gimple_stmt_iterator *gsi, tree inner_type,
       bb_join = e->dest;
       bb_true = create_empty_bb (bb_cond);
       bb_false = create_empty_bb (bb_true);
-      bb_true->frequency = bb_false->frequency = bb_cond->frequency / 2;
       bb_true->count = bb_false->count
 	 = bb_cond->count.apply_probability (profile_probability::even ());
 
       /* Wire the blocks together.  */
       e->flags = EDGE_TRUE_VALUE;
-      e->count = bb_true->count;
       /* TODO: With value profile we could add an historgram to determine real
 	 branch outcome.  */
       e->probability = profile_probability::even ();
       redirect_edge_succ (e, bb_true);
       edge e2 = make_edge (bb_cond, bb_false, EDGE_FALSE_VALUE);
-      e2->count = bb_false->count;
       e2->probability = profile_probability::even ();
       make_single_succ_edge (bb_true, bb_join, EDGE_FALLTHRU);
       make_single_succ_edge (bb_false, bb_join, EDGE_FALLTHRU);
@@ -1675,7 +1677,8 @@ tree_lower_complex (void)
   complex_lattice_values.safe_grow_cleared (num_ssa_names);
 
   init_parameter_lattice_values ();
-  ssa_propagate (complex_visit_stmt, complex_visit_phi);
+  class complex_propagate complex_propagate;
+  complex_propagate.ssa_propagate ();
 
   complex_variable_components = new int_tree_htab_type (10);
 
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 4e18dc650c5..bd85898a6f0 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1544,7 +1544,6 @@ struct GTY(()) tree_type_common {
   tree reference_to;
   union tree_type_symtab {
     int GTY ((tag ("TYPE_SYMTAB_IS_ADDRESS"))) address;
-    const char * GTY ((tag ("TYPE_SYMTAB_IS_POINTER"))) pointer;
     struct die_struct * GTY ((tag ("TYPE_SYMTAB_IS_DIE"))) die;
   } GTY ((desc ("debug_hooks->tree_type_symtab_field"))) symtab;
   tree canonical;
@@ -2065,7 +2064,7 @@ struct floatn_type_info {
                                 Global variables
 ---------------------------------------------------------------------------*/
 /* Matrix describing the structures contained in a given tree code.  */
-extern unsigned char tree_contains_struct[MAX_TREE_CODES][64];
+extern bool tree_contains_struct[MAX_TREE_CODES][64];
 
 /* Class of tree given its code.  */
 extern const enum tree_code_class tree_code_type[];
diff --git a/gcc/tree-dump.c b/gcc/tree-dump.c
index ac0c7b868a1..d691278bbb2 100644
--- a/gcc/tree-dump.c
+++ b/gcc/tree-dump.c
@@ -337,7 +337,8 @@ dequeue_and_dump (dump_info_p di)
       /* All declarations have names.  */
       if (DECL_NAME (t))
 	dump_child ("name", DECL_NAME (t));
-      if (DECL_ASSEMBLER_NAME_SET_P (t)
+      if (HAS_DECL_ASSEMBLER_NAME_P (t)
+	  && DECL_ASSEMBLER_NAME_SET_P (t)
 	  && DECL_ASSEMBLER_NAME (t) != DECL_NAME (t))
 	dump_child ("mngl", DECL_ASSEMBLER_NAME (t));
       if (DECL_ABSTRACT_ORIGIN (t))
diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
index a1d35bace3a..21b2fa9c959 100644
--- a/gcc/tree-eh.c
+++ b/gcc/tree-eh.c
@@ -3223,6 +3223,7 @@ lower_resx (basic_block bb, gresx *stmt,
 	      gimple_stmt_iterator gsi2;
 
 	      new_bb = create_empty_bb (bb);
+	      new_bb->count = bb->count;
 	      add_bb_to_loop (new_bb, bb->loop_father);
 	      lab = gimple_block_label (new_bb);
 	      gsi2 = gsi_start_bb (new_bb);
@@ -3258,7 +3259,6 @@ lower_resx (basic_block bb, gresx *stmt,
 	  gcc_assert (e->flags & EDGE_EH);
 	  e->flags = (e->flags & ~EDGE_EH) | EDGE_FALLTHRU;
 	  e->probability = profile_probability::always ();
-	  e->count = bb->count;
 
 	  /* If there are no more EH users of the landing pad, delete it.  */
 	  FOR_EACH_EDGE (e, ei, e->dest->preds)
@@ -3779,7 +3779,10 @@ pass_lower_eh_dispatch::execute (function *fun)
     }
 
   if (redirected)
-    delete_unreachable_blocks ();
+    {
+      free_dominance_info (CDI_DOMINATORS);
+      delete_unreachable_blocks ();
+    }
   return flags;
 }
 
@@ -4098,7 +4101,6 @@ unsplit_eh (eh_landing_pad lp)
   redirect_edge_pred (e_out, e_in->src);
   e_out->flags = e_in->flags;
   e_out->probability = e_in->probability;
-  e_out->count = e_in->count;
   remove_edge (e_in);
 
   return true;
@@ -4291,7 +4293,6 @@ cleanup_empty_eh_move_lp (basic_block bb, edge e_out,
   /* Clean up E_OUT for the fallthru.  */
   e_out->flags = (e_out->flags & ~EDGE_EH) | EDGE_FALLTHRU;
   e_out->probability = profile_probability::always ();
-  e_out->count = e_out->src->count;
 }
 
 /* A subroutine of cleanup_empty_eh.  Handle more complex cases of
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 2e32267bbc4..e5965b00168 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -2357,7 +2357,8 @@ predicate_statements (loop_p loop)
 	  gassign *stmt = dyn_cast <gassign *> (gsi_stmt (gsi));
 	  if (!stmt)
 	    ;
-	  else if (is_false_predicate (cond))
+	  else if (is_false_predicate (cond)
+		   && gimple_vdef (stmt))
 	    {
 	      unlink_stmt_vdef (stmt);
 	      gsi_remove (&gsi, true);
@@ -2386,10 +2387,7 @@ predicate_statements (loop_p loop)
 					 TREE_OPERAND (cond, 0),
 					 TREE_OPERAND (cond, 1));
 		  else
-		    {
-		      gcc_assert (TREE_CODE (cond) == SSA_NAME);
-		      mask = cond;
-		    }
+		    mask = cond;
 
 		  if (swap)
 		    {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 3d8a7f10bdb..9eac215e4dc 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -1763,16 +1763,15 @@ remap_gimple_stmt (gimple *stmt, copy_body_data *id)
    later  */
 
 static basic_block
-copy_bb (copy_body_data *id, basic_block bb, int frequency_scale,
+copy_bb (copy_body_data *id, basic_block bb,
          profile_count num, profile_count den)
 {
   gimple_stmt_iterator gsi, copy_gsi, seq_gsi;
   basic_block copy_basic_block;
   tree decl;
-  gcov_type freq;
   basic_block prev;
-  bool scale = num.initialized_p ()
-	       && (den > 0 || num == profile_count::zero ());
+  bool scale = !num.initialized_p ()
+	       || (den.nonzero_p () || num == profile_count::zero ());
 
   /* Search for previous copied basic block.  */
   prev = bb->prev_bb;
@@ -1784,15 +1783,8 @@ copy_bb (copy_body_data *id, basic_block bb, int frequency_scale,
   copy_basic_block = create_basic_block (NULL, (basic_block) prev->aux);
   if (scale)
     copy_basic_block->count = bb->count.apply_scale (num, den);
-
-  /* We are going to rebuild frequencies from scratch.  These values
-     have just small importance to drive canonicalize_loop_headers.  */
-  freq = apply_scale ((gcov_type)bb->frequency, frequency_scale);
-
-  /* We recompute frequencies after inlining, so this is quite safe.  */
-  if (freq > BB_FREQ_MAX)
-    freq = BB_FREQ_MAX;
-  copy_basic_block->frequency = freq;
+  else if (num.initialized_p ())
+    copy_basic_block->count = bb->count;
 
   copy_gsi = gsi_start_bb (copy_basic_block);
 
@@ -2068,8 +2060,8 @@ copy_bb (copy_body_data *id, basic_block bb, int frequency_scale,
 			      fprintf (dump_file,
 				       "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
 				       bb->index,
-				       bb->frequency,
-				       copy_basic_block->frequency);
+				       bb->count.to_frequency (cfun),
+				       copy_basic_block->count.to_frequency (cfun));
 			    }
 			}
 		    }
@@ -2215,7 +2207,7 @@ update_ssa_across_abnormal_edges (basic_block bb, basic_block ret_bb,
    debug stmts are left after a statement that must end the basic block.  */
 
 static bool
-copy_edges_for_bb (basic_block bb, profile_count num, profile_count den,
+copy_edges_for_bb (basic_block bb,
 		   basic_block ret_bb, basic_block abnormal_goto_dest)
 {
   basic_block new_bb = (basic_block) bb->aux;
@@ -2224,8 +2216,6 @@ copy_edges_for_bb (basic_block bb, profile_count num, profile_count den,
   gimple_stmt_iterator si;
   int flags;
   bool need_debug_cleanup = false;
-  bool scale = num.initialized_p ()
-	       && (den > 0 || num == profile_count::zero ());
 
   /* Use the indices from the original blocks to create edges for the
      new ones.  */
@@ -2242,8 +2232,6 @@ copy_edges_for_bb (basic_block bb, profile_count num, profile_count den,
 	    && old_edge->dest->aux != EXIT_BLOCK_PTR_FOR_FN (cfun))
 	  flags |= EDGE_FALLTHRU;
 	new_edge = make_edge (new_bb, (basic_block) old_edge->dest->aux, flags);
-	if (scale)
-	  new_edge->count = old_edge->count.apply_scale (num, den);
 	new_edge->probability = old_edge->probability;
       }
 
@@ -2324,17 +2312,11 @@ copy_edges_for_bb (basic_block bb, profile_count num, profile_count den,
 		&& (e = find_edge (copy_stmt_bb,
 				   (basic_block) old_edge->dest->aux))
 		&& (e->flags & EDGE_EH))
-	      {
-		e->probability = old_edge->probability;
-		e->count = old_edge->count;
-	      }
+	      e->probability = old_edge->probability;
 	    
           FOR_EACH_EDGE (e, ei, copy_stmt_bb->succs)
 	    if ((e->flags & EDGE_EH) && !e->probability.initialized_p ())
-	      {
-	        e->probability = profile_probability::never ();
-	        e->count = profile_count::zero ();
-	      }
+	      e->probability = profile_probability::never ();
         }
 
 
@@ -2517,11 +2499,8 @@ initialize_cfun (tree new_fndecl, tree callee_fndecl, profile_count count)
 
   profile_status_for_fn (cfun) = profile_status_for_fn (src_cfun);
 
-  /* FIXME: When all counts are known to be zero, scaling is also meaningful.
-   */
   if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count.initialized_p ()
-      && count.initialized_p ()
-      && ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count.initialized_p ())
+      && count.ipa ().initialized_p ())
     {
       ENTRY_BLOCK_PTR_FOR_FN (cfun)->count =
 	ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count.apply_scale (count,
@@ -2530,10 +2509,13 @@ initialize_cfun (tree new_fndecl, tree callee_fndecl, profile_count count)
 	EXIT_BLOCK_PTR_FOR_FN (src_cfun)->count.apply_scale (count,
 				    ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count);
     }
-  ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency
-    = ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->frequency;
-  EXIT_BLOCK_PTR_FOR_FN (cfun)->frequency =
-    EXIT_BLOCK_PTR_FOR_FN (src_cfun)->frequency;
+  else
+    {
+      ENTRY_BLOCK_PTR_FOR_FN (cfun)->count
+	   = ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count;
+      EXIT_BLOCK_PTR_FOR_FN (cfun)->count
+	   = EXIT_BLOCK_PTR_FOR_FN (src_cfun)->count;
+    }
   if (src_cfun->eh)
     init_eh_for_function ();
 
@@ -2690,33 +2672,11 @@ redirect_all_calls (copy_body_data * id, basic_block bb)
     }
 }
 
-/* Convert estimated frequencies into counts for NODE, scaling COUNT
-   with each bb's frequency. Used when NODE has a 0-weight entry
-   but we are about to inline it into a non-zero count call bb.
-   See the comments for handle_missing_profiles() in predict.c for
-   when this can happen for COMDATs.  */
-
-void
-freqs_to_counts (struct cgraph_node *node, profile_count count)
-{
-  basic_block bb;
-  edge_iterator ei;
-  edge e;
-  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
-
-  FOR_ALL_BB_FN(bb, fn)
-    {
-      bb->count = count.apply_scale (bb->frequency, BB_FREQ_MAX);
-      FOR_EACH_EDGE (e, ei, bb->succs)
-        e->count = e->src->count.apply_probability (e->probability);
-    }
-}
-
 /* Make a copy of the body of FN so that it can be inserted inline in
    another function.  Walks FN via CFG, returns new fndecl.  */
 
 static tree
-copy_cfg_body (copy_body_data * id, profile_count count, int frequency_scale,
+copy_cfg_body (copy_body_data * id, profile_count,
 	       basic_block entry_block_map, basic_block exit_block_map,
 	       basic_block new_entry)
 {
@@ -2728,31 +2688,10 @@ copy_cfg_body (copy_body_data * id, profile_count count, int frequency_scale,
   tree new_fndecl = NULL;
   bool need_debug_cleanup = false;
   int last;
-  int incoming_frequency = 0;
-  profile_count incoming_count = profile_count::zero ();
-  profile_count num = count;
   profile_count den = ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count;
-  bool scale = num.initialized_p ()
-	       && (den > 0 || num == profile_count::zero ());
-
-  /* This can happen for COMDAT routines that end up with 0 counts
-     despite being called (see the comments for handle_missing_profiles()
-     in predict.c as to why). Apply counts to the blocks in the callee
-     before inlining, using the guessed edge frequencies, so that we don't
-     end up with a 0-count inline body which can confuse downstream
-     optimizations such as function splitting.  */
-  if (!(ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count > 0) && count > 0)
-    {
-      /* Apply the larger of the call bb count and the total incoming
-         call edge count to the callee.  */
-      profile_count in_count = profile_count::zero ();
-      struct cgraph_edge *in_edge;
-      for (in_edge = id->src_node->callers; in_edge;
-           in_edge = in_edge->next_caller)
-	if (in_edge->count.initialized_p ())
-          in_count += in_edge->count;
-      freqs_to_counts (id->src_node, count > in_count ? count : in_count);
-    }
+  profile_count num = entry_block_map->count;
+
+  cfun_to_copy = id->src_cfun = DECL_STRUCT_FUNCTION (callee_fndecl);
 
   /* Register specific tree functions.  */
   gimple_register_cfg_hooks ();
@@ -2766,28 +2705,18 @@ copy_cfg_body (copy_body_data * id, profile_count count, int frequency_scale,
     {
       edge e;
       edge_iterator ei;
+      den = profile_count::zero ();
 
       FOR_EACH_EDGE (e, ei, new_entry->preds)
 	if (!e->src->aux)
-	  {
-	    incoming_frequency += EDGE_FREQUENCY (e);
-	    incoming_count += e->count;
-	  }
-      if (scale)
-        incoming_count = incoming_count.apply_scale (num, den);
-      else
-	incoming_count = profile_count::uninitialized ();
-      incoming_frequency
-	= apply_scale ((gcov_type)incoming_frequency, frequency_scale);
-      ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = incoming_count;
-      ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency = incoming_frequency;
+	  den += e->count ();
+      ENTRY_BLOCK_PTR_FOR_FN (cfun)->count = den;
     }
 
   /* Must have a CFG here at this point.  */
   gcc_assert (ENTRY_BLOCK_PTR_FOR_FN
 	      (DECL_STRUCT_FUNCTION (callee_fndecl)));
 
-  cfun_to_copy = id->src_cfun = DECL_STRUCT_FUNCTION (callee_fndecl);
 
   ENTRY_BLOCK_PTR_FOR_FN (cfun_to_copy)->aux = entry_block_map;
   EXIT_BLOCK_PTR_FOR_FN (cfun_to_copy)->aux = exit_block_map;
@@ -2803,7 +2732,7 @@ copy_cfg_body (copy_body_data * id, profile_count count, int frequency_scale,
   FOR_EACH_BB_FN (bb, cfun_to_copy)
     if (!id->blocks_to_copy || bitmap_bit_p (id->blocks_to_copy, bb->index))
       {
-	basic_block new_bb = copy_bb (id, bb, frequency_scale, num, den);
+	basic_block new_bb = copy_bb (id, bb, num, den);
 	bb->aux = new_bb;
 	new_bb->aux = bb;
 	new_bb->loop_father = entry_block_map->loop_father;
@@ -2826,14 +2755,13 @@ copy_cfg_body (copy_body_data * id, profile_count count, int frequency_scale,
   FOR_ALL_BB_FN (bb, cfun_to_copy)
     if (!id->blocks_to_copy
 	|| (bb->index > 0 && bitmap_bit_p (id->blocks_to_copy, bb->index)))
-      need_debug_cleanup |= copy_edges_for_bb (bb, num, den, exit_block_map,
+      need_debug_cleanup |= copy_edges_for_bb (bb, exit_block_map,
 					       abnormal_goto_dest);
 
   if (new_entry)
     {
       edge e = make_edge (entry_block_map, (basic_block)new_entry->aux, EDGE_FALLTHRU);
       e->probability = profile_probability::always ();
-      e->count = incoming_count;
     }
 
   /* Duplicate the loop tree, if available and wanted.  */
@@ -3031,7 +2959,7 @@ copy_tree_body (copy_body_data *id)
    another function.  */
 
 static tree
-copy_body (copy_body_data *id, profile_count count, int frequency_scale,
+copy_body (copy_body_data *id, profile_count count,
 	   basic_block entry_block_map, basic_block exit_block_map,
 	   basic_block new_entry)
 {
@@ -3040,7 +2968,7 @@ copy_body (copy_body_data *id, profile_count count, int frequency_scale,
 
   /* If this body has a CFG, walk CFG and copy.  */
   gcc_assert (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION (fndecl)));
-  body = copy_cfg_body (id, count, frequency_scale, entry_block_map, exit_block_map,
+  body = copy_cfg_body (id, count, entry_block_map, exit_block_map,
 			new_entry);
   copy_debug_stmts (id);
 
@@ -4797,7 +4725,6 @@ expand_call_inline (basic_block bb, gimple *stmt, copy_body_data *id)
      a self-referential call; if we're calling ourselves, we need to
      duplicate our body before altering anything.  */
   copy_body (id, cg_edge->callee->count,
-  	     GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
 	     bb, return_block, NULL);
 
   reset_debug_bindings (id, stmt_gsi);
@@ -5172,6 +5099,7 @@ optimize_inline_calls (tree fn)
     }
 
   /* Fold queued statements.  */
+  counts_to_freqs ();
   fold_marked_statements (last, id.statements_to_fold);
   delete id.statements_to_fold;
 
@@ -6116,7 +6044,7 @@ tree_function_versioning (tree old_decl, tree new_decl,
     }
 
   /* Copy the Function's body.  */
-  copy_body (&id, old_entry_block->count, REG_BR_PROB_BASE,
+  copy_body (&id, old_entry_block->count,
 	     ENTRY_BLOCK_PTR_FOR_FN (cfun), EXIT_BLOCK_PTR_FOR_FN (cfun),
 	     new_entry);
 
@@ -6148,6 +6076,7 @@ tree_function_versioning (tree old_decl, tree new_decl,
   free_dominance_info (CDI_DOMINATORS);
   free_dominance_info (CDI_POST_DOMINATORS);
 
+  counts_to_freqs ();
   fold_marked_statements (0, id.statements_to_fold);
   delete id.statements_to_fold;
   delete_unreachable_blocks_update_callgraph (&id);
@@ -6167,20 +6096,20 @@ tree_function_versioning (tree old_decl, tree new_decl,
       struct cgraph_edge *e;
       rebuild_frequencies ();
 
-      new_version_node->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
+      new_version_node->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa ();
       for (e = new_version_node->callees; e; e = e->next_callee)
 	{
 	  basic_block bb = gimple_bb (e->call_stmt);
 	  e->frequency = compute_call_stmt_bb_frequency (current_function_decl,
 							 bb);
-	  e->count = bb->count;
+	  e->count = bb->count.ipa ();
 	}
       for (e = new_version_node->indirect_calls; e; e = e->next_callee)
 	{
 	  basic_block bb = gimple_bb (e->call_stmt);
 	  e->frequency = compute_call_stmt_bb_frequency (current_function_decl,
 							 bb);
-	  e->count = bb->count;
+	  e->count = bb->count.ipa ();
 	}
     }
 
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 30091453e39..fbb891fdedf 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -90,6 +90,7 @@ along with GCC; see the file COPYING3.  If not see
 	data reuse.  */
 
 #include "config.h"
+#define INCLUDE_ALGORITHM /* stable_sort */
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
@@ -106,6 +107,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "tree-cfg.h"
 #include "tree-ssa-loop-manip.h"
+#include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
 #include "tree-ssa.h"
@@ -604,6 +606,10 @@ struct builtin_info
   tree dst_base;
   tree src_base;
   tree size;
+  /* Base and offset part of dst_base after stripping constant offset.  This
+     is only used in memset builtin distribution for now.  */
+  tree dst_base_base;
+  unsigned HOST_WIDE_INT dst_base_offset;
 };
 
 /* Partition for loop distribution.  */
@@ -1286,12 +1292,12 @@ build_rdg_partition_for_vertex (struct graph *rdg, int v)
   return partition;
 }
 
-/* Given PARTITION of RDG, record single load/store data references for
-   builtin partition in SRC_DR/DST_DR, return false if there is no such
+/* Given PARTITION of LOOP and RDG, record single load/store data references
+   for builtin partition in SRC_DR/DST_DR, return false if there is no such
    data references.  */
 
 static bool
-find_single_drs (struct graph *rdg, partition *partition,
+find_single_drs (struct loop *loop, struct graph *rdg, partition *partition,
 		 data_reference_p *dst_dr, data_reference_p *src_dr)
 {
   unsigned i;
@@ -1347,10 +1353,12 @@ find_single_drs (struct graph *rdg, partition *partition,
       && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (single_st), 1)))
     return false;
 
-  /* Data reference must be executed exactly once per iteration.  */
+  /* Data reference must be executed exactly once per iteration of each
+     loop in the loop nest.  We only need to check dominance information
+     against the outermost one in a perfect loop nest because a bb can't
+     dominate outermost loop's latch without dominating inner loop's.  */
   basic_block bb_st = gimple_bb (DR_STMT (single_st));
-  struct loop *inner = bb_st->loop_father;
-  if (!dominated_by_p (CDI_DOMINATORS, inner->latch, bb_st))
+  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb_st))
     return false;
 
   if (single_ld)
@@ -1368,14 +1376,16 @@ find_single_drs (struct graph *rdg, partition *partition,
 
       /* Load and store must be in the same loop nest.  */
       basic_block bb_ld = gimple_bb (DR_STMT (single_ld));
-      if (inner != bb_ld->loop_father)
+      if (bb_st->loop_father != bb_ld->loop_father)
 	return false;
 
-      /* Data reference must be executed exactly once per iteration.  */
-      if (!dominated_by_p (CDI_DOMINATORS, inner->latch, bb_ld))
+      /* Data reference must be executed exactly once per iteration.
+	 Same as single_st, we only need to check against the outermost
+	 loop.  */
+      if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb_ld))
 	return false;
 
-      edge e = single_exit (inner);
+      edge e = single_exit (bb_st->loop_father);
       bool dom_ld = dominated_by_p (CDI_DOMINATORS, e->src, bb_ld);
       bool dom_st = dominated_by_p (CDI_DOMINATORS, e->src, bb_st);
       if (dom_ld != dom_st)
@@ -1503,7 +1513,17 @@ classify_builtin_st (loop_p loop, partition *partition, data_reference_p dr)
   if (!compute_access_range (loop, dr, &base, &size))
     return;
 
-  partition->builtin = alloc_builtin (dr, NULL, base, NULL_TREE, size);
+  poly_uint64 base_offset;
+  unsigned HOST_WIDE_INT const_base_offset;
+  tree base_base = strip_offset (base, &base_offset);
+  if (!base_offset.is_constant (&const_base_offset))
+    return;
+
+  struct builtin_info *builtin;
+  builtin = alloc_builtin (dr, NULL, base, NULL_TREE, size);
+  builtin->dst_base_base = base_base;
+  builtin->dst_base_offset = const_base_offset;
+  partition->builtin = builtin;
   partition->kind = PKIND_MEMSET;
 }
 
@@ -1614,7 +1634,7 @@ classify_partition (loop_p loop, struct graph *rdg, partition *partition,
     return;
 
   /* Find single load/store data references for builtin partition.  */
-  if (!find_single_drs (rdg, partition, &single_st, &single_ld))
+  if (!find_single_drs (loop, rdg, partition, &single_st, &single_ld))
     return;
 
   /* Classify the builtin kind.  */
@@ -2482,6 +2502,113 @@ version_for_distribution_p (vec<struct partition *> *partitions,
   return (alias_ddrs->length () > 0);
 }
 
+/* Compare base offset of builtin mem* partitions P1 and P2.  */
+
+static bool
+offset_cmp (struct partition *p1, struct partition *p2)
+{
+  gcc_assert (p1 != NULL && p1->builtin != NULL);
+  gcc_assert (p2 != NULL && p2->builtin != NULL);
+  return p1->builtin->dst_base_offset < p2->builtin->dst_base_offset;
+}
+
+/* Fuse adjacent memset builtin PARTITIONS if possible.  This is a special
+   case optimization transforming below code:
+
+     __builtin_memset (&obj, 0, 100);
+     _1 = &obj + 100;
+     __builtin_memset (_1, 0, 200);
+     _2 = &obj + 300;
+     __builtin_memset (_2, 0, 100);
+
+   into:
+
+     __builtin_memset (&obj, 0, 400);
+
+   Note we don't have dependence information between different partitions
+   at this point, as a result, we can't handle nonadjacent memset builtin
+   partitions since dependence might be broken.  */
+
+static void
+fuse_memset_builtins (vec<struct partition *> *partitions)
+{
+  unsigned i, j;
+  struct partition *part1, *part2;
+
+  for (i = 0; partitions->iterate (i, &part1);)
+    {
+      if (part1->kind != PKIND_MEMSET)
+	{
+	  i++;
+	  continue;
+	}
+
+      /* Find sub-array of memset builtins of the same base.  Index range
+	 of the sub-array is [i, j) with "j > i".  */
+      for (j = i + 1; partitions->iterate (j, &part2); ++j)
+	{
+	  if (part2->kind != PKIND_MEMSET
+	      || !operand_equal_p (part1->builtin->dst_base_base,
+				   part2->builtin->dst_base_base, 0))
+	    break;
+	}
+
+      /* Stable sort is required in order to avoid breaking dependence.  */
+      std::stable_sort (&(*partitions)[i],
+			&(*partitions)[i] + j - i, offset_cmp);
+      /* Continue with next partition.  */
+      i = j;
+    }
+
+  /* Merge all consecutive memset builtin partitions.  */
+  for (i = 0; i < partitions->length () - 1;)
+    {
+      part1 = (*partitions)[i];
+      if (part1->kind != PKIND_MEMSET)
+	{
+	  i++;
+	  continue;
+	}
+
+      part2 = (*partitions)[i + 1];
+      /* Only merge memset partitions of the same base and with constant
+	 access sizes.  */
+      if (part2->kind != PKIND_MEMSET
+	  || TREE_CODE (part1->builtin->size) != INTEGER_CST
+	  || TREE_CODE (part2->builtin->size) != INTEGER_CST
+	  || !operand_equal_p (part1->builtin->dst_base_base,
+			       part2->builtin->dst_base_base, 0))
+	{
+	  i++;
+	  continue;
+	}
+      tree rhs1 = gimple_assign_rhs1 (DR_STMT (part1->builtin->dst_dr));
+      tree rhs2 = gimple_assign_rhs1 (DR_STMT (part2->builtin->dst_dr));
+      int bytev1 = const_with_all_bytes_same (rhs1);
+      int bytev2 = const_with_all_bytes_same (rhs2);
+      /* Only merge memset partitions of the same value.  */
+      if (bytev1 != bytev2 || bytev1 == -1)
+	{
+	  i++;
+	  continue;
+	}
+      wide_int end1 = wi::add (part1->builtin->dst_base_offset,
+			       wi::to_wide (part1->builtin->size));
+      /* Only merge adjacent memset partitions.  */
+      if (wi::ne_p (end1, part2->builtin->dst_base_offset))
+	{
+	  i++;
+	  continue;
+	}
+      /* Merge partitions[i] and partitions[i+1].  */
+      part1->builtin->size = fold_build2 (PLUS_EXPR, sizetype,
+					  part1->builtin->size,
+					  part2->builtin->size);
+      partition_free (part2);
+      partitions->ordered_remove (i + 1);
+    }
+}
+
 /* Fuse PARTITIONS of LOOP if necessary before finalizing distribution.
    ALIAS_DDRS contains ddrs which need runtime alias check.  */
 
@@ -2525,6 +2652,10 @@ finalize_partitions (struct loop *loop, vec<struct partition *> *partitions,
 	}
       partitions->truncate (1);
     }
+
+  /* Fuse memset builtins if possible.  */
+  if (partitions->length () > 1)
+    fuse_memset_builtins (partitions);
 }
 
 /* Distributes the code from LOOP in such a way that producer statements
diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index c5a153f8411..3d288ec951c 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -436,8 +436,7 @@ alloc_object_size (const gcall *call, int object_size_type)
 	arg2 = 1;
 	/* fall through */
       case BUILT_IN_MALLOC:
-      case BUILT_IN_ALLOCA:
-      case BUILT_IN_ALLOCA_WITH_ALIGN:
+      CASE_BUILT_IN_ALLOCA:
 	arg1 = 0;
       default:
 	break;
diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h
index 1220b6256ca..ebbaea1a03e 100644
--- a/gcc/tree-outof-ssa.h
+++ b/gcc/tree-outof-ssa.h
@@ -74,18 +74,6 @@ get_gimple_for_ssa_name (tree exp)
   return NULL;
 }
 
-/* Return whether the RTX expression representing the storage of the outof-SSA
-   partition that the SSA name EXP is a member of is always initialized.  */
-static inline bool
-always_initialized_rtx_for_ssa_name_p (tree exp)
-{
-  int p = partition_find (SA.map->var_partition, SSA_NAME_VERSION (exp));
-  if (SA.map->partition_to_view)
-    p = SA.map->partition_to_view[p];
-  gcc_assert (p != NO_PARTITION);
-  return !bitmap_bit_p (SA.partitions_for_undefined_values, p);
-}
-
 extern bool ssa_is_replaceable_p (gimple *stmt);
 extern void finish_out_of_ssa (struct ssaexpand *sa);
 extern unsigned int rewrite_out_of_ssa (struct ssaexpand *sa);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index a6599eb8bf0..fbd0dbdf924 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -223,6 +223,7 @@ protected:
 						   current choices have
 						   been optimized.  */
 #define PROP_gimple_lomp_dev	(1 << 16)	/* done omp_device_lower */
+#define PROP_rtl_split_insns	(1 << 17)	/* RTL has insns split.  */
 
 #define PROP_trees \
   (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | PROP_gimple_lomp)
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index a863de3d1d0..059f820e65b 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -1770,7 +1770,7 @@ interpret_rhs_expr (struct loop *loop, gimple *at_stmt,
 	      res = chrec_fold_plus (type, res, chrec2);
 	    }
 
-	  if (maybe_nonzero (bitpos))
+	  if (may_ne (bitpos, 0))
 	    {
 	      unitpos = size_int (exact_div (bitpos, BITS_PER_UNIT));
 	      chrec3 = analyze_scalar_evolution (loop, unitpos);
@@ -2356,11 +2356,9 @@ instantiate_scev_name (edge instantiate_below,
   struct loop *def_loop;
   basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (chrec));
 
-  /* A parameter (or loop invariant and we do not want to include
-     evolutions in outer loops), nothing to do.  */
+  /* A parameter, nothing to do.  */
   if (!def_bb
-      || loop_depth (def_bb->loop_father) == 0
-      || ! dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
+      || !dominated_by_p (CDI_DOMINATORS, def_bb, instantiate_below->dest))
     return chrec;
 
   /* We cache the value of instantiated variable to avoid exponential
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index d732c5f6b0c..417ca8f3c54 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -693,7 +693,7 @@ addr_to_parts (tree type, aff_tree *addr, tree iv_cand, tree base_hint,
   parts->index = NULL_TREE;
   parts->step = NULL_TREE;
 
-  if (maybe_nonzero (addr->offset))
+  if (may_ne (addr->offset, 0))
     parts->offset = wide_int_to_tree (sizetype, addr->offset);
   else
     parts->offset = NULL_TREE;
@@ -747,18 +747,15 @@ gimplify_mem_ref_parts (gimple_stmt_iterator *gsi, struct mem_address *parts)
 }
 
 /* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP
-   address for type TYPE and if some other component (the symbol or
-   offset) is making it appear invalid.  */
+   address for type TYPE and if the offset is making it appear invalid.  */
 
 static bool
 keep_index_p (tree type, mem_address parts)
 {
   if (!parts.base)
-    parts.base = parts.symbol;
-  if (!parts.base)
     return false;
 
-  parts.symbol = NULL_TREE;
+  gcc_assert (!parts.symbol);
   parts.offset = NULL_TREE;
   return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);
 }
@@ -826,9 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
      into:
        index' = index << step;
        [... + index' + ,,,].  */
-  if (parts.step
-      && !integer_onep (parts.step)
-      && !keep_index_p (type, parts))
+  bool scaled_p = (parts.step && !integer_onep (parts.step));
+  if (scaled_p && !keep_index_p (type, parts))
     {
       gcc_assert (parts.index);
       parts.index = force_gimple_operand_gsi (gsi,
@@ -840,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
       mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
       if (mem_ref)
 	return mem_ref;
+      scaled_p = false;
     }
 
   /* Add offset to invariant part by transforming address expression:
@@ -853,7 +850,7 @@ create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
      depending on which one is invariant.  */
   if (parts.offset
       && !integer_zerop (parts.offset)
-      && (!var_in_base || !parts.step || integer_onep (parts.step)))
+      && (!var_in_base || !scaled_p))
     {
       tree old_base = unshare_expr (parts.base);
       tree old_index = unshare_expr (parts.index);
@@ -903,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
   /* Transform [base + index + ...] into:
        base' = base + index;
        [base' + ...].  */
-  if (parts.index && (!parts.step || integer_onep (parts.step)))
+  if (parts.index && !scaled_p)
     {
       tmp = parts.index;
       parts.index = NULL_TREE;
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index f378993f453..b9f25a3ad78 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -1753,8 +1753,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref)
 	case BUILT_IN_POSIX_MEMALIGN:
 	case BUILT_IN_ALIGNED_ALLOC:
 	case BUILT_IN_CALLOC:
-	case BUILT_IN_ALLOCA:
-	case BUILT_IN_ALLOCA_WITH_ALIGN:
+	CASE_BUILT_IN_ALLOCA:
 	case BUILT_IN_STACK_SAVE:
 	case BUILT_IN_STACK_RESTORE:
 	case BUILT_IN_MEMSET:
@@ -2092,8 +2091,7 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
 	    return true;
 	  return false;
 	case BUILT_IN_STACK_SAVE:
-	case BUILT_IN_ALLOCA:
-	case BUILT_IN_ALLOCA_WITH_ALIGN:
+	CASE_BUILT_IN_ALLOCA:
 	case BUILT_IN_ASSUME_ALIGNED:
 	  return false;
 	/* But posix_memalign stores a pointer into the memory pointed to
@@ -2302,8 +2300,8 @@ same_addr_size_stores_p (tree base1, poly_int64 offset1, poly_int64 size1,
 			 poly_int64 max_size2)
 {
   /* Offsets need to be 0.  */
-  if (maybe_nonzero (offset1)
-      || maybe_nonzero (offset2))
+  if (may_ne (offset1, 0)
+      || may_ne (offset2, 0))
     return false;
 
   bool base1_obj_p = SSA_VAR_P (base1);
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 7b075102df2..cc98d18e46e 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -171,6 +171,13 @@ struct ccp_prop_value_t {
     widest_int mask;
 };
 
+class ccp_propagate : public ssa_propagation_engine
+{
+ public:
+  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
+  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
+};
+
 /* Array of propagated constant values.  After propagation,
    CONST_VAL[I].VALUE holds the constant value for SSA_NAME(I).  If
    the constant is held in an SSA name representing a memory store
@@ -181,7 +188,6 @@ static ccp_prop_value_t *const_val;
 static unsigned n_const_val;
 
 static void canonicalize_value (ccp_prop_value_t *);
-static bool ccp_fold_stmt (gimple_stmt_iterator *);
 static void ccp_lattice_meet (ccp_prop_value_t *, ccp_prop_value_t *);
 
 /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX.  */
@@ -902,6 +908,24 @@ do_dbg_cnt (void)
 }
 
 
+/* We want to provide our own GET_VALUE and FOLD_STMT virtual methods.  */
+class ccp_folder : public substitute_and_fold_engine
+{
+ public:
+  tree get_value (tree) FINAL OVERRIDE;
+  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
+};
+
+/* This method just wraps GET_CONSTANT_VALUE for now.  Over time
+   naked calls to GET_CONSTANT_VALUE should be eliminated in favor
+   of calling member functions.  */
+
+tree
+ccp_folder::get_value (tree op)
+{
+  return get_constant_value (op);
+}
+
 /* Do final substitution of propagated values, cleanup the flowgraph and
    free allocated storage.  If NONZERO_P, record nonzero bits.
 
@@ -960,7 +984,8 @@ ccp_finalize (bool nonzero_p)
     }
 
   /* Perform substitutions based on the known constant values.  */
-  something_changed = substitute_and_fold (get_constant_value, ccp_fold_stmt);
+  class ccp_folder ccp_folder;
+  something_changed = ccp_folder.substitute_and_fold ();
 
   free (const_val);
   const_val = NULL;
@@ -1064,8 +1089,8 @@ ccp_lattice_meet (ccp_prop_value_t *val1, ccp_prop_value_t *val2)
    PHI node is determined calling ccp_lattice_meet with all the arguments
    of the PHI node that are incoming via executable edges.  */
 
-static enum ssa_prop_result
-ccp_visit_phi_node (gphi *phi)
+enum ssa_prop_result
+ccp_propagate::visit_phi (gphi *phi)
 {
   unsigned i;
   ccp_prop_value_t new_val;
@@ -1886,11 +1911,10 @@ evaluate_stmt (gimple *stmt)
 			   / BITS_PER_UNIT - 1);
 	      break;
 
-	    case BUILT_IN_ALLOCA:
-	    case BUILT_IN_ALLOCA_WITH_ALIGN:
-	      align = (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA_WITH_ALIGN
-		       ? TREE_INT_CST_LOW (gimple_call_arg (stmt, 1))
-		       : BIGGEST_ALIGNMENT);
+	    CASE_BUILT_IN_ALLOCA:
+	      align = (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_ALLOCA
+		       ? BIGGEST_ALIGNMENT
+		       : TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)));
 	      val.lattice_val = CONSTANT;
 	      val.value = build_int_cst (TREE_TYPE (gimple_get_lhs (stmt)), 0);
 	      val.mask = ~((HOST_WIDE_INT) align / BITS_PER_UNIT - 1);
@@ -2170,8 +2194,8 @@ fold_builtin_alloca_with_align (gimple *stmt)
 /* Fold the stmt at *GSI with CCP specific information that propagating
    and regular folding does not catch.  */
 
-static bool
-ccp_fold_stmt (gimple_stmt_iterator *gsi)
+bool
+ccp_folder::fold_stmt (gimple_stmt_iterator *gsi)
 {
   gimple *stmt = gsi_stmt (*gsi);
 
@@ -2243,7 +2267,8 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
         /* The heuristic of fold_builtin_alloca_with_align differs before and
 	   after inlining, so we don't require the arg to be changed into a
 	   constant for folding, but just to be constant.  */
-        if (gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+        if (gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN)
+	    || gimple_call_builtin_p (stmt, BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX))
           {
             tree new_rhs = fold_builtin_alloca_with_align (stmt);
             if (new_rhs)
@@ -2378,8 +2403,8 @@ visit_cond_stmt (gimple *stmt, edge *taken_edge_p)
    value, set *TAKEN_EDGE_P accordingly.  If STMT produces a varying
    value, return SSA_PROP_VARYING.  */
 
-static enum ssa_prop_result
-ccp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
+enum ssa_prop_result
+ccp_propagate::visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
 {
   tree def;
   ssa_op_iter iter;
@@ -2441,7 +2466,8 @@ do_ssa_ccp (bool nonzero_p)
   calculate_dominance_info (CDI_DOMINATORS);
 
   ccp_initialize ();
-  ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
+  class ccp_propagate ccp_propagate;
+  ccp_propagate.ssa_propagate ();
   if (ccp_finalize (nonzero_p || flag_ipa_bit_cp))
     {
       todo = (TODO_cleanup_cfg | TODO_update_ssa);
@@ -2535,8 +2561,7 @@ optimize_stack_restore (gimple_stmt_iterator i)
       if (!callee
 	  || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL
 	  /* All regular builtins are ok, just obviously not alloca.  */
-	  || DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA
-	  || DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA_WITH_ALIGN)
+	  || ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (callee)))
 	return NULL_TREE;
 
       if (DECL_FUNCTION_CODE (callee) == BUILT_IN_STACK_RESTORE)
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 3938f064f67..057d51dcf37 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -164,7 +164,7 @@ coalesce_cost (int frequency, bool optimize_for_size)
 static inline int
 coalesce_cost_bb (basic_block bb)
 {
-  return coalesce_cost (bb->frequency, optimize_bb_for_size_p (bb));
+  return coalesce_cost (bb->count.to_frequency (cfun), optimize_bb_for_size_p (bb));
 }
 
 
diff --git a/gcc/tree-ssa-copy.c b/gcc/tree-ssa-copy.c
index 9f0fe541ded..1f9dbf52346 100644
--- a/gcc/tree-ssa-copy.c
+++ b/gcc/tree-ssa-copy.c
@@ -68,6 +68,13 @@ struct prop_value_t {
     tree value;
 };
 
+class copy_prop : public ssa_propagation_engine
+{
+ public:
+  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
+  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
+};
+
 static prop_value_t *copy_of;
 static unsigned n_copy_of;
 
@@ -263,8 +270,8 @@ copy_prop_visit_cond_stmt (gimple *stmt, edge *taken_edge_p)
    If the new value produced by STMT is varying, return
    SSA_PROP_VARYING.  */
 
-static enum ssa_prop_result
-copy_prop_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *result_p)
+enum ssa_prop_result
+copy_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, tree *result_p)
 {
   enum ssa_prop_result retval;
 
@@ -317,8 +324,8 @@ copy_prop_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *result_p)
 /* Visit PHI node PHI.  If all the arguments produce the same value,
    set it to be the value of the LHS of PHI.  */
 
-static enum ssa_prop_result
-copy_prop_visit_phi_node (gphi *phi)
+enum ssa_prop_result
+copy_prop::visit_phi (gphi *phi)
 {
   enum ssa_prop_result retval;
   unsigned i;
@@ -482,10 +489,16 @@ init_copy_prop (void)
     }
 }
 
+class copy_folder : public substitute_and_fold_engine
+{
+ public:
+  tree get_value (tree) FINAL OVERRIDE;
+};
+
 /* Callback for substitute_and_fold to get at the final copy-of values.  */
 
-static tree
-get_value (tree name)
+tree
+copy_folder::get_value (tree name)
 {
   tree val;
   if (SSA_NAME_VERSION (name) >= n_copy_of)
@@ -550,7 +563,8 @@ fini_copy_prop (void)
 	}
     }
 
-  bool changed = substitute_and_fold (get_value, NULL);
+  class copy_folder copy_folder;
+  bool changed = copy_folder.substitute_and_fold ();
   if (changed)
     {
       free_numbers_of_iterations_estimates (cfun);
@@ -601,7 +615,8 @@ static unsigned int
 execute_copy_prop (void)
 {
   init_copy_prop ();
-  ssa_propagate (copy_prop_visit_stmt, copy_prop_visit_phi_node);
+  class copy_prop copy_prop;
+  copy_prop.ssa_propagate ();
   if (fini_copy_prop ())
     return TODO_cleanup_cfg;
   return 0;
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 270253bbfb2..794a1b3a4d7 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -231,8 +231,7 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
 	    case BUILT_IN_MALLOC:
 	    case BUILT_IN_ALIGNED_ALLOC:
 	    case BUILT_IN_CALLOC:
-	    case BUILT_IN_ALLOCA:
-	    case BUILT_IN_ALLOCA_WITH_ALIGN:
+	    CASE_BUILT_IN_ALLOCA:
 	    case BUILT_IN_STRDUP:
 	    case BUILT_IN_STRNDUP:
 	      return;
@@ -572,8 +571,7 @@ mark_all_reaching_defs_necessary_1 (ao_ref *ref ATTRIBUTE_UNUSED,
 	  case BUILT_IN_MALLOC:
 	  case BUILT_IN_ALIGNED_ALLOC:
 	  case BUILT_IN_CALLOC:
-	  case BUILT_IN_ALLOCA:
-	  case BUILT_IN_ALLOCA_WITH_ALIGN:
+	  CASE_BUILT_IN_ALLOCA:
 	  case BUILT_IN_FREE:
 	    return false;
 
@@ -841,9 +839,7 @@ propagate_necessity (bool aggressive)
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_CALLOC
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_FREE
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_VA_END
-		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA
-		      || (DECL_FUNCTION_CODE (callee)
-			  == BUILT_IN_ALLOCA_WITH_ALIGN)
+		      || ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (callee))
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_STACK_SAVE
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_STACK_RESTORE
 		      || DECL_FUNCTION_CODE (callee) == BUILT_IN_ASSUME_ALIGNED))
@@ -1051,7 +1047,6 @@ remove_dead_stmt (gimple_stmt_iterator *i, basic_block bb)
 	}
       gcc_assert (e);
       e->probability = profile_probability::always ();
-      e->count = bb->count;
 
       /* The edge is no longer associated with a conditional, so it does
 	 not have TRUE/FALSE flags.
@@ -1344,9 +1339,8 @@ eliminate_unnecessary_stmts (void)
 		      || (DECL_FUNCTION_CODE (call) != BUILT_IN_ALIGNED_ALLOC
 			  && DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC
 			  && DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC
-			  && DECL_FUNCTION_CODE (call) != BUILT_IN_ALLOCA
-			  && (DECL_FUNCTION_CODE (call)
-			      != BUILT_IN_ALLOCA_WITH_ALIGN)))
+			  && !ALLOCA_FUNCTION_CODE_P
+			      (DECL_FUNCTION_CODE (call))))
 		  /* Avoid doing so for bndret calls for the same reason.  */
 		  && !chkp_gimple_call_builtin_p (stmt, BUILT_IN_CHKP_BNDRET))
 		{
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 06be69a530c..eb85b4a09ad 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -113,7 +113,6 @@ static void eliminate_redundant_computations (gimple_stmt_iterator *,
 					      class avail_exprs_stack *);
 static void record_equivalences_from_stmt (gimple *, int,
 					   class avail_exprs_stack *);
-static edge single_incoming_edge_ignoring_loop_edges (basic_block);
 static void dump_dominator_optimization_stats (FILE *file,
 					       hash_table<expr_elt_hasher> *);
 
@@ -1057,39 +1056,6 @@ record_equivalences_from_phis (basic_block bb)
     }
 }
 
-/* Ignoring loop backedges, if BB has precisely one incoming edge then
-   return that edge.  Otherwise return NULL.  */
-static edge
-single_incoming_edge_ignoring_loop_edges (basic_block bb)
-{
-  edge retval = NULL;
-  edge e;
-  edge_iterator ei;
-
-  FOR_EACH_EDGE (e, ei, bb->preds)
-    {
-      /* A loop back edge can be identified by the destination of
-	 the edge dominating the source of the edge.  */
-      if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
-	continue;
-
-      /* We can safely ignore edges that are not executable.  */
-      if ((e->flags & EDGE_EXECUTABLE) == 0)
-	continue;
-
-      /* If we have already seen a non-loop edge, then we must have
-	 multiple incoming non-loop edges and thus we return NULL.  */
-      if (retval)
-	return NULL;
-
-      /* This is the first non-loop incoming edge we have found.  Record
-	 it.  */
-      retval = e;
-    }
-
-  return retval;
-}
-
 /* Record any equivalences created by the incoming edge to BB into
    CONST_AND_COPIES and AVAIL_EXPRS_STACK.  If BB has more than one
    incoming edge, then no equivalence is created.  */
@@ -1107,7 +1073,7 @@ record_equivalences_from_incoming_edge (basic_block bb,
      the parent was followed.  */
   parent = get_immediate_dominator (CDI_DOMINATORS, bb);
 
-  e = single_incoming_edge_ignoring_loop_edges (bb);
+  e = single_pred_edge_ignoring_loop_edges (bb, true);
 
   /* If we had a single incoming edge from our parent block, then enter
      any data associated with the edge into our tables.  */
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 2aeb1f53410..741180e51bf 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -129,7 +129,7 @@ valid_ao_ref_for_dse (ao_ref *ref)
 {
   return (ao_ref_base (ref)
 	  && known_size_p (ref->max_size)
-	  && maybe_nonzero (ref->size)
+	  && may_ne (ref->size, 0)
 	  && must_eq (ref->max_size, ref->size)
 	  && must_ge (ref->offset, 0)
 	  && multiple_p (ref->offset, BITS_PER_UNIT)
@@ -606,7 +606,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple **use_stmt,
 		      ao_ref use_ref;
 		      ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt));
 		      if (valid_ao_ref_for_dse (&use_ref)
-			  && must_eq (use_ref.base, ref->base)
+			  && use_ref.base == ref->base
 			  && must_eq (use_ref.size, use_ref.max_size)
 			  && !live_bytes_read (use_ref, ref, live_bytes))
 			{
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index ec54a82da24..eb47a08004d 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -1174,7 +1174,7 @@ constant_pointer_difference (tree p1, tree p2)
 	      if (base)
 		{
 		  q = base;
-		  if (maybe_nonzero (offset))
+		  if (may_ne (offset, 0))
 		    off = size_binop (PLUS_EXPR, off, size_int (offset));
 		}
 	      if (TREE_CODE (q) == MEM_REF
@@ -1491,9 +1491,14 @@ defcodefor_name (tree name, enum tree_code *code, tree *arg1, tree *arg2)
    applied, otherwise return false.
 
    We are looking for X with unsigned type T with bitsize B, OP being
-   +, | or ^, some type T2 wider than T and
+   +, | or ^, some type T2 wider than T.  For:
    (X << CNT1) OP (X >> CNT2)				iff CNT1 + CNT2 == B
    ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))	iff CNT1 + CNT2 == B
+
+   transform these into:
+   X r<< CNT1
+
+   Or for:
    (X << Y) OP (X >> (B - Y))
    (X << (int) Y) OP (X >> (int) (B - Y))
    ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
@@ -1503,12 +1508,23 @@ defcodefor_name (tree name, enum tree_code *code, tree *arg1, tree *arg2)
    ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
    ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
 
-   and transform these into:
-   X r<< CNT1
+   transform these into:
    X r<< Y
 
+   Or for:
+   (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
+   (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
+   ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
+   ((T) ((T2) X << (int) (Y & (B - 1)))) \
+     | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
+
+   transform these into:
+   X r<< (Y & (B - 1))
+
    Note, in the patterns with T2 type, the type of OP operands
-   might be even a signed type, but should have precision B.  */
+   might be even a signed type, but should have precision B.
+   Expressions with & (B - 1) should be recognized only if B is
+   a power of 2.  */
 
 static bool
 simplify_rotate (gimple_stmt_iterator *gsi)
@@ -1578,7 +1594,9 @@ simplify_rotate (gimple_stmt_iterator *gsi)
 	def_arg1[i] = tem;
       }
   /* Both shifts have to use the same first operand.  */
-  if (TREE_CODE (def_arg1[0]) != SSA_NAME || def_arg1[0] != def_arg1[1])
+  if (!operand_equal_for_phi_arg_p (def_arg1[0], def_arg1[1])
+      || !types_compatible_p (TREE_TYPE (def_arg1[0]),
+			      TREE_TYPE (def_arg1[1])))
     return false;
   if (!TYPE_UNSIGNED (TREE_TYPE (def_arg1[0])))
     return false;
@@ -1649,8 +1667,10 @@ simplify_rotate (gimple_stmt_iterator *gsi)
 	/* The above sequence isn't safe for Y being 0,
 	   because then one of the shifts triggers undefined behavior.
 	   This alternative is safe even for rotation count of 0.
-	   One shift count is Y and the other (-Y) & (B - 1).  */
+	   One shift count is Y and the other (-Y) & (B - 1).
+	   Or one shift count is Y & (B - 1) and the other (-Y) & (B - 1).  */
 	else if (cdef_code[i] == BIT_AND_EXPR
+		 && pow2p_hwi (TYPE_PRECISION (rtype))
 		 && tree_fits_shwi_p (cdef_arg2[i])
 		 && tree_to_shwi (cdef_arg2[i])
 		    == TYPE_PRECISION (rtype) - 1
@@ -1675,17 +1695,50 @@ simplify_rotate (gimple_stmt_iterator *gsi)
 		    rotcnt = tem;
 		    break;
 		  }
-		defcodefor_name (tem, &code, &tem, NULL);
+		tree tem2;
+		defcodefor_name (tem, &code, &tem2, NULL);
 		if (CONVERT_EXPR_CODE_P (code)
-		    && INTEGRAL_TYPE_P (TREE_TYPE (tem))
-		    && TYPE_PRECISION (TREE_TYPE (tem))
+		    && INTEGRAL_TYPE_P (TREE_TYPE (tem2))
+		    && TYPE_PRECISION (TREE_TYPE (tem2))
 		       > floor_log2 (TYPE_PRECISION (rtype))
-		    && type_has_mode_precision_p (TREE_TYPE (tem))
-		    && (tem == def_arg2[1 - i]
-			|| tem == def_arg2_alt[1 - i]))
+		    && type_has_mode_precision_p (TREE_TYPE (tem2)))
 		  {
-		    rotcnt = tem;
-		    break;
+		    if (tem2 == def_arg2[1 - i]
+			|| tem2 == def_arg2_alt[1 - i])
+		      {
+			rotcnt = tem2;
+			break;
+		      }
+		  }
+		else
+		  tem2 = NULL_TREE;
+
+		if (cdef_code[1 - i] == BIT_AND_EXPR
+		    && tree_fits_shwi_p (cdef_arg2[1 - i])
+		    && tree_to_shwi (cdef_arg2[1 - i])
+		       == TYPE_PRECISION (rtype) - 1
+		    && TREE_CODE (cdef_arg1[1 - i]) == SSA_NAME)
+		  {
+		    if (tem == cdef_arg1[1 - i]
+			|| tem2 == cdef_arg1[1 - i])
+		      {
+			rotcnt = def_arg2[1 - i];
+			break;
+		      }
+		    tree tem3;
+		    defcodefor_name (cdef_arg1[1 - i], &code, &tem3, NULL);
+		    if (CONVERT_EXPR_CODE_P (code)
+			&& INTEGRAL_TYPE_P (TREE_TYPE (tem3))
+			&& TYPE_PRECISION (TREE_TYPE (tem3))
+			   > floor_log2 (TYPE_PRECISION (rtype))
+			&& type_has_mode_precision_p (TREE_TYPE (tem3)))
+		      {
+			if (tem == tem3 || tem2 == tem3)
+			  {
+			    rotcnt = def_arg2[1 - i];
+			    break;
+			  }
+		      }
 		  }
 	      }
 	  }
diff --git a/gcc/tree-ssa-ifcombine.c b/gcc/tree-ssa-ifcombine.c
index a211335889b..ff26dd1f731 100644
--- a/gcc/tree-ssa-ifcombine.c
+++ b/gcc/tree-ssa-ifcombine.c
@@ -358,10 +358,7 @@ update_profile_after_ifcombine (basic_block inner_cond_bb,
      outer_cond_bb->(outer_to_inner)->inner_cond_bb->(inner_taken)
      and probability of inner_not_taken updated.  */
 
-  outer_to_inner->count = outer_cond_bb->count;
   inner_cond_bb->count = outer_cond_bb->count;
-  inner_taken->count += outer2->count;
-  outer2->count = profile_count::zero ();
 
   inner_taken->probability = outer2->probability + outer_to_inner->probability
 			     * inner_taken->probability;
@@ -369,7 +366,6 @@ update_profile_after_ifcombine (basic_block inner_cond_bb,
 				 - inner_taken->probability;
 
   outer_to_inner->probability = profile_probability::always ();
-  inner_cond_bb->frequency = outer_cond_bb->frequency;
   outer2->probability = profile_probability::never ();
 }
 
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 3866709fa6c..5103d12cf87 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1802,7 +1802,7 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, tree flag,
   for (hash_set<basic_block>::iterator it = flag_bbs->begin ();
        it != flag_bbs->end (); ++it)
     {
-       freq_sum += (*it)->frequency;
+       freq_sum += (*it)->count.to_frequency (cfun);
        if ((*it)->count.initialized_p ())
          count_sum += (*it)->count, ncount ++;
        if (dominated_by_p (CDI_DOMINATORS, ex->src, *it))
@@ -1814,20 +1814,15 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, tree flag,
 
   if (flag_probability.initialized_p ())
     ;
-  else if (ncount == nbbs && count_sum > 0 && preheader->count >= count_sum)
+  else if (ncount == nbbs
+	   && preheader->count () >= count_sum && preheader->count ().nonzero_p ())
     {
-      flag_probability = count_sum.probability_in (preheader->count);
+      flag_probability = count_sum.probability_in (preheader->count ());
       if (flag_probability > cap)
 	flag_probability = cap;
     }
-  else if (freq_sum > 0 && EDGE_FREQUENCY (preheader) >= freq_sum)
-    {
-      flag_probability = profile_probability::from_reg_br_prob_base
-		(GCOV_COMPUTE_SCALE (freq_sum, EDGE_FREQUENCY (preheader)));
-      if (flag_probability > cap)
-	flag_probability = cap;
-    }
-  else
+
+  if (!flag_probability.initialized_p ())
     flag_probability = cap;
 
   /* ?? Insert store after previous store if applicable.  See note
@@ -1860,7 +1855,6 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, tree flag,
   old_dest = ex->dest;
   new_bb = split_edge (ex);
   then_bb = create_empty_bb (new_bb);
-  then_bb->frequency = flag_probability.apply (new_bb->frequency);
   then_bb->count = new_bb->count.apply_probability (flag_probability);
   if (irr)
     then_bb->flags = BB_IRREDUCIBLE_LOOP;
@@ -1880,13 +1874,11 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, tree flag,
   edge e2 = make_edge (new_bb, then_bb,
 	               EDGE_TRUE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0));
   e2->probability = flag_probability;
-  e2->count = then_bb->count;
 
   e1->flags |= EDGE_FALSE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0);
   e1->flags &= ~EDGE_FALLTHRU;
 
   e1->probability = flag_probability.invert ();
-  e1->count = new_bb->count - then_bb->count;
 
   then_old_edge = make_single_succ_edge (then_bb, old_dest,
 			     EDGE_FALLTHRU | (irr ? EDGE_IRREDUCIBLE_LOOP : 0));
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index efb199aaaa2..8b1daa66b3e 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -530,7 +530,6 @@ remove_exits_and_undefined_stmts (struct loop *loop, unsigned int npeeled)
 	  if (!loop_exit_edge_p (loop, exit_edge))
 	    exit_edge = EDGE_SUCC (bb, 1);
 	  exit_edge->probability = profile_probability::always ();
-	  exit_edge->count = exit_edge->src->count;
 	  gcc_checking_assert (loop_exit_edge_p (loop, exit_edge));
 	  gcond *cond_stmt = as_a <gcond *> (elt->stmt);
 	  if (exit_edge->flags & EDGE_TRUE_VALUE)
@@ -643,13 +642,11 @@ unloop_loops (bitmap loop_closed_ssa_invalidated,
       stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_UNREACHABLE), 0);
       latch_edge = make_edge (latch, create_basic_block (NULL, NULL, latch), flags);
       latch_edge->probability = profile_probability::never ();
-      latch_edge->count = profile_count::zero ();
       latch_edge->flags |= flags;
       latch_edge->goto_locus = locus;
 
       add_bb_to_loop (latch_edge->dest, current_loops->tree_root);
       latch_edge->dest->count = profile_count::zero ();
-      latch_edge->dest->frequency = 0;
       set_immediate_dominator (CDI_DOMINATORS, latch_edge->dest, latch_edge->src);
 
       gsi = gsi_start_bb (latch_edge->dest);
@@ -1092,7 +1089,6 @@ try_peel_loop (struct loop *loop,
 	}
     }
   profile_count entry_count = profile_count::zero ();
-  int entry_freq = 0;
 
   edge e;
   edge_iterator ei;
@@ -1101,15 +1097,10 @@ try_peel_loop (struct loop *loop,
       {
 	if (e->src->count.initialized_p ())
 	  entry_count = e->src->count + e->src->count;
-	entry_freq += e->src->frequency;
 	gcc_assert (!flow_bb_inside_loop_p (loop, e->src));
       }
   profile_probability p = profile_probability::very_unlikely ();
-  if (loop->header->count > 0)
-    p = entry_count.probability_in (loop->header->count);
-  else if (loop->header->frequency)
-    p = profile_probability::probability_in_gcov_type
-		 (entry_freq, loop->header->frequency);
+  p = entry_count.probability_in (loop->header->count);
   scale_loop_profile (loop, p, 0);
   bitmap_set_bit (peeled_loops, loop->num);
   return true;
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index fd23eba8158..e550c850e25 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -1571,9 +1571,6 @@ record_invariant (struct ivopts_data *data, tree op, bool nonlinear_use)
   bitmap_set_bit (data->relevant, SSA_NAME_VERSION (op));
 }
 
-static tree
-strip_offset (tree expr, poly_uint64 *offset);
-
 /* Record a group of TYPE.  */
 
 static struct iv_group *
@@ -2724,7 +2721,7 @@ split_address_groups (struct ivopts_data *data)
 	  /* Split group if aksed to, or the offset against the first
 	     use can't fit in offset part of addressing mode.  IV uses
 	     having the same offset are still kept in one group.  */
-	  if (maybe_nonzero (offset)
+	  if (may_ne (offset, 0)
 	      && (split_p || !addr_offset_valid_p (use, offset)))
 	    {
 	      if (!new_group)
@@ -2920,7 +2917,7 @@ strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
       break;
 
     default:
-      if (ptrdiff_tree_p (expr, offset) && maybe_nonzero (*offset))
+      if (ptrdiff_tree_p (expr, offset) && may_ne (*offset, 0))
 	return build_int_cst (orig_type, 0);
       return orig_expr;
     }
@@ -2950,8 +2947,8 @@ strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
 
 /* Strips constant offsets from EXPR and stores them to OFFSET.  */
 
-static tree
-strip_offset (tree expr, poly_uint64 *offset)
+tree
+strip_offset (tree expr, poly_uint64_pod *offset)
 {
   poly_int64 off;
   tree core = strip_offset_1 (expr, false, false, &off);
@@ -3228,7 +3225,7 @@ add_autoinc_candidates (struct ivopts_data *data, tree base, tree step,
      statement.  */
   if (use_bb->loop_father != data->current_loop
       || !dominated_by_p (CDI_DOMINATORS, data->current_loop->latch, use_bb)
-      || stmt_could_throw_p (use->stmt)
+      || stmt_can_throw_internal (use->stmt)
       || !cst_and_fits_in_hwi (step))
     return;
 
@@ -3499,7 +3496,7 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use)
   /* Record common candidate with constant offset stripped in base.
      Like the use itself, we also add candidate directly for it.  */
   base = strip_offset (iv->base, &offset);
-  if (maybe_nonzero (offset) || base != iv->base)
+  if (may_ne (offset, 0U) || base != iv->base)
     {
       record_common_cand (data, base, iv->step, use);
       add_candidate (data, base, iv->step, false, use);
@@ -3518,7 +3515,7 @@ add_iv_candidate_for_use (struct ivopts_data *data, struct iv_use *use)
       record_common_cand (data, base, step, use);
       /* Also record common candidate with offset stripped.  */
       base = strip_offset (base, &offset);
-      if (maybe_nonzero (offset))
+      if (may_ne (offset, 0U))
 	record_common_cand (data, base, step, use);
     }
 
@@ -4385,9 +4382,9 @@ get_address_cost_ainc (poly_int64 ainc_step, poly_int64 ainc_offset,
     }
 
   poly_int64 msize = GET_MODE_SIZE (mem_mode);
-  if (known_zero (ainc_offset) && must_eq (msize, ainc_step))
+  if (must_eq (ainc_offset, 0) && must_eq (msize, ainc_step))
     return comp_cost (data->costs[AINC_POST_INC], 0);
-  if (known_zero (ainc_offset) && must_eq (msize, -ainc_step))
+  if (must_eq (ainc_offset, 0) && must_eq (msize, -ainc_step))
     return comp_cost (data->costs[AINC_POST_DEC], 0);
   if (must_eq (ainc_offset, msize) && must_eq (msize, ainc_step))
     return comp_cost (data->costs[AINC_PRE_INC], 0);
@@ -4429,19 +4426,19 @@ get_address_cost (struct ivopts_data *data, struct iv_use *use,
   if (!aff_combination_const_p (aff_inv))
     {
       parts.index = integer_one_node;
-      /* Addressing mode "base + index [<< scale]".  */
-      parts.step = NULL_TREE;
+      /* Addressing mode "base + index".  */
       ok_without_ratio_p = valid_mem_ref_p (mem_mode, as, &parts);
       if (ratio != 1)
 	{
 	  parts.step = wide_int_to_tree (type, ratio);
+	  /* Addressing mode "base + index << scale".  */
 	  ok_with_ratio_p = valid_mem_ref_p (mem_mode, as, &parts);
 	  if (!ok_with_ratio_p)
 	    parts.step = NULL_TREE;
 	}
       if (ok_with_ratio_p || ok_without_ratio_p)
 	{
-	  if (maybe_nonzero (aff_inv->offset))
+	  if (may_ne (aff_inv->offset, 0))
 	    {
 	      parts.offset = wide_int_to_tree (sizetype, aff_inv->offset);
 	      /* Addressing mode "base + index [<< scale] + offset".  */
@@ -4542,7 +4539,7 @@ get_address_cost (struct ivopts_data *data, struct iv_use *use,
     cost.complexity += 1;
   /* Don't increase the complexity of adding a scaled index if it's
      the only kind of index that the target allows.  */
-  if (ok_with_ratio_p && ok_without_ratio_p)
+  if (parts.step != NULL_TREE && ok_without_ratio_p)
     cost.complexity += 1;
   if (parts.base != NULL_TREE && parts.index != NULL_TREE)
     cost.complexity += 1;
@@ -4559,8 +4556,8 @@ get_address_cost (struct ivopts_data *data, struct iv_use *use,
 static comp_cost
 get_scaled_computation_cost_at (ivopts_data *data, gimple *at, comp_cost cost)
 {
-   int loop_freq = data->current_loop->header->frequency;
-   int bb_freq = gimple_bb (at)->frequency;
+   int loop_freq = data->current_loop->header->count.to_frequency (cfun);
+   int bb_freq = gimple_bb (at)->count.to_frequency (cfun);
    if (loop_freq != 0)
      {
        gcc_assert (cost.scratch <= cost.cost);
diff --git a/gcc/tree-ssa-loop-ivopts.h b/gcc/tree-ssa-loop-ivopts.h
index f8f31e93856..6326bb01eb9 100644
--- a/gcc/tree-ssa-loop-ivopts.h
+++ b/gcc/tree-ssa-loop-ivopts.h
@@ -28,6 +28,7 @@ extern void dump_cand (FILE *, struct iv_cand *);
 extern bool contains_abnormal_ssa_name_p (tree);
 extern struct loop *outermost_invariant_loop_for_expr (struct loop *, tree);
 extern bool expr_invariant_in_loop_p (struct loop *, tree);
+extern tree strip_offset (tree, poly_uint64_pod *);
 bool may_be_nonaddressable_p (tree expr);
 void tree_ssa_iv_optimize (void);
 
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index b08b8b9b92c..1efcd272241 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -1122,6 +1122,9 @@ niter_for_unrolled_loop (struct loop *loop, unsigned factor)
      converts back.  */
   gcov_type new_est_niter = est_niter / factor;
 
+  if (est_niter == -1)
+    return -1;
+
   /* Without profile feedback, loops for which we do not know a better estimate
      are assumed to roll 10 times.  When we unroll such loop, it appears to
      roll too little, and it may even seem to be cold.  To avoid this, we
@@ -1294,12 +1297,10 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor,
   /* Set the probability of new exit to the same of the old one.  Fix
      the frequency of the latch block, by scaling it back by
      1 - exit->probability.  */
-  new_exit->count = exit->count;
   new_exit->probability = exit->probability;
   new_nonexit = single_pred_edge (loop->latch);
   new_nonexit->probability = exit->probability.invert ();
   new_nonexit->flags = EDGE_TRUE_VALUE;
-  new_nonexit->count -= exit->count;
   if (new_nonexit->probability.initialized_p ())
     scale_bbs_frequencies (&loop->latch, 1, new_nonexit->probability);
 
@@ -1371,15 +1372,8 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor,
      exit edge.  */
 
   freq_h = loop->header->count;
-  freq_e = (loop_preheader_edge (loop))->count;
-  /* Use frequency only if counts are zero.  */
-  if (!(freq_h > 0) && !(freq_e > 0))
-    {
-      freq_h = profile_count::from_gcov_type (loop->header->frequency);
-      freq_e = profile_count::from_gcov_type
-		 (EDGE_FREQUENCY (loop_preheader_edge (loop)));
-    }
-  if (freq_h > 0)
+  freq_e = (loop_preheader_edge (loop))->count ();
+  if (freq_h.nonzero_p ())
     {
       /* Avoid dropping loop body profile counter to 0 because of zero count
 	 in loop's preheader.  */
@@ -1390,17 +1384,14 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor,
 
   exit_bb = single_pred (loop->latch);
   new_exit = find_edge (exit_bb, rest);
-  new_exit->count = loop_preheader_edge (loop)->count;
   new_exit->probability = profile_probability::always ()
 				.apply_scale (1, new_est_niter + 1);
 
-  rest->count += new_exit->count;
-  rest->frequency += EDGE_FREQUENCY (new_exit);
+  rest->count += new_exit->count ();
 
   new_nonexit = single_pred_edge (loop->latch);
   prob = new_nonexit->probability;
   new_nonexit->probability = new_exit->probability.invert ();
-  new_nonexit->count = exit_bb->count - new_exit->count;
   prob = new_nonexit->probability / prob;
   if (prob.initialized_p ())
     scale_bbs_frequencies (&loop->latch, 1, prob);
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 89e57931745..9b8b11048db 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -3901,7 +3901,7 @@ estimate_numbers_of_iterations (struct loop *loop)
      recomputing iteration bounds later in the compilation process will just
      introduce random roundoff errors.  */
   if (!loop->any_estimate
-      && loop->header->count > 0)
+      && loop->header->count.reliable_p ())
     {
       gcov_type nit = expected_loop_iterations_unbounded (loop);
       bound = gcov_type_to_wide_int (nit);
diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index e454cc5dc93..dcb7c1ee4c8 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -353,11 +353,8 @@ connect_loops (struct loop *loop1, struct loop *loop2)
       new_e->flags |= EDGE_TRUE_VALUE;
     }
 
-  new_e->count = skip_bb->count;
   new_e->probability = profile_probability::likely ();
-  new_e->count = skip_e->count.apply_probability (PROB_LIKELY);
-  skip_e->count -= new_e->count;
-  skip_e->probability = profile_probability::unlikely ();
+  skip_e->probability = new_e->probability.invert ();
 
   return new_e;
 }
@@ -560,7 +557,6 @@ split_loop (struct loop *loop1, struct tree_niter_desc *niter)
 	initialize_original_copy_tables ();
 	basic_block cond_bb;
 
-	/* FIXME: probabilities seems wrong here.  */
 	struct loop *loop2 = loop_version (loop1, cond, &cond_bb,
 					   profile_probability::always (),
 					   profile_probability::always (),
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 57aba4f1dd0..ecc72cbaf5c 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -852,17 +852,16 @@ hoist_guard (struct loop *loop, edge guard)
   /* Determine the probability that we skip the loop.  Assume that loop has
      same average number of iterations regardless outcome of guard.  */
   new_edge->probability = guard->probability;
-  profile_count skip_count = guard->src->count > 0
-		   ? guard->count.apply_scale (pre_header->count,
+  profile_count skip_count = guard->src->count.nonzero_p ()
+		   ? guard->count ().apply_scale (pre_header->count,
 					       guard->src->count)
-		   : guard->count.apply_probability (new_edge->probability);
+		   : guard->count ().apply_probability (new_edge->probability);
 
-  if (skip_count > e->count)
+  if (skip_count > e->count ())
     {
       fprintf (dump_file, "  Capping count; expect profile inconsistency\n");
-      skip_count = e->count;
+      skip_count = e->count ();
     }
-  new_edge->count = skip_count;
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
       fprintf (dump_file, "  Estimated probability of skipping loop is ");
@@ -874,19 +873,13 @@ hoist_guard (struct loop *loop, edge guard)
 
      First decrease count of path from newly hoisted loop guard
      to loop header...  */
-  e->count -= skip_count;
   e->probability = new_edge->probability.invert ();
-  e->dest->count = e->count;
-  e->dest->frequency = EDGE_FREQUENCY (e);
+  e->dest->count = e->count ();
 
   /* ... now update profile to represent that original guard will be optimized
      away ...  */
   guard->probability = profile_probability::never ();
-  guard->count = profile_count::zero ();
   not_guard->probability = profile_probability::always ();
-  /* This count is wrong (frequency of not_guard does not change),
-     but will be scaled later.  */
-  not_guard->count = guard->src->count;
 
   /* ... finally scale everything in the loop except for guarded basic blocks
      where profile does not change.  */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 1722ecb3d01..a600516ded8 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -515,6 +515,7 @@ internal_fn_reciprocal (gcall *call)
   switch (gimple_call_combined_fn (call))
     {
     CASE_CFN_SQRT:
+    CASE_CFN_SQRT_FN:
       ifn = IFN_RSQRT;
       break;
 
@@ -3258,6 +3259,9 @@ convert_mult_to_widen (gimple *stmt, gimple_stmt_iterator *gsi)
 
   to_mode = SCALAR_INT_TYPE_MODE (type);
   from_mode = SCALAR_INT_TYPE_MODE (type1);
+  if (to_mode == from_mode)
+    return false;
+
   from_unsigned1 = TYPE_UNSIGNED (type1);
   from_unsigned2 = TYPE_UNSIGNED (type2);
 
@@ -3448,6 +3452,9 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple *stmt,
 
   to_mode = SCALAR_TYPE_MODE (type);
   from_mode = SCALAR_TYPE_MODE (type1);
+  if (to_mode == from_mode)
+    return false;
+
   from_unsigned1 = TYPE_UNSIGNED (type1);
   from_unsigned2 = TYPE_UNSIGNED (type2);
   optype = type1;
diff --git a/gcc/tree-ssa-phionlycprop.c b/gcc/tree-ssa-phionlycprop.c
index 65af44834df..fe39aa71f98 100644
--- a/gcc/tree-ssa-phionlycprop.c
+++ b/gcc/tree-ssa-phionlycprop.c
@@ -298,7 +298,6 @@ propagate_rhs_into_lhs (gimple *stmt, tree lhs, tree rhs,
 
 			  te->probability += e->probability;
 
-			  te->count += e->count;
 			  remove_edge (e);
 			  cfg_altered = true;
 			}
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 1682f18eabb..ed342dac46d 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -375,7 +375,6 @@ replace_phi_edge_with_variable (basic_block cond_block,
       EDGE_SUCC (cond_block, 0)->flags |= EDGE_FALLTHRU;
       EDGE_SUCC (cond_block, 0)->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
       EDGE_SUCC (cond_block, 0)->probability = profile_probability::always ();
-      EDGE_SUCC (cond_block, 0)->count += EDGE_SUCC (cond_block, 1)->count;
 
       block_to_remove = EDGE_SUCC (cond_block, 1)->dest;
     }
@@ -385,7 +384,6 @@ replace_phi_edge_with_variable (basic_block cond_block,
       EDGE_SUCC (cond_block, 1)->flags
 	&= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
       EDGE_SUCC (cond_block, 1)->probability = profile_probability::always ();
-      EDGE_SUCC (cond_block, 1)->count += EDGE_SUCC (cond_block, 0)->count;
 
       block_to_remove = EDGE_SUCC (cond_block, 0)->dest;
     }
@@ -697,7 +695,7 @@ jump_function_from_stmt (tree *arg, gimple *stmt)
 						&offset);
       if (tem
 	  && TREE_CODE (tem) == MEM_REF
-	  && known_zero (mem_ref_offset (tem) + offset))
+	  && must_eq (mem_ref_offset (tem) + offset, 0))
 	{
 	  *arg = TREE_OPERAND (tem, 0);
 	  return true;
@@ -995,11 +993,13 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
 
     }
 
-  /* Now optimize (x != 0) ? x + y : y to just y.
-     The following condition is too restrictive, there can easily be another
-     stmt in middle_bb, for instance a CONVERT_EXPR for the second argument.  */
-  gimple *assign = last_and_only_stmt (middle_bb);
-  if (!assign || gimple_code (assign) != GIMPLE_ASSIGN
+  /* Now optimize (x != 0) ? x + y : y to just x + y.  */
+  gsi = gsi_last_nondebug_bb (middle_bb);
+  if (gsi_end_p (gsi))
+    return 0;
+
+  gimple *assign = gsi_stmt (gsi);
+  if (!is_gimple_assign (assign)
       || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS
       || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0))
 	  && !POINTER_TYPE_P (TREE_TYPE (arg0))))
@@ -1009,6 +1009,71 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
   if (!gimple_seq_empty_p (phi_nodes (middle_bb)))
     return 0;
 
+  /* Allow up to 2 cheap preparation statements that prepare argument
+     for assign, e.g.:
+      if (y_4 != 0)
+	goto <bb 3>;
+      else
+	goto <bb 4>;
+     <bb 3>:
+      _1 = (int) y_4;
+      iftmp.0_6 = x_5(D) r<< _1;
+     <bb 4>:
+      # iftmp.0_2 = PHI <iftmp.0_6(3), x_5(D)(2)>
+     or:
+      if (y_3(D) == 0)
+	goto <bb 4>;
+      else
+	goto <bb 3>;
+     <bb 3>:
+      y_4 = y_3(D) & 31;
+      _1 = (int) y_4;
+      _6 = x_5(D) r<< _1;
+     <bb 4>:
+      # _2 = PHI <x_5(D)(2), _6(3)>  */
+  gimple *prep_stmt[2] = { NULL, NULL };
+  int prep_cnt;
+  for (prep_cnt = 0; ; prep_cnt++)
+    {
+      gsi_prev_nondebug (&gsi);
+      if (gsi_end_p (gsi))
+	break;
+
+      gimple *g = gsi_stmt (gsi);
+      if (gimple_code (g) == GIMPLE_LABEL)
+	break;
+
+      if (prep_cnt == 2 || !is_gimple_assign (g))
+	return 0;
+
+      tree lhs = gimple_assign_lhs (g);
+      tree rhs1 = gimple_assign_rhs1 (g);
+      use_operand_p use_p;
+      gimple *use_stmt;
+      if (TREE_CODE (lhs) != SSA_NAME
+	  || TREE_CODE (rhs1) != SSA_NAME
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+	  || !single_imm_use (lhs, &use_p, &use_stmt)
+	  || use_stmt != (prep_cnt ? prep_stmt[prep_cnt - 1] : assign))
+	return 0;
+      switch (gimple_assign_rhs_code (g))
+	{
+	CASE_CONVERT:
+	  break;
+	case PLUS_EXPR:
+	case BIT_AND_EXPR:
+	case BIT_IOR_EXPR:
+	case BIT_XOR_EXPR:
+	  if (TREE_CODE (gimple_assign_rhs2 (g)) != INTEGER_CST)
+	    return 0;
+	  break;
+	default:
+	  return 0;
+	}
+      prep_stmt[prep_cnt] = g;
+    }
+
   /* Only transform if it removes the condition.  */
   if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, e1))
     return 0;
@@ -1019,7 +1084,7 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
       && profile_status_for_fn (cfun) != PROFILE_ABSENT
       && EDGE_PRED (middle_bb, 0)->probability < profile_probability::even ()
       /* If assign is cheap, there is no point avoiding it.  */
-      && estimate_num_insns (assign, &eni_time_weights)
+      && estimate_num_insns (bb_seq (middle_bb), &eni_time_weights)
 	 >= 3 * estimate_num_insns (cond, &eni_time_weights))
     return 0;
 
@@ -1030,6 +1095,32 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
   tree cond_lhs = gimple_cond_lhs (cond);
   tree cond_rhs = gimple_cond_rhs (cond);
 
+  /* Propagate the cond_rhs constant through preparation stmts,
+     make sure UB isn't invoked while doing that.  */
+  for (int i = prep_cnt - 1; i >= 0; --i)
+    {
+      gimple *g = prep_stmt[i];
+      tree grhs1 = gimple_assign_rhs1 (g);
+      if (!operand_equal_for_phi_arg_p (cond_lhs, grhs1))
+	return 0;
+      cond_lhs = gimple_assign_lhs (g);
+      cond_rhs = fold_convert (TREE_TYPE (grhs1), cond_rhs);
+      if (TREE_CODE (cond_rhs) != INTEGER_CST
+	  || TREE_OVERFLOW (cond_rhs))
+	return 0;
+      if (gimple_assign_rhs_class (g) == GIMPLE_BINARY_RHS)
+	{
+	  cond_rhs = int_const_binop (gimple_assign_rhs_code (g), cond_rhs,
+				      gimple_assign_rhs2 (g));
+	  if (TREE_OVERFLOW (cond_rhs))
+	    return 0;
+	}
+      cond_rhs = fold_convert (TREE_TYPE (cond_lhs), cond_rhs);
+      if (TREE_CODE (cond_rhs) != INTEGER_CST
+	  || TREE_OVERFLOW (cond_rhs))
+	return 0;
+    }
+
   if (((code == NE_EXPR && e1 == false_edge)
 	|| (code == EQ_EXPR && e1 == true_edge))
       && arg0 == lhs
@@ -1071,7 +1162,15 @@ value_replacement (basic_block cond_bb, basic_block middle_bb,
 	    duplicate_ssa_name_range_info (lhs, SSA_NAME_RANGE_TYPE (phires),
 					   phires_range_info);
 	}
-      gimple_stmt_iterator gsi_from = gsi_for_stmt (assign);
+      gimple_stmt_iterator gsi_from;
+      for (int i = prep_cnt - 1; i >= 0; --i)
+	{
+	  tree plhs = gimple_assign_lhs (prep_stmt[i]);
+	  SSA_NAME_RANGE_INFO (plhs) = NULL;
+	  gsi_from = gsi_for_stmt (prep_stmt[i]);
+	  gsi_move_before (&gsi_from, &gsi);
+	}
+      gsi_from = gsi_for_stmt (assign);
       gsi_move_before (&gsi_from, &gsi);
       replace_phi_edge_with_variable (cond_bb, e1, phi, lhs);
       return 2;
@@ -1813,9 +1912,24 @@ cond_store_replacement (basic_block middle_bb, basic_block join_bb,
   gsi_remove (&gsi, true);
   release_defs (assign);
 
+  /* Make both store and load use alias-set zero as we have to
+     deal with the case of the store being a conditional change
+     of the dynamic type.  */
+  lhs = unshare_expr (lhs);
+  tree *basep = &lhs;
+  while (handled_component_p (*basep))
+    basep = &TREE_OPERAND (*basep, 0);
+  if (TREE_CODE (*basep) == MEM_REF
+      || TREE_CODE (*basep) == TARGET_MEM_REF)
+    TREE_OPERAND (*basep, 1)
+      = fold_convert (ptr_type_node, TREE_OPERAND (*basep, 1));
+  else
+    *basep = build2 (MEM_REF, TREE_TYPE (*basep),
+		     build_fold_addr_expr (*basep),
+		     build_zero_cst (ptr_type_node));
+
   /* 2) Insert a load from the memory of the store to the temporary
         on the edge which did not contain the store.  */
-  lhs = unshare_expr (lhs);
   name = make_temp_ssa_name (TREE_TYPE (lhs), NULL, "cstore");
   new_stmt = gimple_build_assign (name, lhs);
   gimple_set_location (new_stmt, locus);
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index c9079cd6751..45a82f95eb2 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -39,7 +39,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "gimple-iterator.h"
 #include "tree-cfg.h"
-#include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
 #include "tree-dfa.h"
 #include "tree-ssa.h"
@@ -50,9 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "domwalk.h"
 #include "tree-ssa-propagate.h"
-#include "ipa-utils.h"
 #include "tree-cfgcleanup.h"
-#include "langhooks.h"
 #include "alias.h"
 
 /* Even though this file is called tree-ssa-pre.c, we actually
@@ -516,9 +513,6 @@ typedef struct bb_bitmap_sets
    optimization PRE was able to perform.  */
 static struct
 {
-  /* The number of RHS computations eliminated by PRE.  */
-  int eliminations;
-
   /* The number of new expressions/temporaries generated by PRE.  */
   int insertions;
 
@@ -537,7 +531,6 @@ static pre_expr bitmap_find_leader (bitmap_set_t, unsigned int);
 static void bitmap_value_insert_into_set (bitmap_set_t, pre_expr);
 static void bitmap_value_replace_in_set (bitmap_set_t, pre_expr);
 static void bitmap_set_copy (bitmap_set_t, bitmap_set_t);
-static void bitmap_set_and (bitmap_set_t, bitmap_set_t);
 static bool bitmap_set_contains_value (bitmap_set_t, unsigned int);
 static void bitmap_insert_into_set (bitmap_set_t, pre_expr);
 static bitmap_set_t bitmap_set_new (void);
@@ -552,12 +545,6 @@ static unsigned int get_expr_value_id (pre_expr);
 static object_allocator<bitmap_set> bitmap_set_pool ("Bitmap sets");
 static bitmap_obstack grand_bitmap_obstack;
 
-/* Set of blocks with statements that have had their EH properties changed.  */
-static bitmap need_eh_cleanup;
-
-/* Set of blocks with statements that have had their AB properties changed.  */
-static bitmap need_ab_cleanup;
-
 /* A three tuple {e, pred, v} used to cache phi translations in the
    phi_translate_table.  */
 
@@ -720,14 +707,11 @@ sccvn_valnum_from_value_id (unsigned int val)
 /* Remove an expression EXPR from a bitmapped set.  */
 
 static void
-bitmap_remove_from_set (bitmap_set_t set, pre_expr expr)
+bitmap_remove_expr_from_set (bitmap_set_t set, pre_expr expr)
 {
   unsigned int val  = get_expr_value_id (expr);
-  if (!value_id_constant_p (val))
-    {
-      bitmap_clear_bit (&set->values, val);
-      bitmap_clear_bit (&set->expressions, get_expression_id (expr));
-    }
+  bitmap_clear_bit (&set->values, val);
+  bitmap_clear_bit (&set->expressions, get_expression_id (expr));
 }
 
 /* Insert an expression EXPR into a bitmapped set.  */
@@ -800,40 +784,10 @@ sorted_array_from_bitmap_set (bitmap_set_t set)
   return result;
 }
 
-/* Perform bitmapped set operation DEST &= ORIG.  */
-
-static void
-bitmap_set_and (bitmap_set_t dest, bitmap_set_t orig)
-{
-  bitmap_iterator bi;
-  unsigned int i;
-
-  if (dest != orig)
-    {
-      bitmap_and_into (&dest->values, &orig->values);
-
-      unsigned int to_clear = -1U;
-      FOR_EACH_EXPR_ID_IN_SET (dest, i, bi)
-	{
-	  if (to_clear != -1U)
-	    {
-	      bitmap_clear_bit (&dest->expressions, to_clear);
-	      to_clear = -1U;
-	    }
-	  pre_expr expr = expression_for_id (i);
-	  unsigned int value_id = get_expr_value_id (expr);
-	  if (!bitmap_bit_p (&dest->values, value_id))
-	    to_clear = i;
-	}
-      if (to_clear != -1U)
-	bitmap_clear_bit (&dest->expressions, to_clear);
-    }
-}
-
 /* Subtract all expressions contained in ORIG from DEST.  */
 
 static bitmap_set_t
-bitmap_set_subtract (bitmap_set_t dest, bitmap_set_t orig)
+bitmap_set_subtract_expressions (bitmap_set_t dest, bitmap_set_t orig)
 {
   bitmap_set_t result = bitmap_set_new ();
   bitmap_iterator bi;
@@ -864,15 +818,15 @@ bitmap_set_subtract_values (bitmap_set_t a, bitmap_set_t b)
     {
       if (to_remove)
 	{
-	  bitmap_remove_from_set (a, to_remove);
+	  bitmap_remove_expr_from_set (a, to_remove);
 	  to_remove = NULL;
 	}
       pre_expr expr = expression_for_id (i);
-      if (bitmap_set_contains_value (b, get_expr_value_id (expr)))
+      if (bitmap_bit_p (&b->values, get_expr_value_id (expr)))
 	to_remove = expr;
     }
   if (to_remove)
-    bitmap_remove_from_set (a, to_remove);
+    bitmap_remove_expr_from_set (a, to_remove);
 }
 
 
@@ -884,9 +838,6 @@ bitmap_set_contains_value (bitmap_set_t set, unsigned int value_id)
   if (value_id_constant_p (value_id))
     return true;
 
-  if (!set || bitmap_empty_p (&set->expressions))
-    return false;
-
   return bitmap_bit_p (&set->values, value_id);
 }
 
@@ -896,44 +847,6 @@ bitmap_set_contains_expr (bitmap_set_t set, const pre_expr expr)
   return bitmap_bit_p (&set->expressions, get_expression_id (expr));
 }
 
-/* Replace an instance of value LOOKFOR with expression EXPR in SET.  */
-
-static void
-bitmap_set_replace_value (bitmap_set_t set, unsigned int lookfor,
-			  const pre_expr expr)
-{
-  bitmap exprset;
-  unsigned int i;
-  bitmap_iterator bi;
-
-  if (value_id_constant_p (lookfor))
-    return;
-
-  if (!bitmap_set_contains_value (set, lookfor))
-    return;
-
-  /* The number of expressions having a given value is usually
-     significantly less than the total number of expressions in SET.
-     Thus, rather than check, for each expression in SET, whether it
-     has the value LOOKFOR, we walk the reverse mapping that tells us
-     what expressions have a given value, and see if any of those
-     expressions are in our set.  For large testcases, this is about
-     5-10x faster than walking the bitmap.  If this is somehow a
-     significant lose for some cases, we can choose which set to walk
-     based on the set size.  */
-  exprset = value_expressions[lookfor];
-  EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi)
-    {
-      if (bitmap_clear_bit (&set->expressions, i))
-	{
-	  bitmap_set_bit (&set->expressions, get_expression_id (expr));
-	  return;
-	}
-    }
-
-  gcc_unreachable ();
-}
-
 /* Return true if two bitmap sets are equal.  */
 
 static bool
@@ -949,9 +862,33 @@ static void
 bitmap_value_replace_in_set (bitmap_set_t set, pre_expr expr)
 {
   unsigned int val = get_expr_value_id (expr);
+  if (value_id_constant_p (val))
+    return;
 
   if (bitmap_set_contains_value (set, val))
-    bitmap_set_replace_value (set, val, expr);
+    {
+      /* The number of expressions having a given value is usually
+	 significantly less than the total number of expressions in SET.
+	 Thus, rather than check, for each expression in SET, whether it
+	 has the value LOOKFOR, we walk the reverse mapping that tells us
+	 what expressions have a given value, and see if any of those
+	 expressions are in our set.  For large testcases, this is about
+	 5-10x faster than walking the bitmap.  If this is somehow a
+	 significant lose for some cases, we can choose which set to walk
+	 based on the set size.  */
+      unsigned int i;
+      bitmap_iterator bi;
+      bitmap exprset = value_expressions[val];
+      EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi)
+	{
+	  if (bitmap_clear_bit (&set->expressions, i))
+	    {
+	      bitmap_set_bit (&set->expressions, get_expression_id (expr));
+	      return;
+	    }
+	}
+      gcc_unreachable ();
+    }
   else
     bitmap_insert_into_set (set, expr);
 }
@@ -2010,14 +1947,12 @@ valid_in_sets (bitmap_set_t set1, bitmap_set_t set2, pre_expr expr)
     }
 }
 
-/* Clean the set of expressions that are no longer valid in SET1 or
-   SET2.  This means expressions that are made up of values we have no
-   leaders for in SET1 or SET2.  This version is used for partial
-   anticipation, which means it is not valid in either ANTIC_IN or
-   PA_IN.  */
+/* Clean the set of expressions SET1 that are no longer valid in SET1 or SET2.
+   This means expressions that are made up of values we have no leaders for
+   in SET1 or SET2.  */
 
 static void
-dependent_clean (bitmap_set_t set1, bitmap_set_t set2)
+clean (bitmap_set_t set1, bitmap_set_t set2 = NULL)
 {
   vec<pre_expr> exprs = sorted_array_from_bitmap_set (set1);
   pre_expr expr;
@@ -2026,26 +1961,7 @@ dependent_clean (bitmap_set_t set1, bitmap_set_t set2)
   FOR_EACH_VEC_ELT (exprs, i, expr)
     {
       if (!valid_in_sets (set1, set2, expr))
-	bitmap_remove_from_set (set1, expr);
-    }
-  exprs.release ();
-}
-
-/* Clean the set of expressions that are no longer valid in SET.  This
-   means expressions that are made up of values we have no leaders for
-   in SET.  */
-
-static void
-clean (bitmap_set_t set)
-{
-  vec<pre_expr> exprs = sorted_array_from_bitmap_set (set);
-  pre_expr expr;
-  int i;
-
-  FOR_EACH_VEC_ELT (exprs, i, expr)
-    {
-      if (!valid_in_sets (set, NULL, expr))
-	bitmap_remove_from_set (set, expr);
+	bitmap_remove_expr_from_set (set1, expr);
     }
   exprs.release ();
 }
@@ -2065,7 +1981,7 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block)
       /* Remove queued expr.  */
       if (to_remove)
 	{
-	  bitmap_remove_from_set (set, to_remove);
+	  bitmap_remove_expr_from_set (set, to_remove);
 	  to_remove = NULL;
 	}
 
@@ -2100,7 +2016,7 @@ prune_clobbered_mems (bitmap_set_t set, basic_block block)
 
   /* Remove queued expr.  */
   if (to_remove)
-    bitmap_remove_from_set (set, to_remove);
+    bitmap_remove_expr_from_set (set, to_remove);
 }
 
 static sbitmap has_abnormal_preds;
@@ -2182,17 +2098,54 @@ compute_antic_aux (basic_block block, bool block_has_abnormal_pred_edge)
 
       phi_translate_set (ANTIC_OUT, ANTIC_IN (first), block, first);
 
+      /* If we have multiple successors we need to intersect the ANTIC_OUT
+         sets.  For values that's a simple intersection but for
+	 expressions it is a union.  Given we want to have a single
+	 expression per value in our sets we have to canonicalize.
+	 Avoid randomness and running into cycles like for PR82129 and
+	 canonicalize the expression we choose to the one with the
+	 lowest id.  This requires we actually compute the union first.  */
       FOR_EACH_VEC_ELT (worklist, i, bprime)
 	{
 	  if (!gimple_seq_empty_p (phi_nodes (bprime)))
 	    {
 	      bitmap_set_t tmp = bitmap_set_new ();
 	      phi_translate_set (tmp, ANTIC_IN (bprime), block, bprime);
-	      bitmap_set_and (ANTIC_OUT, tmp);
+	      bitmap_and_into (&ANTIC_OUT->values, &tmp->values);
+	      bitmap_ior_into (&ANTIC_OUT->expressions, &tmp->expressions);
 	      bitmap_set_free (tmp);
 	    }
 	  else
-	    bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime));
+	    {
+	      bitmap_and_into (&ANTIC_OUT->values, &ANTIC_IN (bprime)->values);
+	      bitmap_ior_into (&ANTIC_OUT->expressions,
+			       &ANTIC_IN (bprime)->expressions);
+	    }
+	}
+      if (! worklist.is_empty ())
+	{
+	  /* Prune expressions not in the value set, canonicalizing to
+	     expression with lowest ID.  */
+	  bitmap_iterator bi;
+	  unsigned int i;
+	  unsigned int to_clear = -1U;
+	  bitmap seen_value = BITMAP_ALLOC (NULL);
+	  FOR_EACH_EXPR_ID_IN_SET (ANTIC_OUT, i, bi)
+	    {
+	      if (to_clear != -1U)
+		{
+		  bitmap_clear_bit (&ANTIC_OUT->expressions, to_clear);
+		  to_clear = -1U;
+		}
+	      pre_expr expr = expression_for_id (i);
+	      unsigned int value_id = get_expr_value_id (expr);
+	      if (!bitmap_bit_p (&ANTIC_OUT->values, value_id)
+		  || !bitmap_set_bit (seen_value, value_id))
+		to_clear = i;
+	    }
+	  if (to_clear != -1U)
+	    bitmap_clear_bit (&ANTIC_OUT->expressions, to_clear);
+	  BITMAP_FREE (seen_value);
 	}
     }
 
@@ -2201,11 +2154,11 @@ compute_antic_aux (basic_block block, bool block_has_abnormal_pred_edge)
   prune_clobbered_mems (ANTIC_OUT, block);
 
   /* Generate ANTIC_OUT - TMP_GEN.  */
-  S = bitmap_set_subtract (ANTIC_OUT, TMP_GEN (block));
+  S = bitmap_set_subtract_expressions (ANTIC_OUT, TMP_GEN (block));
 
   /* Start ANTIC_IN with EXP_GEN - TMP_GEN.  */
-  ANTIC_IN (block) = bitmap_set_subtract (EXP_GEN (block),
-					  TMP_GEN (block));
+  ANTIC_IN (block) = bitmap_set_subtract_expressions (EXP_GEN (block),
+						      TMP_GEN (block));
 
   /* Then union in the ANTIC_OUT - TMP_GEN values,
      to get ANTIC_OUT U EXP_GEN - TMP_GEN */
@@ -2250,8 +2203,7 @@ compute_antic_aux (basic_block block, bool block_has_abnormal_pred_edge)
    else if succs(BLOCK) == 1 then
      PA_OUT[BLOCK] = phi_translate (PA_IN[succ(BLOCK)])
 
-   PA_IN[BLOCK] = dependent_clean(PA_OUT[BLOCK] - TMP_GEN[BLOCK]
-				  - ANTIC_IN[BLOCK])
+   PA_IN[BLOCK] = clean(PA_OUT[BLOCK] - TMP_GEN[BLOCK] - ANTIC_IN[BLOCK])
 
 */
 static void
@@ -2344,7 +2296,7 @@ compute_partial_antic_aux (basic_block block,
 
   /* PA_IN starts with PA_OUT - TMP_GEN.
      Then we subtract things from ANTIC_IN.  */
-  PA_IN (block) = bitmap_set_subtract (PA_OUT, TMP_GEN (block));
+  PA_IN (block) = bitmap_set_subtract_expressions (PA_OUT, TMP_GEN (block));
 
   /* For partial antic, we want to put back in the phi results, since
      we will properly avoid making them partially antic over backedges.  */
@@ -2354,7 +2306,7 @@ compute_partial_antic_aux (basic_block block,
   /* PA_IN[block] = PA_IN[block] - ANTIC_IN[block] */
   bitmap_set_subtract_values (PA_IN (block), ANTIC_IN (block));
 
-  dependent_clean (PA_IN (block), ANTIC_IN (block));
+  clean (PA_IN (block), ANTIC_IN (block));
 
  maybe_dump_sets:
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -4088,810 +4040,6 @@ compute_avail (void)
   free (worklist);
 }
 
-
-/* Local state for the eliminate domwalk.  */
-static vec<gimple *> el_to_remove;
-static vec<gimple *> el_to_fixup;
-static unsigned int el_todo;
-static vec<tree> el_avail;
-static vec<tree> el_avail_stack;
-
-/* Return a leader for OP that is available at the current point of the
-   eliminate domwalk.  */
-
-static tree
-eliminate_avail (tree op)
-{
-  tree valnum = VN_INFO (op)->valnum;
-  if (TREE_CODE (valnum) == SSA_NAME)
-    {
-      if (SSA_NAME_IS_DEFAULT_DEF (valnum))
-	return valnum;
-      if (el_avail.length () > SSA_NAME_VERSION (valnum))
-	return el_avail[SSA_NAME_VERSION (valnum)];
-    }
-  else if (is_gimple_min_invariant (valnum))
-    return valnum;
-  return NULL_TREE;
-}
-
-/* At the current point of the eliminate domwalk make OP available.  */
-
-static void
-eliminate_push_avail (tree op)
-{
-  tree valnum = VN_INFO (op)->valnum;
-  if (TREE_CODE (valnum) == SSA_NAME)
-    {
-      if (el_avail.length () <= SSA_NAME_VERSION (valnum))
-	el_avail.safe_grow_cleared (SSA_NAME_VERSION (valnum) + 1);
-      tree pushop = op;
-      if (el_avail[SSA_NAME_VERSION (valnum)])
-	pushop = el_avail[SSA_NAME_VERSION (valnum)];
-      el_avail_stack.safe_push (pushop);
-      el_avail[SSA_NAME_VERSION (valnum)] = op;
-    }
-}
-
-/* Insert the expression recorded by SCCVN for VAL at *GSI.  Returns
-   the leader for the expression if insertion was successful.  */
-
-static tree
-eliminate_insert (gimple_stmt_iterator *gsi, tree val)
-{
-  /* We can insert a sequence with a single assignment only.  */
-  gimple_seq stmts = VN_INFO (val)->expr;
-  if (!gimple_seq_singleton_p (stmts))
-    return NULL_TREE;
-  gassign *stmt = dyn_cast <gassign *> (gimple_seq_first_stmt (stmts));
-  if (!stmt
-      || (!CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
-	  && gimple_assign_rhs_code (stmt) != VIEW_CONVERT_EXPR
-	  && gimple_assign_rhs_code (stmt) != BIT_FIELD_REF
-	  && (gimple_assign_rhs_code (stmt) != BIT_AND_EXPR
-	      || TREE_CODE (gimple_assign_rhs2 (stmt)) != INTEGER_CST)))
-    return NULL_TREE;
-
-  tree op = gimple_assign_rhs1 (stmt);
-  if (gimple_assign_rhs_code (stmt) == VIEW_CONVERT_EXPR
-      || gimple_assign_rhs_code (stmt) == BIT_FIELD_REF)
-    op = TREE_OPERAND (op, 0);
-  tree leader = TREE_CODE (op) == SSA_NAME ? eliminate_avail (op) : op;
-  if (!leader)
-    return NULL_TREE;
-
-  tree res;
-  stmts = NULL;
-  if (gimple_assign_rhs_code (stmt) == BIT_FIELD_REF)
-    res = gimple_build (&stmts, BIT_FIELD_REF,
-			TREE_TYPE (val), leader,
-			TREE_OPERAND (gimple_assign_rhs1 (stmt), 1),
-			TREE_OPERAND (gimple_assign_rhs1 (stmt), 2));
-  else if (gimple_assign_rhs_code (stmt) == BIT_AND_EXPR)
-    res = gimple_build (&stmts, BIT_AND_EXPR,
-			TREE_TYPE (val), leader, gimple_assign_rhs2 (stmt));
-  else
-    res = gimple_build (&stmts, gimple_assign_rhs_code (stmt),
-			TREE_TYPE (val), leader);
-  if (TREE_CODE (res) != SSA_NAME
-      || SSA_NAME_IS_DEFAULT_DEF (res)
-      || gimple_bb (SSA_NAME_DEF_STMT (res)))
-    {
-      gimple_seq_discard (stmts);
-
-      /* During propagation we have to treat SSA info conservatively
-         and thus we can end up simplifying the inserted expression
-	 at elimination time to sth not defined in stmts.  */
-      /* But then this is a redundancy we failed to detect.  Which means
-         res now has two values.  That doesn't play well with how
-	 we track availability here, so give up.  */
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  if (TREE_CODE (res) == SSA_NAME)
-	    res = eliminate_avail (res);
-	  if (res)
-	    {
-	      fprintf (dump_file, "Failed to insert expression for value ");
-	      print_generic_expr (dump_file, val);
-	      fprintf (dump_file, " which is really fully redundant to ");
-	      print_generic_expr (dump_file, res);
-	      fprintf (dump_file, "\n");
-	    }
-	}
-
-      return NULL_TREE;
-    }
-  else
-    {
-      gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
-      VN_INFO_GET (res)->valnum = val;
-    }
-
-  pre_stats.insertions++;
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    {
-      fprintf (dump_file, "Inserted ");
-      print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (res), 0);
-    }
-
-  return res;
-}
-
-class eliminate_dom_walker : public dom_walker
-{
-public:
-  eliminate_dom_walker (cdi_direction direction, bool do_pre_)
-      : dom_walker (direction), do_pre (do_pre_) {}
-
-  virtual edge before_dom_children (basic_block);
-  virtual void after_dom_children (basic_block);
-
-  bool do_pre;
-};
-
-/* Perform elimination for the basic-block B during the domwalk.  */
-
-edge
-eliminate_dom_walker::before_dom_children (basic_block b)
-{
-  /* Mark new bb.  */
-  el_avail_stack.safe_push (NULL_TREE);
-
-  /* Skip unreachable blocks marked unreachable during the SCCVN domwalk.  */
-  edge_iterator ei;
-  edge e;
-  FOR_EACH_EDGE (e, ei, b->preds)
-    if (e->flags & EDGE_EXECUTABLE)
-      break;
-  if (! e)
-    return NULL;
-
-  for (gphi_iterator gsi = gsi_start_phis (b); !gsi_end_p (gsi);)
-    {
-      gphi *phi = gsi.phi ();
-      tree res = PHI_RESULT (phi);
-
-      if (virtual_operand_p (res))
-	{
-	  gsi_next (&gsi);
-	  continue;
-	}
-
-      tree sprime = eliminate_avail (res);
-      if (sprime
-	  && sprime != res)
-	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file, "Replaced redundant PHI node defining ");
-	      print_generic_expr (dump_file, res);
-	      fprintf (dump_file, " with ");
-	      print_generic_expr (dump_file, sprime);
-	      fprintf (dump_file, "\n");
-	    }
-
-	  /* If we inserted this PHI node ourself, it's not an elimination.  */
-	  if (inserted_exprs
-	      && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (res)))
-	    pre_stats.phis--;
-	  else
-	    pre_stats.eliminations++;
-
-	  /* If we will propagate into all uses don't bother to do
-	     anything.  */
-	  if (may_propagate_copy (res, sprime))
-	    {
-	      /* Mark the PHI for removal.  */
-	      el_to_remove.safe_push (phi);
-	      gsi_next (&gsi);
-	      continue;
-	    }
-
-	  remove_phi_node (&gsi, false);
-
-	  if (!useless_type_conversion_p (TREE_TYPE (res), TREE_TYPE (sprime)))
-	    sprime = fold_convert (TREE_TYPE (res), sprime);
-	  gimple *stmt = gimple_build_assign (res, sprime);
-	  gimple_stmt_iterator gsi2 = gsi_after_labels (b);
-	  gsi_insert_before (&gsi2, stmt, GSI_NEW_STMT);
-	  continue;
-	}
-
-      eliminate_push_avail (res);
-      gsi_next (&gsi);
-    }
-
-  for (gimple_stmt_iterator gsi = gsi_start_bb (b);
-       !gsi_end_p (gsi);
-       gsi_next (&gsi))
-    {
-      tree sprime = NULL_TREE;
-      gimple *stmt = gsi_stmt (gsi);
-      tree lhs = gimple_get_lhs (stmt);
-      if (lhs && TREE_CODE (lhs) == SSA_NAME
-	  && !gimple_has_volatile_ops (stmt)
-	  /* See PR43491.  Do not replace a global register variable when
-	     it is a the RHS of an assignment.  Do replace local register
-	     variables since gcc does not guarantee a local variable will
-	     be allocated in register.
-	     ???  The fix isn't effective here.  This should instead
-	     be ensured by not value-numbering them the same but treating
-	     them like volatiles?  */
-	  && !(gimple_assign_single_p (stmt)
-	       && (TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL
-		   && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))
-		   && is_global_var (gimple_assign_rhs1 (stmt)))))
-	{
-	  sprime = eliminate_avail (lhs);
-	  if (!sprime)
-	    {
-	      /* If there is no existing usable leader but SCCVN thinks
-		 it has an expression it wants to use as replacement,
-		 insert that.  */
-	      tree val = VN_INFO (lhs)->valnum;
-	      if (val != VN_TOP
-		  && TREE_CODE (val) == SSA_NAME
-		  && VN_INFO (val)->needs_insertion
-		  && VN_INFO (val)->expr != NULL
-		  && (sprime = eliminate_insert (&gsi, val)) != NULL_TREE)
-		eliminate_push_avail (sprime);
-	    }
-
-	  /* If this now constitutes a copy duplicate points-to
-	     and range info appropriately.  This is especially
-	     important for inserted code.  See tree-ssa-copy.c
-	     for similar code.  */
-	  if (sprime
-	      && TREE_CODE (sprime) == SSA_NAME)
-	    {
-	      basic_block sprime_b = gimple_bb (SSA_NAME_DEF_STMT (sprime));
-	      if (POINTER_TYPE_P (TREE_TYPE (lhs))
-		  && VN_INFO_PTR_INFO (lhs)
-		  && ! VN_INFO_PTR_INFO (sprime))
-		{
-		  duplicate_ssa_name_ptr_info (sprime,
-					       VN_INFO_PTR_INFO (lhs));
-		  if (b != sprime_b)
-		    mark_ptr_info_alignment_unknown
-			(SSA_NAME_PTR_INFO (sprime));
-		}
-	      else if (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-		       && VN_INFO_RANGE_INFO (lhs)
-		       && ! VN_INFO_RANGE_INFO (sprime)
-		       && b == sprime_b)
-		duplicate_ssa_name_range_info (sprime,
-					       VN_INFO_RANGE_TYPE (lhs),
-					       VN_INFO_RANGE_INFO (lhs));
-	    }
-
-	  /* Inhibit the use of an inserted PHI on a loop header when
-	     the address of the memory reference is a simple induction
-	     variable.  In other cases the vectorizer won't do anything
-	     anyway (either it's loop invariant or a complicated
-	     expression).  */
-	  if (sprime
-	      && TREE_CODE (sprime) == SSA_NAME
-	      && do_pre
-	      && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
-	      && loop_outer (b->loop_father)
-	      && has_zero_uses (sprime)
-	      && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))
-	      && gimple_assign_load_p (stmt))
-	    {
-	      gimple *def_stmt = SSA_NAME_DEF_STMT (sprime);
-	      basic_block def_bb = gimple_bb (def_stmt);
-	      if (gimple_code (def_stmt) == GIMPLE_PHI
-		  && def_bb->loop_father->header == def_bb)
-		{
-		  loop_p loop = def_bb->loop_father;
-		  ssa_op_iter iter;
-		  tree op;
-		  bool found = false;
-		  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
-		    {
-		      affine_iv iv;
-		      def_bb = gimple_bb (SSA_NAME_DEF_STMT (op));
-		      if (def_bb
-			  && flow_bb_inside_loop_p (loop, def_bb)
-			  && simple_iv (loop, loop, op, &iv, true))
-			{
-			  found = true;
-			  break;
-			}
-		    }
-		  if (found)
-		    {
-		      if (dump_file && (dump_flags & TDF_DETAILS))
-			{
-			  fprintf (dump_file, "Not replacing ");
-			  print_gimple_expr (dump_file, stmt, 0);
-			  fprintf (dump_file, " with ");
-			  print_generic_expr (dump_file, sprime);
-			  fprintf (dump_file, " which would add a loop"
-				   " carried dependence to loop %d\n",
-				   loop->num);
-			}
-		      /* Don't keep sprime available.  */
-		      sprime = NULL_TREE;
-		    }
-		}
-	    }
-
-	  if (sprime)
-	    {
-	      /* If we can propagate the value computed for LHS into
-		 all uses don't bother doing anything with this stmt.  */
-	      if (may_propagate_copy (lhs, sprime))
-		{
-		  /* Mark it for removal.  */
-		  el_to_remove.safe_push (stmt);
-
-		  /* ???  Don't count copy/constant propagations.  */
-		  if (gimple_assign_single_p (stmt)
-		      && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
-			  || gimple_assign_rhs1 (stmt) == sprime))
-		    continue;
-
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    {
-		      fprintf (dump_file, "Replaced ");
-		      print_gimple_expr (dump_file, stmt, 0);
-		      fprintf (dump_file, " with ");
-		      print_generic_expr (dump_file, sprime);
-		      fprintf (dump_file, " in all uses of ");
-		      print_gimple_stmt (dump_file, stmt, 0);
-		    }
-
-		  pre_stats.eliminations++;
-		  continue;
-		}
-
-	      /* If this is an assignment from our leader (which
-	         happens in the case the value-number is a constant)
-		 then there is nothing to do.  */
-	      if (gimple_assign_single_p (stmt)
-		  && sprime == gimple_assign_rhs1 (stmt))
-		continue;
-
-	      /* Else replace its RHS.  */
-	      bool can_make_abnormal_goto
-		  = is_gimple_call (stmt)
-		  && stmt_can_make_abnormal_goto (stmt);
-
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-		{
-		  fprintf (dump_file, "Replaced ");
-		  print_gimple_expr (dump_file, stmt, 0);
-		  fprintf (dump_file, " with ");
-		  print_generic_expr (dump_file, sprime);
-		  fprintf (dump_file, " in ");
-		  print_gimple_stmt (dump_file, stmt, 0);
-		}
-
-	      pre_stats.eliminations++;
-	      gimple *orig_stmt = stmt;
-	      if (!useless_type_conversion_p (TREE_TYPE (lhs),
-					      TREE_TYPE (sprime)))
-		sprime = fold_convert (TREE_TYPE (lhs), sprime);
-	      tree vdef = gimple_vdef (stmt);
-	      tree vuse = gimple_vuse (stmt);
-	      propagate_tree_value_into_stmt (&gsi, sprime);
-	      stmt = gsi_stmt (gsi);
-	      update_stmt (stmt);
-	      if (vdef != gimple_vdef (stmt))
-		VN_INFO (vdef)->valnum = vuse;
-
-	      /* If we removed EH side-effects from the statement, clean
-		 its EH information.  */
-	      if (maybe_clean_or_replace_eh_stmt (orig_stmt, stmt))
-		{
-		  bitmap_set_bit (need_eh_cleanup,
-				  gimple_bb (stmt)->index);
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    fprintf (dump_file, "  Removed EH side-effects.\n");
-		}
-
-	      /* Likewise for AB side-effects.  */
-	      if (can_make_abnormal_goto
-		  && !stmt_can_make_abnormal_goto (stmt))
-		{
-		  bitmap_set_bit (need_ab_cleanup,
-				  gimple_bb (stmt)->index);
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    fprintf (dump_file, "  Removed AB side-effects.\n");
-		}
-
-	      continue;
-	    }
-	}
-
-      /* If the statement is a scalar store, see if the expression
-         has the same value number as its rhs.  If so, the store is
-         dead.  */
-      if (gimple_assign_single_p (stmt)
-	  && !gimple_has_volatile_ops (stmt)
-	  && !is_gimple_reg (gimple_assign_lhs (stmt))
-	  && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
-	      || is_gimple_min_invariant (gimple_assign_rhs1 (stmt))))
-	{
-	  tree val;
-	  tree rhs = gimple_assign_rhs1 (stmt);
-	  vn_reference_t vnresult;
-	  val = vn_reference_lookup (lhs, gimple_vuse (stmt), VN_WALKREWRITE,
-				     &vnresult, false);
-	  if (TREE_CODE (rhs) == SSA_NAME)
-	    rhs = VN_INFO (rhs)->valnum;
-	  if (val
-	      && operand_equal_p (val, rhs, 0))
-	    {
-	      /* We can only remove the later store if the former aliases
-		 at least all accesses the later one does or if the store
-		 was to readonly memory storing the same value.  */
-	      alias_set_type set = get_alias_set (lhs);
-	      if (! vnresult
-		  || vnresult->set == set
-		  || alias_set_subset_of (set, vnresult->set))
-		{
-		  if (dump_file && (dump_flags & TDF_DETAILS))
-		    {
-		      fprintf (dump_file, "Deleted redundant store ");
-		      print_gimple_stmt (dump_file, stmt, 0);
-		    }
-
-		  /* Queue stmt for removal.  */
-		  el_to_remove.safe_push (stmt);
-		  continue;
-		}
-	    }
-	}
-
-      /* If this is a control statement value numbering left edges
-	 unexecuted on force the condition in a way consistent with
-	 that.  */
-      if (gcond *cond = dyn_cast <gcond *> (stmt))
-	{
-	  if ((EDGE_SUCC (b, 0)->flags & EDGE_EXECUTABLE)
-	      ^ (EDGE_SUCC (b, 1)->flags & EDGE_EXECUTABLE))
-	    {
-              if (dump_file && (dump_flags & TDF_DETAILS))
-                {
-                  fprintf (dump_file, "Removing unexecutable edge from ");
-		  print_gimple_stmt (dump_file, stmt, 0);
-                }
-	      if (((EDGE_SUCC (b, 0)->flags & EDGE_TRUE_VALUE) != 0)
-		  == ((EDGE_SUCC (b, 0)->flags & EDGE_EXECUTABLE) != 0))
-		gimple_cond_make_true (cond);
-	      else
-		gimple_cond_make_false (cond);
-	      update_stmt (cond);
-	      el_todo |= TODO_cleanup_cfg;
-	      continue;
-	    }
-	}
-
-      bool can_make_abnormal_goto = stmt_can_make_abnormal_goto (stmt);
-      bool was_noreturn = (is_gimple_call (stmt)
-			   && gimple_call_noreturn_p (stmt));
-      tree vdef = gimple_vdef (stmt);
-      tree vuse = gimple_vuse (stmt);
-
-      /* If we didn't replace the whole stmt (or propagate the result
-         into all uses), replace all uses on this stmt with their
-	 leaders.  */
-      bool modified = false;
-      use_operand_p use_p;
-      ssa_op_iter iter;
-      FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
-	{
-	  tree use = USE_FROM_PTR (use_p);
-	  /* ???  The call code above leaves stmt operands un-updated.  */
-	  if (TREE_CODE (use) != SSA_NAME)
-	    continue;
-	  tree sprime = eliminate_avail (use);
-	  if (sprime && sprime != use
-	      && may_propagate_copy (use, sprime)
-	      /* We substitute into debug stmts to avoid excessive
-	         debug temporaries created by removed stmts, but we need
-		 to avoid doing so for inserted sprimes as we never want
-		 to create debug temporaries for them.  */
-	      && (!inserted_exprs
-		  || TREE_CODE (sprime) != SSA_NAME
-		  || !is_gimple_debug (stmt)
-		  || !bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))))
-	    {
-	      propagate_value (use_p, sprime);
-	      modified = true;
-	    }
-	}
-
-      /* Fold the stmt if modified, this canonicalizes MEM_REFs we propagated
-         into which is a requirement for the IPA devirt machinery.  */
-      gimple *old_stmt = stmt;
-      if (modified)
-	{
-	  /* If a formerly non-invariant ADDR_EXPR is turned into an
-	     invariant one it was on a separate stmt.  */
-	  if (gimple_assign_single_p (stmt)
-	      && TREE_CODE (gimple_assign_rhs1 (stmt)) == ADDR_EXPR)
-	    recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 (stmt));
-	  gimple_stmt_iterator prev = gsi;
-	  gsi_prev (&prev);
-	  if (fold_stmt (&gsi))
-	    {
-	      /* fold_stmt may have created new stmts inbetween
-		 the previous stmt and the folded stmt.  Mark
-		 all defs created there as varying to not confuse
-		 the SCCVN machinery as we're using that even during
-		 elimination.  */
-	      if (gsi_end_p (prev))
-		prev = gsi_start_bb (b);
-	      else
-		gsi_next (&prev);
-	      if (gsi_stmt (prev) != gsi_stmt (gsi))
-		do
-		  {
-		    tree def;
-		    ssa_op_iter dit;
-		    FOR_EACH_SSA_TREE_OPERAND (def, gsi_stmt (prev),
-					       dit, SSA_OP_ALL_DEFS)
-		      /* As existing DEFs may move between stmts
-			 we have to guard VN_INFO_GET.  */
-		      if (! has_VN_INFO (def))
-			VN_INFO_GET (def)->valnum = def;
-		    if (gsi_stmt (prev) == gsi_stmt (gsi))
-		      break;
-		    gsi_next (&prev);
-		  }
-		while (1);
-	    }
-	  stmt = gsi_stmt (gsi);
-	  /* In case we folded the stmt away schedule the NOP for removal.  */
-	  if (gimple_nop_p (stmt))
-	    el_to_remove.safe_push (stmt);
-	}
-
-      /* Visit indirect calls and turn them into direct calls if
-	 possible using the devirtualization machinery.  Do this before
-	 checking for required EH/abnormal/noreturn cleanup as devird
-	 may expose more of those.  */
-      if (gcall *call_stmt = dyn_cast <gcall *> (stmt))
-	{
-	  tree fn = gimple_call_fn (call_stmt);
-	  if (fn
-	      && flag_devirtualize
-	      && virtual_method_call_p (fn))
-	    {
-	      tree otr_type = obj_type_ref_class (fn);
-	      unsigned HOST_WIDE_INT otr_tok
-		= tree_to_uhwi (OBJ_TYPE_REF_TOKEN (fn));
-	      tree instance;
-	      ipa_polymorphic_call_context context (current_function_decl,
-						    fn, stmt, &instance);
-	      context.get_dynamic_type (instance, OBJ_TYPE_REF_OBJECT (fn),
-					otr_type, stmt);
-	      bool final;
-	      vec <cgraph_node *> targets
-		= possible_polymorphic_call_targets (obj_type_ref_class (fn),
-						     otr_tok, context, &final);
-	      if (dump_file)
-		dump_possible_polymorphic_call_targets (dump_file, 
-							obj_type_ref_class (fn),
-							otr_tok, context);
-	      if (final && targets.length () <= 1 && dbg_cnt (devirt))
-		{
-		  tree fn;
-		  if (targets.length () == 1)
-		    fn = targets[0]->decl;
-		  else
-		    fn = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
-		  if (dump_enabled_p ())
-		    {
-		      location_t loc = gimple_location (stmt);
-		      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
-				       "converting indirect call to "
-				       "function %s\n",
-				       lang_hooks.decl_printable_name (fn, 2));
-		    }
-		  gimple_call_set_fndecl (call_stmt, fn);
-		  /* If changing the call to __builtin_unreachable
-		     or similar noreturn function, adjust gimple_call_fntype
-		     too.  */
-		  if (gimple_call_noreturn_p (call_stmt)
-		      && VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fn)))
-		      && TYPE_ARG_TYPES (TREE_TYPE (fn))
-		      && (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fn)))
-			  == void_type_node))
-		    gimple_call_set_fntype (call_stmt, TREE_TYPE (fn));
-		  maybe_remove_unused_call_args (cfun, call_stmt);
-		  modified = true;
-		}
-	    }
-	}
-
-      if (modified)
-	{
-	  /* When changing a call into a noreturn call, cfg cleanup
-	     is needed to fix up the noreturn call.  */
-	  if (!was_noreturn
-	      && is_gimple_call (stmt) && gimple_call_noreturn_p (stmt))
-	    el_to_fixup.safe_push  (stmt);
-	  /* When changing a condition or switch into one we know what
-	     edge will be executed, schedule a cfg cleanup.  */
-	  if ((gimple_code (stmt) == GIMPLE_COND
-	       && (gimple_cond_true_p (as_a <gcond *> (stmt))
-		   || gimple_cond_false_p (as_a <gcond *> (stmt))))
-	      || (gimple_code (stmt) == GIMPLE_SWITCH
-		  && TREE_CODE (gimple_switch_index
-				  (as_a <gswitch *> (stmt))) == INTEGER_CST))
-	    el_todo |= TODO_cleanup_cfg;
-	  /* If we removed EH side-effects from the statement, clean
-	     its EH information.  */
-	  if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt))
-	    {
-	      bitmap_set_bit (need_eh_cleanup,
-			      gimple_bb (stmt)->index);
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-		fprintf (dump_file, "  Removed EH side-effects.\n");
-	    }
-	  /* Likewise for AB side-effects.  */
-	  if (can_make_abnormal_goto
-	      && !stmt_can_make_abnormal_goto (stmt))
-	    {
-	      bitmap_set_bit (need_ab_cleanup,
-			      gimple_bb (stmt)->index);
-	      if (dump_file && (dump_flags & TDF_DETAILS))
-		fprintf (dump_file, "  Removed AB side-effects.\n");
-	    }
-	  update_stmt (stmt);
-	  if (vdef != gimple_vdef (stmt))
-	    VN_INFO (vdef)->valnum = vuse;
-	}
-
-      /* Make new values available - for fully redundant LHS we
-         continue with the next stmt above and skip this.  */
-      def_operand_p defp;
-      FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_DEF)
-	eliminate_push_avail (DEF_FROM_PTR (defp));
-    }
-
-  /* Replace destination PHI arguments.  */
-  FOR_EACH_EDGE (e, ei, b->succs)
-    if (e->flags & EDGE_EXECUTABLE)
-      for (gphi_iterator gsi = gsi_start_phis (e->dest);
-	   !gsi_end_p (gsi);
-	   gsi_next (&gsi))
-	{
-	  gphi *phi = gsi.phi ();
-	  use_operand_p use_p = PHI_ARG_DEF_PTR_FROM_EDGE (phi, e);
-	  tree arg = USE_FROM_PTR (use_p);
-	  if (TREE_CODE (arg) != SSA_NAME
-	      || virtual_operand_p (arg))
-	    continue;
-	  tree sprime = eliminate_avail (arg);
-	  if (sprime && may_propagate_copy (arg, sprime))
-	    propagate_value (use_p, sprime);
-	}
-  return NULL;
-}
-
-/* Make no longer available leaders no longer available.  */
-
-void
-eliminate_dom_walker::after_dom_children (basic_block)
-{
-  tree entry;
-  while ((entry = el_avail_stack.pop ()) != NULL_TREE)
-    {
-      tree valnum = VN_INFO (entry)->valnum;
-      tree old = el_avail[SSA_NAME_VERSION (valnum)];
-      if (old == entry)
-	el_avail[SSA_NAME_VERSION (valnum)] = NULL_TREE;
-      else
-	el_avail[SSA_NAME_VERSION (valnum)] = entry;
-    }
-}
-
-/* Eliminate fully redundant computations.  */
-
-static unsigned int
-eliminate (bool do_pre)
-{
-  need_eh_cleanup = BITMAP_ALLOC (NULL);
-  need_ab_cleanup = BITMAP_ALLOC (NULL);
-
-  el_to_remove.create (0);
-  el_to_fixup.create (0);
-  el_todo = 0;
-  el_avail.create (num_ssa_names);
-  el_avail_stack.create (0);
-
-  eliminate_dom_walker (CDI_DOMINATORS,
-			do_pre).walk (cfun->cfg->x_entry_block_ptr);
-
-  el_avail.release ();
-  el_avail_stack.release ();
-
-  return el_todo;
-}
-
-/* Perform CFG cleanups made necessary by elimination.  */
-
-static unsigned 
-fini_eliminate (void)
-{
-  gimple_stmt_iterator gsi;
-  gimple *stmt;
-  unsigned todo = 0;
-
-  /* We cannot remove stmts during BB walk, especially not release SSA
-     names there as this confuses the VN machinery.  The stmts ending
-     up in el_to_remove are either stores or simple copies.
-     Remove stmts in reverse order to make debug stmt creation possible.  */
-  while (!el_to_remove.is_empty ())
-    {
-      stmt = el_to_remove.pop ();
-
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Removing dead stmt ");
-	  print_gimple_stmt (dump_file, stmt, 0, 0);
-	}
-
-      gsi = gsi_for_stmt (stmt);
-      if (gimple_code (stmt) == GIMPLE_PHI)
-	remove_phi_node (&gsi, true);
-      else
-	{
-	  basic_block bb = gimple_bb (stmt);
-	  unlink_stmt_vdef (stmt);
-	  if (gsi_remove (&gsi, true))
-	    bitmap_set_bit (need_eh_cleanup, bb->index);
-	  if (is_gimple_call (stmt) && stmt_can_make_abnormal_goto (stmt))
-	    bitmap_set_bit (need_ab_cleanup, bb->index);
-	  release_defs (stmt);
-	}
-
-      /* Removing a stmt may expose a forwarder block.  */
-      todo |= TODO_cleanup_cfg;
-    }
-  el_to_remove.release ();
-
-  /* Fixup stmts that became noreturn calls.  This may require splitting
-     blocks and thus isn't possible during the dominator walk.  Do this
-     in reverse order so we don't inadvertedly remove a stmt we want to
-     fixup by visiting a dominating now noreturn call first.  */
-  while (!el_to_fixup.is_empty ())
-    {
-      stmt = el_to_fixup.pop ();
-
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Fixing up noreturn call ");
-	  print_gimple_stmt (dump_file, stmt, 0);
-	}
-
-      if (fixup_noreturn_call (stmt))
-	todo |= TODO_cleanup_cfg;
-    }
-  el_to_fixup.release ();
-
-  bool do_eh_cleanup = !bitmap_empty_p (need_eh_cleanup);
-  bool do_ab_cleanup = !bitmap_empty_p (need_ab_cleanup);
-
-  if (do_eh_cleanup)
-    gimple_purge_all_dead_eh_edges (need_eh_cleanup);
-
-  if (do_ab_cleanup)
-    gimple_purge_all_dead_abnormal_call_edges (need_ab_cleanup);
-
-  BITMAP_FREE (need_eh_cleanup);
-  BITMAP_FREE (need_ab_cleanup);
-
-  if (do_eh_cleanup || do_ab_cleanup)
-    todo |= TODO_cleanup_cfg;
-  return todo;
-}
-
 /* Cheap DCE of a known set of possibly dead stmts.
 
    Because we don't follow exactly the standard PRE algorithm, and decide not
@@ -5078,18 +4226,16 @@ pass_pre::execute (function *fun)
   gcc_assert (!need_ssa_update_p (fun));
 
   /* Remove all the redundant expressions.  */
-  todo |= eliminate (true);
+  todo |= vn_eliminate (inserted_exprs);
 
   statistics_counter_event (fun, "Insertions", pre_stats.insertions);
   statistics_counter_event (fun, "PA inserted", pre_stats.pa_insert);
   statistics_counter_event (fun, "HOIST inserted", pre_stats.hoist_insert);
   statistics_counter_event (fun, "New PHIs", pre_stats.phis);
-  statistics_counter_event (fun, "Eliminated", pre_stats.eliminations);
 
   clear_expression_ids ();
 
   scev_finalize ();
-  todo |= fini_eliminate ();
   remove_dead_inserted_code ();
   fini_pre ();
   loop_optimizer_finalize ();
@@ -5125,63 +4271,3 @@ make_pass_pre (gcc::context *ctxt)
 {
   return new pass_pre (ctxt);
 }
-
-namespace {
-
-const pass_data pass_data_fre =
-{
-  GIMPLE_PASS, /* type */
-  "fre", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
-  TV_TREE_FRE, /* tv_id */
-  ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
-  0, /* properties_destroyed */
-  0, /* todo_flags_start */
-  0, /* todo_flags_finish */
-};
-
-class pass_fre : public gimple_opt_pass
-{
-public:
-  pass_fre (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_fre, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  opt_pass * clone () { return new pass_fre (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_fre != 0; }
-  virtual unsigned int execute (function *);
-
-}; // class pass_fre
-
-unsigned int
-pass_fre::execute (function *fun)
-{
-  unsigned int todo = 0;
-
-  run_scc_vn (VN_WALKREWRITE);
-
-  memset (&pre_stats, 0, sizeof (pre_stats));
-
-  /* Remove all the redundant expressions.  */
-  todo |= eliminate (false);
-
-  todo |= fini_eliminate ();
-
-  scc_vn_restore_ssa_info ();
-  free_scc_vn ();
-
-  statistics_counter_event (fun, "Insertions", pre_stats.insertions);
-  statistics_counter_event (fun, "Eliminated", pre_stats.eliminations);
-
-  return todo;
-}
-
-} // anon namespace
-
-gimple_opt_pass *
-make_pass_fre (gcc::context *ctxt)
-{
-  return new pass_fre (ctxt);
-}
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index 00ab3d72564..62955bef9c5 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -108,10 +108,6 @@
      [3] Advanced Compiler Design and Implementation,
 	 Steven Muchnick, Morgan Kaufmann, 1997, Section 12.6  */
 
-/* Function pointers used to parameterize the propagation engine.  */
-static ssa_prop_visit_stmt_fn ssa_prop_visit_stmt;
-static ssa_prop_visit_phi_fn ssa_prop_visit_phi;
-
 /* Worklist of control flow edge destinations.  This contains
    the CFG order number of the blocks so we can iterate in CFG
    order by visiting in bit-order.  */
@@ -217,8 +213,8 @@ add_control_edge (edge e)
 
 /* Simulate the execution of STMT and update the work lists accordingly.  */
 
-static void
-simulate_stmt (gimple *stmt)
+void
+ssa_propagation_engine::simulate_stmt (gimple *stmt)
 {
   enum ssa_prop_result val = SSA_PROP_NOT_INTERESTING;
   edge taken_edge = NULL;
@@ -234,11 +230,11 @@ simulate_stmt (gimple *stmt)
 
   if (gimple_code (stmt) == GIMPLE_PHI)
     {
-      val = ssa_prop_visit_phi (as_a <gphi *> (stmt));
+      val = visit_phi (as_a <gphi *> (stmt));
       output_name = gimple_phi_result (stmt);
     }
   else
-    val = ssa_prop_visit_stmt (stmt, &taken_edge, &output_name);
+    val = visit_stmt (stmt, &taken_edge, &output_name);
 
   if (val == SSA_PROP_VARYING)
     {
@@ -321,8 +317,8 @@ simulate_stmt (gimple *stmt)
    when an SSA edge is added to it in simulate_stmt.  Return true if a stmt
    was simulated.  */
 
-static void
-process_ssa_edge_worklist ()
+void
+ssa_propagation_engine::process_ssa_edge_worklist (void)
 {
   /* Process the next entry from the worklist.  */
   unsigned stmt_uid = bitmap_first_set_bit (ssa_edge_worklist);
@@ -345,8 +341,8 @@ process_ssa_edge_worklist ()
 /* Simulate the execution of BLOCK.  Evaluate the statement associated
    with each variable reference inside the block.  */
 
-static void
-simulate_block (basic_block block)
+void
+ssa_propagation_engine::simulate_block (basic_block block)
 {
   gimple_stmt_iterator gsi;
 
@@ -781,19 +777,15 @@ update_call_from_tree (gimple_stmt_iterator *si_p, tree expr)
     return false;
 }
 
-
 /* Entry point to the propagation engine.
 
-   VISIT_STMT is called for every statement visited.
-   VISIT_PHI is called for every PHI node visited.  */
+   The VISIT_STMT virtual function is called for every statement
+   visited and the VISIT_PHI virtual function is called for every PHI
+   node visited.  */
 
 void
-ssa_propagate (ssa_prop_visit_stmt_fn visit_stmt,
-	       ssa_prop_visit_phi_fn visit_phi)
+ssa_propagation_engine::ssa_propagate (void)
 {
-  ssa_prop_visit_stmt = visit_stmt;
-  ssa_prop_visit_phi = visit_phi;
-
   ssa_prop_init ();
 
   /* Iterate until the worklists are empty.  */
@@ -861,7 +853,7 @@ static struct prop_stats_d prop_stats;
    PROP_VALUE. Return true if at least one reference was replaced.  */
 
 bool
-replace_uses_in (gimple *stmt, ssa_prop_get_value_fn get_value)
+substitute_and_fold_engine::replace_uses_in (gimple *stmt)
 {
   bool replaced = false;
   use_operand_p use;
@@ -870,7 +862,7 @@ replace_uses_in (gimple *stmt, ssa_prop_get_value_fn get_value)
   FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
     {
       tree tuse = USE_FROM_PTR (use);
-      tree val = (*get_value) (tuse);
+      tree val = get_value (tuse);
 
       if (val == tuse || val == NULL_TREE)
 	continue;
@@ -899,8 +891,8 @@ replace_uses_in (gimple *stmt, ssa_prop_get_value_fn get_value)
 /* Replace propagated values into all the arguments for PHI using the
    values from PROP_VALUE.  */
 
-static bool
-replace_phi_args_in (gphi *phi, ssa_prop_get_value_fn get_value)
+bool
+substitute_and_fold_engine::replace_phi_args_in (gphi *phi)
 {
   size_t i;
   bool replaced = false;
@@ -917,7 +909,7 @@ replace_phi_args_in (gphi *phi, ssa_prop_get_value_fn get_value)
 
       if (TREE_CODE (arg) == SSA_NAME)
 	{
-	  tree val = (*get_value) (arg);
+	  tree val = get_value (arg);
 
 	  if (val && val != arg && may_propagate_copy (arg, val))
 	    {
@@ -968,10 +960,10 @@ class substitute_and_fold_dom_walker : public dom_walker
 {
 public:
     substitute_and_fold_dom_walker (cdi_direction direction,
-				    ssa_prop_get_value_fn get_value_fn_,
-				    ssa_prop_fold_stmt_fn fold_fn_)
-	: dom_walker (direction), get_value_fn (get_value_fn_),
-      fold_fn (fold_fn_), something_changed (false)
+				    class substitute_and_fold_engine *engine)
+	: dom_walker (direction),
+          something_changed (false),
+	  substitute_and_fold_engine (engine)
     {
       stmts_to_remove.create (0);
       stmts_to_fixup.create (0);
@@ -987,12 +979,12 @@ public:
     virtual edge before_dom_children (basic_block);
     virtual void after_dom_children (basic_block) {}
 
-    ssa_prop_get_value_fn get_value_fn;
-    ssa_prop_fold_stmt_fn fold_fn;
     bool something_changed;
     vec<gimple *> stmts_to_remove;
     vec<gimple *> stmts_to_fixup;
     bitmap need_eh_cleanup;
+
+    class substitute_and_fold_engine *substitute_and_fold_engine;
 };
 
 edge
@@ -1009,7 +1001,7 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
 	continue;
       if (res && TREE_CODE (res) == SSA_NAME)
 	{
-	  tree sprime = get_value_fn (res);
+	  tree sprime = substitute_and_fold_engine->get_value (res);
 	  if (sprime
 	      && sprime != res
 	      && may_propagate_copy (res, sprime))
@@ -1018,7 +1010,7 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
 	      continue;
 	    }
 	}
-      something_changed |= replace_phi_args_in (phi, get_value_fn);
+      something_changed |= substitute_and_fold_engine->replace_phi_args_in (phi);
     }
 
   /* Propagate known values into stmts.  In some case it exposes
@@ -1035,7 +1027,7 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
       tree lhs = gimple_get_lhs (stmt);
       if (lhs && TREE_CODE (lhs) == SSA_NAME)
 	{
-	  tree sprime = get_value_fn (lhs);
+	  tree sprime = substitute_and_fold_engine->get_value (lhs);
 	  if (sprime
 	      && sprime != lhs
 	      && may_propagate_copy (lhs, sprime)
@@ -1064,7 +1056,7 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
 			   && gimple_call_noreturn_p (stmt));
 
       /* Replace real uses in the statement.  */
-      did_replace |= replace_uses_in (stmt, get_value_fn);
+      did_replace |= substitute_and_fold_engine->replace_uses_in (stmt);
 
       /* If we made a replacement, fold the statement.  */
       if (did_replace)
@@ -1077,16 +1069,13 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
       /* Some statements may be simplified using propagator
 	 specific information.  Do this before propagating
 	 into the stmt to not disturb pass specific information.  */
-      if (fold_fn)
+      update_stmt_if_modified (stmt);
+      if (substitute_and_fold_engine->fold_stmt(&i))
 	{
-	  update_stmt_if_modified (stmt);
-	  if ((*fold_fn)(&i))
-	    {
-	      did_replace = true;
-	      prop_stats.num_stmts_folded++;
-	      stmt = gsi_stmt (i);
-	      gimple_set_modified (stmt, true);
-	    }
+	  did_replace = true;
+	  prop_stats.num_stmts_folded++;
+	  stmt = gsi_stmt (i);
+	  gimple_set_modified (stmt, true);
 	}
 
       /* If this is a control statement the propagator left edges
@@ -1172,19 +1161,15 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
    Return TRUE when something changed.  */
 
 bool
-substitute_and_fold (ssa_prop_get_value_fn get_value_fn,
-		     ssa_prop_fold_stmt_fn fold_fn)
+substitute_and_fold_engine::substitute_and_fold (void)
 {
-  gcc_assert (get_value_fn);
-
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "\nSubstituting values and folding statements\n\n");
 
   memset (&prop_stats, 0, sizeof (prop_stats));
 
   calculate_dominance_info (CDI_DOMINATORS);
-  substitute_and_fold_dom_walker walker(CDI_DOMINATORS,
-					get_value_fn, fold_fn);
+  substitute_and_fold_dom_walker walker (CDI_DOMINATORS, this);
   walker.walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
 
   /* We cannot remove stmts during the BB walk, especially not release
diff --git a/gcc/tree-ssa-propagate.h b/gcc/tree-ssa-propagate.h
index 9a8ecc845d1..be4500bc83a 100644
--- a/gcc/tree-ssa-propagate.h
+++ b/gcc/tree-ssa-propagate.h
@@ -61,21 +61,11 @@ enum ssa_prop_result {
 };
 
 
-/* Call-back functions used by the value propagation engine.  */
-typedef enum ssa_prop_result (*ssa_prop_visit_stmt_fn) (gimple *, edge *,
-							tree *);
-typedef enum ssa_prop_result (*ssa_prop_visit_phi_fn) (gphi *);
-typedef bool (*ssa_prop_fold_stmt_fn) (gimple_stmt_iterator *gsi);
-typedef tree (*ssa_prop_get_value_fn) (tree);
-
-
 extern bool valid_gimple_rhs_p (tree);
 extern void move_ssa_defining_stmt_for_defs (gimple *, gimple *);
 extern bool update_gimple_call (gimple_stmt_iterator *, tree, int, ...);
 extern bool update_call_from_tree (gimple_stmt_iterator *, tree);
-extern void ssa_propagate (ssa_prop_visit_stmt_fn, ssa_prop_visit_phi_fn);
 extern bool stmt_makes_single_store (gimple *);
-extern bool substitute_and_fold (ssa_prop_get_value_fn, ssa_prop_fold_stmt_fn);
 extern bool may_propagate_copy (tree, tree);
 extern bool may_propagate_copy_into_stmt (gimple *, tree);
 extern bool may_propagate_copy_into_asm (tree);
@@ -83,6 +73,42 @@ extern void propagate_value (use_operand_p, tree);
 extern void replace_exp (use_operand_p, tree);
 extern void propagate_tree_value (tree *, tree);
 extern void propagate_tree_value_into_stmt (gimple_stmt_iterator *, tree);
-extern bool replace_uses_in (gimple *stmt, ssa_prop_get_value_fn get_value);
+
+/* Public interface into the SSA propagation engine.  Clients should inherit
+   from this class and provide their own visitors.  */
+
+class ssa_propagation_engine
+{
+ public:
+
+  virtual ~ssa_propagation_engine (void) { }
+
+  /* Virtual functions the clients must provide to visit statements
+     and phi nodes respectively.  */
+  virtual enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) = 0;
+  virtual enum ssa_prop_result visit_phi (gphi *) = 0;
+
+  /* Main interface into the propagation engine.  */
+  void ssa_propagate (void);
+
+ private:
+  /* Internal implementation details.  */
+  void simulate_stmt (gimple *stmt);
+  void process_ssa_edge_worklist (void);
+  void simulate_block (basic_block);
+
+};
+
+class substitute_and_fold_engine
+{
+ public:
+  virtual ~substitute_and_fold_engine (void) { }
+  virtual bool fold_stmt (gimple_stmt_iterator *) { return false; }
+  virtual tree get_value (tree) { return NULL_TREE; }
+
+  bool substitute_and_fold (void);
+  bool replace_uses_in (gimple *);
+  bool replace_phi_args_in (gphi *);
+};
 
 #endif /* _TREE_SSA_PROPAGATE_H  */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index cc57ae320a3..5e8cac69d5d 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -5625,6 +5625,7 @@ attempt_builtin_copysign (vec<operand_entry *> *ops)
 	      switch (gimple_call_combined_fn (old_call))
 		{
 		CASE_CFN_COPYSIGN:
+		CASE_CFN_COPYSIGN_FN:
 		  arg0 = gimple_call_arg (old_call, 0);
 		  arg1 = gimple_call_arg (old_call, 1);
 		  /* The first argument of copysign must be a constant,
@@ -5910,7 +5911,7 @@ reassociate_bb (basic_block bb)
 		     move it to the front.  This helps ensure that we generate
 		     (X & Y) & C rather than (X & C) & Y.  The former will
 		     often match a canonical bit test when we get to RTL.  */
-		  if (ops.length () != 2
+		  if (ops.length () > 2
 		      && (rhs_code == BIT_AND_EXPR
 		          || rhs_code == BIT_IOR_EXPR
 		          || rhs_code == BIT_XOR_EXPR)
@@ -6033,12 +6034,10 @@ branch_fixup (void)
 
       edge etrue = make_edge (cond_bb, merge_bb, EDGE_TRUE_VALUE);
       etrue->probability = profile_probability::even ();
-      etrue->count = cond_bb->count.apply_scale (1, 2);
       edge efalse = find_edge (cond_bb, then_bb);
       efalse->flags = EDGE_FALSE_VALUE;
       efalse->probability -= etrue->probability;
-      efalse->count -= etrue->count;
-      then_bb->count -= etrue->count;
+      then_bb->count -= etrue->count ();
 
       tree othervar = NULL_TREE;
       if (gimple_assign_rhs1 (use_stmt) == var)
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 1a130d0d133..a5beb2a1c9f 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -55,13 +55,21 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "params.h"
 #include "tree-ssa-propagate.h"
-#include "tree-ssa-sccvn.h"
 #include "tree-cfg.h"
 #include "domwalk.h"
 #include "gimple-iterator.h"
 #include "gimple-match.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "tree-pass.h"
+#include "statistics.h"
+#include "langhooks.h"
+#include "ipa-utils.h"
+#include "dbgcnt.h"
+#include "tree-cfgcleanup.h"
+#include "tree-ssa-loop.h"
+#include "tree-scalar-evolution.h"
+#include "tree-ssa-sccvn.h"
 
 /* This algorithm is based on the SCC algorithm presented by Keith
    Cooper and L. Taylor Simpson in "SCC-Based Value numbering"
@@ -791,7 +799,7 @@ copy_reference_ops_from_ref (tree ref, vec<vn_reference_op_s> *result)
 		       (checking cfun->after_inlining does the
 		       trick here).  */
 		    if (TREE_CODE (orig) != ADDR_EXPR
-			|| maybe_nonzero (off)
+			|| may_ne (off, 0)
 			|| cfun->after_inlining)
 		      off.to_shwi (&temp.off);
 		  }
@@ -1212,7 +1220,7 @@ vn_reference_maybe_forwprop_address (vec<vn_reference_op_s> *ops,
 	 dereference isn't offsetted.  */
       if (!addr_base
 	  && *i_p == ops->length () - 1
-	  && known_zero (off)
+	  && must_eq (off, 0)
 	  /* This makes us disable this transform for PRE where the
 	     reference ops might be also used for code insertion which
 	     is invalid.  */
@@ -2148,7 +2156,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *vr_,
       copy_reference_ops_from_ref (gimple_assign_rhs1 (def_stmt), &rhs);
 
       /* Apply an extra offset to the inner MEM_REF of the RHS.  */
-      if (maybe_nonzero (extra_off))
+      if (may_ne (extra_off, 0))
 	{
 	  if (rhs.length () < 2
 	      || rhs[0].opcode != MEM_REF
@@ -4824,34 +4832,18 @@ sccvn_dom_walker::after_dom_children (basic_block bb)
 edge
 sccvn_dom_walker::before_dom_children (basic_block bb)
 {
-  edge e;
-  edge_iterator ei;
-
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "Visiting BB %d\n", bb->index);
 
   /* If we have a single predecessor record the equivalence from a
      possible condition on the predecessor edge.  */
-  edge pred_e = NULL;
-  FOR_EACH_EDGE (e, ei, bb->preds)
-    {
-      /* Ignore simple backedges from this to allow recording conditions
-         in loop headers.  */
-      if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
-	continue;
-      if (! pred_e)
-	pred_e = e;
-      else
-	{
-	  pred_e = NULL;
-	  break;
-	}
-    }
+  edge pred_e = single_pred_edge_ignoring_loop_edges (bb, false);
   if (pred_e)
     {
       /* Check if there are multiple executable successor edges in
 	 the source block.  Otherwise there is no additional info
 	 to be recorded.  */
+      edge_iterator ei;
       edge e2;
       FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
 	if (e2 != pred_e
@@ -5134,3 +5126,868 @@ vn_nary_may_trap (vn_nary_op_t nary)
 
   return false;
 }
+
+
+class eliminate_dom_walker : public dom_walker
+{
+public:
+  eliminate_dom_walker (cdi_direction, bitmap);
+  ~eliminate_dom_walker ();
+
+  virtual edge before_dom_children (basic_block);
+  virtual void after_dom_children (basic_block);
+
+  tree eliminate_avail (tree op);
+  void eliminate_push_avail (tree op);
+  tree eliminate_insert (gimple_stmt_iterator *gsi, tree val);
+
+  bool do_pre;
+  unsigned int el_todo;
+  unsigned int eliminations;
+  unsigned int insertions;
+
+  /* SSA names that had their defs inserted by PRE if do_pre.  */
+  bitmap inserted_exprs;
+
+  /* Blocks with statements that have had their EH properties changed.  */
+  bitmap need_eh_cleanup;
+
+  /* Blocks with statements that have had their AB properties changed.  */
+  bitmap need_ab_cleanup;
+
+  auto_vec<gimple *> to_remove;
+  auto_vec<gimple *> to_fixup;
+  auto_vec<tree> avail;
+  auto_vec<tree> avail_stack;
+};
+
+eliminate_dom_walker::eliminate_dom_walker (cdi_direction direction,
+					    bitmap inserted_exprs_)
+  : dom_walker (direction), do_pre (inserted_exprs_ != NULL),
+    el_todo (0), eliminations (0), insertions (0),
+    inserted_exprs (inserted_exprs_)
+{
+  need_eh_cleanup = BITMAP_ALLOC (NULL);
+  need_ab_cleanup = BITMAP_ALLOC (NULL);
+}
+
+eliminate_dom_walker::~eliminate_dom_walker ()
+{
+  BITMAP_FREE (need_eh_cleanup);
+  BITMAP_FREE (need_ab_cleanup);
+}
+
+/* Return a leader for OP that is available at the current point of the
+   eliminate domwalk.  */
+
+tree
+eliminate_dom_walker::eliminate_avail (tree op)
+{
+  tree valnum = VN_INFO (op)->valnum;
+  if (TREE_CODE (valnum) == SSA_NAME)
+    {
+      if (SSA_NAME_IS_DEFAULT_DEF (valnum))
+	return valnum;
+      if (avail.length () > SSA_NAME_VERSION (valnum))
+	return avail[SSA_NAME_VERSION (valnum)];
+    }
+  else if (is_gimple_min_invariant (valnum))
+    return valnum;
+  return NULL_TREE;
+}
+
+/* At the current point of the eliminate domwalk make OP available.  */
+
+void
+eliminate_dom_walker::eliminate_push_avail (tree op)
+{
+  tree valnum = VN_INFO (op)->valnum;
+  if (TREE_CODE (valnum) == SSA_NAME)
+    {
+      if (avail.length () <= SSA_NAME_VERSION (valnum))
+	avail.safe_grow_cleared (SSA_NAME_VERSION (valnum) + 1);
+      tree pushop = op;
+      if (avail[SSA_NAME_VERSION (valnum)])
+	pushop = avail[SSA_NAME_VERSION (valnum)];
+      avail_stack.safe_push (pushop);
+      avail[SSA_NAME_VERSION (valnum)] = op;
+    }
+}
+
+/* Insert the expression recorded by SCCVN for VAL at *GSI.  Returns
+   the leader for the expression if insertion was successful.  */
+
+tree
+eliminate_dom_walker::eliminate_insert (gimple_stmt_iterator *gsi, tree val)
+{
+  /* We can insert a sequence with a single assignment only.  */
+  gimple_seq stmts = VN_INFO (val)->expr;
+  if (!gimple_seq_singleton_p (stmts))
+    return NULL_TREE;
+  gassign *stmt = dyn_cast <gassign *> (gimple_seq_first_stmt (stmts));
+  if (!stmt
+      || (!CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
+	  && gimple_assign_rhs_code (stmt) != VIEW_CONVERT_EXPR
+	  && gimple_assign_rhs_code (stmt) != BIT_FIELD_REF
+	  && (gimple_assign_rhs_code (stmt) != BIT_AND_EXPR
+	      || TREE_CODE (gimple_assign_rhs2 (stmt)) != INTEGER_CST)))
+    return NULL_TREE;
+
+  tree op = gimple_assign_rhs1 (stmt);
+  if (gimple_assign_rhs_code (stmt) == VIEW_CONVERT_EXPR
+      || gimple_assign_rhs_code (stmt) == BIT_FIELD_REF)
+    op = TREE_OPERAND (op, 0);
+  tree leader = TREE_CODE (op) == SSA_NAME ? eliminate_avail (op) : op;
+  if (!leader)
+    return NULL_TREE;
+
+  tree res;
+  stmts = NULL;
+  if (gimple_assign_rhs_code (stmt) == BIT_FIELD_REF)
+    res = gimple_build (&stmts, BIT_FIELD_REF,
+			TREE_TYPE (val), leader,
+			TREE_OPERAND (gimple_assign_rhs1 (stmt), 1),
+			TREE_OPERAND (gimple_assign_rhs1 (stmt), 2));
+  else if (gimple_assign_rhs_code (stmt) == BIT_AND_EXPR)
+    res = gimple_build (&stmts, BIT_AND_EXPR,
+			TREE_TYPE (val), leader, gimple_assign_rhs2 (stmt));
+  else
+    res = gimple_build (&stmts, gimple_assign_rhs_code (stmt),
+			TREE_TYPE (val), leader);
+  if (TREE_CODE (res) != SSA_NAME
+      || SSA_NAME_IS_DEFAULT_DEF (res)
+      || gimple_bb (SSA_NAME_DEF_STMT (res)))
+    {
+      gimple_seq_discard (stmts);
+
+      /* During propagation we have to treat SSA info conservatively
+         and thus we can end up simplifying the inserted expression
+	 at elimination time to sth not defined in stmts.  */
+      /* But then this is a redundancy we failed to detect.  Which means
+         res now has two values.  That doesn't play well with how
+	 we track availability here, so give up.  */
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  if (TREE_CODE (res) == SSA_NAME)
+	    res = eliminate_avail (res);
+	  if (res)
+	    {
+	      fprintf (dump_file, "Failed to insert expression for value ");
+	      print_generic_expr (dump_file, val);
+	      fprintf (dump_file, " which is really fully redundant to ");
+	      print_generic_expr (dump_file, res);
+	      fprintf (dump_file, "\n");
+	    }
+	}
+
+      return NULL_TREE;
+    }
+  else
+    {
+      gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+      VN_INFO_GET (res)->valnum = val;
+    }
+
+  insertions++;
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Inserted ");
+      print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (res), 0);
+    }
+
+  return res;
+}
+
+
+
+/* Perform elimination for the basic-block B during the domwalk.  */
+
+edge
+eliminate_dom_walker::before_dom_children (basic_block b)
+{
+  /* Mark new bb.  */
+  avail_stack.safe_push (NULL_TREE);
+
+  /* Skip unreachable blocks marked unreachable during the SCCVN domwalk.  */
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, b->preds)
+    if (e->flags & EDGE_EXECUTABLE)
+      break;
+  if (! e)
+    return NULL;
+
+  for (gphi_iterator gsi = gsi_start_phis (b); !gsi_end_p (gsi);)
+    {
+      gphi *phi = gsi.phi ();
+      tree res = PHI_RESULT (phi);
+
+      if (virtual_operand_p (res))
+	{
+	  gsi_next (&gsi);
+	  continue;
+	}
+
+      tree sprime = eliminate_avail (res);
+      if (sprime
+	  && sprime != res)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "Replaced redundant PHI node defining ");
+	      print_generic_expr (dump_file, res);
+	      fprintf (dump_file, " with ");
+	      print_generic_expr (dump_file, sprime);
+	      fprintf (dump_file, "\n");
+	    }
+
+	  /* If we inserted this PHI node ourself, it's not an elimination.  */
+	  if (! inserted_exprs
+	      || ! bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (res)))
+	    eliminations++;
+
+	  /* If we will propagate into all uses don't bother to do
+	     anything.  */
+	  if (may_propagate_copy (res, sprime))
+	    {
+	      /* Mark the PHI for removal.  */
+	      to_remove.safe_push (phi);
+	      gsi_next (&gsi);
+	      continue;
+	    }
+
+	  remove_phi_node (&gsi, false);
+
+	  if (!useless_type_conversion_p (TREE_TYPE (res), TREE_TYPE (sprime)))
+	    sprime = fold_convert (TREE_TYPE (res), sprime);
+	  gimple *stmt = gimple_build_assign (res, sprime);
+	  gimple_stmt_iterator gsi2 = gsi_after_labels (b);
+	  gsi_insert_before (&gsi2, stmt, GSI_NEW_STMT);
+	  continue;
+	}
+
+      eliminate_push_avail (res);
+      gsi_next (&gsi);
+    }
+
+  for (gimple_stmt_iterator gsi = gsi_start_bb (b);
+       !gsi_end_p (gsi);
+       gsi_next (&gsi))
+    {
+      tree sprime = NULL_TREE;
+      gimple *stmt = gsi_stmt (gsi);
+      tree lhs = gimple_get_lhs (stmt);
+      if (lhs && TREE_CODE (lhs) == SSA_NAME
+	  && !gimple_has_volatile_ops (stmt)
+	  /* See PR43491.  Do not replace a global register variable when
+	     it is a the RHS of an assignment.  Do replace local register
+	     variables since gcc does not guarantee a local variable will
+	     be allocated in register.
+	     ???  The fix isn't effective here.  This should instead
+	     be ensured by not value-numbering them the same but treating
+	     them like volatiles?  */
+	  && !(gimple_assign_single_p (stmt)
+	       && (TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL
+		   && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))
+		   && is_global_var (gimple_assign_rhs1 (stmt)))))
+	{
+	  sprime = eliminate_avail (lhs);
+	  if (!sprime)
+	    {
+	      /* If there is no existing usable leader but SCCVN thinks
+		 it has an expression it wants to use as replacement,
+		 insert that.  */
+	      tree val = VN_INFO (lhs)->valnum;
+	      if (val != VN_TOP
+		  && TREE_CODE (val) == SSA_NAME
+		  && VN_INFO (val)->needs_insertion
+		  && VN_INFO (val)->expr != NULL
+		  && (sprime = eliminate_insert (&gsi, val)) != NULL_TREE)
+		eliminate_push_avail (sprime);
+	    }
+
+	  /* If this now constitutes a copy duplicate points-to
+	     and range info appropriately.  This is especially
+	     important for inserted code.  See tree-ssa-copy.c
+	     for similar code.  */
+	  if (sprime
+	      && TREE_CODE (sprime) == SSA_NAME)
+	    {
+	      basic_block sprime_b = gimple_bb (SSA_NAME_DEF_STMT (sprime));
+	      if (POINTER_TYPE_P (TREE_TYPE (lhs))
+		  && VN_INFO_PTR_INFO (lhs)
+		  && ! VN_INFO_PTR_INFO (sprime))
+		{
+		  duplicate_ssa_name_ptr_info (sprime,
+					       VN_INFO_PTR_INFO (lhs));
+		  if (b != sprime_b)
+		    mark_ptr_info_alignment_unknown
+			(SSA_NAME_PTR_INFO (sprime));
+		}
+	      else if (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		       && VN_INFO_RANGE_INFO (lhs)
+		       && ! VN_INFO_RANGE_INFO (sprime)
+		       && b == sprime_b)
+		duplicate_ssa_name_range_info (sprime,
+					       VN_INFO_RANGE_TYPE (lhs),
+					       VN_INFO_RANGE_INFO (lhs));
+	    }
+
+	  /* Inhibit the use of an inserted PHI on a loop header when
+	     the address of the memory reference is a simple induction
+	     variable.  In other cases the vectorizer won't do anything
+	     anyway (either it's loop invariant or a complicated
+	     expression).  */
+	  if (sprime
+	      && TREE_CODE (sprime) == SSA_NAME
+	      && do_pre
+	      && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
+	      && loop_outer (b->loop_father)
+	      && has_zero_uses (sprime)
+	      && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))
+	      && gimple_assign_load_p (stmt))
+	    {
+	      gimple *def_stmt = SSA_NAME_DEF_STMT (sprime);
+	      basic_block def_bb = gimple_bb (def_stmt);
+	      if (gimple_code (def_stmt) == GIMPLE_PHI
+		  && def_bb->loop_father->header == def_bb)
+		{
+		  loop_p loop = def_bb->loop_father;
+		  ssa_op_iter iter;
+		  tree op;
+		  bool found = false;
+		  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
+		    {
+		      affine_iv iv;
+		      def_bb = gimple_bb (SSA_NAME_DEF_STMT (op));
+		      if (def_bb
+			  && flow_bb_inside_loop_p (loop, def_bb)
+			  && simple_iv (loop, loop, op, &iv, true))
+			{
+			  found = true;
+			  break;
+			}
+		    }
+		  if (found)
+		    {
+		      if (dump_file && (dump_flags & TDF_DETAILS))
+			{
+			  fprintf (dump_file, "Not replacing ");
+			  print_gimple_expr (dump_file, stmt, 0);
+			  fprintf (dump_file, " with ");
+			  print_generic_expr (dump_file, sprime);
+			  fprintf (dump_file, " which would add a loop"
+				   " carried dependence to loop %d\n",
+				   loop->num);
+			}
+		      /* Don't keep sprime available.  */
+		      sprime = NULL_TREE;
+		    }
+		}
+	    }
+
+	  if (sprime)
+	    {
+	      /* If we can propagate the value computed for LHS into
+		 all uses don't bother doing anything with this stmt.  */
+	      if (may_propagate_copy (lhs, sprime))
+		{
+		  /* Mark it for removal.  */
+		  to_remove.safe_push (stmt);
+
+		  /* ???  Don't count copy/constant propagations.  */
+		  if (gimple_assign_single_p (stmt)
+		      && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
+			  || gimple_assign_rhs1 (stmt) == sprime))
+		    continue;
+
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file, "Replaced ");
+		      print_gimple_expr (dump_file, stmt, 0);
+		      fprintf (dump_file, " with ");
+		      print_generic_expr (dump_file, sprime);
+		      fprintf (dump_file, " in all uses of ");
+		      print_gimple_stmt (dump_file, stmt, 0);
+		    }
+
+		  eliminations++;
+		  continue;
+		}
+
+	      /* If this is an assignment from our leader (which
+	         happens in the case the value-number is a constant)
+		 then there is nothing to do.  */
+	      if (gimple_assign_single_p (stmt)
+		  && sprime == gimple_assign_rhs1 (stmt))
+		continue;
+
+	      /* Else replace its RHS.  */
+	      bool can_make_abnormal_goto
+		  = is_gimple_call (stmt)
+		  && stmt_can_make_abnormal_goto (stmt);
+
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "Replaced ");
+		  print_gimple_expr (dump_file, stmt, 0);
+		  fprintf (dump_file, " with ");
+		  print_generic_expr (dump_file, sprime);
+		  fprintf (dump_file, " in ");
+		  print_gimple_stmt (dump_file, stmt, 0);
+		}
+
+	      eliminations++;
+	      gimple *orig_stmt = stmt;
+	      if (!useless_type_conversion_p (TREE_TYPE (lhs),
+					      TREE_TYPE (sprime)))
+		sprime = fold_convert (TREE_TYPE (lhs), sprime);
+	      tree vdef = gimple_vdef (stmt);
+	      tree vuse = gimple_vuse (stmt);
+	      propagate_tree_value_into_stmt (&gsi, sprime);
+	      stmt = gsi_stmt (gsi);
+	      update_stmt (stmt);
+	      if (vdef != gimple_vdef (stmt))
+		VN_INFO (vdef)->valnum = vuse;
+
+	      /* If we removed EH side-effects from the statement, clean
+		 its EH information.  */
+	      if (maybe_clean_or_replace_eh_stmt (orig_stmt, stmt))
+		{
+		  bitmap_set_bit (need_eh_cleanup,
+				  gimple_bb (stmt)->index);
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    fprintf (dump_file, "  Removed EH side-effects.\n");
+		}
+
+	      /* Likewise for AB side-effects.  */
+	      if (can_make_abnormal_goto
+		  && !stmt_can_make_abnormal_goto (stmt))
+		{
+		  bitmap_set_bit (need_ab_cleanup,
+				  gimple_bb (stmt)->index);
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    fprintf (dump_file, "  Removed AB side-effects.\n");
+		}
+
+	      continue;
+	    }
+	}
+
+      /* If the statement is a scalar store, see if the expression
+         has the same value number as its rhs.  If so, the store is
+         dead.  */
+      if (gimple_assign_single_p (stmt)
+	  && !gimple_has_volatile_ops (stmt)
+	  && !is_gimple_reg (gimple_assign_lhs (stmt))
+	  && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
+	      || is_gimple_min_invariant (gimple_assign_rhs1 (stmt))))
+	{
+	  tree val;
+	  tree rhs = gimple_assign_rhs1 (stmt);
+	  vn_reference_t vnresult;
+	  val = vn_reference_lookup (lhs, gimple_vuse (stmt), VN_WALKREWRITE,
+				     &vnresult, false);
+	  if (TREE_CODE (rhs) == SSA_NAME)
+	    rhs = VN_INFO (rhs)->valnum;
+	  if (val
+	      && operand_equal_p (val, rhs, 0))
+	    {
+	      /* We can only remove the later store if the former aliases
+		 at least all accesses the later one does or if the store
+		 was to readonly memory storing the same value.  */
+	      alias_set_type set = get_alias_set (lhs);
+	      if (! vnresult
+		  || vnresult->set == set
+		  || alias_set_subset_of (set, vnresult->set))
+		{
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file, "Deleted redundant store ");
+		      print_gimple_stmt (dump_file, stmt, 0);
+		    }
+
+		  /* Queue stmt for removal.  */
+		  to_remove.safe_push (stmt);
+		  continue;
+		}
+	    }
+	}
+
+      /* If this is a control statement value numbering left edges
+	 unexecuted on force the condition in a way consistent with
+	 that.  */
+      if (gcond *cond = dyn_cast <gcond *> (stmt))
+	{
+	  if ((EDGE_SUCC (b, 0)->flags & EDGE_EXECUTABLE)
+	      ^ (EDGE_SUCC (b, 1)->flags & EDGE_EXECUTABLE))
+	    {
+              if (dump_file && (dump_flags & TDF_DETAILS))
+                {
+                  fprintf (dump_file, "Removing unexecutable edge from ");
+		  print_gimple_stmt (dump_file, stmt, 0);
+                }
+	      if (((EDGE_SUCC (b, 0)->flags & EDGE_TRUE_VALUE) != 0)
+		  == ((EDGE_SUCC (b, 0)->flags & EDGE_EXECUTABLE) != 0))
+		gimple_cond_make_true (cond);
+	      else
+		gimple_cond_make_false (cond);
+	      update_stmt (cond);
+	      el_todo |= TODO_cleanup_cfg;
+	      continue;
+	    }
+	}
+
+      bool can_make_abnormal_goto = stmt_can_make_abnormal_goto (stmt);
+      bool was_noreturn = (is_gimple_call (stmt)
+			   && gimple_call_noreturn_p (stmt));
+      tree vdef = gimple_vdef (stmt);
+      tree vuse = gimple_vuse (stmt);
+
+      /* If we didn't replace the whole stmt (or propagate the result
+         into all uses), replace all uses on this stmt with their
+	 leaders.  */
+      bool modified = false;
+      use_operand_p use_p;
+      ssa_op_iter iter;
+      FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
+	{
+	  tree use = USE_FROM_PTR (use_p);
+	  /* ???  The call code above leaves stmt operands un-updated.  */
+	  if (TREE_CODE (use) != SSA_NAME)
+	    continue;
+	  tree sprime = eliminate_avail (use);
+	  if (sprime && sprime != use
+	      && may_propagate_copy (use, sprime)
+	      /* We substitute into debug stmts to avoid excessive
+	         debug temporaries created by removed stmts, but we need
+		 to avoid doing so for inserted sprimes as we never want
+		 to create debug temporaries for them.  */
+	      && (!inserted_exprs
+		  || TREE_CODE (sprime) != SSA_NAME
+		  || !is_gimple_debug (stmt)
+		  || !bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))))
+	    {
+	      propagate_value (use_p, sprime);
+	      modified = true;
+	    }
+	}
+
+      /* Fold the stmt if modified, this canonicalizes MEM_REFs we propagated
+         into which is a requirement for the IPA devirt machinery.  */
+      gimple *old_stmt = stmt;
+      if (modified)
+	{
+	  /* If a formerly non-invariant ADDR_EXPR is turned into an
+	     invariant one it was on a separate stmt.  */
+	  if (gimple_assign_single_p (stmt)
+	      && TREE_CODE (gimple_assign_rhs1 (stmt)) == ADDR_EXPR)
+	    recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 (stmt));
+	  gimple_stmt_iterator prev = gsi;
+	  gsi_prev (&prev);
+	  if (fold_stmt (&gsi))
+	    {
+	      /* fold_stmt may have created new stmts inbetween
+		 the previous stmt and the folded stmt.  Mark
+		 all defs created there as varying to not confuse
+		 the SCCVN machinery as we're using that even during
+		 elimination.  */
+	      if (gsi_end_p (prev))
+		prev = gsi_start_bb (b);
+	      else
+		gsi_next (&prev);
+	      if (gsi_stmt (prev) != gsi_stmt (gsi))
+		do
+		  {
+		    tree def;
+		    ssa_op_iter dit;
+		    FOR_EACH_SSA_TREE_OPERAND (def, gsi_stmt (prev),
+					       dit, SSA_OP_ALL_DEFS)
+		      /* As existing DEFs may move between stmts
+			 we have to guard VN_INFO_GET.  */
+		      if (! has_VN_INFO (def))
+			VN_INFO_GET (def)->valnum = def;
+		    if (gsi_stmt (prev) == gsi_stmt (gsi))
+		      break;
+		    gsi_next (&prev);
+		  }
+		while (1);
+	    }
+	  stmt = gsi_stmt (gsi);
+	  /* In case we folded the stmt away schedule the NOP for removal.  */
+	  if (gimple_nop_p (stmt))
+	    to_remove.safe_push (stmt);
+	}
+
+      /* Visit indirect calls and turn them into direct calls if
+	 possible using the devirtualization machinery.  Do this before
+	 checking for required EH/abnormal/noreturn cleanup as devird
+	 may expose more of those.  */
+      if (gcall *call_stmt = dyn_cast <gcall *> (stmt))
+	{
+	  tree fn = gimple_call_fn (call_stmt);
+	  if (fn
+	      && flag_devirtualize
+	      && virtual_method_call_p (fn))
+	    {
+	      tree otr_type = obj_type_ref_class (fn);
+	      unsigned HOST_WIDE_INT otr_tok
+		= tree_to_uhwi (OBJ_TYPE_REF_TOKEN (fn));
+	      tree instance;
+	      ipa_polymorphic_call_context context (current_function_decl,
+						    fn, stmt, &instance);
+	      context.get_dynamic_type (instance, OBJ_TYPE_REF_OBJECT (fn),
+					otr_type, stmt);
+	      bool final;
+	      vec <cgraph_node *> targets
+		= possible_polymorphic_call_targets (obj_type_ref_class (fn),
+						     otr_tok, context, &final);
+	      if (dump_file)
+		dump_possible_polymorphic_call_targets (dump_file, 
+							obj_type_ref_class (fn),
+							otr_tok, context);
+	      if (final && targets.length () <= 1 && dbg_cnt (devirt))
+		{
+		  tree fn;
+		  if (targets.length () == 1)
+		    fn = targets[0]->decl;
+		  else
+		    fn = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+		  if (dump_enabled_p ())
+		    {
+		      location_t loc = gimple_location (stmt);
+		      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
+				       "converting indirect call to "
+				       "function %s\n",
+				       lang_hooks.decl_printable_name (fn, 2));
+		    }
+		  gimple_call_set_fndecl (call_stmt, fn);
+		  /* If changing the call to __builtin_unreachable
+		     or similar noreturn function, adjust gimple_call_fntype
+		     too.  */
+		  if (gimple_call_noreturn_p (call_stmt)
+		      && VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fn)))
+		      && TYPE_ARG_TYPES (TREE_TYPE (fn))
+		      && (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fn)))
+			  == void_type_node))
+		    gimple_call_set_fntype (call_stmt, TREE_TYPE (fn));
+		  maybe_remove_unused_call_args (cfun, call_stmt);
+		  modified = true;
+		}
+	    }
+	}
+
+      if (modified)
+	{
+	  /* When changing a call into a noreturn call, cfg cleanup
+	     is needed to fix up the noreturn call.  */
+	  if (!was_noreturn
+	      && is_gimple_call (stmt) && gimple_call_noreturn_p (stmt))
+	    to_fixup.safe_push  (stmt);
+	  /* When changing a condition or switch into one we know what
+	     edge will be executed, schedule a cfg cleanup.  */
+	  if ((gimple_code (stmt) == GIMPLE_COND
+	       && (gimple_cond_true_p (as_a <gcond *> (stmt))
+		   || gimple_cond_false_p (as_a <gcond *> (stmt))))
+	      || (gimple_code (stmt) == GIMPLE_SWITCH
+		  && TREE_CODE (gimple_switch_index
+				  (as_a <gswitch *> (stmt))) == INTEGER_CST))
+	    el_todo |= TODO_cleanup_cfg;
+	  /* If we removed EH side-effects from the statement, clean
+	     its EH information.  */
+	  if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt))
+	    {
+	      bitmap_set_bit (need_eh_cleanup,
+			      gimple_bb (stmt)->index);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "  Removed EH side-effects.\n");
+	    }
+	  /* Likewise for AB side-effects.  */
+	  if (can_make_abnormal_goto
+	      && !stmt_can_make_abnormal_goto (stmt))
+	    {
+	      bitmap_set_bit (need_ab_cleanup,
+			      gimple_bb (stmt)->index);
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "  Removed AB side-effects.\n");
+	    }
+	  update_stmt (stmt);
+	  if (vdef != gimple_vdef (stmt))
+	    VN_INFO (vdef)->valnum = vuse;
+	}
+
+      /* Make new values available - for fully redundant LHS we
+         continue with the next stmt above and skip this.  */
+      def_operand_p defp;
+      FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_DEF)
+	eliminate_push_avail (DEF_FROM_PTR (defp));
+    }
+
+  /* Replace destination PHI arguments.  */
+  FOR_EACH_EDGE (e, ei, b->succs)
+    if (e->flags & EDGE_EXECUTABLE)
+      for (gphi_iterator gsi = gsi_start_phis (e->dest);
+	   !gsi_end_p (gsi);
+	   gsi_next (&gsi))
+	{
+	  gphi *phi = gsi.phi ();
+	  use_operand_p use_p = PHI_ARG_DEF_PTR_FROM_EDGE (phi, e);
+	  tree arg = USE_FROM_PTR (use_p);
+	  if (TREE_CODE (arg) != SSA_NAME
+	      || virtual_operand_p (arg))
+	    continue;
+	  tree sprime = eliminate_avail (arg);
+	  if (sprime && may_propagate_copy (arg, sprime))
+	    propagate_value (use_p, sprime);
+	}
+  return NULL;
+}
+
+/* Make no longer available leaders no longer available.  */
+
+void
+eliminate_dom_walker::after_dom_children (basic_block)
+{
+  tree entry;
+  while ((entry = avail_stack.pop ()) != NULL_TREE)
+    {
+      tree valnum = VN_INFO (entry)->valnum;
+      tree old = avail[SSA_NAME_VERSION (valnum)];
+      if (old == entry)
+	avail[SSA_NAME_VERSION (valnum)] = NULL_TREE;
+      else
+	avail[SSA_NAME_VERSION (valnum)] = entry;
+    }
+}
+
+/* Eliminate fully redundant computations.  */
+
+unsigned int
+vn_eliminate (bitmap inserted_exprs)
+{
+  eliminate_dom_walker el (CDI_DOMINATORS, inserted_exprs);
+  el.avail.reserve (num_ssa_names);
+
+  el.walk (cfun->cfg->x_entry_block_ptr);
+
+  /* We cannot remove stmts during BB walk, especially not release SSA
+     names there as this confuses the VN machinery.  The stmts ending
+     up in to_remove are either stores or simple copies.
+     Remove stmts in reverse order to make debug stmt creation possible.  */
+  while (!el.to_remove.is_empty ())
+    {
+      gimple *stmt = el.to_remove.pop ();
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Removing dead stmt ");
+	  print_gimple_stmt (dump_file, stmt, 0, 0);
+	}
+
+      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+      if (gimple_code (stmt) == GIMPLE_PHI)
+	remove_phi_node (&gsi, true);
+      else
+	{
+	  basic_block bb = gimple_bb (stmt);
+	  unlink_stmt_vdef (stmt);
+	  if (gsi_remove (&gsi, true))
+	    bitmap_set_bit (el.need_eh_cleanup, bb->index);
+	  if (is_gimple_call (stmt) && stmt_can_make_abnormal_goto (stmt))
+	    bitmap_set_bit (el.need_ab_cleanup, bb->index);
+	  release_defs (stmt);
+	}
+
+      /* Removing a stmt may expose a forwarder block.  */
+      el.el_todo |= TODO_cleanup_cfg;
+    }
+
+  /* Fixup stmts that became noreturn calls.  This may require splitting
+     blocks and thus isn't possible during the dominator walk.  Do this
+     in reverse order so we don't inadvertedly remove a stmt we want to
+     fixup by visiting a dominating now noreturn call first.  */
+  while (!el.to_fixup.is_empty ())
+    {
+      gimple *stmt = el.to_fixup.pop ();
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Fixing up noreturn call ");
+	  print_gimple_stmt (dump_file, stmt, 0);
+	}
+
+      if (fixup_noreturn_call (stmt))
+	el.el_todo |= TODO_cleanup_cfg;
+    }
+
+  bool do_eh_cleanup = !bitmap_empty_p (el.need_eh_cleanup);
+  bool do_ab_cleanup = !bitmap_empty_p (el.need_ab_cleanup);
+
+  if (do_eh_cleanup)
+    gimple_purge_all_dead_eh_edges (el.need_eh_cleanup);
+
+  if (do_ab_cleanup)
+    gimple_purge_all_dead_abnormal_call_edges (el.need_ab_cleanup);
+
+  if (do_eh_cleanup || do_ab_cleanup)
+    el.el_todo |= TODO_cleanup_cfg;
+
+  statistics_counter_event (cfun, "Eliminated", el.eliminations);
+  statistics_counter_event (cfun, "Insertions", el.insertions);
+
+  return el.el_todo;
+}
+
+
+namespace {
+
+const pass_data pass_data_fre =
+{
+  GIMPLE_PASS, /* type */
+  "fre", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_FRE, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_fre : public gimple_opt_pass
+{
+public:
+  pass_fre (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_fre, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_fre (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_fre != 0; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_fre
+
+unsigned int
+pass_fre::execute (function *)
+{
+  unsigned int todo = 0;
+
+  run_scc_vn (VN_WALKREWRITE);
+
+  /* Remove all the redundant expressions.  */
+  todo |= vn_eliminate (NULL);
+
+  scc_vn_restore_ssa_info ();
+  free_scc_vn ();
+
+  return todo;
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_fre (gcc::context *ctxt)
+{
+  return new pass_fre (ctxt);
+}
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 68191562b85..830876849bf 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -214,6 +214,7 @@ extern vn_ssa_aux_t VN_INFO (tree);
 extern vn_ssa_aux_t VN_INFO_GET (tree);
 tree vn_get_expr_for (tree);
 void run_scc_vn (vn_lookup_kind);
+unsigned int vn_eliminate (bitmap);
 void free_scc_vn (void);
 void scc_vn_restore_ssa_info (void);
 tree vn_nary_op_lookup (tree, vn_nary_op_t *);
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index acf832d66f6..1c5d7dd7556 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -226,7 +226,8 @@ select_best_block (basic_block early_bb,
   /* If BEST_BB is at the same nesting level, then require it to have
      significantly lower execution frequency to avoid gratutious movement.  */
   if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
-      && best_bb->frequency < (early_bb->frequency * threshold / 100.0))
+      && best_bb->count.to_frequency (cfun)
+	 < (early_bb->count.to_frequency (cfun) * threshold / 100.0))
     return best_bb;
 
   /* No better block found, so return EARLY_BB, which happens to be the
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 3296058dd4b..e6d708f77b4 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -3256,7 +3256,7 @@ get_constraint_for_component_ref (tree t, vec<ce_s> *results,
 	 we may have to do something cute here.  */
 
       if (may_lt (poly_uint64 (bitpos), get_varinfo (result.var)->fullsize)
-	  && maybe_nonzero (bitmaxsize))
+	  && may_ne (bitmaxsize, 0))
 	{
 	  /* It's also not true that the constraint will actually start at the
 	     right offset, it may start in some padding.  We only care about
@@ -3302,7 +3302,7 @@ get_constraint_for_component_ref (tree t, vec<ce_s> *results,
 	      results->safe_push (cexpr);
 	    }
 	}
-      else if (known_zero (bitmaxsize))
+      else if (must_eq (bitmaxsize, 0))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file, "Access to zero-sized part of variable, "
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index a65ff31d900..97e90233d58 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -1530,8 +1530,6 @@ static void
 replace_block_by (basic_block bb1, basic_block bb2)
 {
   edge pred_edge;
-  edge e1, e2;
-  edge_iterator ei;
   unsigned int i;
   gphi *bb2_phi;
 
@@ -1560,9 +1558,13 @@ replace_block_by (basic_block bb1, basic_block bb2)
 
   bb2->count += bb1->count;
 
+  /* FIXME: Fix merging of probabilities.  They need to be redistributed
+     according to the relative counts of merged BBs.  */
+#if 0
   /* Merge the outgoing edge counts from bb1 onto bb2.  */
   profile_count out_sum = profile_count::zero ();
   int out_freq_sum = 0;
+  edge e1, e2;
 
   /* Recompute the edge probabilities from the new merged edge count.
      Use the sum of the new merged edge counts computed above instead
@@ -1570,40 +1572,36 @@ replace_block_by (basic_block bb1, basic_block bb2)
      making the bb count inconsistent with the edge weights.  */
   FOR_EACH_EDGE (e1, ei, bb1->succs)
     {
-      if (e1->count.initialized_p ())
-	out_sum += e1->count;
+      if (e1->count ().initialized_p ())
+	out_sum += e1->count ();
       out_freq_sum += EDGE_FREQUENCY (e1);
     }
   FOR_EACH_EDGE (e1, ei, bb2->succs)
     {
-      if (e1->count.initialized_p ())
-	out_sum += e1->count;
+      if (e1->count ().initialized_p ())
+	out_sum += e1->count ();
       out_freq_sum += EDGE_FREQUENCY (e1);
     }
-
   FOR_EACH_EDGE (e1, ei, bb1->succs)
     {
       e2 = find_edge (bb2, e1->dest);
       gcc_assert (e2);
-      e2->count += e1->count;
-      if (out_sum > 0 && e2->count.initialized_p ())
+      if (out_sum > 0 && e2->count ().initialized_p ())
 	{
-	  e2->probability = e2->count.probability_in (bb2->count);
+	  e2->probability = e2->count ().probability_in (bb2->count);
 	}
-      else if (bb1->frequency && bb2->frequency)
+      else if (bb1->count.to_frequency (cfun) && bb2->count.to_frequency (cfun))
 	e2->probability = e1->probability;
-      else if (bb2->frequency && !bb1->frequency)
+      else if (bb2->count.to_frequency (cfun) && !bb1->count.to_frequency (cfun))
 	;
       else if (out_freq_sum)
 	e2->probability = profile_probability::from_reg_br_prob_base
 		(GCOV_COMPUTE_SCALE (EDGE_FREQUENCY (e1)
 				     + EDGE_FREQUENCY (e2),
 				     out_freq_sum));
-      out_sum += e2->count;
+      out_sum += e2->count ();
     }
-  bb2->frequency += bb1->frequency;
-  if (bb2->frequency > BB_FREQ_MAX)
-    bb2->frequency = BB_FREQ_MAX;
+#endif
 
   /* Move over any user labels from bb1 after the bb2 labels.  */
   gimple_stmt_iterator gsi1 = gsi_start_bb (bb1);
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 28c81a6ec57..1dab0f1fab4 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -303,7 +303,6 @@ remove_ctrl_stmt_and_useless_edges (basic_block bb, basic_block dest_bb)
       else
 	{
 	  e->probability = profile_probability::always ();
-	  e->count = bb->count;
 	  ei_next (&ei);
 	}
     }
@@ -340,7 +339,6 @@ create_block_for_threading (basic_block bb,
     e->aux = NULL;
 
   /* Zero out the profile, since the block is unreachable for now.  */
-  rd->dup_blocks[count]->frequency = 0;
   rd->dup_blocks[count]->count = profile_count::uninitialized ();
   if (duplicate_blocks)
     bitmap_set_bit (*duplicate_blocks, rd->dup_blocks[count]->index);
@@ -591,7 +589,7 @@ any_remaining_duplicated_blocks (vec<jump_thread_edge *> *path,
 }
 
 
-/* Compute the amount of profile count/frequency coming into the jump threading
+/* Compute the amount of profile count coming into the jump threading
    path stored in RD that we are duplicating, returned in PATH_IN_COUNT_PTR and
    PATH_IN_FREQ_PTR, as well as the amount of counts flowing out of the
    duplicated path, returned in PATH_OUT_COUNT_PTR.  LOCAL_INFO is used to
@@ -599,7 +597,7 @@ any_remaining_duplicated_blocks (vec<jump_thread_edge *> *path,
    edges that need to be ignored in the analysis.  Return true if path contains
    a joiner, false otherwise.
 
-   In the non-joiner case, this is straightforward - all the counts/frequency
+   In the non-joiner case, this is straightforward - all the counts
    flowing into the jump threading path should flow through the duplicated
    block and out of the duplicated path.
 
@@ -741,7 +739,7 @@ compute_path_counts (struct redirection_data *rd,
 	     same last path edge in the case where the last edge has a nocopy
 	     source block.  */
 	  gcc_assert (ein_path->last ()->e == elast);
-	  path_in_count += ein->count;
+	  path_in_count += ein->count ();
 	  path_in_freq += EDGE_FREQUENCY (ein);
 	}
       else if (!ein_path)
@@ -749,7 +747,7 @@ compute_path_counts (struct redirection_data *rd,
 	  /* Keep track of the incoming edges that are not on any jump-threading
 	     path.  These counts will still flow out of original path after all
 	     jump threading is complete.  */
-	    nonpath_count += ein->count;
+	    nonpath_count += ein->count ();
 	}
     }
 
@@ -789,7 +787,7 @@ compute_path_counts (struct redirection_data *rd,
   for (unsigned int i = 1; i < path->length (); i++)
     {
       edge epath = (*path)[i]->e;
-      profile_count cur_count = epath->count;
+      profile_count cur_count = epath->count ();
       if ((*path)[i]->type == EDGE_COPY_SRC_JOINER_BLOCK)
 	{
 	  has_joiner = true;
@@ -809,13 +807,13 @@ compute_path_counts (struct redirection_data *rd,
 		     they are redirected by an invocation of this routine.  */
 		  && !bitmap_bit_p (local_info->duplicate_blocks,
 				    ein->src->index))
-		nonpath_count += ein->count;
+		nonpath_count += ein->count ();
 	    }
 	}
       if (cur_count < path_out_count)
 	path_out_count = cur_count;
-      if (epath->count < min_path_count)
-	min_path_count = epath->count;
+      if (epath->count () < min_path_count)
+	min_path_count = epath->count ();
     }
 
   /* We computed path_out_count above assuming that this path targeted
@@ -830,12 +828,12 @@ compute_path_counts (struct redirection_data *rd,
      (since any path through the joiner with a different elast will not
      include a copy of this elast in its duplicated path).
      So ensure that this path's path_out_count is at least the
-     difference between elast->count and nonpath_count.  Otherwise the edge
+     difference between elast->count () and nonpath_count.  Otherwise the edge
      counts after threading will not be sane.  */
   if (local_info->need_profile_correction
-      && has_joiner && path_out_count < elast->count - nonpath_count)
+      && has_joiner && path_out_count < elast->count () - nonpath_count)
     {
-      path_out_count = elast->count - nonpath_count;
+      path_out_count = elast->count () - nonpath_count;
       /* But neither can we go above the minimum count along the path
 	 we are duplicating.  This can be an issue due to profile
 	 insanities coming in to this pass.  */
@@ -852,267 +850,95 @@ compute_path_counts (struct redirection_data *rd,
 
 /* Update the counts and frequencies for both an original path
    edge EPATH and its duplicate EDUP.  The duplicate source block
-   will get a count/frequency of PATH_IN_COUNT and PATH_IN_FREQ,
+   will get a count of PATH_IN_COUNT and PATH_IN_FREQ,
    and the duplicate edge EDUP will have a count of PATH_OUT_COUNT.  */
 static void
 update_profile (edge epath, edge edup, profile_count path_in_count,
-		profile_count path_out_count, int path_in_freq)
+		profile_count path_out_count)
 {
 
-  /* First update the duplicated block's count / frequency.  */
+  /* First update the duplicated block's count.  */
   if (edup)
     {
       basic_block dup_block = edup->src;
+
+      /* Edup's count is reduced by path_out_count.  We need to redistribute
+         probabilities to the remaining edges.  */
+
+      edge esucc;
+      edge_iterator ei;
+      profile_probability edup_prob
+	 = path_out_count.probability_in (path_in_count);
+
+      /* Either scale up or down the remaining edges.
+	 probabilities are always in range <0,1> and thus we can't do
+	 both by same loop.  */
+      if (edup->probability > edup_prob)
+	{
+	   profile_probability rev_scale
+	     = (profile_probability::always () - edup->probability)
+	       / (profile_probability::always () - edup_prob);
+	   FOR_EACH_EDGE (esucc, ei, dup_block->succs)
+	     if (esucc != edup)
+	       esucc->probability /= rev_scale;
+	}
+      else if (edup->probability < edup_prob)
+	{
+	   profile_probability scale
+	     = (profile_probability::always () - edup_prob)
+	       / (profile_probability::always () - edup->probability);
+	  FOR_EACH_EDGE (esucc, ei, dup_block->succs)
+	    if (esucc != edup)
+	      esucc->probability *= scale;
+	}
+      if (edup_prob.initialized_p ())
+        edup->probability = edup_prob;
+
       gcc_assert (!dup_block->count.initialized_p ());
-      gcc_assert (dup_block->frequency == 0);
       dup_block->count = path_in_count;
-      dup_block->frequency = path_in_freq;
     }
 
-  /* Now update the original block's count and frequency in the
+  if (path_in_count == profile_count::zero ())
+    return;
+
+  profile_count final_count = epath->count () - path_out_count;
+
+  /* Now update the original block's count in the
      opposite manner - remove the counts/freq that will flow
      into the duplicated block.  Handle underflow due to precision/
      rounding issues.  */
   epath->src->count -= path_in_count;
-  epath->src->frequency -= path_in_freq;
-  if (epath->src->frequency < 0)
-    epath->src->frequency = 0;
 
   /* Next update this path edge's original and duplicated counts.  We know
      that the duplicated path will have path_out_count flowing
      out of it (in the joiner case this is the count along the duplicated path
      out of the duplicated joiner).  This count can then be removed from the
      original path edge.  */
-  if (edup)
-    edup->count = path_out_count;
-  epath->count -= path_out_count;
-  /* FIXME: can epath->count be legally uninitialized here?  */
-}
-
 
-/* The duplicate and original joiner blocks may end up with different
-   probabilities (different from both the original and from each other).
-   Recompute the probabilities here once we have updated the edge
-   counts and frequencies.  */
-
-static void
-recompute_probabilities (basic_block bb)
-{
   edge esucc;
   edge_iterator ei;
-  FOR_EACH_EDGE (esucc, ei, bb->succs)
-    {
-      if (!(bb->count > 0))
-	continue;
-
-      /* Prevent overflow computation due to insane profiles.  */
-      if (esucc->count < bb->count)
-	esucc->probability = esucc->count.probability_in (bb->count).guessed ();
-      else
-	/* Can happen with missing/guessed probabilities, since we
-	   may determine that more is flowing along duplicated
-	   path than joiner succ probabilities allowed.
-	   Counts and freqs will be insane after jump threading,
-	   at least make sure probability is sane or we will
-	   get a flow verification error.
-	   Not much we can do to make counts/freqs sane without
-	   redoing the profile estimation.  */
-	esucc->probability = profile_probability::guessed_always ();
-    }
-}
-
-
-/* Update the counts of the original and duplicated edges from a joiner
-   that go off path, given that we have already determined that the
-   duplicate joiner DUP_BB has incoming count PATH_IN_COUNT and
-   outgoing count along the path PATH_OUT_COUNT.  The original (on-)path
-   edge from joiner is EPATH.  */
-
-static void
-update_joiner_offpath_counts (edge epath, basic_block dup_bb,
-			      profile_count path_in_count,
-			      profile_count path_out_count)
-{
-  /* Compute the count that currently flows off path from the joiner.
-     In other words, the total count of joiner's out edges other than
-     epath.  Compute this by walking the successors instead of
-     subtracting epath's count from the joiner bb count, since there
-     are sometimes slight insanities where the total out edge count is
-     larger than the bb count (possibly due to rounding/truncation
-     errors).  */
-  profile_count total_orig_off_path_count = profile_count::zero ();
-  edge enonpath;
-  edge_iterator ei;
-  FOR_EACH_EDGE (enonpath, ei, epath->src->succs)
-    {
-      if (enonpath == epath)
-	continue;
-      total_orig_off_path_count += enonpath->count;
-    }
-
-  /* For the path that we are duplicating, the amount that will flow
-     off path from the duplicated joiner is the delta between the
-     path's cumulative in count and the portion of that count we
-     estimated above as flowing from the joiner along the duplicated
-     path.  */
-  profile_count total_dup_off_path_count = path_in_count - path_out_count;
-
-  /* Now do the actual updates of the off-path edges.  */
-  FOR_EACH_EDGE (enonpath, ei, epath->src->succs)
-    {
-      /* Look for edges going off of the threading path.  */
-      if (enonpath == epath)
-	continue;
-
-      /* Find the corresponding edge out of the duplicated joiner.  */
-      edge enonpathdup = find_edge (dup_bb, enonpath->dest);
-      gcc_assert (enonpathdup);
-
-      /* We can't use the original probability of the joiner's out
-	 edges, since the probabilities of the original branch
-	 and the duplicated branches may vary after all threading is
-	 complete.  But apportion the duplicated joiner's off-path
-	 total edge count computed earlier (total_dup_off_path_count)
-	 among the duplicated off-path edges based on their original
-	 ratio to the full off-path count (total_orig_off_path_count).
-	 */
-      profile_probability scale
-		 = enonpath->count.probability_in (total_orig_off_path_count);
-      /* Give the duplicated offpath edge a portion of the duplicated
-	 total.  */
-      enonpathdup->count = total_dup_off_path_count.apply_probability (scale);
-      /* Now update the original offpath edge count, handling underflow
-	 due to rounding errors.  */
-      enonpath->count -= enonpathdup->count;
-    }
-}
-
-
-/* Check if the paths through RD all have estimated frequencies but zero
-   profile counts.  This is more accurate than checking the entry block
-   for a zero profile count, since profile insanities sometimes creep in.  */
-
-static bool
-estimated_freqs_path (struct redirection_data *rd)
-{
-  edge e = rd->incoming_edges->e;
-  vec<jump_thread_edge *> *path = THREAD_PATH (e);
-  edge ein;
-  edge_iterator ei;
-  bool non_zero_freq = false;
-  FOR_EACH_EDGE (ein, ei, e->dest->preds)
-    {
-      if (ein->count > 0)
-	return false;
-      non_zero_freq |= ein->src->frequency != 0;
-    }
-
-  for (unsigned int i = 1; i < path->length (); i++)
-    {
-      edge epath = (*path)[i]->e;
-      if (epath->src->count > 0)
-	return false;
-      non_zero_freq |= epath->src->frequency != 0;
-      edge esucc;
-      FOR_EACH_EDGE (esucc, ei, epath->src->succs)
-	{
-	  if (esucc->count > 0)
-	    return false;
-	  non_zero_freq |= esucc->src->frequency != 0;
-	}
-    }
-  return non_zero_freq;
-}
-
-
-/* Invoked for routines that have guessed frequencies and no profile
-   counts to record the block and edge frequencies for paths through RD
-   in the profile count fields of those blocks and edges.  This is because
-   ssa_fix_duplicate_block_edges incrementally updates the block and
-   edge counts as edges are redirected, and it is difficult to do that
-   for edge frequencies which are computed on the fly from the source
-   block frequency and probability.  When a block frequency is updated
-   its outgoing edge frequencies are affected and become difficult to
-   adjust.  */
-
-static void
-freqs_to_counts_path (struct redirection_data *rd)
-{
-  edge e = rd->incoming_edges->e;
-  vec<jump_thread_edge *> *path = THREAD_PATH (e);
-  edge ein;
-  edge_iterator ei;
-  FOR_EACH_EDGE (ein, ei, e->dest->preds)
-    {
-      /* Scale up the frequency by REG_BR_PROB_BASE, to avoid rounding
-	 errors applying the probability when the frequencies are very
-	 small.  */
-      if (ein->probability.initialized_p ())
-        ein->count = profile_count::from_gcov_type
-		  (apply_probability (ein->src->frequency * REG_BR_PROB_BASE,
-				        ein->probability
-					  .to_reg_br_prob_base ())).guessed ();
-      else
-	/* FIXME: this is hack; we should track uninitialized values.  */
-	ein->count = profile_count::zero ();
-    }
+  profile_probability epath_prob = final_count.probability_in (epath->src->count);
 
-  for (unsigned int i = 1; i < path->length (); i++)
+  if (epath->probability > epath_prob)
     {
-      edge epath = (*path)[i]->e;
-      edge esucc;
-      /* Scale up the frequency by REG_BR_PROB_BASE, to avoid rounding
-	 errors applying the edge probability when the frequencies are very
-	 small.  */
-      epath->src->count = 
-	profile_count::from_gcov_type
-	  (epath->src->frequency * REG_BR_PROB_BASE);
-      FOR_EACH_EDGE (esucc, ei, epath->src->succs)
-	esucc->count = 
-	   esucc->src->count.apply_probability (esucc->probability);
+       profile_probability rev_scale
+	 = (profile_probability::always () - epath->probability)
+	   / (profile_probability::always () - epath_prob);
+       FOR_EACH_EDGE (esucc, ei, epath->src->succs)
+	 if (esucc != epath)
+	   esucc->probability /= rev_scale;
     }
-}
-
-
-/* For routines that have guessed frequencies and no profile counts, where we
-   used freqs_to_counts_path to record block and edge frequencies for paths
-   through RD, we clear the counts after completing all updates for RD.
-   The updates in ssa_fix_duplicate_block_edges are based off the count fields,
-   but the block frequencies and edge probabilities were updated as well,
-   so we can simply clear the count fields.  */
-
-static void
-clear_counts_path (struct redirection_data *rd)
-{
-  edge e = rd->incoming_edges->e;
-  vec<jump_thread_edge *> *path = THREAD_PATH (e);
-  edge ein, esucc;
-  edge_iterator ei;
-  profile_count val = profile_count::uninitialized ();
-  if (profile_status_for_fn (cfun) == PROFILE_READ)
-    val = profile_count::zero ();
-
-  FOR_EACH_EDGE (ein, ei, e->dest->preds)
-    ein->count = val;
-
-  /* First clear counts along original path.  */
-  for (unsigned int i = 1; i < path->length (); i++)
+  else if (epath->probability < epath_prob)
     {
-      edge epath = (*path)[i]->e;
+       profile_probability scale
+	 = (profile_probability::always () - epath_prob)
+	   / (profile_probability::always () - epath->probability);
       FOR_EACH_EDGE (esucc, ei, epath->src->succs)
-	esucc->count = val;
-      epath->src->count = val;
-    }
-  /* Also need to clear the counts along duplicated path.  */
-  for (unsigned int i = 0; i < 2; i++)
-    {
-      basic_block dup = rd->dup_blocks[i];
-      if (!dup)
-	continue;
-      FOR_EACH_EDGE (esucc, ei, dup->succs)
-	esucc->count = val;
-      dup->count = val;
+	if (esucc != epath)
+	  esucc->probability *= scale;
     }
+  if (epath_prob.initialized_p ())
+    epath->probability = epath_prob;
 }
 
 /* Wire up the outgoing edges from the duplicate blocks and
@@ -1130,20 +956,6 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
   profile_count path_out_count = profile_count::zero ();
   int path_in_freq = 0;
 
-  /* This routine updates profile counts, frequencies, and probabilities
-     incrementally. Since it is difficult to do the incremental updates
-     using frequencies/probabilities alone, for routines without profile
-     data we first take a snapshot of the existing block and edge frequencies
-     by copying them into the empty profile count fields.  These counts are
-     then used to do the incremental updates, and cleared at the end of this
-     routine.  If the function is marked as having a profile, we still check
-     to see if the paths through RD are using estimated frequencies because
-     the routine had zero profile counts.  */
-  bool do_freqs_to_counts = (profile_status_for_fn (cfun) != PROFILE_READ
-			     || estimated_freqs_path (rd));
-  if (do_freqs_to_counts)
-    freqs_to_counts_path (rd);
-
   /* First determine how much profile count to move from original
      path to the duplicate path.  This is tricky in the presence of
      a joiner (see comments for compute_path_counts), where some portion
@@ -1154,7 +966,6 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 					 &path_in_count, &path_out_count,
 					 &path_in_freq);
 
-  int cur_path_freq = path_in_freq;
   for (unsigned int count = 0, i = 1; i < path->length (); i++)
     {
       edge epath = (*path)[i]->e;
@@ -1220,30 +1031,14 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 		}
 	    }
 
-	  /* Update the counts and frequency of both the original block
+	  /* Update the counts of both the original block
 	     and path edge, and the duplicates.  The path duplicate's
-	     incoming count and frequency are the totals for all edges
+	     incoming count are the totals for all edges
 	     incoming to this jump threading path computed earlier.
 	     And we know that the duplicated path will have path_out_count
 	     flowing out of it (i.e. along the duplicated path out of the
 	     duplicated joiner).  */
-	  update_profile (epath, e2, path_in_count, path_out_count,
-			  path_in_freq);
-
-	  /* Next we need to update the counts of the original and duplicated
-	     edges from the joiner that go off path.  */
-	  update_joiner_offpath_counts (epath, e2->src, path_in_count,
-					path_out_count);
-
-	  /* Finally, we need to set the probabilities on the duplicated
-	     edges out of the duplicated joiner (e2->src).  The probabilities
-	     along the original path will all be updated below after we finish
-	     processing the whole path.  */
-	  recompute_probabilities (e2->src);
-
-	  /* Record the frequency flowing to the downstream duplicated
-	     path blocks.  */
-	  cur_path_freq = EDGE_FREQUENCY (e2);
+	  update_profile (epath, e2, path_in_count, path_out_count);
 	}
       else if ((*path)[i]->type == EDGE_COPY_SRC_BLOCK)
 	{
@@ -1253,7 +1048,7 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 	  if (count == 1)
 	    single_succ_edge (rd->dup_blocks[1])->aux = NULL;
 
-	  /* Update the counts and frequency of both the original block
+	  /* Update the counts of both the original block
 	     and path edge, and the duplicates.  Since we are now after
 	     any joiner that may have existed on the path, the count
 	     flowing along the duplicated threaded path is path_out_count.
@@ -1263,8 +1058,7 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 	     been updated at the end of that handling to the edge frequency
 	     along the duplicated joiner path edge.  */
 	  update_profile (epath, EDGE_SUCC (rd->dup_blocks[count], 0),
-			  path_out_count, path_out_count,
-			  cur_path_freq);
+			  path_out_count, path_out_count);
 	}
       else
 	{
@@ -1281,8 +1075,7 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 	     thread path (path_in_freq).  If we had a joiner, it would have
 	     been updated at the end of that handling to the edge frequency
 	     along the duplicated joiner path edge.  */
-	   update_profile (epath, NULL, path_out_count, path_out_count,
-			   cur_path_freq);
+	   update_profile (epath, NULL, path_out_count, path_out_count);
 	}
 
       /* Increment the index into the duplicated path when we processed
@@ -1293,19 +1086,6 @@ ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 	  count++;
 	}
     }
-
-  /* Now walk orig blocks and update their probabilities, since the
-     counts and freqs should be updated properly by above loop.  */
-  for (unsigned int i = 1; i < path->length (); i++)
-    {
-      edge epath = (*path)[i]->e;
-      recompute_probabilities (epath->src);
-    }
-
-  /* Done with all profile and frequency updates, clear counts if they
-     were copied.  */
-  if (do_freqs_to_counts)
-    clear_counts_path (rd);
 }
 
 /* Hash table traversal callback routine to create duplicate blocks.  */
@@ -2215,7 +1995,6 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
   struct loop *loop = entry->dest->loop_father;
   edge exit_copy;
   edge redirected;
-  int curr_freq;
   profile_count curr_count;
 
   if (!can_copy_bbs_p (region, n_region))
@@ -2247,8 +2026,7 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
      invalidating the property that is propagated by executing all the blocks of
      the jump-thread path in order.  */
 
-  curr_count = entry->count;
-  curr_freq = EDGE_FREQUENCY (entry);
+  curr_count = entry->count ();
 
   for (i = 0; i < n_region; i++)
     {
@@ -2259,10 +2037,8 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
       /* Watch inconsistent profile.  */
       if (curr_count > region[i]->count)
 	curr_count = region[i]->count;
-      if (curr_freq > region[i]->frequency)
-	curr_freq = region[i]->frequency;
       /* Scale current BB.  */
-      if (region[i]->count > 0 && curr_count.initialized_p ())
+      if (region[i]->count.nonzero_p () && curr_count.initialized_p ())
 	{
 	  /* In the middle of the path we only scale the frequencies.
 	     In last BB we need to update probabilities of outgoing edges
@@ -2273,24 +2049,11 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
 					         region[i]->count);
 	  else
 	    update_bb_profile_for_threading (region[i],
-					     curr_freq, curr_count,
+					     curr_count,
 					     exit);
 	  scale_bbs_frequencies_profile_count (region_copy + i, 1, curr_count,
 					       region_copy[i]->count);
 	}
-      else if (region[i]->frequency)
-	{
-	  if (i + 1 != n_region)
-	    scale_bbs_frequencies_int (region + i, 1,
-				       region[i]->frequency - curr_freq,
-				       region[i]->frequency);
-	  else
-	    update_bb_profile_for_threading (region[i],
-					     curr_freq, curr_count,
-					     exit);
-	  scale_bbs_frequencies_int (region_copy + i, 1, curr_freq,
-				     region_copy[i]->frequency);
-	}
 
       if (single_succ_p (bb))
 	{
@@ -2299,8 +2062,7 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
 		      || region_copy[i + 1] == single_succ_edge (bb)->dest);
 	  if (i + 1 != n_region)
 	    {
-	      curr_freq = EDGE_FREQUENCY (single_succ_edge (bb));
-	      curr_count = single_succ_edge (bb)->count;
+	      curr_count = single_succ_edge (bb)->count ();
 	    }
 	  continue;
 	}
@@ -2330,8 +2092,7 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
 	  }
 	else
 	  {
-	    curr_freq = EDGE_FREQUENCY (e);
-	    curr_count = e->count;
+	    curr_count = e->count ();
 	  }
     }
 
@@ -2353,7 +2114,6 @@ duplicate_thread_path (edge entry, edge exit, basic_block *region,
     {
       rescan_loop_exit (e, true, false);
       e->probability = profile_probability::always ();
-      e->count = region_copy[n_region - 1]->count;
     }
 
   /* Redirect the entry and add the phi node arguments.  */
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 200ec70c9b7..35a49d28420 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -408,40 +408,10 @@ uncprop_into_successor_phis (basic_block bb)
     }
 }
 
-/* Ignoring loop backedges, if BB has precisely one incoming edge then
-   return that edge.  Otherwise return NULL.  */
-static edge
-single_incoming_edge_ignoring_loop_edges (basic_block bb)
-{
-  edge retval = NULL;
-  edge e;
-  edge_iterator ei;
-
-  FOR_EACH_EDGE (e, ei, bb->preds)
-    {
-      /* A loop back edge can be identified by the destination of
-	 the edge dominating the source of the edge.  */
-      if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
-	continue;
-
-      /* If we have already seen a non-loop edge, then we must have
-	 multiple incoming non-loop edges and thus we return NULL.  */
-      if (retval)
-	return NULL;
-
-      /* This is the first non-loop incoming edge we have found.  Record
-	 it.  */
-      retval = e;
-    }
-
-  return retval;
-}
-
 edge
 uncprop_dom_walker::before_dom_children (basic_block bb)
 {
   basic_block parent;
-  edge e;
   bool recorded = false;
 
   /* If this block is dominated by a single incoming edge and that edge
@@ -450,7 +420,7 @@ uncprop_dom_walker::before_dom_children (basic_block bb)
   parent = get_immediate_dominator (CDI_DOMINATORS, bb);
   if (parent)
     {
-      e = single_incoming_edge_ignoring_loop_edges (bb);
+      edge e = single_pred_edge_ignoring_loop_edges (bb, false);
 
       if (e && e->src == parent && e->aux)
 	{
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index dc9fc84c6a0..f0d158391df 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -107,8 +107,7 @@ hoist_edge_and_branch_if_true (gimple_stmt_iterator *gsip,
   e_false->flags &= ~EDGE_FALLTHRU;
   e_false->flags |= EDGE_FALSE_VALUE;
   e_false->probability = e_true->probability.invert ();
-  e_false->count = split_bb->count - e_true->count;
-  new_bb->count = e_false->count;
+  new_bb->count = e_false->count ();
 
   if (update_dominators)
     {
@@ -239,9 +238,9 @@ case_bit_test_cmp (const void *p1, const void *p2)
   const struct case_bit_test *const d1 = (const struct case_bit_test *) p1;
   const struct case_bit_test *const d2 = (const struct case_bit_test *) p2;
 
-  if (d2->target_edge->count < d1->target_edge->count)
+  if (d2->target_edge->count () < d1->target_edge->count ())
     return -1;
-  if (d2->target_edge->count > d1->target_edge->count)
+  if (d2->target_edge->count () > d1->target_edge->count ())
     return 1;
   if (d2->bits != d1->bits)
     return d2->bits - d1->bits;
@@ -635,10 +634,10 @@ collect_switch_conv_info (gswitch *swtch, struct switch_conv_info *info)
     = label_to_block (CASE_LABEL (gimple_switch_default_label (swtch)));
   e_default = find_edge (info->switch_bb, info->default_bb);
   info->default_prob = e_default->probability;
-  info->default_count = e_default->count;
+  info->default_count = e_default->count ();
   FOR_EACH_EDGE (e, ei, info->switch_bb->succs)
     if (e != e_default)
-      info->other_count += e->count;
+      info->other_count += e->count ();
 
   /* Get upper and lower bounds of case values, and the covered range.  */
   min_case = gimple_switch_label (swtch, 1);
@@ -1424,19 +1423,16 @@ gen_inbound_check (gswitch *swtch, struct switch_conv_info *info)
   if (!info->default_case_nonstandard)
     e01 = make_edge (bb0, bb1, EDGE_TRUE_VALUE);
   e01->probability = info->default_prob.invert ();
-  e01->count = info->other_count;
 
   /* flags and profiles of the edge taking care of out-of-range values */
   e02->flags &= ~EDGE_FALLTHRU;
   e02->flags |= EDGE_FALSE_VALUE;
   e02->probability = info->default_prob;
-  e02->count = info->default_count;
 
   bbf = info->final_bb;
 
   e1f = make_edge (bb1, bbf, EDGE_FALLTHRU);
   e1f->probability = profile_probability::always ();
-  e1f->count = info->other_count;
 
   if (info->default_case_nonstandard)
     e2f = NULL;
@@ -1444,14 +1440,13 @@ gen_inbound_check (gswitch *swtch, struct switch_conv_info *info)
     {
       e2f = make_edge (bb2, bbf, EDGE_FALLTHRU);
       e2f->probability = profile_probability::always ();
-      e2f->count = info->default_count;
     }
 
   /* frequencies of the new BBs */
-  bb1->frequency = EDGE_FREQUENCY (e01);
-  bb2->frequency = EDGE_FREQUENCY (e02);
+  bb1->count = e01->count ();
+  bb2->count = e02->count ();
   if (!info->default_case_nonstandard)
-    bbf->frequency = EDGE_FREQUENCY (e1f) + EDGE_FREQUENCY (e2f);
+    bbf->count = e1f->count () + e2f->count ();
 
   /* Tidy blocks that have become unreachable.  */
   prune_bbs (bbd, info->final_bb,
@@ -2248,12 +2243,10 @@ do_jump_if_equal (basic_block bb, tree op0, tree op1, basic_block label_bb,
   edge false_edge = split_block (bb, cond);
   false_edge->flags = EDGE_FALSE_VALUE;
   false_edge->probability = prob.invert ();
-  false_edge->count = bb->count.apply_probability (false_edge->probability);
 
   edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE);
   fix_phi_operands_for_edge (true_edge, phi_mapping);
   true_edge->probability = prob;
-  true_edge->count = bb->count.apply_probability (true_edge->probability);
 
   return false_edge->dest;
 }
@@ -2293,12 +2286,10 @@ emit_cmp_and_jump_insns (basic_block bb, tree op0, tree op1,
   edge false_edge = split_block (bb, cond);
   false_edge->flags = EDGE_FALSE_VALUE;
   false_edge->probability = prob.invert ();
-  false_edge->count = bb->count.apply_probability (false_edge->probability);
 
   edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE);
   fix_phi_operands_for_edge (true_edge, phi_mapping);
   true_edge->probability = prob;
-  true_edge->count = bb->count.apply_probability (true_edge->probability);
 
   return false_edge->dest;
 }
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index c4b8cee4d27..0e637147e8c 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -805,20 +805,14 @@ adjust_return_value (basic_block bb, tree m, tree a)
 /* Subtract COUNT and FREQUENCY from the basic block and it's
    outgoing edge.  */
 static void
-decrease_profile (basic_block bb, profile_count count, int frequency)
+decrease_profile (basic_block bb, profile_count count)
 {
-  edge e;
   bb->count = bb->count - count;
-  bb->frequency -= frequency;
-  if (bb->frequency < 0)
-    bb->frequency = 0;
   if (!single_succ_p (bb))
     {
       gcc_assert (!EDGE_COUNT (bb->succs));
       return;
     }
-  e = single_succ_edge (bb);
-  e->count -= count;
 }
 
 /* Returns true if argument PARAM of the tail recursive call needs to be copied
@@ -895,11 +889,10 @@ eliminate_tail_call (struct tailcall *t)
 
   /* Number of executions of function has reduced by the tailcall.  */
   e = single_succ_edge (gsi_bb (t->call_gsi));
-  decrease_profile (EXIT_BLOCK_PTR_FOR_FN (cfun), e->count, EDGE_FREQUENCY (e));
-  decrease_profile (ENTRY_BLOCK_PTR_FOR_FN (cfun), e->count,
-		    EDGE_FREQUENCY (e));
+  decrease_profile (EXIT_BLOCK_PTR_FOR_FN (cfun), e->count ());
+  decrease_profile (ENTRY_BLOCK_PTR_FOR_FN (cfun), e->count ());
   if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
-    decrease_profile (e->dest, e->count, EDGE_FREQUENCY (e));
+    decrease_profile (e->dest, e->count ());
 
   /* Replace the call by a jump to the start of function.  */
   e = redirect_edge_and_branch (single_succ_edge (gsi_bb (t->call_gsi)),
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 1152222be08..b19b6fbc0f0 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -1443,6 +1443,7 @@ ssa_uniform_vector_p (tree op)
 {
   if (TREE_CODE (op) == VECTOR_CST
       || TREE_CODE (op) == VEC_DUPLICATE_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_EXPR
       || TREE_CODE (op) == CONSTRUCTOR)
     return uniform_vector_p (op);
   if (TREE_CODE (op) == SSA_NAME)
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 8cdc3c2521e..8811679e2bf 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -1481,13 +1481,10 @@ slpeel_add_loop_guard (basic_block guard_bb, tree cond,
   /* Add new edge to connect guard block to the merge/loop-exit block.  */
   new_e = make_edge (guard_bb, guard_to, EDGE_TRUE_VALUE);
 
-  new_e->count = guard_bb->count;
   new_e->probability = probability;
-  new_e->count = enter_e->count.apply_probability (probability);
   if (irreducible_p)
     new_e->flags |= EDGE_IRREDUCIBLE_LOOP;
 
-  enter_e->count -= new_e->count;
   enter_e->probability = probability.invert ();
   set_immediate_dominator (CDI_DOMINATORS, guard_to, dom_bb);
 
@@ -2914,9 +2911,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 
 	  /* Simply propagate profile info from guard_bb to guard_to which is
 	     a merge point of control flow.  */
-	  guard_to->frequency = guard_bb->frequency;
 	  guard_to->count = guard_bb->count;
-	  single_succ_edge (guard_to)->count = guard_to->count;
 	  /* Scale probability of epilog loop back.
 	     FIXME: We should avoid scaling down and back up.  Profile may
 	     get lost if we scale down to 0.  */
@@ -3288,7 +3283,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
     cond_expr = fold_build2 (GE_EXPR, boolean_type_node, scalar_loop_iters,
 			     build_int_cst (TREE_TYPE (scalar_loop_iters),
 					    th - 1));
-  if (maybe_nonzero (versioning_threshold))
+  if (may_ne (versioning_threshold, 0U))
     {
       tree expr = fold_build2 (GE_EXPR, boolean_type_node, scalar_loop_iters,
 			       build_int_cst (TREE_TYPE (scalar_loop_iters),
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 61da0e7998d..91a3610a1a0 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -1841,7 +1841,7 @@ vect_update_vf_for_slp (loop_vec_info loop_vinfo)
 		     "=== vect_update_vf_for_slp ===\n");
 
   vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
-  gcc_assert (known_nonzero (vectorization_factor));
+  gcc_assert (must_ne (vectorization_factor, 0U));
 
   /* If all the stmts in the loop can be SLPed, we perform only SLP, and
      vectorization factor of the loop is the unrolling factor required by
@@ -2366,7 +2366,7 @@ start_over:
 
   /* Now the vectorization factor is final.  */
   poly_uint64 vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
-  gcc_assert (known_nonzero (vectorization_factor));
+  gcc_assert (must_ne (vectorization_factor, 0U));
 
   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) && dump_enabled_p ())
     {
@@ -2814,7 +2814,7 @@ vect_analyze_loop (struct loop *loop, loop_vec_info orig_loop_vinfo)
 
       if (fatal
 	  || next_size == vector_sizes.length ()
-	  || known_zero (current_vector_size))
+	  || must_eq (current_vector_size, 0U))
 	return NULL;
 
       /* Try the next biggest vector size.  */
@@ -6666,9 +6666,13 @@ vectorizable_reduction (gimple *stmt, gimple_stmt_iterator *gsi,
 	  reduc_index = i;
 	  continue;
 	}
-      else
+      else if (tem)
 	{
-	  if (!vectype_in)
+	  /* To properly compute ncopies we are interested in the widest
+	     input type in case we're looking at a widening accumulation.  */
+	  if (!vectype_in
+	      || (GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE (vectype_in)))
+		  < GET_MODE_SIZE (SCALAR_TYPE_MODE (TREE_TYPE (tem)))))
 	    vectype_in = tem;
 	}
 
@@ -8642,36 +8646,27 @@ scale_profile_for_vect_loop (struct loop *loop, unsigned vf)
   edge preheader = loop_preheader_edge (loop);
   /* Reduce loop iterations by the vectorization factor.  */
   gcov_type new_est_niter = niter_for_unrolled_loop (loop, vf);
-  profile_count freq_h = loop->header->count, freq_e = preheader->count;
+  profile_count freq_h = loop->header->count, freq_e = preheader->count ();
 
-  /* Use frequency only if counts are zero.  */
-  if (!(freq_h > 0) && !(freq_e > 0))
-    {
-      freq_h = profile_count::from_gcov_type (loop->header->frequency);
-      freq_e = profile_count::from_gcov_type (EDGE_FREQUENCY (preheader));
-    }
-  if (freq_h > 0)
+  if (freq_h.nonzero_p ())
     {
       profile_probability p;
 
       /* Avoid dropping loop body profile counter to 0 because of zero count
 	 in loop's preheader.  */
-      if (!(freq_e > profile_count::from_gcov_type (1)))
-       freq_e = profile_count::from_gcov_type (1);
+      if (!(freq_e == profile_count::zero ()))
+        freq_e = freq_e.force_nonzero ();
       p = freq_e.apply_scale (new_est_niter + 1, 1).probability_in (freq_h);
       scale_loop_frequencies (loop, p);
     }
 
-  basic_block exit_bb = single_pred (loop->latch);
   edge exit_e = single_exit (loop);
-  exit_e->count = loop_preheader_edge (loop)->count;
   exit_e->probability = profile_probability::always ()
 				 .apply_scale (1, new_est_niter + 1);
 
   edge exit_l = single_pred_edge (loop->latch);
   profile_probability prob = exit_l->probability;
   exit_l->probability = exit_e->probability.invert ();
-  exit_l->count = exit_bb->count - exit_e->count;
   if (prob.initialized_p () && exit_l->probability.initialized_p ())
     scale_bbs_frequencies (&loop->latch, 1, exit_l->probability / prob);
 }
@@ -9327,7 +9322,7 @@ optimize_mask_stores (struct loop *loop)
       efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE);
       /* Put STORE_BB to likely part.  */
       efalse->probability = profile_probability::unlikely ();
-      store_bb->frequency = PROB_ALWAYS - EDGE_FREQUENCY (efalse);
+      store_bb->count = efalse->count ();
       make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU);
       if (dom_info_available_p (CDI_DOMINATORS))
 	set_immediate_dominator (CDI_DOMINATORS, store_bb, bb);
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 57c76364b9c..066ec48c056 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -3979,6 +3979,20 @@ vect_recog_mask_conversion_pattern (vec<gimple *> *stmts, tree *type_in,
 	}
       else if (COMPARISON_CLASS_P (rhs1))
 	{
+	  /* Check whether we're comparing scalar booleans and (if so)
+	     whether a better mask type exists than the mask associated
+	     with boolean-sized elements.  This avoids unnecessary packs
+	     and unpacks if the booleans are set from comparisons of
+	     wider types.  E.g. in:
+
+	       int x1, x2, x3, x4, y1, y1;
+	       ...
+	       bool b1 = (x1 == x2);
+	       bool b2 = (x3 == x4);
+	       ... = b1 == b2 ? y1 : y2;
+
+	     it is better for b1 and b2 to use the mask type associated
+	     with int elements rather bool (byte) elements.  */
 	  rhs1_type = search_type_for_mask (TREE_OPERAND (rhs1, 0), vinfo);
 	  if (!rhs1_type)
 	    rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0));
@@ -3992,9 +4006,11 @@ vect_recog_mask_conversion_pattern (vec<gimple *> *stmts, tree *type_in,
 	return NULL;
 
       /* Continue if a conversion is needed.  Also continue if we have
-	 a comparison whose natural vector type is different from VECTYPE2;
-	 in that case we'll replace the comparison with an SSA name and
-	 behave as though the comparison was an SSA name from the outset.  */
+	 a comparison whose vector type would normally be different from
+	 VECTYPE2 when considered in isolation.  In that case we'll
+	 replace the comparison with an SSA name (so that we can record
+	 its vector type) and behave as though the comparison was an SSA
+	 name from the outset.  */
       if (must_eq (TYPE_VECTOR_SUBPARTS (vectype1),
 		   TYPE_VECTOR_SUBPARTS (vectype2))
 	  && (TREE_CODE (rhs1) == SSA_NAME
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index c6ad2c557fa..01dfecaafb6 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3242,7 +3242,7 @@ vect_slp_bb (basic_block bb)
 
       if (vectorized
 	  || next_size == vector_sizes.length ()
-	  || known_zero (current_vector_size)
+	  || must_eq (current_vector_size, 0U)
 	  /* If vect_slp_analyze_bb_1 signaled that analysis for all
 	     vector sizes will fail do not bother iterating.  */
 	  || fatal)
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1bc2d43773c..f810212ca1b 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -98,6 +98,12 @@ record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
 		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
 		  int misalign, enum vect_cost_model_location where)
 {
+  if ((kind == vector_load || kind == unaligned_load)
+      && STMT_VINFO_GATHER_SCATTER_P (stmt_info))
+    kind = vector_gather_load;
+  if ((kind == vector_store || kind == unaligned_store)
+      && STMT_VINFO_GATHER_SCATTER_P (stmt_info))
+    kind = vector_scatter_store;
   if (body_cost_vec)
     {
       tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
@@ -4756,7 +4762,7 @@ vectorizable_simd_clone_call (gimple *stmt, gimple_stmt_iterator *gsi,
   vec<tree> vargs = vNULL;
   size_t i, nargs;
   tree lhs, rtype, ratype;
-  vec<constructor_elt, va_gc> *ret_ctor_elts;
+  vec<constructor_elt, va_gc> *ret_ctor_elts = NULL;
 
   /* Is STMT a vectorizable call?   */
   if (!is_gimple_call (stmt))
@@ -8975,7 +8981,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 	         we need to skip the gaps after we manage to fully load
 		 all elements.  group_gap_adj is GROUP_SIZE here.  */
 	      group_elt += nunits;
-	      if (maybe_nonzero (group_gap_adj)
+	      if (may_ne (group_gap_adj, 0U)
 		  && !slp_perm
 		  && must_eq (group_elt, group_size - group_gap_adj))
 		{
@@ -8990,7 +8996,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 	    }
 	  /* Bump the vector pointer to account for a gap or for excess
 	     elements loaded for a permuted SLP load.  */
-	  if (maybe_nonzero (group_gap_adj) && slp_perm)
+	  if (may_ne (group_gap_adj, 0U) && slp_perm)
 	    {
 	      poly_wide_int bump_val
 		= (wi::to_wide (TYPE_SIZE_UNIT (elem_type))
@@ -10541,7 +10547,7 @@ get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
 
   /* If no size was supplied use the mode the target prefers.   Otherwise
      lookup a vector mode of the specified size.  */
-  if (known_zero (size))
+  if (must_eq (size, 0U))
     simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
   else if (!multiple_p (size, nbytes, &nunits)
 	   || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
@@ -10579,7 +10585,7 @@ get_vectype_for_scalar_type (tree scalar_type)
   vectype = get_vectype_for_scalar_type_and_size (scalar_type,
 						  current_vector_size);
   if (vectype
-      && known_zero (current_vector_size))
+      && must_eq (current_vector_size, 0U))
     current_vector_size = GET_MODE_SIZE (TYPE_MODE (vectype));
   return vectype;
 }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index a34dea905d3..f229c8983bf 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1619,21 +1619,20 @@ extract_range_from_ssa_name (value_range *vr, tree var)
 }
 
 
-/* Wrapper around int_const_binop.  If the operation overflows and
-   overflow is undefined, then adjust the result to be
-   -INF or +INF depending on CODE, VAL1 and VAL2.  Sets *OVERFLOW_P
-   to whether the operation overflowed.  For division by zero
-   the result is indeterminate but *OVERFLOW_P is set.  */
+/* Wrapper around int_const_binop.  Return true if we can compute the
+   result; i.e. if the operation doesn't overflow or if the overflow is
+   undefined.  In the latter case (if the operation overflows and
+   overflow is undefined), then adjust the result to be -INF or +INF
+   depending on CODE, VAL1 and VAL2.  Return the value in *RES.
 
-static wide_int
-vrp_int_const_binop (enum tree_code code, tree val1, tree val2,
-		     bool *overflow_p)
+   Return false for division by zero, for which the result is
+   indeterminate.  */
+
+static bool
+vrp_int_const_binop (enum tree_code code, tree val1, tree val2, wide_int *res)
 {
   bool overflow = false;
   signop sign = TYPE_SIGN (TREE_TYPE (val1));
-  wide_int res;
-
-  *overflow_p = false;
 
   switch (code)
     {
@@ -1654,57 +1653,45 @@ vrp_int_const_binop (enum tree_code code, tree val1, tree val2,
 	  /* It's unclear from the C standard whether shifts can overflow.
 	     The following code ignores overflow; perhaps a C standard
 	     interpretation ruling is needed.  */
-	  res = wi::rshift (wi::to_wide (val1), wval2, sign);
+	  *res = wi::rshift (wi::to_wide (val1), wval2, sign);
 	else
-	  res = wi::lshift (wi::to_wide (val1), wval2);
+	  *res = wi::lshift (wi::to_wide (val1), wval2);
 	break;
       }
 
     case MULT_EXPR:
-      res = wi::mul (wi::to_wide (val1),
-		     wi::to_wide (val2), sign, &overflow);
+      *res = wi::mul (wi::to_wide (val1),
+		      wi::to_wide (val2), sign, &overflow);
       break;
 
     case TRUNC_DIV_EXPR:
     case EXACT_DIV_EXPR:
       if (val2 == 0)
-	{
-	  *overflow_p = true;
-	  return res;
-	}
+	return false;
       else
-	res = wi::div_trunc (wi::to_wide (val1),
-			     wi::to_wide (val2), sign, &overflow);
+	*res = wi::div_trunc (wi::to_wide (val1),
+			      wi::to_wide (val2), sign, &overflow);
       break;
 
     case FLOOR_DIV_EXPR:
       if (val2 == 0)
-	{
-	  *overflow_p = true;
-	  return res;
-	}
-      res = wi::div_floor (wi::to_wide (val1),
-			   wi::to_wide (val2), sign, &overflow);
+	return false;
+      *res = wi::div_floor (wi::to_wide (val1),
+			    wi::to_wide (val2), sign, &overflow);
       break;
 
     case CEIL_DIV_EXPR:
       if (val2 == 0)
-	{
-	  *overflow_p = true;
-	  return res;
-	}
-      res = wi::div_ceil (wi::to_wide (val1),
-			  wi::to_wide (val2), sign, &overflow);
+	return false;
+      *res = wi::div_ceil (wi::to_wide (val1),
+			   wi::to_wide (val2), sign, &overflow);
       break;
 
     case ROUND_DIV_EXPR:
       if (val2 == 0)
-	{
-	  *overflow_p = 0;
-	  return res;
-	}
-      res = wi::div_round (wi::to_wide (val1),
-			   wi::to_wide (val2), sign, &overflow);
+	return false;
+      *res = wi::div_round (wi::to_wide (val1),
+			    wi::to_wide (val2), sign, &overflow);
       break;
 
     default:
@@ -1747,16 +1734,15 @@ vrp_int_const_binop (enum tree_code code, tree val1, tree val2,
 	  || code == CEIL_DIV_EXPR
 	  || code == EXACT_DIV_EXPR
 	  || code == ROUND_DIV_EXPR)
-	return wi::max_value (TYPE_PRECISION (TREE_TYPE (val1)),
+	*res = wi::max_value (TYPE_PRECISION (TREE_TYPE (val1)),
 			      TYPE_SIGN (TREE_TYPE (val1)));
       else
-	return wi::min_value (TYPE_PRECISION (TREE_TYPE (val1)),
+	*res = wi::min_value (TYPE_PRECISION (TREE_TYPE (val1)),
 			      TYPE_SIGN (TREE_TYPE (val1)));
+      return true;
     }
 
-  *overflow_p = overflow;
-
-  return res;
+  return !overflow;
 }
 
 
@@ -1852,7 +1838,6 @@ extract_range_from_multiplicative_op_1 (value_range *vr,
 {
   enum value_range_type rtype;
   wide_int val, min, max;
-  bool sop;
   tree type;
 
   /* Multiplications, divisions and shifts are a bit tricky to handle,
@@ -1883,58 +1868,50 @@ extract_range_from_multiplicative_op_1 (value_range *vr,
   signop sgn = TYPE_SIGN (type);
 
   /* Compute the 4 cross operations and their minimum and maximum value.  */
-  sop = false;
-  val = vrp_int_const_binop (code, vr0->min, vr1->min, &sop);
-  if (! sop)
-    min = max = val;
-
-  if (vr1->max == vr1->min)
-    ;
-  else if (! sop)
+  if (!vrp_int_const_binop (code, vr0->min, vr1->min, &val))
     {
-      val = vrp_int_const_binop (code, vr0->min, vr1->max, &sop);
-      if (! sop)
-	{
-	  if (wi::lt_p (val, min, sgn))
-	    min = val;
-	  else if (wi::gt_p (val, max, sgn))
-	    max = val;
-	}
+      set_value_range_to_varying (vr);
+      return;
     }
+  min = max = val;
 
-  if (vr0->max == vr0->min)
-    ;
-  else if (! sop)
+  if (vr1->max != vr1->min)
     {
-      val = vrp_int_const_binop (code, vr0->max, vr1->min, &sop);
-      if (! sop)
+      if (!vrp_int_const_binop (code, vr0->min, vr1->max, &val))
 	{
-	  if (wi::lt_p (val, min, sgn))
-	    min = val;
-	  else if (wi::gt_p (val, max, sgn))
-	    max = val;
+	  set_value_range_to_varying (vr);
+	  return;
 	}
+      if (wi::lt_p (val, min, sgn))
+	min = val;
+      else if (wi::gt_p (val, max, sgn))
+	max = val;
     }
 
-  if (vr0->min == vr0->max || vr1->min == vr1->max)
-    ;
-  else if (! sop)
+  if (vr0->max != vr0->min)
     {
-      val = vrp_int_const_binop (code, vr0->max, vr1->max, &sop);
-      if (! sop)
+      if (!vrp_int_const_binop (code, vr0->max, vr1->min, &val))
 	{
-	  if (wi::lt_p (val, min, sgn))
-	    min = val;
-	  else if (wi::gt_p (val, max, sgn))
-	    max = val;
+	  set_value_range_to_varying (vr);
+	  return;
 	}
+      if (wi::lt_p (val, min, sgn))
+	min = val;
+      else if (wi::gt_p (val, max, sgn))
+	max = val;
     }
 
-  /* If either operation overflowed, drop to VARYING.  */
-  if (sop)
+  if (vr0->min != vr0->max && vr1->min != vr1->max)
     {
-      set_value_range_to_varying (vr);
-      return;
+      if (!vrp_int_const_binop (code, vr0->max, vr1->max, &val))
+	{
+	  set_value_range_to_varying (vr);
+	  return;
+	}
+      if (wi::lt_p (val, min, sgn))
+	min = val;
+      else if (wi::gt_p (val, max, sgn))
+	max = val;
     }
 
   /* If the new range has its limits swapped around (MIN > MAX),
@@ -2804,7 +2781,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
 	      return;
 	    }
 	}
-      else if (!symbolic_range_p (&vr0) && !symbolic_range_p (&vr1))
+      else if (range_int_cst_p (&vr0) && range_int_cst_p (&vr1))
 	{
 	  extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
 	  return;
@@ -6856,10 +6833,7 @@ check_array_bounds (tree *tp, int *walk_subtree, void *data)
   if (EXPR_HAS_LOCATION (t))
     location = EXPR_LOCATION (t);
   else
-    {
-      location_t *locp = (location_t *) wi->info;
-      location = *locp;
-    }
+    location = gimple_location (wi->stmt);
 
   *walk_subtree = TRUE;
 
@@ -6906,9 +6880,6 @@ check_all_array_refs (void)
 
 	  memset (&wi, 0, sizeof (wi));
 
-	  location_t loc = gimple_location (stmt);
-	  wi.info = &loc;
-
 	  walk_gimple_op (gsi_stmt (si),
 			  check_array_bounds,
 			  &wi);
@@ -8111,6 +8082,13 @@ extract_range_from_stmt (gimple *stmt, edge *taken_edge_p,
     vrp_visit_switch_stmt (as_a <gswitch *> (stmt), taken_edge_p);
 }
 
+class vrp_prop : public ssa_propagation_engine
+{
+ public:
+  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
+  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
+};
+
 /* Evaluate statement STMT.  If the statement produces a useful range,
    return SSA_PROP_INTERESTING and record the SSA name with the
    interesting range into *OUTPUT_P.
@@ -8120,8 +8098,8 @@ extract_range_from_stmt (gimple *stmt, edge *taken_edge_p,
 
    If STMT produces a varying value, return SSA_PROP_VARYING.  */
 
-static enum ssa_prop_result
-vrp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
+enum ssa_prop_result
+vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
 {
   value_range vr = VR_INITIALIZER;
   tree lhs = gimple_get_lhs (stmt);
@@ -9212,8 +9190,8 @@ update_range:
    edges.  If a valid value range can be derived from all the incoming
    value ranges, set a new range for the LHS of PHI.  */
 
-static enum ssa_prop_result
-vrp_visit_phi_node (gphi *phi)
+enum ssa_prop_result
+vrp_prop::visit_phi (gphi *phi)
 {
   tree lhs = PHI_RESULT (phi);
   value_range vr_result = VR_INITIALIZER;
@@ -10548,10 +10526,17 @@ fold_predicate_in (gimple_stmt_iterator *si)
   return false;
 }
 
+class vrp_folder : public substitute_and_fold_engine
+{
+ public:
+  tree get_value (tree) FINAL OVERRIDE;
+  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
+};
+
 /* Callback for substitute_and_fold folding the stmt at *SI.  */
 
-static bool
-vrp_fold_stmt (gimple_stmt_iterator *si)
+bool
+vrp_folder::fold_stmt (gimple_stmt_iterator *si)
 {
   if (fold_predicate_in (si))
     return true;
@@ -10559,6 +10544,18 @@ vrp_fold_stmt (gimple_stmt_iterator *si)
   return simplify_stmt_using_ranges (si);
 }
 
+/* If OP has a value range with a single constant value return that,
+   otherwise return NULL_TREE.  This returns OP itself if OP is a
+   constant.
+
+   Implemented as a pure wrapper right now, but this will change.  */
+
+tree
+vrp_folder::get_value (tree op)
+{
+  return op_with_constant_singleton_value_range (op);
+}
+
 /* Return the LHS of any ASSERT_EXPR where OP appears as the first
    argument to the ASSERT_EXPR and in which the ASSERT_EXPR dominates
    BB.  If no such ASSERT_EXPR is found, return OP.  */
@@ -10900,7 +10897,8 @@ vrp_finalize (bool warn_array_bounds_p)
 			  wi::to_wide (vr_value[i]->max));
       }
 
-  substitute_and_fold (op_with_constant_singleton_value_range, vrp_fold_stmt);
+  class vrp_folder vrp_folder;
+  vrp_folder.substitute_and_fold ();
 
   if (warn_array_bounds && warn_array_bounds_p)
     check_all_array_refs ();
@@ -10968,33 +10966,17 @@ evrp_dom_walker::try_find_new_range (tree name,
 edge
 evrp_dom_walker::before_dom_children (basic_block bb)
 {
-  tree op0 = NULL_TREE;
-  edge_iterator ei;
-  edge e;
-
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "Visiting BB%d\n", bb->index);
 
   stack.safe_push (std::make_pair (NULL_TREE, (value_range *)NULL));
 
-  edge pred_e = NULL;
-  FOR_EACH_EDGE (e, ei, bb->preds)
-    {
-      /* Ignore simple backedges from this to allow recording conditions
-	 in loop headers.  */
-      if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
-	continue;
-      if (! pred_e)
-	pred_e = e;
-      else
-	{
-	  pred_e = NULL;
-	  break;
-	}
-    }
+  edge pred_e = single_pred_edge_ignoring_loop_edges (bb, false);
   if (pred_e)
     {
       gimple *stmt = last_stmt (pred_e->src);
+      tree op0 = NULL_TREE;
+
       if (stmt
 	  && gimple_code (stmt) == GIMPLE_COND
 	  && (op0 = gimple_cond_lhs (stmt))
@@ -11038,6 +11020,8 @@ evrp_dom_walker::before_dom_children (basic_block bb)
 
   /* Visit PHI stmts and discover any new VRs possible.  */
   bool has_unvisited_preds = false;
+  edge_iterator ei;
+  edge e;
   FOR_EACH_EDGE (e, ei, bb->preds)
     if (e->flags & EDGE_EXECUTABLE
 	&& !(e->src->flags & BB_VISITED))
@@ -11237,8 +11221,8 @@ evrp_dom_walker::before_dom_children (basic_block bb)
 	}
 
       /* Try folding stmts with the VR discovered.  */
-      bool did_replace
-	= replace_uses_in (stmt, op_with_constant_singleton_value_range);
+      class vrp_folder vrp_folder;
+      bool did_replace = vrp_folder.replace_uses_in (stmt);
       if (fold_stmt (&gsi, follow_single_use_edges)
 	  || did_replace)
 	{
@@ -11488,7 +11472,8 @@ execute_vrp (bool warn_array_bounds_p)
 
   vrp_initialize_lattice ();
   vrp_initialize ();
-  ssa_propagate (vrp_visit_stmt, vrp_visit_phi_node);
+  class vrp_prop vrp_prop;
+  vrp_prop.ssa_propagate ();
   vrp_finalize (warn_array_bounds_p);
 
   /* We must identify jump threading opportunities before we release
diff --git a/gcc/tree.c b/gcc/tree.c
index ee12d3f3c4f..e5ee29e49ce 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -270,7 +270,7 @@ tree integer_types[itk_none];
 bool int_n_enabled_p[NUM_INT_N_ENTS];
 struct int_n_trees_t int_n_trees [NUM_INT_N_ENTS];
 
-unsigned char tree_contains_struct[MAX_TREE_CODES][64];
+bool tree_contains_struct[MAX_TREE_CODES][64];
 
 /* Number of operands for each OpenMP clause.  */
 unsigned const char omp_clause_num_ops[] =
@@ -780,40 +780,53 @@ tree_code_size (enum tree_code code)
   switch (TREE_CODE_CLASS (code))
     {
     case tcc_declaration:  /* A decl node */
-      {
-	switch (code)
-	  {
-	  case FIELD_DECL:
-	    return sizeof (struct tree_field_decl);
-	  case PARM_DECL:
-	    return sizeof (struct tree_parm_decl);
-	  case VAR_DECL:
-	    return sizeof (struct tree_var_decl);
-	  case LABEL_DECL:
-	    return sizeof (struct tree_label_decl);
-	  case RESULT_DECL:
-	    return sizeof (struct tree_result_decl);
-	  case CONST_DECL:
-	    return sizeof (struct tree_const_decl);
-	  case TYPE_DECL:
-	    return sizeof (struct tree_type_decl);
-	  case FUNCTION_DECL:
-	    return sizeof (struct tree_function_decl);
-	  case DEBUG_EXPR_DECL:
-	    return sizeof (struct tree_decl_with_rtl);
-	  case TRANSLATION_UNIT_DECL:
-	    return sizeof (struct tree_translation_unit_decl);
-	  case NAMESPACE_DECL:
-	  case IMPORTED_DECL:
-	  case NAMELIST_DECL:
-	    return sizeof (struct tree_decl_non_common);
-	  default:
-	    return lang_hooks.tree_size (code);
-	  }
-      }
+      switch (code)
+	{
+	case FIELD_DECL:	return sizeof (tree_field_decl);
+	case PARM_DECL:		return sizeof (tree_parm_decl);
+	case VAR_DECL:		return sizeof (tree_var_decl);
+	case LABEL_DECL:	return sizeof (tree_label_decl);
+	case RESULT_DECL:	return sizeof (tree_result_decl);
+	case CONST_DECL:	return sizeof (tree_const_decl);
+	case TYPE_DECL:		return sizeof (tree_type_decl);
+	case FUNCTION_DECL:	return sizeof (tree_function_decl);
+	case DEBUG_EXPR_DECL:	return sizeof (tree_decl_with_rtl);
+	case TRANSLATION_UNIT_DECL: return sizeof (tree_translation_unit_decl);
+	case NAMESPACE_DECL:
+	case IMPORTED_DECL:
+	case NAMELIST_DECL:	return sizeof (tree_decl_non_common);
+	default:
+	  gcc_checking_assert (code >= NUM_TREE_CODES);
+	  return lang_hooks.tree_size (code);
+	}
 
     case tcc_type:  /* a type node */
-      return sizeof (struct tree_type_non_common);
+      switch (code)
+	{
+	case OFFSET_TYPE:
+	case ENUMERAL_TYPE:
+	case BOOLEAN_TYPE:
+	case INTEGER_TYPE:
+	case REAL_TYPE:
+	case POINTER_TYPE:
+	case REFERENCE_TYPE:
+	case NULLPTR_TYPE:
+	case FIXED_POINT_TYPE:
+	case COMPLEX_TYPE:
+	case VECTOR_TYPE:
+	case ARRAY_TYPE:
+	case RECORD_TYPE:
+	case UNION_TYPE:
+	case QUAL_UNION_TYPE:
+	case VOID_TYPE:
+	case POINTER_BOUNDS_TYPE:
+	case FUNCTION_TYPE:
+	case METHOD_TYPE:
+	case LANG_TYPE:		return sizeof (tree_type_non_common);
+	default:
+	  gcc_checking_assert (code >= NUM_TREE_CODES);
+	  return lang_hooks.tree_size (code);
+	}
 
     case tcc_reference:   /* a reference */
     case tcc_expression:  /* an expression */
@@ -827,18 +840,18 @@ tree_code_size (enum tree_code code)
     case tcc_constant:  /* a constant */
       switch (code)
 	{
-	case VOID_CST:		return sizeof (struct tree_typed);
+	case VOID_CST:		return sizeof (tree_typed);
 	case INTEGER_CST:	gcc_unreachable ();
-	case POLY_INT_CST:	return sizeof (struct tree_poly_int_cst);
-	case REAL_CST:		return sizeof (struct tree_real_cst);
-	case FIXED_CST:		return sizeof (struct tree_fixed_cst);
-	case COMPLEX_CST:	return sizeof (struct tree_complex);
-	case VECTOR_CST:	return sizeof (struct tree_vector);
-	case VEC_DUPLICATE_CST:	return sizeof (struct tree_vector);
-	case VEC_SERIES_CST:
-	  return sizeof (struct tree_vector) + sizeof (tree);
+	case POLY_INT_CST:	return sizeof (tree_poly_int_cst);
+	case REAL_CST:		return sizeof (tree_real_cst);
+	case FIXED_CST:		return sizeof (tree_fixed_cst);
+	case COMPLEX_CST:	return sizeof (tree_complex);
+	case VECTOR_CST:	return sizeof (tree_vector);
+	case VEC_DUPLICATE_CST:	return sizeof (tree_vector);
+	case VEC_SERIES_CST:	return sizeof (tree_vector) + sizeof (tree);
 	case STRING_CST:	gcc_unreachable ();
 	default:
+	  gcc_checking_assert (code >= NUM_TREE_CODES);
 	  return lang_hooks.tree_size (code);
 	}
 
@@ -846,23 +859,24 @@ tree_code_size (enum tree_code code)
       switch (code)
 	{
 	case IDENTIFIER_NODE:	return lang_hooks.identifier_size;
-	case TREE_LIST:		return sizeof (struct tree_list);
+	case TREE_LIST:		return sizeof (tree_list);
 
 	case ERROR_MARK:
-	case PLACEHOLDER_EXPR:	return sizeof (struct tree_common);
+	case PLACEHOLDER_EXPR:	return sizeof (tree_common);
 
-	case TREE_VEC:
+	case TREE_VEC:		gcc_unreachable ();
 	case OMP_CLAUSE:	gcc_unreachable ();
 
-	case SSA_NAME:		return sizeof (struct tree_ssa_name);
+	case SSA_NAME:		return sizeof (tree_ssa_name);
 
-	case STATEMENT_LIST:	return sizeof (struct tree_statement_list);
+	case STATEMENT_LIST:	return sizeof (tree_statement_list);
 	case BLOCK:		return sizeof (struct tree_block);
-	case CONSTRUCTOR:	return sizeof (struct tree_constructor);
-	case OPTIMIZATION_NODE: return sizeof (struct tree_optimization_option);
-	case TARGET_OPTION_NODE: return sizeof (struct tree_target_option);
+	case CONSTRUCTOR:	return sizeof (tree_constructor);
+	case OPTIMIZATION_NODE: return sizeof (tree_optimization_option);
+	case TARGET_OPTION_NODE: return sizeof (tree_target_option);
 
 	default:
+	  gcc_checking_assert (code >= NUM_TREE_CODES);
 	  return lang_hooks.tree_size (code);
 	}
 
@@ -1218,8 +1232,8 @@ copy_node (tree node MEM_STAT_DECL)
 	 The two statements usually duplicate each other
 	 (because they clear fields of the same union),
 	 but the optimizer should catch that.  */
-      TYPE_SYMTAB_POINTER (t) = 0;
       TYPE_SYMTAB_ADDRESS (t) = 0;
+      TYPE_SYMTAB_DIE (t) = 0;
 
       /* Do not copy the values cache.  */
       if (TYPE_CACHED_VALUES_P (t))
@@ -1804,13 +1818,16 @@ cst_and_fits_in_hwi (const_tree x)
 
 /* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.
 
-   Note that this function is only suitable for callers that specifically
-   need a VEC_DUPLICATE_CST node.  Use build_vector_from_val to duplicate
-   a general scalar into a general vector type.  */
+   This function is only suitable for callers that know TYPE is a
+   variable-length vector and specifically need a VEC_DUPLICATE_CST node.
+   Use build_vector_from_val to duplicate a general scalar into a general
+   vector type.  */
 
-tree
+static tree
 build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)
 {
+  gcc_assert (!TYPE_VECTOR_SUBPARTS (type).is_constant ());
+
   int length = sizeof (struct tree_vector);
 
   record_node_allocation_statistics (VEC_DUPLICATE_CST, length);
@@ -1832,9 +1849,11 @@ build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)
    need a VEC_SERIES_CST node.  Use build_vec_series to build a general
    series vector from a general base and step.  */
 
-tree
+static tree
 build_vec_series_cst (tree type, tree base, tree step MEM_STAT_DECL)
 {
+  gcc_assert (!TYPE_VECTOR_SUBPARTS (type).is_constant ());
+
   int length = sizeof (struct tree_vector) + sizeof (tree);
 
   record_node_allocation_statistics (VEC_SERIES_CST, length);
@@ -1970,7 +1989,8 @@ build_vector_from_val (tree vectype, tree sc)
 }
 
 /* Build a vector series of type TYPE in which element I has the value
-   BASE + I * STEP.  */
+   BASE + I * STEP.  The result is a constant if BASE and STEP are constant
+   and a VEC_SERIES_EXPR otherwise.  */
 
 tree
 build_vec_series (tree type, tree base, tree step)
@@ -1978,7 +1998,20 @@ build_vec_series (tree type, tree base, tree step)
   if (integer_zerop (step))
     return build_vector_from_val (type, base);
   if (CONSTANT_CLASS_P (base) && CONSTANT_CLASS_P (step))
-    return build_vec_series_cst (type, base, step);
+    {
+      unsigned HOST_WIDE_INT nunits;
+      if (!TYPE_VECTOR_SUBPARTS (type).is_constant (&nunits))
+	return build_vec_series_cst (type, base, step);
+
+      auto_vec<tree, 32> v (nunits);
+      v.quick_push (base);
+      for (unsigned int i = 1; i < nunits; ++i)
+	{
+	  base = const_binop (PLUS_EXPR, TREE_TYPE (base), base, step);
+	  v.quick_push (base);
+	}
+      return build_vector (type, v);
+    }
   return build2 (VEC_SERIES_EXPR, type, base, step);
 }
 
@@ -10309,6 +10342,13 @@ build_common_builtin_nodes (void)
 			"__builtin_alloca_with_align",
 			alloca_flags);
 
+  ftype = build_function_type_list (ptr_type_node, size_type_node,
+				    size_type_node, size_type_node, NULL_TREE);
+  local_define_builtin ("__builtin_alloca_with_align_and_max", ftype,
+			BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX,
+			"__builtin_alloca_with_align_and_max",
+			alloca_flags);
+
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node,
 				    ptr_type_node, NULL_TREE);
@@ -10623,7 +10663,7 @@ build_same_sized_truth_vector_type (tree vectype)
 
   poly_uint64 size = GET_MODE_SIZE (TYPE_MODE (vectype));
 
-  if (known_zero (size))
+  if (must_eq (size, 0U))
     size = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
 
   return build_truth_vector_type (TYPE_VECTOR_SUBPARTS (vectype), size);
@@ -11060,6 +11100,33 @@ maybe_build_call_expr_loc (location_t loc, combined_fn fn, tree type,
     }
 }
 
+/* Return a function call to the appropriate builtin alloca variant.
+
+   SIZE is the size to be allocated.  ALIGN, if non-zero, is the requested
+   alignment of the allocated area.  MAX_SIZE, if non-negative, is an upper
+   bound for SIZE in case it is not a fixed value.  */
+
+tree
+build_alloca_call_expr (tree size, unsigned int align, HOST_WIDE_INT max_size)
+{
+  if (max_size >= 0)
+    {
+      tree t = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX);
+      return
+	build_call_expr (t, 3, size, size_int (align), size_int (max_size));
+    }
+  else if (align > 0)
+    {
+      tree t = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN);
+      return build_call_expr (t, 2, size, size_int (align));
+    }
+  else
+    {
+      tree t = builtin_decl_explicit (BUILT_IN_ALLOCA);
+      return build_call_expr (t, 1, size);
+    }
+}
+
 /* Create a new constant string literal and return a char* pointer to it.
    The STRING_CST value is the LEN characters at STR.  */
 tree
@@ -12394,7 +12461,7 @@ get_binfo_at_offset (tree binfo, poly_int64 offset, tree expected_type)
 
       /* Offset 0 indicates the primary base, whose vtable contents are
 	 represented in the binfo for the derived class.  */
-      else if (maybe_nonzero (offset))
+      else if (may_ne (offset, 0))
 	{
 	  tree found_binfo = NULL, base_binfo;
 	  /* Offsets in BINFO are in bytes relative to the whole structure
@@ -12876,6 +12943,9 @@ array_at_struct_end_p (tree ref)
   else
     return false;
 
+  if (TREE_CODE (ref) == STRING_CST)
+    return false;
+
   while (handled_component_p (ref))
     {
       /* If the reference chain contains a component reference to a
@@ -14180,10 +14250,15 @@ test_integer_constants ()
 static void
 test_vec_duplicate_predicates_int (tree type)
 {
-  tree vec_type = build_vector_type (type, 4);
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (type);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  poly_uint64 nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree vec_type = build_vector_type (type, nunits);
 
   tree zero = build_zero_cst (type);
-  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  tree vec_zero = build_vector_from_val (vec_type, zero);
   ASSERT_TRUE (integer_zerop (vec_zero));
   ASSERT_FALSE (integer_onep (vec_zero));
   ASSERT_FALSE (integer_minus_onep (vec_zero));
@@ -14192,7 +14267,7 @@ test_vec_duplicate_predicates_int (tree type)
   ASSERT_TRUE (initializer_zerop (vec_zero));
 
   tree one = build_one_cst (type);
-  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  tree vec_one = build_vector_from_val (vec_type, one);
   ASSERT_FALSE (integer_zerop (vec_one));
   ASSERT_TRUE (integer_onep (vec_one));
   ASSERT_FALSE (integer_minus_onep (vec_one));
@@ -14201,7 +14276,7 @@ test_vec_duplicate_predicates_int (tree type)
   ASSERT_FALSE (initializer_zerop (vec_one));
 
   tree minus_one = build_minus_one_cst (type);
-  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);
   ASSERT_FALSE (integer_zerop (vec_minus_one));
   ASSERT_FALSE (integer_onep (vec_minus_one));
   ASSERT_TRUE (integer_minus_onep (vec_minus_one));
@@ -14223,24 +14298,29 @@ test_vec_duplicate_predicates_int (tree type)
 static void
 test_vec_duplicate_predicates_float (tree type)
 {
-  tree vec_type = build_vector_type (type, 4);
+  scalar_float_mode float_mode = SCALAR_FLOAT_TYPE_MODE (type);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (float_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  poly_uint64 nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree vec_type = build_vector_type (type, nunits);
 
   tree zero = build_zero_cst (type);
-  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  tree vec_zero = build_vector_from_val (vec_type, zero);
   ASSERT_TRUE (real_zerop (vec_zero));
   ASSERT_FALSE (real_onep (vec_zero));
   ASSERT_FALSE (real_minus_onep (vec_zero));
   ASSERT_TRUE (initializer_zerop (vec_zero));
 
   tree one = build_one_cst (type);
-  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  tree vec_one = build_vector_from_val (vec_type, one);
   ASSERT_FALSE (real_zerop (vec_one));
   ASSERT_TRUE (real_onep (vec_one));
   ASSERT_FALSE (real_minus_onep (vec_one));
   ASSERT_FALSE (initializer_zerop (vec_one));
 
   tree minus_one = build_minus_one_cst (type);
-  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);
   ASSERT_FALSE (real_zerop (vec_minus_one));
   ASSERT_FALSE (real_onep (vec_minus_one));
   ASSERT_TRUE (real_minus_onep (vec_minus_one));
diff --git a/gcc/tree.def b/gcc/tree.def
index 608d950b20e..051ecd4897e 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -308,11 +308,14 @@ DEFTREECODE (COMPLEX_CST, "complex_cst", tcc_constant, 0)
 DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
 
 /* Represents a vector constant in which every element is equal to
-   VEC_DUPLICATE_CST_ELT.  */
+   VEC_DUPLICATE_CST_ELT.  This is only ever used for variable-length
+   vectors; fixed-length vectors must use VECTOR_CST instead.  */
 DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)
 
 /* Represents a vector constant in which element i is equal to
-   VEC_SERIES_CST_BASE + i * VEC_SERIES_CST_STEP.  */
+   VEC_SERIES_CST_BASE + i * VEC_SERIES_CST_STEP.  This is only ever
+   used for variable-length vectors; fixed-length vectors must use
+   VECTOR_CST instead.  */
 DEFTREECODE (VEC_SERIES_CST, "vec_series_cst", tcc_constant, 0)
 
 /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 1b05d969541..a73928fa3ee 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -76,64 +76,43 @@ as_internal_fn (combined_fn code)
 
 /* Macros for initializing `tree_contains_struct'.  */
 #define MARK_TS_BASE(C)					\
-  do {							\
-    tree_contains_struct[C][TS_BASE] = 1;		\
-  } while (0)
+  (tree_contains_struct[C][TS_BASE] = true)
 
 #define MARK_TS_TYPED(C)				\
-  do {							\
-    MARK_TS_BASE (C);					\
-    tree_contains_struct[C][TS_TYPED] = 1;		\
-  } while (0)
+  (MARK_TS_BASE (C),					\
+   tree_contains_struct[C][TS_TYPED] = true)
 
 #define MARK_TS_COMMON(C)				\
-  do {							\
-    MARK_TS_TYPED (C);					\
-    tree_contains_struct[C][TS_COMMON] = 1;		\
-  } while (0)
+  (MARK_TS_TYPED (C),					\
+   tree_contains_struct[C][TS_COMMON] = true)
 
 #define MARK_TS_TYPE_COMMON(C)				\
-  do {							\
-    MARK_TS_COMMON (C);					\
-    tree_contains_struct[C][TS_TYPE_COMMON] = 1;	\
-  } while (0)
+  (MARK_TS_COMMON (C),					\
+   tree_contains_struct[C][TS_TYPE_COMMON] = true)
 
 #define MARK_TS_TYPE_WITH_LANG_SPECIFIC(C)		\
-  do {							\
-    MARK_TS_TYPE_COMMON (C);				\
-    tree_contains_struct[C][TS_TYPE_WITH_LANG_SPECIFIC] = 1;	\
-  } while (0)
+  (MARK_TS_TYPE_COMMON (C),				\
+   tree_contains_struct[C][TS_TYPE_WITH_LANG_SPECIFIC] = true)
 
 #define MARK_TS_DECL_MINIMAL(C)				\
-  do {							\
-    MARK_TS_COMMON (C);					\
-    tree_contains_struct[C][TS_DECL_MINIMAL] = 1;	\
-  } while (0)
+  (MARK_TS_COMMON (C),					\
+   tree_contains_struct[C][TS_DECL_MINIMAL] = true)
 
 #define MARK_TS_DECL_COMMON(C)				\
-  do {							\
-    MARK_TS_DECL_MINIMAL (C);				\
-    tree_contains_struct[C][TS_DECL_COMMON] = 1;	\
-  } while (0)
+  (MARK_TS_DECL_MINIMAL (C),				\
+   tree_contains_struct[C][TS_DECL_COMMON] = true)
 
 #define MARK_TS_DECL_WRTL(C)				\
-  do {							\
-    MARK_TS_DECL_COMMON (C);				\
-    tree_contains_struct[C][TS_DECL_WRTL] = 1;		\
-  } while (0)
+  (MARK_TS_DECL_COMMON (C),				\
+   tree_contains_struct[C][TS_DECL_WRTL] = true)
 
 #define MARK_TS_DECL_WITH_VIS(C)			\
-  do {							\
-    MARK_TS_DECL_WRTL (C);				\
-    tree_contains_struct[C][TS_DECL_WITH_VIS] = 1;	\
-  } while (0)
+  (MARK_TS_DECL_WRTL (C),				\
+   tree_contains_struct[C][TS_DECL_WITH_VIS] = true)
 
 #define MARK_TS_DECL_NON_COMMON(C)			\
-  do {							\
-    MARK_TS_DECL_WITH_VIS (C);				\
-    tree_contains_struct[C][TS_DECL_NON_COMMON] = 1;	\
-  } while (0)
-
+  (MARK_TS_DECL_WITH_VIS (C),				\
+   tree_contains_struct[C][TS_DECL_NON_COMMON] = true)
 
 /* Returns the string representing CLASS.  */
 
@@ -2103,11 +2082,6 @@ extern machine_mode vector_type_mode (const_tree);
 #define TYPE_SYMTAB_ADDRESS(NODE) \
   (TYPE_CHECK (NODE)->type_common.symtab.address)
 
-/* Symtab field as a string.  Used by COFF generator in sdbout.c to
-   hold struct/union type tag names.  */
-#define TYPE_SYMTAB_POINTER(NODE) \
-  (TYPE_CHECK (NODE)->type_common.symtab.pointer)
-
 /* Symtab field as a pointer to a DWARF DIE.  Used by DWARF generator
    in dwarf2out.c to point to the DIE generated for the type.  */
 #define TYPE_SYMTAB_DIE(NODE) \
@@ -2118,8 +2092,7 @@ extern machine_mode vector_type_mode (const_tree);
    union.  */
 
 #define TYPE_SYMTAB_IS_ADDRESS (0)
-#define TYPE_SYMTAB_IS_POINTER (1)
-#define TYPE_SYMTAB_IS_DIE (2)
+#define TYPE_SYMTAB_IS_DIE (1)
 
 #define TYPE_LANG_SPECIFIC(NODE) \
   (TYPE_CHECK (NODE)->type_with_lang_specific.lang_specific)
@@ -2427,6 +2400,18 @@ extern machine_mode vector_type_mode (const_tree);
 #define DECL_FUNCTION_CODE(NODE) \
   (FUNCTION_DECL_CHECK (NODE)->function_decl.function_code)
 
+/* Test if FCODE is a function code for an alloca operation.  */
+#define ALLOCA_FUNCTION_CODE_P(FCODE)				\
+  ((FCODE) == BUILT_IN_ALLOCA					\
+   || (FCODE) == BUILT_IN_ALLOCA_WITH_ALIGN			\
+   || (FCODE) == BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX)
+
+/* Generate case for an alloca operation.  */
+#define CASE_BUILT_IN_ALLOCA			\
+  case BUILT_IN_ALLOCA:				\
+  case BUILT_IN_ALLOCA_WITH_ALIGN:		\
+  case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX
+
 #define DECL_FUNCTION_PERSONALITY(NODE) \
   (FUNCTION_DECL_CHECK (NODE)->function_decl.personality)
 
@@ -4093,8 +4078,6 @@ extern tree build_int_cst (tree, poly_int64);
 extern tree build_int_cstu (tree type, poly_uint64);
 extern tree build_int_cst_type (tree, poly_int64);
 extern tree make_vector (unsigned CXX_MEM_STAT_INFO);
-extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);
-extern tree build_vec_series_cst (tree, tree, tree CXX_MEM_STAT_INFO);
 extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);
 extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);
 extern tree build_vector_from_val (tree, tree);
@@ -4144,6 +4127,7 @@ extern tree build_call_expr_internal_loc_array (location_t, enum internal_fn,
 						tree, int, const tree *);
 extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
 				       int, ...);
+extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
 extern tree build_string_literal (int, const char *);
 
 /* Construct various nodes representing data types.  */
@@ -5601,8 +5585,9 @@ template <typename T>
 bool
 wi::fits_to_boolean_p (const T &x, const_tree type)
 {
-  return (known_zero (x)
-	  || (TYPE_UNSIGNED (type) ? known_one (x) : known_all_ones (x)));
+  typedef typename poly_int_traits<T>::int_type int_type;
+  return (must_eq (x, int_type (0))
+	  || must_eq (x, int_type (TYPE_UNSIGNED (type) ? 1 : -1)));
 }
 
 template <typename T>
diff --git a/gcc/ubsan.c b/gcc/ubsan.c
index cfa08c0e6b6..3a0584271a3 100644
--- a/gcc/ubsan.c
+++ b/gcc/ubsan.c
@@ -804,6 +804,7 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
      this edge is unlikely taken, so set up the probability accordingly.  */
   e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
   e->probability = profile_probability::very_unlikely ();
+  then_bb->count = e->count ();
 
   /* Connect 'then block' with the 'else block'.  This is needed
      as the ubsan routines we call in the 'then block' are not noreturn.
@@ -813,7 +814,6 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
   /* Set up the fallthrough basic block.  */
   e = find_edge (cond_bb, fallthru_bb);
   e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
   e->probability = profile_probability::very_likely ();
 
   /* Update dominance info for the newly created then_bb; note that
@@ -830,15 +830,17 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
       enum built_in_function bcode
 	= (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT : 0)
 				    | (check_null ? SANITIZE_NULL : 0)))
-	  ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH
-	  : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT;
+	  ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1
+	  : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT;
       tree fn = builtin_decl_implicit (bcode);
+      int align_log = tree_log2 (align);
       tree data
 	= ubsan_create_data ("__ubsan_null_data", 1, &loc,
 			     ubsan_type_descriptor (TREE_TYPE (ckind),
 						    UBSAN_PRINT_POINTER),
 			     NULL_TREE,
-			     align,
+			     build_int_cst (unsigned_char_type_node,
+					    MAX (align_log, 0)),
 			     fold_convert (unsigned_char_type_node, ckind),
 			     NULL_TREE);
       data = build_fold_addr_expr_loc (loc, data);
@@ -882,7 +884,6 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
 	  /* Set up the fallthrough basic block.  */
 	  e = find_edge (cond1_bb, cond2_bb);
 	  e->flags = EDGE_FALSE_VALUE;
-	  e->count = cond1_bb->count;
 	  e->probability = profile_probability::very_likely ();
 
 	  /* Update dominance info.  */
@@ -1001,14 +1002,14 @@ ubsan_expand_objsize_ifn (gimple_stmt_iterator *gsi)
 				 ubsan_type_descriptor (TREE_TYPE (ptr),
 							UBSAN_PRINT_POINTER),
 				 NULL_TREE,
-				 build_zero_cst (pointer_sized_int_node),
+				 build_zero_cst (unsigned_char_type_node),
 				 ckind,
 				 NULL_TREE);
 	  data = build_fold_addr_expr_loc (loc, data);
 	  enum built_in_function bcode
 	    = (flag_sanitize_recover & SANITIZE_OBJECT_SIZE)
-	      ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH
-	      : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT;
+	      ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1
+	      : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT;
 	  tree p = make_ssa_name (pointer_sized_int_node);
 	  g = gimple_build_assign (p, NOP_EXPR, ptr);
 	  gimple_set_location (g, loc);
@@ -1073,7 +1074,6 @@ ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip)
   e->flags = EDGE_FALSE_VALUE;
   if (pos_neg != 3)
     {
-      e->count = cond_bb->count;
       e->probability = profile_probability::very_likely ();
 
       /* Connect 'then block' with the 'else block'.  This is needed
@@ -1086,35 +1086,33 @@ ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip)
 	 accordingly.  */
       e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
       e->probability = profile_probability::very_unlikely ();
+      then_bb->count = e->count ();
     }
   else
     {
-      profile_count count = cond_bb->count.apply_probability (PROB_EVEN);
-      e->count = count;
       e->probability = profile_probability::even ();
 
       e = split_block (fallthru_bb, (gimple *) NULL);
       cond_neg_bb = e->src;
       fallthru_bb = e->dest;
-      e->count = count;
       e->probability = profile_probability::very_likely ();
       e->flags = EDGE_FALSE_VALUE;
 
       e = make_edge (cond_neg_bb, then_bb, EDGE_TRUE_VALUE);
       e->probability = profile_probability::very_unlikely ();
+      then_bb->count = e->count ();
 
       cond_pos_bb = create_empty_bb (cond_bb);
       add_bb_to_loop (cond_pos_bb, cond_bb->loop_father);
 
       e = make_edge (cond_bb, cond_pos_bb, EDGE_TRUE_VALUE);
-      e->count = count;
       e->probability = profile_probability::even ();
+      cond_pos_bb->count = e->count ();
 
       e = make_edge (cond_pos_bb, then_bb, EDGE_TRUE_VALUE);
       e->probability = profile_probability::very_unlikely ();
 
       e = make_edge (cond_pos_bb, fallthru_bb, EDGE_FALSE_VALUE);
-      e->count = count;
       e->probability = profile_probability::very_likely ();
 
       make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
@@ -1449,7 +1447,7 @@ maybe_instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree t)
 	 fits, don't instrument anything.  */
       poly_int64 base_size;
       if (offset == NULL_TREE
-	  && maybe_nonzero (bitpos)
+	  && may_ne (bitpos, 0)
 	  && (VAR_P (base)
 	      || TREE_CODE (base) == PARM_DECL
 	      || TREE_CODE (base) == RESULT_DECL)
@@ -1476,7 +1474,7 @@ maybe_instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree t)
   if (!POINTER_TYPE_P (TREE_TYPE (base)) && !DECL_P (base))
     return;
   bytepos = bits_to_bytes_round_down (bitpos);
-  if (offset == NULL_TREE && known_zero (bytepos) && moff == NULL_TREE)
+  if (offset == NULL_TREE && must_eq (bytepos, 0) && moff == NULL_TREE)
     return;
 
   tree base_addr = base;
@@ -1484,7 +1482,7 @@ maybe_instrument_pointer_overflow (gimple_stmt_iterator *gsi, tree t)
     base_addr = build1 (ADDR_EXPR,
 			build_pointer_type (TREE_TYPE (base)), base);
   t = offset;
-  if (maybe_nonzero (bytepos))
+  if (may_ne (bytepos, 0))
     {
       if (t)
 	t = fold_build2 (PLUS_EXPR, TREE_TYPE (t), t,
@@ -2025,15 +2023,18 @@ instrument_nonnull_return (gimple_stmt_iterator *gsi)
       else
 	{
 	  tree data = ubsan_create_data ("__ubsan_nonnull_return_data",
-					 2, loc, NULL_TREE, NULL_TREE);
+					 1, &loc[1], NULL_TREE, NULL_TREE);
 	  data = build_fold_addr_expr_loc (loc[0], data);
+	  tree data2 = ubsan_create_data ("__ubsan_nonnull_return_data",
+					  1, &loc[0], NULL_TREE, NULL_TREE);
+	  data2 = build_fold_addr_expr_loc (loc[0], data2);
 	  enum built_in_function bcode
 	    = (flag_sanitize_recover & SANITIZE_RETURNS_NONNULL_ATTRIBUTE)
-	      ? BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN
-	      : BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_ABORT;
+	      ? BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1
+	      : BUILT_IN_UBSAN_HANDLE_NONNULL_RETURN_V1_ABORT;
 	  tree fn = builtin_decl_explicit (bcode);
 
-	  g = gimple_build_call (fn, 1, data);
+	  g = gimple_build_call (fn, 2, data, data2);
 	}
       gimple_set_location (g, loc[0]);
       gsi_insert_before (gsi, g, GSI_SAME_STMT);
@@ -2217,6 +2218,72 @@ instrument_object_size (gimple_stmt_iterator *gsi, tree t, bool is_lhs)
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
 }
 
+/* Instrument values passed to builtin functions.  */
+
+static void
+instrument_builtin (gimple_stmt_iterator *gsi)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  location_t loc = gimple_location (stmt);
+  tree arg;
+  enum built_in_function fcode
+    = DECL_FUNCTION_CODE (gimple_call_fndecl (stmt));
+  int kind = 0;
+  switch (fcode)
+    {
+    CASE_INT_FN (BUILT_IN_CLZ):
+      kind = 1;
+      gcc_fallthrough ();
+    CASE_INT_FN (BUILT_IN_CTZ):
+      arg = gimple_call_arg (stmt, 0);
+      if (!integer_nonzerop (arg))
+	{
+	  gimple *g;
+	  if (!is_gimple_val (arg))
+	    {
+	      g = gimple_build_assign (make_ssa_name (TREE_TYPE (arg)), arg);
+	      gimple_set_location (g, loc);
+	      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	      arg = gimple_assign_lhs (g);
+	    }
+
+	  basic_block then_bb, fallthru_bb;
+	  *gsi = create_cond_insert_point (gsi, true, false, true,
+					   &then_bb, &fallthru_bb);
+	  g = gimple_build_cond (EQ_EXPR, arg,
+				 build_zero_cst (TREE_TYPE (arg)),
+				 NULL_TREE, NULL_TREE);
+	  gimple_set_location (g, loc);
+	  gsi_insert_after (gsi, g, GSI_NEW_STMT);
+
+	  *gsi = gsi_after_labels (then_bb);
+	  if (flag_sanitize_undefined_trap_on_error)
+	    g = gimple_build_call (builtin_decl_explicit (BUILT_IN_TRAP), 0);
+	  else
+	    {
+	      tree t = build_int_cst (unsigned_char_type_node, kind);
+	      tree data = ubsan_create_data ("__ubsan_builtin_data",
+					     1, &loc, NULL_TREE, t, NULL_TREE);
+	      data = build_fold_addr_expr_loc (loc, data);
+	      enum built_in_function bcode
+		= (flag_sanitize_recover & SANITIZE_BUILTIN)
+		  ? BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN
+		  : BUILT_IN_UBSAN_HANDLE_INVALID_BUILTIN_ABORT;
+	      tree fn = builtin_decl_explicit (bcode);
+
+	      g = gimple_build_call (fn, 1, data);
+	    }
+	  gimple_set_location (g, loc);
+	  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	  ubsan_create_edge (g);
+	}
+      *gsi = gsi_for_stmt (stmt);
+      break;
+    default:
+      break;
+    }
+}
+
 namespace {
 
 const pass_data pass_data_ubsan =
@@ -2248,7 +2315,8 @@ public:
 				| SANITIZE_NONNULL_ATTRIBUTE
 				| SANITIZE_RETURNS_NONNULL_ATTRIBUTE
 				| SANITIZE_OBJECT_SIZE
-				| SANITIZE_POINTER_OVERFLOW));
+				| SANITIZE_POINTER_OVERFLOW
+				| SANITIZE_BUILTIN));
     }
 
   virtual unsigned int execute (function *);
@@ -2313,6 +2381,13 @@ pass_ubsan::execute (function *fun)
 	      bb = gimple_bb (stmt);
 	    }
 
+	  if (sanitize_flags_p (SANITIZE_BUILTIN, fun->decl)
+	      && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
+	    {
+	      instrument_builtin (&gsi);
+	      bb = gimple_bb (stmt);
+	    }
+
 	  if (sanitize_flags_p (SANITIZE_RETURNS_NONNULL_ATTRIBUTE, fun->decl)
 	      && gimple_code (stmt) == GIMPLE_RETURN)
 	    {
diff --git a/gcc/unique-ptr-tests.cc b/gcc/unique-ptr-tests.cc
new file mode 100644
index 00000000000..d2756947189
--- /dev/null
+++ b/gcc/unique-ptr-tests.cc
@@ -0,0 +1,234 @@
+/* Unit tests for unique-ptr.h.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#define INCLUDE_UNIQUE_PTR
+#include "system.h"
+#include "coretypes.h"
+#include "selftest.h"
+
+#if CHECKING_P
+
+namespace selftest {
+
+namespace {
+
+/* A class for counting ctor and dtor invocations.  */
+
+struct stats
+{
+  stats () : ctor_count (0), dtor_count (0) {}
+
+  int ctor_count;
+  int dtor_count;
+};
+
+/* A class that uses "stats" to track its ctor and dtor invocations.  */
+
+class foo
+{
+public:
+  foo (stats &s) : m_s (s) { ++m_s.ctor_count; }
+  ~foo () { ++m_s.dtor_count; }
+
+  int example_method () const { return 42; }
+
+private:
+  foo (const foo&);
+  foo & operator= (const foo &);
+
+private:
+  stats &m_s;
+};
+
+/* A struct for testing unique_ptr<T[]>.  */
+
+struct has_default_ctor
+{
+  has_default_ctor () : m_field (42) {}
+  int m_field;
+};
+
+/* A dummy struct for testing unique_xmalloc_ptr.  */
+
+struct dummy
+{
+  int field;
+};
+
+} // anonymous namespace
+
+/* Verify that the default ctor inits ptrs to NULL.  */
+
+static void
+test_null_ptr ()
+{
+  gnu::unique_ptr<void *> p;
+  ASSERT_EQ (NULL, p);
+
+  gnu::unique_xmalloc_ptr<void *> q;
+  ASSERT_EQ (NULL, q);
+}
+
+/* Verify that deletion happens when a unique_ptr goes out of scope.  */
+
+static void
+test_implicit_deletion ()
+{
+  stats s;
+  ASSERT_EQ (0, s.ctor_count);
+  ASSERT_EQ (0, s.dtor_count);
+
+  {
+    gnu::unique_ptr<foo> f (new foo (s));
+    ASSERT_NE (NULL, f);
+    ASSERT_EQ (1, s.ctor_count);
+    ASSERT_EQ (0, s.dtor_count);
+  }
+
+  /* Verify that the foo was implicitly deleted.  */
+  ASSERT_EQ (1, s.ctor_count);
+  ASSERT_EQ (1, s.dtor_count);
+}
+
+/* Verify that we can assign to a NULL unique_ptr.  */
+
+static void
+test_overwrite_of_null ()
+{
+  stats s;
+  ASSERT_EQ (0, s.ctor_count);
+  ASSERT_EQ (0, s.dtor_count);
+
+  {
+    gnu::unique_ptr<foo> f;
+    ASSERT_EQ (NULL, f);
+    ASSERT_EQ (0, s.ctor_count);
+    ASSERT_EQ (0, s.dtor_count);
+
+    /* Overwrite with a non-NULL value.  */
+    f = gnu::unique_ptr<foo> (new foo (s));
+    ASSERT_EQ (1, s.ctor_count);
+    ASSERT_EQ (0, s.dtor_count);
+  }
+
+  /* Verify that the foo is implicitly deleted.  */
+  ASSERT_EQ (1, s.ctor_count);
+  ASSERT_EQ (1, s.dtor_count);
+}
+
+/* Verify that we can assign to a non-NULL unique_ptr.  */
+
+static void
+test_overwrite_of_non_null ()
+{
+  stats s;
+  ASSERT_EQ (0, s.ctor_count);
+  ASSERT_EQ (0, s.dtor_count);
+
+  {
+    gnu::unique_ptr<foo> f (new foo (s));
+    ASSERT_NE (NULL, f);
+    ASSERT_EQ (1, s.ctor_count);
+    ASSERT_EQ (0, s.dtor_count);
+
+    /* Overwrite with a different value.  */
+    f = gnu::unique_ptr<foo> (new foo (s));
+    ASSERT_EQ (2, s.ctor_count);
+    ASSERT_EQ (1, s.dtor_count);
+  }
+
+  /* Verify that the 2nd foo was implicitly deleted.  */
+  ASSERT_EQ (2, s.ctor_count);
+  ASSERT_EQ (2, s.dtor_count);
+}
+
+/* Verify that unique_ptr's overloaded ops work.  */
+
+static void
+test_overloaded_ops ()
+{
+  stats s;
+  gnu::unique_ptr<foo> f (new foo (s));
+  ASSERT_EQ (42, f->example_method ());
+  ASSERT_EQ (42, (*f).example_method ());
+  ASSERT_EQ (f, f);
+  ASSERT_NE (NULL, f.get ());
+
+  gnu::unique_ptr<foo> g (new foo (s));
+  ASSERT_NE (f, g);
+}
+
+/* Verify that the gnu::unique_ptr specialization for T[] works.  */
+
+static void
+test_array_new ()
+{
+  const int num = 10;
+  gnu::unique_ptr<has_default_ctor[]> p (new has_default_ctor[num]);
+  ASSERT_NE (NULL, p.get ());
+  /* Verify that operator[] works, and that the default ctor was called
+     on each element.  */
+  for (int i = 0; i < num; i++)
+    ASSERT_EQ (42, p[i].m_field);
+}
+
+/* Verify that gnu::unique_xmalloc_ptr works.  */
+
+static void
+test_xmalloc ()
+{
+  gnu::unique_xmalloc_ptr<dummy> p (XNEW (dummy));
+  ASSERT_NE (NULL, p.get ());
+}
+
+/* Verify the gnu::unique_xmalloc_ptr specialization for T[].  */
+
+static void
+test_xmalloc_array ()
+{
+  const int num = 10;
+  gnu::unique_xmalloc_ptr<dummy[]> p (XNEWVEC (dummy, num));
+  ASSERT_NE (NULL, p.get ());
+
+  /* Verify that operator[] works.  */
+  for (int i = 0; i < num; i++)
+    p[i].field = 42;
+  for (int i = 0; i < num; i++)
+    ASSERT_EQ (42, p[i].field);
+}
+
+/* Run all of the selftests within this file.  */
+
+void
+unique_ptr_tests_cc_tests ()
+{
+  test_null_ptr ();
+  test_implicit_deletion ();
+  test_overwrite_of_null ();
+  test_overwrite_of_non_null ();
+  test_overloaded_ops ();
+  test_array_new ();
+  test_xmalloc ();
+  test_xmalloc_array ();
+}
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 23b8dc26471..85de3189f83 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -583,7 +583,7 @@ static bool
 check_counter (gimple *stmt, const char * name,
 	       gcov_type *count, gcov_type *all, profile_count bb_count_d)
 {
-  gcov_type bb_count = bb_count_d.to_gcov_type ();
+  gcov_type bb_count = bb_count_d.ipa ().to_gcov_type ();
   if (*all != bb_count || *count > *all)
     {
       location_t locus;
@@ -745,20 +745,16 @@ gimple_divmod_fixed_value (gassign *stmt, tree value, profile_probability prob,
   e12->flags &= ~EDGE_FALLTHRU;
   e12->flags |= EDGE_FALSE_VALUE;
   e12->probability = prob;
-  e12->count = profile_count::from_gcov_type (count);
 
   e13 = make_edge (bb, bb3, EDGE_TRUE_VALUE);
   e13->probability = prob.invert ();
-  e13->count = profile_count::from_gcov_type (all - count);
 
   remove_edge (e23);
 
   e24 = make_edge (bb2, bb4, EDGE_FALLTHRU);
   e24->probability = profile_probability::always ();
-  e24->count = profile_count::from_gcov_type (count);
 
   e34->probability = profile_probability::always ();
-  e34->count = profile_count::from_gcov_type (all - count);
 
   return tmp2;
 }
@@ -910,20 +906,16 @@ gimple_mod_pow2 (gassign *stmt, profile_probability prob, gcov_type count, gcov_
   e12->flags &= ~EDGE_FALLTHRU;
   e12->flags |= EDGE_FALSE_VALUE;
   e12->probability = prob;
-  e12->count = profile_count::from_gcov_type (count);
 
   e13 = make_edge (bb, bb3, EDGE_TRUE_VALUE);
   e13->probability = prob.invert ();
-  e13->count = profile_count::from_gcov_type (all - count);
 
   remove_edge (e23);
 
   e24 = make_edge (bb2, bb4, EDGE_FALLTHRU);
   e24->probability = profile_probability::always ();
-  e24->count = profile_count::from_gcov_type (count);
 
   e34->probability = profile_probability::always ();
-  e34->count = profile_count::from_gcov_type (all - count);
 
   return result;
 }
@@ -1076,26 +1068,21 @@ gimple_mod_subtract (gassign *stmt, profile_probability prob1,
   e12->flags &= ~EDGE_FALLTHRU;
   e12->flags |= EDGE_FALSE_VALUE;
   e12->probability = prob1.invert ();
-  e12->count = profile_count::from_gcov_type (all - count1);
 
   e14 = make_edge (bb, bb4, EDGE_TRUE_VALUE);
   e14->probability = prob1;
-  e14->count = profile_count::from_gcov_type (count1);
 
   if (ncounts)  /* Assumed to be 0 or 1.  */
     {
       e23->flags &= ~EDGE_FALLTHRU;
       e23->flags |= EDGE_FALSE_VALUE;
-      e23->count = profile_count::from_gcov_type (all - count1 - count2);
       e23->probability = prob2.invert ();
 
       e24 = make_edge (bb2, bb4, EDGE_TRUE_VALUE);
       e24->probability = prob2;
-      e24->count = profile_count::from_gcov_type (count2);
     }
 
   e34->probability = profile_probability::always ();
-  e34->count = profile_count::from_gcov_type (all - count1 - count2);
 
   return result;
 }
@@ -1312,7 +1299,7 @@ check_ic_target (gcall *call_stmt, struct cgraph_node *target)
 
 gcall *
 gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
-	   profile_probability prob, profile_count count, profile_count all)
+	   profile_probability prob)
 {
   gcall *dcall_stmt;
   gassign *load_stmt;
@@ -1367,11 +1354,11 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
   /* Edge e_cd connects cond_bb to dcall_bb, etc; note the first letters. */
   e_cd = split_block (cond_bb, cond_stmt);
   dcall_bb = e_cd->dest;
-  dcall_bb->count = count;
+  dcall_bb->count = cond_bb->count.apply_probability (prob);
 
   e_di = split_block (dcall_bb, dcall_stmt);
   icall_bb = e_di->dest;
-  icall_bb->count = all - count;
+  icall_bb->count = cond_bb->count - dcall_bb->count;
 
   /* Do not disturb existing EH edges from the indirect call.  */
   if (!stmt_ends_bb_p (icall_stmt))
@@ -1383,37 +1370,29 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
       if (e_ij != NULL)
 	{
 	  e_ij->probability = profile_probability::always ();
-	  e_ij->count = all - count;
 	  e_ij = single_pred_edge (split_edge (e_ij));
 	}
     }
   if (e_ij != NULL)
     {
       join_bb = e_ij->dest;
-      join_bb->count = all;
+      join_bb->count = cond_bb->count;
     }
 
   e_cd->flags = (e_cd->flags & ~EDGE_FALLTHRU) | EDGE_TRUE_VALUE;
   e_cd->probability = prob;
-  e_cd->count = count;
 
   e_ci = make_edge (cond_bb, icall_bb, EDGE_FALSE_VALUE);
   e_ci->probability = prob.invert ();
-  e_ci->count = all - count;
 
   remove_edge (e_di);
 
   if (e_ij != NULL)
     {
-      if ((dflags & ECF_NORETURN) != 0)
-	e_ij->count = all;
-      else
+      if ((dflags & ECF_NORETURN) == 0)
 	{
 	  e_dj = make_edge (dcall_bb, join_bb, EDGE_FALLTHRU);
 	  e_dj->probability = profile_probability::always ();
-	  e_dj->count = count;
-
-	  e_ij->count = all - count;
 	}
       e_ij->probability = profile_probability::always ();
     }
@@ -1494,7 +1473,6 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
       {
 	e = make_edge (dcall_bb, e_eh->dest, e_eh->flags);
 	e->probability = e_eh->probability;
-	e->count = e_eh->count;
 	for (gphi_iterator psi = gsi_start_phis (e_eh->dest);
 	     !gsi_end_p (psi); gsi_next (&psi))
 	  {
@@ -1540,7 +1518,7 @@ gimple_ic_transform (gimple_stmt_iterator *gsi)
   count = histogram->hvalue.counters [1];
   all = histogram->hvalue.counters [2];
 
-  bb_all = gimple_bb (stmt)->count.to_gcov_type ();
+  bb_all = gimple_bb (stmt)->count.ipa ().to_gcov_type ();
   /* The order of CHECK_COUNTER calls is important -
      since check_counter can correct the third parameter
      and we want to make count <= all <= bb_all. */
@@ -1704,20 +1682,16 @@ gimple_stringop_fixed_value (gcall *vcall_stmt, tree icall_size, profile_probabi
 
   e_ci->flags = (e_ci->flags & ~EDGE_FALLTHRU) | EDGE_TRUE_VALUE;
   e_ci->probability = prob;
-  e_ci->count = profile_count::from_gcov_type (count);
 
   e_cv = make_edge (cond_bb, vcall_bb, EDGE_FALSE_VALUE);
   e_cv->probability = prob.invert ();
-  e_cv->count = profile_count::from_gcov_type (all - count);
 
   remove_edge (e_iv);
 
   e_ij = make_edge (icall_bb, join_bb, EDGE_FALLTHRU);
   e_ij->probability = profile_probability::always ();
-  e_ij->count = profile_count::from_gcov_type (count);
 
   e_vj->probability = profile_probability::always ();
-  e_vj->count = profile_count::from_gcov_type (all - count);
 
   /* Insert PHI node for the call result if necessary.  */
   if (gimple_call_lhs (vcall_stmt)
diff --git a/gcc/value-prof.h b/gcc/value-prof.h
index f72bb2d2241..8190bfd074f 100644
--- a/gcc/value-prof.h
+++ b/gcc/value-prof.h
@@ -90,8 +90,7 @@ void gimple_move_stmt_histograms (struct function *, gimple *, gimple *);
 void verify_histograms (void);
 void free_histograms (function *);
 void stringop_block_profile (gimple *, unsigned int *, HOST_WIDE_INT *);
-gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability,
-		  profile_count, profile_count);
+gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability);
 bool check_ic_target (gcall *, struct cgraph_node *);
 
 
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 4682aabc18f..92a0f77551f 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -5350,7 +5350,7 @@ track_loc_p (rtx loc, tree expr, poly_int64 offset, bool store_reg_p,
        || (store_reg_p
 	   && !COMPLEX_MODE_P (DECL_MODE (expr))
 	   && hard_regno_nregs (REGNO (loc), DECL_MODE (expr)) == 1))
-      && known_zero (offset + byte_lowpart_offset (DECL_MODE (expr), mode)))
+      && must_eq (offset + byte_lowpart_offset (DECL_MODE (expr), mode), 0))
     {
       mode = DECL_MODE (expr);
       offset = 0;
diff --git a/gcc/varasm.c b/gcc/varasm.c
index bf19ab7f413..046fd8bc2f0 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -2397,8 +2397,7 @@ incorporeal_function_p (tree decl)
       const char *name;
 
       if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL
-	  && (DECL_FUNCTION_CODE (decl) == BUILT_IN_ALLOCA
-	      || DECL_FUNCTION_CODE (decl) == BUILT_IN_ALLOCA_WITH_ALIGN))
+	  && ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (decl)))
 	return true;
 
       name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
@@ -2894,8 +2893,10 @@ decode_addr_const (tree exp, struct addr_const *value)
       else if (TREE_CODE (target) == ARRAY_REF
 	       || TREE_CODE (target) == ARRAY_RANGE_REF)
 	{
-	  offset += (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (target)))
-		     * tree_to_poly_int64 (TREE_OPERAND (target, 1)));
+	  /* Truncate big offset.  */
+	  offset
+	    += (TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (target)))
+		* wi::to_poly_widest (TREE_OPERAND (target, 1)).force_shwi ());
 	  target = TREE_OPERAND (target, 0);
 	}
       else if (TREE_CODE (target) == MEM_REF
@@ -3931,8 +3932,8 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	gcc_assert (GET_CODE (x) == CONST_VECTOR);
 
 	/* Pick the smallest integer mode that contains at least one
-	   whole element.  Often this will be byte_mode and will contain
-	   more than one element.  */
+	   whole element.  Often this is byte_mode and contains more
+	   than one element.  */
 	unsigned int nelts = CONST_VECTOR_NUNITS (x);
 	unsigned int elt_bits = GET_MODE_BITSIZE (mode) / nelts;
 	unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT);
@@ -3947,9 +3948,8 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	    for (unsigned int j = 0; j < limit; ++j)
 	      if (INTVAL (CONST_VECTOR_ELT (x, i + j)) != 0)
 		value |= 1 << (j * elt_bits);
-	    output_constant_pool_2 (byte_mode,
-				    gen_int_mode (value, int_mode),
-				    i ? MIN (align, int_bits) : align);
+	    output_constant_pool_2 (int_mode, gen_int_mode (value, int_mode),
+				    i != 0 ? MIN (align, int_bits) : align);
 	  }
 	break;
       }
diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc
index a2c8fa72302..ba0fd25b093 100644
--- a/gcc/wide-int.cc
+++ b/gcc/wide-int.cc
@@ -2146,6 +2146,39 @@ template void generic_wide_int <wide_int_ref_storage <true> >::dump () const;
 template void offset_int::dump () const;
 template void widest_int::dump () const;
 
+/* We could add all the above ::dump variants here, but wide_int and
+   widest_int should handle the common cases.  Besides, you can always
+   call the dump method directly.  */
+
+DEBUG_FUNCTION void
+debug (const wide_int &ref)
+{
+  ref.dump ();
+}
+
+DEBUG_FUNCTION void
+debug (const wide_int *ptr)
+{
+  if (ptr)
+    debug (*ptr);
+  else
+    fprintf (stderr, "<nil>\n");
+}
+
+DEBUG_FUNCTION void
+debug (const widest_int &ref)
+{
+  ref.dump ();
+}
+
+DEBUG_FUNCTION void
+debug (const widest_int *ptr)
+{
+  if (ptr)
+    debug (*ptr);
+  else
+    fprintf (stderr, "<nil>\n");
+}
 
 #if CHECKING_P
 
diff --git a/gcc/xcoffout.c b/gcc/xcoffout.c
index 17b201aced6..cf2064d5ba5 100644
--- a/gcc/xcoffout.c
+++ b/gcc/xcoffout.c
@@ -20,7 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Output xcoff-format symbol table data.  The main functionality is contained
    in dbxout.c.  This file implements the sdbout-like parts of the xcoff
    interface.  Many functions are very similar to their counterparts in
-   sdbout.c.  */
+   the former sdbout.c file.  */
 
 #include "config.h"
 #include "system.h"
@@ -452,7 +452,7 @@ xcoffout_begin_prologue (unsigned int line,
   ASM_OUTPUT_LFB (asm_out_file, line);
   dbxout_parms (DECL_ARGUMENTS (current_function_decl));
 
-  /* Emit the symbols for the outermost BLOCK's variables.  sdbout.c does this
+  /* Emit the symbols for the outermost BLOCK's variables.  sdbout.c did this
      in sdbout_begin_block, but there is no guarantee that there will be any
      inner block 1, so we must do it here.  This gives a result similar to
      dbxout, so it does make some sense.  */
author	Richard Sandiford <richard.sandiford@linaro.org>	2017-11-05 17:19:35 +0000
committer	Richard Sandiford <richard.sandiford@linaro.org>	2017-11-05 17:19:35 +0000
commit	648f8fc59b2cc39abd24f4c22388b346cdebcc31 (patch)
tree	3a07eccc4c22b265261edd75c9ec3910d9c626f5 /gcc
parent	7bef5b82e4109778a0988d20e19e1ed29dadd835 (diff)
parent	8c089b5c15a7b35644750ca393f1e66071ad9aa9 (diff)
download	gcc-648f8fc59b2cc39abd24f4c22388b346cdebcc31.tar.gz