diff options
author | bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4> | 2016-04-15 13:13:48 +0000 |
---|---|---|
committer | bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4> | 2016-04-15 13:13:48 +0000 |
commit | df168526dd4d08c5faa014d585874f978bf73d80 (patch) | |
tree | baab9f6705e45f350fc6dbdd45e2924ae75d2d1b | |
parent | ffbf47a37f7d2d4aa647f4bf0f231a8f2399049b (diff) | |
download | gcc-df168526dd4d08c5faa014d585874f978bf73d80.tar.gz |
2016-04-15 Basile Starynkevitch <basile@starynkevitch.net>
{{merging with even more of GCC 6, using subversion 1.9
svn merge -r230101:230160 ^/trunk
}}
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/melt-branch@235026 138bc75d-0d04-0410-961f-82ee72b054a4
86 files changed, 2946 insertions, 525 deletions
diff --git a/ChangeLog.MELT b/ChangeLog.MELT index e7ffae7f403..2705c783c3e 100644 --- a/ChangeLog.MELT +++ b/ChangeLog.MELT @@ -1,5 +1,10 @@ 2016-04-15 Basile Starynkevitch <basile@starynkevitch.net> + {{merging with even more of GCC 6, using subversion 1.9 + svn merge -r230101:230160 ^/trunk + }} + +2016-04-15 Basile Starynkevitch <basile@starynkevitch.net> {{trouble merging with GCC 6 svn rev 230222, should investigate}} 2016-04-15 Basile Starynkevitch <basile@starynkevitch.net> diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2100306a011..c29980930a9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,296 @@ +2015-11-11 Simon Dardis <simon.dardis@imgtec.com> + + * config/mips/mips.c (mips_breakable_sequence_p): New function. + (mips_break_sequence): New function. + (mips_reorg_process_insns): Use them. Use compact branches in selected + situations. + +2015-11-11 Alan Lawrence <alan.lawrence@arm.com> + + * fold-const.c (get_array_ctor_element_at_index): Fix whitespace, typo. + +2015-11-11 Jiong Wang <jiong.wang@arm.com> + Jim Wilson <wilson@gcc.gnu.org> + + PR target/67305 + * config/arm/arm.md (neon_vector_mem_operand): Return FALSE if strict + be true and eliminable registers mentioned. + +2015-11-11 Claudiu Zissulescu <claziss@synopsys.com> + + * common/config/arc/arc-common.c (arc_handle_option): Handle ARCv2 + options. + * config/arc/arc-opts.h: Add ARCv2 CPUs. + * config/arc/arc-protos.h (arc_secondary_reload_conv): Prototype. + * config/arc/arc.c (arc_secondary_reload): Handle subreg (reg) + situation, and store instructions with large offsets. + (arc_secondary_reload_conv): New function. + (arc_init): Add ARCv2 options. + (arc_conditional_register_usage): Select the proper register usage + for ARCv2 processors. + (arc_handle_interrupt_attribute): ILINK2 is only valid for ARCv1 + architecture. + (arc_compute_function_type): Likewise. + (arc_print_operand): Handle new ARCv2 punctuation characters. + (arc_return_in_memory): ARCv2 ABI returns in registers up to 16 + bytes. + (workaround_arc_anomaly, arc_asm_insn_p, arc_loop_hazard): New + function. + (arc_reorg, arc_hazard): Use it. + * config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): Define __HS__ and + __EM__. + (ASM_SPEC): Add ARCv2 options. + (TARGET_NORM): ARC HS has norm instructions by default. + (TARGET_OPTFPE): Use optimized floating point emulation for ARC + HS. + (TARGET_AT_DBR_CONDEXEC): Only for ARC600 family. + (TARGET_EM, TARGET_HS, TARGET_V2, TARGET_MPYW, TARGET_MULTI): + Define. + (SIGNED_INT16, TARGET_MPY, TARGET_ARC700_MPY, TARGET_ANY_MPY): + Likewise. + (TARGET_ARC600_FAMILY, TARGET_ARCOMPACT_FAMILY): Likewise. + (TARGET_LP_WR_INTERLOCK): Likewise. + * config/arc/arc.md + (commutative_binary_mult_comparison_result_used, movsicc_insn) + (mulsi3, mulsi3_600_lib, mulsidi3, mulsidi3_700, mulsi3_highpart) + (umulsi3_highpart_i, umulsi3_highpart_int, umulsi3_highpart) + (umulsidi3, umulsidi3_700, cstoresi4, simple_return, p_return_i): + Use it for ARCv2. + (mulhisi3, mulhisi3_imm, mulhisi3_reg, umulhisi3, umulhisi3_imm) + (umulhisi3_reg, umulhisi3_reg, mulsi3_v2, nopv, bswapsi2) + (prefetch, divsi3, udivsi3 modsi3, umodsi3, arcset, arcsetltu) + (arcsetgeu, arcsethi, arcsetls, reload_*_load, reload_*_store) + (extzvsi): New pattern. + * config/arc/arc.opt: New ARCv2 options. + * config/arc/arcEM.md: New file. + * config/arc/arcHS.md: Likewise. + * config/arc/constraints.md (C3p): New constraint, accepts 1 and 2 + values. + (Cm2): A signed 9-bit integer constant constraint. + (C62): An unsigned 6-bit integer constant constraint. + (C16): A signed 16-bit integer constant constraint. + * config/arc/predicates.md (mult_operator): Add ARCv2 processort. + (short_const_int_operand): New predicate. + * config/arc/t-arc-newlib: Add ARCv2 multilib options. + * doc/invoke.texi: Add documentation for -mcpu=<archs/arcem> + -mcode-density and -mdiv-rem. + +2015-11-11 Julia Koval <julia.koval@intel.com> + + * config/i386/i386.c (m_SKYLAKE_AVX512): Fix typo. + +2015-11-11 Julia Koval <julia.koval@intel.com> + + * config/i386/i386.c: Handle "skylake" and + "skylake-avx512". + +2015-11-11 Martin Liska <mliska@suse.cz> + + * gimple-ssa-strength-reduction.c (create_phi_basis): + Use auto_vec. + * passes.c (release_dump_file_name): New function. + (pass_init_dump_file): Used from this function. + (pass_fini_dump_file): Likewise. + * tree-sra.c (convert_callers_for_node): Use xstrdup_for_dump. + * var-tracking.c (vt_initialize): Use pool_allocator. + +2015-11-11 Richard Biener <rguenth@gcc.gnu.org> + Jiong Wang <jiong.wang@arm.com> + + PR tree-optimization/68234 + * tree-vrp.c (vrp_visit_phi_node): Extend SCEV check to those loop PHI + node which estimiated to be VR_VARYING initially. + +2015-11-11 Robert Suchanek <robert.suchanek@imgtec.com> + + * regname.c (scan_rtx_reg): Check the matching number of consecutive + registers when tying chains. + (build_def_use): Move terminated_this_insn earlier in the function. + +2015-11-10 Mike Frysinger <vapier@gentoo.org> + + * configure.ac: Use = with test and not ==. + * configure: Regenerated. + +2015-11-11 David Edelsohn <dje.gcc@gmail.com> + + * config/rs6000/aix.h (TARGET_OS_AIX_CPP_BUILTINS): Add cpu and + machine asserts. Update defines for 64 bit. + +2015-11-11 Charles Baylis <charles.baylis@linaro.org> + + PR target/63870 + * config/arm/neon.md (neon_vld1_lane<mode>): Remove error for invalid + lane number. + (neon_vst1_lane<mode>): Likewise. + (neon_vld2_lane<mode>): Likewise. + (neon_vst2_lane<mode>): Likewise. + (neon_vld3_lane<mode>): Likewise. + (neon_vst3_lane<mode>): Likewise. + (neon_vld4_lane<mode>): Likewise. + (neon_vst4_lane<mode>): Likewise. + +2015-11-11 Charles Baylis <charles.baylis@linaro.org> + + PR target/63870 + * config/arm/arm-builtins.c: (arm_load1_qualifiers) Use + qualifier_struct_load_store_lane_index. + (arm_storestruct_lane_qualifiers) Likewise. + * config/arm/neon.md: (neon_vld1_lane<mode>) Reverse lane numbers for + big-endian. + (neon_vst1_lane<mode>) Likewise. + (neon_vld2_lane<mode>) Likewise. + (neon_vst2_lane<mode>) Likewise. + (neon_vld3_lane<mode>) Likewise. + (neon_vst3_lane<mode>) Likewise. + (neon_vld4_lane<mode>) Likewise. + (neon_vst4_lane<mode>) Likewise. + +2015-11-11 Charles Baylis <charles.baylis@linaro.org> + + PR target/63870 + * config/arm/arm-builtins.c (enum arm_type_qualifiers): New enumerator + qualifier_struct_load_store_lane_index. + (builtin_arg): New enumerator NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX. + (arm_expand_neon_args): New parameter. Remove ellipsis. Handle NEON + argument qualifiers. + (arm_expand_neon_builtin): Handle new NEON argument qualifier. + * config/arm/arm.h (NEON_ENDIAN_LANE_N): New macro. + +2015-11-10 Nathan Sidwell <nathan@codesourcery.com> + + * config/nvptx/nvptx.opt (moptimize): New flag. + * config/nvptx/nvptx.c (nvptx_option_override): Set nvptx_optimize + default. + (nvptx_optimize_inner): New. + (nvptx_process_pars): Call it when optimizing. + * doc/invoke.texi (Nvidia PTX Options): Document -moptimize. + +2015-11-10 Bill Schmidt <wschmidt@linux.vnet.ibm.com> + + * config/rs6000/rs6000.c (rs6000_secondary_reload_direct_move): + Remove redundant code. + +2015-11-10 Jeff Law <law@redhat.com> + + * config/ft32/ft32.c (ft32_print_operand): Supply mode to + call to output_address. + * config/moxie/moxie.c (moxie_print_operand_address): Similarly. + Add unnamed machine_mode argument. + +2015-11-10 Michael Meissner <meissner@linux.vnet.ibm.com> + + * config.gcc (powerpc*-*-*, rs6000*-*-*): Add power9 to hosts that + default to 64-bit. + +2015-11-10 Uros Bizjak <ubizjak@gmail.com> + + * config/i386/i386.md (*movabs<mode>_1): Add explicit + size directives for -masm=intel. + (*movabs<mode>_2): Ditto. + +2015-11-10 Uros Bizjak <ubizjak@gmail.com> + + * config/i386/i386.c (ix86_print_operand): Remove dead code that + tried to avoid (%rip) for call operands. + +2015-11-10 Uros Bizjak <ubizjak@gmail.com> + + * config/i386/i386.c (ix86_print_operand_address_as): Add no_rip + argument. Do not use RIP relative addressing when no_rip is set. + (ix86_print_operand): Update call to ix86_print_operand_address_as. + (ix86_print_operand_address): Ditto. + * config/i386/i386.md (*movabs<mode>_1): Use %P modifier for + absolute movabs operand 0. Add square braces for -masm=intel. + (*movabs<mode>_2): Ditto for operand 1. + +2015-11-10 Kyrylo Tkachov <kyrylo.tkachov@arm.com> + + * config/arm/arm.c (arm_new_rtx_costs, FIX case): Handle + combine_vcvtf2i pattern. + +2015-11-10 Kyrylo Tkachov <kyrylo.tkachov@arm.com> + + * config/arm/arm.c (neon_valid_immediate): Remove integer + CONST_DOUBLE handling. It should never occur. + +2015-11-10 Matthew Wahab <matthew.wahab@arm.com> + + * config/aarch64/atomics.md (unspecv): Move to iterators.md. + (ATOMIC_LDOP): Likewise. + (atomic_ldop): Likewise. + * config/aarch64/iterators.md (unspecv): Moved from atomics.md. + (ATOMIC_LDOP): Likewise. + (atomic_ldop): Likewise. + +2015-11-10 Martin Liska <mliska@suse.cz> + + * alloc-pool.h (allocate_raw): New function. + (operator new (size_t, object_allocator<T> &a)): Use the + function instead of object_allocator::allocate). + +2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> + + * config/i386/sse.md (HALFMASKMODE): New attribute. + (DOUBLEMASKMODE): New attribute. + (vec_pack_trunc_qi): New. + (vec_pack_trunc_<mode>): New. + (vec_unpacks_lo_hi): New. + (vec_unpacks_lo_si): New. + (vec_unpacks_lo_di): New. + (vec_unpacks_hi_hi): New. + (vec_unpacks_hi_<mode>): New. + +2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> + + * optabs.c (expand_binop_directly): Allow scalar mode for + vec_pack_trunc_optab. + * tree-vect-loop.c (vect_determine_vectorization_factor): Skip + boolean vector producers from pattern sequence when computing VF. + * tree-vect-patterns.c (vect_vect_recog_func_ptrs) Add + vect_recog_mask_conversion_pattern. + (search_type_for_mask): Choose the smallest + type if different size types are mixed. + (build_mask_conversion): New. + (vect_recog_mask_conversion_pattern): New. + (vect_pattern_recog_1): Allow scalar mode for boolean vectype. + * tree-vect-stmts.c (vectorizable_mask_load_store): Support masked + load with pattern. + (vectorizable_conversion): Support boolean vectors. + (free_stmt_vec_info): Allow patterns for statements with no lhs. + * tree-vectorizer.h (NUM_PATTERNS): Increase to 14. + +2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> + + * config/i386/i386-protos.h (ix86_expand_sse_movcc): New. + * config/i386/i386.c (ix86_expand_sse_movcc): Make public. + Cast mask to FP mode if required. + * config/i386/sse.md (vcond_mask_<mode><avx512fmaskmodelower>): New. + (vcond_mask_<mode><avx512fmaskmodelower>): New. + (vcond_mask_<mode><sseintvecmodelower>): New. + (vcond_mask_<mode><sseintvecmodelower>): New. + (vcond_mask_v2div2di): New. + (vcond_mask_<mode><sseintvecmodelower>): New. + (vcond_mask_<mode><sseintvecmodelower>): New. + +2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> + + * optabs-query.h (get_vcond_mask_icode): New. + * optabs-tree.c (expand_vec_cond_expr_p): Use + get_vcond_mask_icode for VEC_COND_EXPR with mask. + * optabs.c (expand_vec_cond_mask_expr): New. + (expand_vec_cond_expr): Use get_vcond_mask_icode + when possible. + * optabs.def (vcond_mask_optab): New. + * tree-vect-patterns.c (vect_recog_bool_pattern): Don't + generate redundant comparison for COND_EXPR. + * tree-vect-stmts.c (vect_is_simple_cond): Allow SSA_NAME + as a condition. + (vectorizable_condition): Likewise. + * tree-vect-slp.c (vect_get_and_check_slp_defs): Allow + cond_exp with no embedded comparison. + (vect_build_slp_tree_1): Likewise. + 2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> * config/i386/sse.md (maskload<mode>): Rename to ... @@ -302,6 +595,7 @@ Fix comment typo. 2015-11-09 Michael Meissner <meissner@linux.vnet.ibm.com> + Peter Bergner <bergner@vnet.ibm.com> * config/rs6000/rs6000.opt (-mpower9-fusion): Add new switches for ISA 3.0 (power9). diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP index 7ed3ab068e1..ef86fadfebb 100644 --- a/gcc/DATESTAMP +++ b/gcc/DATESTAMP @@ -1 +1 @@ -20151110 +20151111 diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h index bf9b0ebd6ee..38aff284997 100644 --- a/gcc/alloc-pool.h +++ b/gcc/alloc-pool.h @@ -477,12 +477,25 @@ public: m_allocator.release_if_empty (); } + + /* Allocate memory for instance of type T and call a default constructor. */ + inline T * allocate () ATTRIBUTE_MALLOC { return ::new (m_allocator.allocate ()) T; } + /* Allocate memory for instance of type T and return void * that + could be used in situations where a default constructor is not provided + by the class T. */ + + inline void * + allocate_raw () ATTRIBUTE_MALLOC + { + return m_allocator.allocate (); + } + inline void remove (T *object) { @@ -528,7 +541,7 @@ template <typename T> inline void * operator new (size_t, object_allocator<T> &a) { - return a.allocate (); + return a.allocate_raw (); } /* Hashtable mapping alloc_pool names to descriptors. */ diff --git a/gcc/common/config/arc/arc-common.c b/gcc/common/config/arc/arc-common.c index 489bdb22533..c06f488d285 100644 --- a/gcc/common/config/arc/arc-common.c +++ b/gcc/common/config/arc/arc-common.c @@ -33,7 +33,7 @@ arc_option_init_struct (struct gcc_options *opts) { opts->x_flag_no_common = 255; /* Mark as not user-initialized. */ - /* Which cpu we're compiling for (ARC600, ARC601, ARC700). */ + /* Which cpu we're compiling for (ARC600, ARC601, ARC700, ARCv2). */ arc_cpu = PROCESSOR_NONE; } @@ -68,6 +68,7 @@ arc_handle_option (struct gcc_options *opts, struct gcc_options *opts_set, { size_t code = decoded->opt_index; int value = decoded->value; + const char *arg = decoded->arg; switch (code) { @@ -91,9 +92,40 @@ arc_handle_option (struct gcc_options *opts, struct gcc_options *opts_set, if (! (opts_set->x_target_flags & MASK_BARREL_SHIFTER) ) opts->x_target_flags &= ~MASK_BARREL_SHIFTER; break; + case PROCESSOR_ARCHS: + if ( !(opts_set->x_target_flags & MASK_BARREL_SHIFTER)) + opts->x_target_flags |= MASK_BARREL_SHIFTER; /* Default: on. */ + if ( !(opts_set->x_target_flags & MASK_CODE_DENSITY)) + opts->x_target_flags |= MASK_CODE_DENSITY; /* Default: on. */ + if ( !(opts_set->x_target_flags & MASK_NORM_SET)) + opts->x_target_flags |= MASK_NORM_SET; /* Default: on. */ + if ( !(opts_set->x_target_flags & MASK_SWAP_SET)) + opts->x_target_flags |= MASK_SWAP_SET; /* Default: on. */ + if ( !(opts_set->x_target_flags & MASK_DIVREM)) + opts->x_target_flags |= MASK_DIVREM; /* Default: on. */ + break; + + case PROCESSOR_ARCEM: + if ( !(opts_set->x_target_flags & MASK_BARREL_SHIFTER)) + opts->x_target_flags |= MASK_BARREL_SHIFTER; /* Default: on. */ + if ( !(opts_set->x_target_flags & MASK_CODE_DENSITY)) + opts->x_target_flags &= ~MASK_CODE_DENSITY; /* Default: off. */ + if ( !(opts_set->x_target_flags & MASK_NORM_SET)) + opts->x_target_flags &= ~MASK_NORM_SET; /* Default: off. */ + if ( !(opts_set->x_target_flags & MASK_SWAP_SET)) + opts->x_target_flags &= ~MASK_SWAP_SET; /* Default: off. */ + if ( !(opts_set->x_target_flags & MASK_DIVREM)) + opts->x_target_flags &= ~MASK_DIVREM; /* Default: off. */ + break; default: gcc_unreachable (); } + break; + + case OPT_mmpy_option_: + if (value < 0 || value > 9) + error_at (loc, "bad value %qs for -mmpy-option switch", arg); + break; } return true; diff --git a/gcc/config.gcc b/gcc/config.gcc index 9cc765e2bc1..59aee2cfdcd 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -439,7 +439,7 @@ powerpc*-*-*) cpu_type=rs6000 extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h" case x$with_cpu in - xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[345678]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500) + xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500) cpu_is_64bit=yes ;; esac @@ -4131,7 +4131,7 @@ case "${target}" in eval "with_$which=405" ;; "" | common | native \ - | power | power[2345678] | power6x | powerpc | powerpc64 \ + | power | power[23456789] | power6x | powerpc | powerpc64 \ | rios | rios1 | rios2 | rsc | rsc1 | rs64a \ | 401 | 403 | 405 | 405fp | 440 | 440fp | 464 | 464fp \ | 476 | 476fp | 505 | 601 | 602 | 603 | 603e | ec603e \ diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md index e7ac5f6fc1c..3c034fb4376 100644 --- a/gcc/config/aarch64/atomics.md +++ b/gcc/config/aarch64/atomics.md @@ -18,34 +18,6 @@ ;; along with GCC; see the file COPYING3. If not see ;; <http://www.gnu.org/licenses/>. -(define_c_enum "unspecv" - [ - UNSPECV_LX ; Represent a load-exclusive. - UNSPECV_SX ; Represent a store-exclusive. - UNSPECV_LDA ; Represent an atomic load or load-acquire. - UNSPECV_STL ; Represent an atomic store or store-release. - UNSPECV_ATOMIC_CMPSW ; Represent an atomic compare swap. - UNSPECV_ATOMIC_EXCHG ; Represent an atomic exchange. - UNSPECV_ATOMIC_CAS ; Represent an atomic CAS. - UNSPECV_ATOMIC_SWP ; Represent an atomic SWP. - UNSPECV_ATOMIC_OP ; Represent an atomic operation. - UNSPECV_ATOMIC_LDOP ; Represent an atomic load-operation - UNSPECV_ATOMIC_LDOP_OR ; Represent an atomic load-or - UNSPECV_ATOMIC_LDOP_BIC ; Represent an atomic load-bic - UNSPECV_ATOMIC_LDOP_XOR ; Represent an atomic load-xor - UNSPECV_ATOMIC_LDOP_PLUS ; Represent an atomic load-add -]) - -;; Iterators for load-operate instructions. - -(define_int_iterator ATOMIC_LDOP - [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC - UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS]) - -(define_int_attr atomic_ldop - [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr") - (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")]) - ;; Instruction patterns. (define_expand "atomic_compare_and_swap<mode>" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c4a1c9888ea..c2eb7dec99d 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -306,6 +306,29 @@ UNSPEC_VEC_SHR ; Used in aarch64-simd.md. ]) +;; ------------------------------------------------------------------ +;; Unspec enumerations for Atomics. They are here so that they can be +;; used in the int_iterators for atomic operations. +;; ------------------------------------------------------------------ + +(define_c_enum "unspecv" + [ + UNSPECV_LX ; Represent a load-exclusive. + UNSPECV_SX ; Represent a store-exclusive. + UNSPECV_LDA ; Represent an atomic load or load-acquire. + UNSPECV_STL ; Represent an atomic store or store-release. + UNSPECV_ATOMIC_CMPSW ; Represent an atomic compare swap. + UNSPECV_ATOMIC_EXCHG ; Represent an atomic exchange. + UNSPECV_ATOMIC_CAS ; Represent an atomic CAS. + UNSPECV_ATOMIC_SWP ; Represent an atomic SWP. + UNSPECV_ATOMIC_OP ; Represent an atomic operation. + UNSPECV_ATOMIC_LDOP ; Represent an atomic load-operation + UNSPECV_ATOMIC_LDOP_OR ; Represent an atomic load-or + UNSPECV_ATOMIC_LDOP_BIC ; Represent an atomic load-bic + UNSPECV_ATOMIC_LDOP_XOR ; Represent an atomic load-xor + UNSPECV_ATOMIC_LDOP_PLUS ; Represent an atomic load-add +]) + ;; ------------------------------------------------------------------- ;; Mode attributes ;; ------------------------------------------------------------------- @@ -965,6 +988,16 @@ (define_int_iterator CRYPTO_SHA256 [UNSPEC_SHA256H UNSPEC_SHA256H2]) +;; Iterators for atomic operations. + +(define_int_iterator ATOMIC_LDOP + [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC + UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS]) + +(define_int_attr atomic_ldop + [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr") + (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")]) + ;; ------------------------------------------------------------------- ;; Int Iterators Attributes. ;; ------------------------------------------------------------------- diff --git a/gcc/config/arc/arc-opts.h b/gcc/config/arc/arc-opts.h index cca1f035636..a33f4b77521 100644 --- a/gcc/config/arc/arc-opts.h +++ b/gcc/config/arc/arc-opts.h @@ -23,5 +23,7 @@ enum processor_type PROCESSOR_NONE, PROCESSOR_ARC600, PROCESSOR_ARC601, - PROCESSOR_ARC700 + PROCESSOR_ARC700, + PROCESSOR_ARCEM, + PROCESSOR_ARCHS }; diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h index ff82ecf63dd..6e04351159b 100644 --- a/gcc/config/arc/arc-protos.h +++ b/gcc/config/arc/arc-protos.h @@ -118,3 +118,4 @@ extern bool arc_epilogue_uses (int regno); extern int regno_clobbered_p (unsigned int, rtx_insn *, machine_mode, int); extern int arc_return_slot_offset (void); extern bool arc_legitimize_reload_address (rtx *, machine_mode, int, int); +extern void arc_secondary_reload_conv (rtx, rtx, rtx, bool); diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 01261bc702a..85d53e4d2e3 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -590,10 +590,26 @@ arc_sched_adjust_priority (rtx_insn *insn, int priority) return priority; } +/* For ARC base register + offset addressing, the validity of the + address is mode-dependent for most of the offset range, as the + offset can be scaled by the access size. + We don't expose these as mode-dependent addresses in the + mode_dependent_address_p target hook, because that would disable + lots of optimizations, and most uses of these addresses are for 32 + or 64 bit accesses anyways, which are fine. + However, that leaves some addresses for 8 / 16 bit values not + properly reloaded by the generic code, which is why we have to + schedule secondary reloads for these. */ + static reg_class_t -arc_secondary_reload (bool in_p, rtx x, reg_class_t cl, machine_mode, - secondary_reload_info *) +arc_secondary_reload (bool in_p, + rtx x, + reg_class_t cl, + machine_mode mode, + secondary_reload_info *sri) { + enum rtx_code code = GET_CODE (x); + if (cl == DOUBLE_REGS) return GENERAL_REGS; @@ -601,9 +617,86 @@ arc_secondary_reload (bool in_p, rtx x, reg_class_t cl, machine_mode, if ((cl == LPCOUNT_REG || cl == WRITABLE_CORE_REGS) && in_p && MEM_P (x)) return GENERAL_REGS; + + /* If we have a subreg (reg), where reg is a pseudo (that will end in + a memory location), then we may need a scratch register to handle + the fp/sp+largeoffset address. */ + if (code == SUBREG) + { + rtx addr = NULL_RTX; + x = SUBREG_REG (x); + + if (REG_P (x)) + { + int regno = REGNO (x); + if (regno >= FIRST_PSEUDO_REGISTER) + regno = reg_renumber[regno]; + + if (regno != -1) + return NO_REGS; + + /* It is a pseudo that ends in a stack location. */ + if (reg_equiv_mem (REGNO (x))) + { + /* Get the equivalent address and check the range of the + offset. */ + rtx mem = reg_equiv_mem (REGNO (x)); + addr = find_replacement (&XEXP (mem, 0)); + } + } + else + { + gcc_assert (MEM_P (x)); + addr = XEXP (x, 0); + addr = simplify_rtx (addr); + } + if (addr && GET_CODE (addr) == PLUS + && CONST_INT_P (XEXP (addr, 1)) + && (!RTX_OK_FOR_OFFSET_P (mode, XEXP (addr, 1)))) + { + switch (mode) + { + case QImode: + sri->icode = + in_p ? CODE_FOR_reload_qi_load : CODE_FOR_reload_qi_store; + break; + case HImode: + sri->icode = + in_p ? CODE_FOR_reload_hi_load : CODE_FOR_reload_hi_store; + break; + default: + break; + } + } + } return NO_REGS; } +/* Convert reloads using offsets that are too large to use indirect + addressing. */ + +void +arc_secondary_reload_conv (rtx reg, rtx mem, rtx scratch, bool store_p) +{ + rtx addr; + + gcc_assert (GET_CODE (mem) == MEM); + addr = XEXP (mem, 0); + + /* Large offset: use a move. FIXME: ld ops accepts limms as + offsets. Hence, the following move insn is not required. */ + emit_move_insn (scratch, addr); + mem = replace_equiv_address_nv (mem, scratch); + + /* Now create the move. */ + if (store_p) + emit_insn (gen_rtx_SET (mem, reg)); + else + emit_insn (gen_rtx_SET (reg, mem)); + + return; +} + static unsigned arc_ifcvt (void); namespace { @@ -687,23 +780,35 @@ arc_init (void) { enum attr_tune tune_dflt = TUNE_NONE; - if (TARGET_ARC600) + switch (arc_cpu) { + case PROCESSOR_ARC600: arc_cpu_string = "ARC600"; tune_dflt = TUNE_ARC600; - } - else if (TARGET_ARC601) - { + break; + + case PROCESSOR_ARC601: arc_cpu_string = "ARC601"; tune_dflt = TUNE_ARC600; - } - else if (TARGET_ARC700) - { + break; + + case PROCESSOR_ARC700: arc_cpu_string = "ARC700"; tune_dflt = TUNE_ARC700_4_2_STD; + break; + + case PROCESSOR_ARCEM: + arc_cpu_string = "EM"; + break; + + case PROCESSOR_ARCHS: + arc_cpu_string = "HS"; + break; + + default: + gcc_unreachable (); } - else - gcc_unreachable (); + if (arc_tune == TUNE_NONE) arc_tune = tune_dflt; /* Note: arc_multcost is only used in rtx_cost if speed is true. */ @@ -737,15 +842,15 @@ arc_init (void) } /* Support mul64 generation only for ARC600. */ - if (TARGET_MUL64_SET && TARGET_ARC700) - error ("-mmul64 not supported for ARC700"); + if (TARGET_MUL64_SET && (!TARGET_ARC600_FAMILY)) + error ("-mmul64 not supported for ARC700 or ARCv2"); - /* MPY instructions valid only for ARC700. */ - if (TARGET_NOMPY_SET && !TARGET_ARC700) - error ("-mno-mpy supported only for ARC700"); + /* MPY instructions valid only for ARC700 or ARCv2. */ + if (TARGET_NOMPY_SET && TARGET_ARC600_FAMILY) + error ("-mno-mpy supported only for ARC700 or ARCv2"); /* mul/mac instructions only for ARC600. */ - if (TARGET_MULMAC_32BY16_SET && !(TARGET_ARC600 || TARGET_ARC601)) + if (TARGET_MULMAC_32BY16_SET && (!TARGET_ARC600_FAMILY)) error ("-mmul32x16 supported only for ARC600 or ARC601"); if (!TARGET_DPFP && TARGET_DPFP_DISABLE_LRSR) @@ -757,18 +862,25 @@ arc_init (void) error ("FPX fast and compact options cannot be specified together"); /* FPX-2. No fast-spfp for arc600 or arc601. */ - if (TARGET_SPFP_FAST_SET && (TARGET_ARC600 || TARGET_ARC601)) + if (TARGET_SPFP_FAST_SET && TARGET_ARC600_FAMILY) error ("-mspfp_fast not available on ARC600 or ARC601"); /* FPX-3. No FPX extensions on pre-ARC600 cores. */ if ((TARGET_DPFP || TARGET_SPFP) - && !(TARGET_ARC600 || TARGET_ARC601 || TARGET_ARC700)) + && !TARGET_ARCOMPACT_FAMILY) error ("FPX extensions not available on pre-ARC600 cores"); + /* Only selected multiplier configurations are available for HS. */ + if (TARGET_HS && ((arc_mpy_option > 2 && arc_mpy_option < 7) + || (arc_mpy_option == 1))) + error ("This multiplier configuration is not available for HS cores"); + /* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic. */ - if (flag_pic && !TARGET_ARC700) + if (flag_pic && TARGET_ARC600_FAMILY) { - warning (DK_WARNING, "PIC is not supported for %s. Generating non-PIC code only..", arc_cpu_string); + warning (DK_WARNING, + "PIC is not supported for %s. Generating non-PIC code only..", + arc_cpu_string); flag_pic = 0; } @@ -782,6 +894,8 @@ arc_init (void) arc_punct_chars['!'] = 1; arc_punct_chars['^'] = 1; arc_punct_chars['&'] = 1; + arc_punct_chars['+'] = 1; + arc_punct_chars['_'] = 1; if (optimize > 1 && !TARGET_NO_COND_EXEC) { @@ -825,7 +939,7 @@ arc_override_options (void) if (flag_no_common == 255) flag_no_common = !TARGET_NO_SDATA_SET; - /* TARGET_COMPACT_CASESI needs the "q" register class. */ \ + /* TARGET_COMPACT_CASESI needs the "q" register class. */ if (TARGET_MIXED_CODE) TARGET_Q_CLASS = 1; if (!TARGET_Q_CLASS) @@ -1198,6 +1312,8 @@ arc_init_reg_tables (void) char rname57[5] = "r57"; char rname58[5] = "r58"; char rname59[5] = "r59"; + char rname29[7] = "ilink1"; + char rname30[7] = "ilink2"; static void arc_conditional_register_usage (void) @@ -1206,6 +1322,14 @@ arc_conditional_register_usage (void) int i; int fix_start = 60, fix_end = 55; + if (TARGET_V2) + { + /* For ARCv2 the core register set is changed. */ + strcpy (rname29, "ilink"); + strcpy (rname30, "r30"); + fixed_regs[30] = call_used_regs[30] = 1; + } + if (TARGET_MUL64_SET) { fix_start = 57; @@ -1271,7 +1395,7 @@ arc_conditional_register_usage (void) machine_dependent_reorg. */ if (TARGET_ARC600) CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], LP_COUNT); - else if (!TARGET_ARC700) + else if (!TARGET_LP_WR_INTERLOCK) fixed_regs[LP_COUNT] = 1; for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) if (!call_used_regs[regno]) @@ -1279,7 +1403,7 @@ arc_conditional_register_usage (void) for (regno = 32; regno < 60; regno++) if (!fixed_regs[regno]) SET_HARD_REG_BIT (reg_class_contents[WRITABLE_CORE_REGS], regno); - if (TARGET_ARC700) + if (!TARGET_ARC600_FAMILY) { for (regno = 32; regno <= 60; regno++) CLEAR_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], regno); @@ -1313,7 +1437,7 @@ arc_conditional_register_usage (void) = (fixed_regs[i] ? (TEST_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], i) ? CHEAP_CORE_REGS : ALL_CORE_REGS) - : ((TARGET_ARC700 + : (((!TARGET_ARC600_FAMILY) && TEST_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], i)) ? CHEAP_CORE_REGS : WRITABLE_CORE_REGS)); else @@ -1331,7 +1455,8 @@ arc_conditional_register_usage (void) /* Handle Special Registers. */ arc_regno_reg_class[29] = LINK_REGS; /* ilink1 register. */ - arc_regno_reg_class[30] = LINK_REGS; /* ilink2 register. */ + if (!TARGET_V2) + arc_regno_reg_class[30] = LINK_REGS; /* ilink2 register. */ arc_regno_reg_class[31] = LINK_REGS; /* blink register. */ arc_regno_reg_class[60] = LPCOUNT_REG; arc_regno_reg_class[61] = NO_REGS; /* CC_REG: must be NO_REGS. */ @@ -1413,13 +1538,23 @@ arc_handle_interrupt_attribute (tree *, tree name, tree args, int, *no_add_attrs = true; } else if (strcmp (TREE_STRING_POINTER (value), "ilink1") - && strcmp (TREE_STRING_POINTER (value), "ilink2")) + && strcmp (TREE_STRING_POINTER (value), "ilink2") + && !TARGET_V2) { warning (OPT_Wattributes, "argument of %qE attribute is not \"ilink1\" or \"ilink2\"", name); *no_add_attrs = true; } + else if (TARGET_V2 + && strcmp (TREE_STRING_POINTER (value), "ilink")) + { + warning (OPT_Wattributes, + "argument of %qE attribute is not \"ilink\"", + name); + *no_add_attrs = true; + } + return NULL_TREE; } @@ -1931,7 +2066,8 @@ arc_compute_function_type (struct function *fun) { tree value = TREE_VALUE (args); - if (!strcmp (TREE_STRING_POINTER (value), "ilink1")) + if (!strcmp (TREE_STRING_POINTER (value), "ilink1") + || !strcmp (TREE_STRING_POINTER (value), "ilink")) fn_type = ARC_FUNCTION_ILINK1; else if (!strcmp (TREE_STRING_POINTER (value), "ilink2")) fn_type = ARC_FUNCTION_ILINK2; @@ -3115,6 +3251,18 @@ arc_print_operand (FILE *file, rtx x, int code) if (TARGET_ANNOTATE_ALIGN && cfun->machine->size_reason) fprintf (file, "; unalign: %d", cfun->machine->unalign); return; + case '+': + if (TARGET_V2) + fputs ("m", file); + else + fputs ("h", file); + return; + case '_': + if (TARGET_V2) + fputs ("h", file); + else + fputs ("w", file); + return; default : /* Unknown flag. */ output_operand_lossage ("invalid operand output code"); @@ -4224,7 +4372,7 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code, *total= arc_multcost; /* We do not want synth_mult sequences when optimizing for size. */ - else if (TARGET_MUL64_SET || (TARGET_ARC700 && !TARGET_NOMPY_SET)) + else if (TARGET_MUL64_SET || TARGET_ARC700_MPY) *total = COSTS_N_INSNS (1); else *total = COSTS_N_INSNS (2); @@ -5639,7 +5787,7 @@ arc_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED) else { HOST_WIDE_INT size = int_size_in_bytes (type); - return (size == -1 || size > 8); + return (size == -1 || size > (TARGET_V2 ? 16 : 8)); } } @@ -5737,6 +5885,26 @@ arc_invalid_within_doloop (const rtx_insn *insn) return NULL; } +/* The same functionality as arc_hazard. It is called in machine + reorg before any other optimization. Hence, the NOP size is taken + into account when doing branch shortening. */ + +static void +workaround_arc_anomaly (void) +{ + rtx_insn *insn, *succ0; + + /* For any architecture: call arc_hazard here. */ + for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) + { + succ0 = next_real_insn (insn); + if (arc_hazard (insn, succ0)) + { + emit_insn_before (gen_nopv (), succ0); + } + } +} + static int arc_reorg_in_progress = 0; /* ARC's machince specific reorg function. */ @@ -5750,6 +5918,8 @@ arc_reorg (void) long offset; int changed; + workaround_arc_anomaly (); + cfun->machine->arc_reorg_started = 1; arc_reorg_in_progress = 1; @@ -7758,6 +7928,109 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ) return 0; } +/* Given a rtx, check if it is an assembly instruction or not. */ + +static int +arc_asm_insn_p (rtx x) +{ + int i, j; + + if (x == 0) + return 0; + + switch (GET_CODE (x)) + { + case ASM_OPERANDS: + case ASM_INPUT: + return 1; + + case SET: + return arc_asm_insn_p (SET_SRC (x)); + + case PARALLEL: + j = 0; + for (i = XVECLEN (x, 0) - 1; i >= 0; i--) + j += arc_asm_insn_p (XVECEXP (x, 0, i)); + if ( j > 0) + return 1; + break; + + default: + break; + } + + return 0; +} + +/* We might have a CALL to a non-returning function before a loop end. + ??? Although the manual says that's OK (the target is outside the + loop, and the loop counter unused there), the assembler barfs on + this for ARC600, so we must insert a nop before such a call too. + For ARC700, and ARCv2 is not allowed to have the last ZOL + instruction a jump to a location where lp_count is modified. */ + +static bool +arc_loop_hazard (rtx_insn *pred, rtx_insn *succ) +{ + rtx_insn *jump = NULL; + rtx_insn *label = NULL; + basic_block succ_bb; + + if (recog_memoized (succ) != CODE_FOR_doloop_end_i) + return false; + + /* Phase 1: ARC600 and ARCv2HS doesn't allow any control instruction + (i.e., jump/call) as the last instruction of a ZOL. */ + if (TARGET_ARC600 || TARGET_HS) + if (JUMP_P (pred) || CALL_P (pred) + || arc_asm_insn_p (PATTERN (pred)) + || GET_CODE (PATTERN (pred)) == SEQUENCE) + return true; + + /* Phase 2: Any architecture, it is not allowed to have the last ZOL + instruction a jump to a location where lp_count is modified. */ + + /* Phase 2a: Dig for the jump instruction. */ + if (JUMP_P (pred)) + jump = pred; + else if (GET_CODE (PATTERN (pred)) == SEQUENCE + && JUMP_P (XVECEXP (PATTERN (pred), 0, 0))) + jump = as_a <rtx_insn *> XVECEXP (PATTERN (pred), 0, 0); + else + return false; + + label = JUMP_LABEL_AS_INSN (jump); + if (!label) + return false; + + /* Phase 2b: Make sure is not a millicode jump. */ + if ((GET_CODE (PATTERN (jump)) == PARALLEL) + && (XVECEXP (PATTERN (jump), 0, 0) == ret_rtx)) + return false; + + /* Phase 2c: Make sure is not a simple_return. */ + if ((GET_CODE (PATTERN (jump)) == SIMPLE_RETURN) + || (GET_CODE (label) == SIMPLE_RETURN)) + return false; + + /* Pahse 2d: Go to the target of the jump and check for aliveness of + LP_COUNT register. */ + succ_bb = BLOCK_FOR_INSN (label); + if (!succ_bb) + { + gcc_assert (NEXT_INSN (label)); + if (NOTE_INSN_BASIC_BLOCK_P (NEXT_INSN (label))) + succ_bb = NOTE_BASIC_BLOCK (NEXT_INSN (label)); + else + succ_bb = BLOCK_FOR_INSN (NEXT_INSN (label)); + } + + if (succ_bb && REGNO_REG_SET_P (df_get_live_out (succ_bb), LP_COUNT)) + return true; + + return false; +} + /* For ARC600: A write to a core reg greater or equal to 32 must not be immediately followed by a use. Anticipate the length requirement to insert a nop @@ -7766,19 +8039,16 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ) int arc_hazard (rtx_insn *pred, rtx_insn *succ) { - if (!TARGET_ARC600) - return 0; if (!pred || !INSN_P (pred) || !succ || !INSN_P (succ)) return 0; - /* We might have a CALL to a non-returning function before a loop end. - ??? Although the manual says that's OK (the target is outside the loop, - and the loop counter unused there), the assembler barfs on this, so we - must instert a nop before such a call too. */ - if (recog_memoized (succ) == CODE_FOR_doloop_end_i - && (JUMP_P (pred) || CALL_P (pred) - || GET_CODE (PATTERN (pred)) == SEQUENCE)) + + if (arc_loop_hazard (pred, succ)) return 4; - return arc600_corereg_hazard (pred, succ); + + if (TARGET_ARC600) + return arc600_corereg_hazard (pred, succ); + + return 0; } /* Return length adjustment for INSN. */ diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index e8baf5b8d79..d312f9f14a7 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -80,6 +80,14 @@ along with GCC; see the file COPYING3. If not see builtin_define ("__A7__"); \ builtin_define ("__ARC700__"); \ } \ + else if (TARGET_EM) \ + { \ + builtin_define ("__EM__"); \ + } \ + else if (TARGET_HS) \ + { \ + builtin_define ("__HS__"); \ + } \ if (TARGET_NORM) \ { \ builtin_define ("__ARC_NORM__");\ @@ -143,6 +151,8 @@ along with GCC; see the file COPYING3. If not see %{mcpu=ARC700|!mcpu=*:%{mlock}} \ %{mcpu=ARC700|!mcpu=*:%{mswape}} \ %{mcpu=ARC700|!mcpu=*:%{mrtsc}} \ +%{mcpu=ARCHS:-mHS} \ +%{mcpu=ARCEM:-mEM} \ " #if DEFAULT_LIBC == LIBC_UCLIBC @@ -246,12 +256,13 @@ along with GCC; see the file COPYING3. If not see /* Non-zero means the cpu supports norm instruction. This flag is set by default for A7, and only for pre A7 cores when -mnorm is given. */ -#define TARGET_NORM (TARGET_ARC700 || TARGET_NORM_SET) +#define TARGET_NORM (TARGET_ARC700 || TARGET_NORM_SET || TARGET_HS) /* Indicate if an optimized floating point emulation library is available. */ #define TARGET_OPTFPE \ (TARGET_ARC700 \ /* We need a barrel shifter and NORM. */ \ - || (TARGET_ARC600 && TARGET_NORM_SET)) + || (TARGET_ARC600 && TARGET_NORM_SET) \ + || TARGET_HS) /* Non-zero means the cpu supports swap instruction. This flag is set by default for A7, and only for pre A7 cores when -mswap is given. */ @@ -271,11 +282,15 @@ along with GCC; see the file COPYING3. If not see /* For an anulled-true delay slot insn for a delayed branch, should we only use conditional execution? */ -#define TARGET_AT_DBR_CONDEXEC (!TARGET_ARC700) +#define TARGET_AT_DBR_CONDEXEC (!TARGET_ARC700 && !TARGET_V2) #define TARGET_ARC600 (arc_cpu == PROCESSOR_ARC600) #define TARGET_ARC601 (arc_cpu == PROCESSOR_ARC601) #define TARGET_ARC700 (arc_cpu == PROCESSOR_ARC700) +#define TARGET_EM (arc_cpu == PROCESSOR_ARCEM) +#define TARGET_HS (arc_cpu == PROCESSOR_ARCHS) +#define TARGET_V2 \ + ((arc_cpu == PROCESSOR_ARCHS) || (arc_cpu == PROCESSOR_ARCEM)) /* Recast the cpu class to be the cpu attribute. */ #define arc_cpu_attr ((enum attr_cpu)arc_cpu) @@ -744,6 +759,7 @@ extern enum reg_class arc_regno_reg_class[]; ((unsigned) (((X) >> (SHIFT)) + 0x100) \ < 0x200 - ((unsigned) (OFFSET) >> (SHIFT))) #define SIGNED_INT12(X) ((unsigned) ((X) + 0x800) < 0x1000) +#define SIGNED_INT16(X) ((unsigned) ((X) + 0x8000) < 0x10000) #define LARGE_INT(X) \ (((X) < 0) \ ? (X) >= (-(HOST_WIDE_INT) 0x7fffffff - 1) \ @@ -1305,6 +1321,7 @@ do { \ #endif #define SET_ASM_OP "\t.set\t" +extern char rname29[], rname30[]; extern char rname56[], rname57[], rname58[], rname59[]; /* How to refer to registers in assembler output. This sequence is indexed by compiler's hard-register-number (see above). */ @@ -1312,7 +1329,7 @@ extern char rname56[], rname57[], rname58[], rname59[]; { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", \ "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", \ "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", \ - "r24", "r25", "gp", "fp", "sp", "ilink1", "ilink2", "blink", \ + "r24", "r25", "gp", "fp", "sp", rname29, rname30, "blink", \ "r32", "r33", "r34", "r35", "r36", "r37", "r38", "r39", \ "d1", "d1", "d2", "d2", "r44", "r45", "r46", "r47", \ "r48", "r49", "r50", "r51", "r52", "r53", "r54", "r55", \ @@ -1678,4 +1695,25 @@ enum #define SFUNC_CHECK_PREDICABLE \ (GET_CODE (PATTERN (insn)) != COND_EXEC || !flag_pic || !TARGET_MEDIUM_CALLS) +/* MPYW feature macro. Only valid for ARCHS and ARCEM cores. */ +#define TARGET_MPYW ((arc_mpy_option > 0) && TARGET_V2) +/* Full ARCv2 multiplication feature macro. */ +#define TARGET_MULTI ((arc_mpy_option > 1) && TARGET_V2) +/* General MPY feature macro. */ +#define TARGET_MPY ((TARGET_ARC700 && (!TARGET_NOMPY_SET)) || TARGET_MULTI) +/* ARC700 MPY feature macro. */ +#define TARGET_ARC700_MPY (TARGET_ARC700 && (!TARGET_NOMPY_SET)) +/* Any multiplication feature macro. */ +#define TARGET_ANY_MPY \ + (TARGET_MPY || TARGET_MUL64_SET || TARGET_MULMAC_32BY16_SET) + +/* ARC600 and ARC601 feature macro. */ +#define TARGET_ARC600_FAMILY (TARGET_ARC600 || TARGET_ARC601) +/* ARC600, ARC601 and ARC700 feature macro. */ +#define TARGET_ARCOMPACT_FAMILY \ + (TARGET_ARC600 || TARGET_ARC601 || TARGET_ARC700) +/* Loop count register can be read in very next instruction after has + been written to by an ordinary instruction. */ +#define TARGET_LP_WR_INTERLOCK (!TARGET_ARC600_FAMILY) + #endif /* GCC_ARC_H */ diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index e1da4d70085..1d070a30d82 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -84,6 +84,8 @@ ;; Include DFA scheduluers (include ("arc600.md")) (include ("arc700.md")) +(include ("arcEM.md")) +(include ("arcHS.md")) ;; Predicates @@ -124,6 +126,7 @@ (VUNSPEC_SR 26) ; blockage insn for writing to an auxiliary register (VUNSPEC_TRAP_S 27) ; blockage insn for trap_s generation (VUNSPEC_UNIMP_S 28) ; blockage insn for unimp_s generation + (VUNSPEC_NOP 29) ; volatile NOP (R0_REG 0) (R1_REG 1) @@ -165,7 +168,7 @@ simd_varith_with_acc, simd_vlogic, simd_vlogic_with_acc, simd_vcompare, simd_vpermute, simd_vpack, simd_vpack_with_acc, simd_valign, simd_valign_with_acc, simd_vcontrol, - simd_vspecial_3cycle, simd_vspecial_4cycle, simd_dma" + simd_vspecial_3cycle, simd_vspecial_4cycle, simd_dma, mul16_em, div_rem" (cond [(eq_attr "is_sfunc" "yes") (cond [(match_test "!TARGET_LONG_CALLS_SET && (!TARGET_MEDIUM_CALLS || GET_CODE (PATTERN (insn)) != COND_EXEC)") (const_string "call") (match_test "flag_pic") (const_string "sfunc")] @@ -188,7 +191,7 @@ ;; Attribute describing the processor -(define_attr "cpu" "none,ARC600,ARC700" +(define_attr "cpu" "none,ARC600,ARC700,ARCEM,ARCHS" (const (symbol_ref "arc_cpu_attr"))) ;; true for compact instructions (those with _s suffix) @@ -226,8 +229,21 @@ (symbol_ref "get_attr_length (NEXT_INSN (PREV_INSN (insn))) - get_attr_length (insn)"))) +; for ARCv2 we need to disable/enable different instruction alternatives +(define_attr "cpu_facility" "std,av1,av2" + (const_string "std")) -(define_attr "enabled" "no,yes" (const_string "yes")) +; We should consider all the instructions enabled until otherwise +(define_attr "enabled" "no,yes" + (cond [(and (eq_attr "cpu_facility" "av1") + (match_test "TARGET_V2")) + (const_string "no") + + (and (eq_attr "cpu_facility" "av2") + (not (match_test "TARGET_V2"))) + (const_string "no") + ] + (const_string "yes"))) (define_attr "predicable" "no,yes" (const_string "no")) ;; if 'predicable' were not so brain-dead, we would specify: @@ -580,7 +596,8 @@ stb%U0%V0 %1,%0" [(set_attr "type" "move,move,move,move,move,move,move,load,store,load,load,store,store") (set_attr "iscompact" "maybe,maybe,maybe,false,false,false,false,true,true,true,false,false,false") - (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,no,no,no,no,no,no")]) + (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,no,no,no,no,no,no") + (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*")]) (define_expand "movhi" [(set (match_operand:HI 0 "move_dest_operand" "") @@ -607,15 +624,16 @@ mov%? %0,%1 mov%? %0,%S1%& mov%? %0,%S1 - ldw%? %0,%1%& - stw%? %1,%0%& - ldw%U1%V1 %0,%1 - stw%U0%V0 %1,%0 - stw%U0%V0 %1,%0 - stw%U0%V0 %S1,%0" + ld%_%? %0,%1%& + st%_%? %1,%0%& + ld%_%U1%V1 %0,%1 + st%_%U0%V0 %1,%0 + st%_%U0%V0 %1,%0 + st%_%U0%V0 %S1,%0" [(set_attr "type" "move,move,move,move,move,move,move,move,load,store,load,store,store,store") (set_attr "iscompact" "maybe,maybe,maybe,false,false,false,maybe_limm,false,true,true,false,false,false,false") - (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,yes,no,no,no,no,no,no")]) + (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,yes,no,no,no,no,no,no") + (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*,*")]) (define_expand "movsi" [(set (match_operand:SI 0 "move_dest_operand" "") @@ -669,7 +687,8 @@ ; Use default length for iscompact to allow for COND_EXEC. But set length ; of Crr to 4. (set_attr "length" "*,*,*,4,4,4,4,8,8,*,8,*,*,*,*,*,*,*,*,8") - (set_attr "predicable" "yes,no,yes,yes,no,no,yes,no,no,yes,yes,no,no,no,no,no,no,no,no,no")]) + (set_attr "predicable" "yes,no,yes,yes,no,no,yes,no,no,yes,yes,no,no,no,no,no,no,no,no,no") + (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")]) ;; Sometimes generated by the epilogue code. We don't want to ;; recognize these addresses in general, because the limm is costly, @@ -698,7 +717,7 @@ (define_insn_and_split "*movsi_set_cc_insn" [(set (match_operand:CC_ZN 2 "cc_set_register" "") - (match_operator 3 "zn_compare_operator" + (match_operator:CC_ZN 3 "zn_compare_operator" [(match_operand:SI 1 "nonmemory_operand" "cI,cL,Cal") (const_int 0)])) (set (match_operand:SI 0 "register_operand" "=w,w,w") (match_dup 1))] @@ -715,7 +734,7 @@ (define_insn "unary_comparison" [(set (match_operand:CC_ZN 0 "cc_set_register" "") - (match_operator 3 "zn_compare_operator" + (match_operator:CC_ZN 3 "zn_compare_operator" [(match_operator:SI 2 "unary_operator" [(match_operand:SI 1 "register_operand" "c")]) (const_int 0)]))] @@ -779,7 +798,7 @@ (define_insn "*commutative_binary_comparison" [(set (match_operand:CC_ZN 0 "cc_set_register" "") - (match_operator 5 "zn_compare_operator" + (match_operator:CC_ZN 5 "zn_compare_operator" [(match_operator:SI 4 "commutative_operator" [(match_operand:SI 1 "register_operand" "%c,c,c") (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")]) @@ -857,7 +876,7 @@ ; Make sure to use the W class to not touch LP_COUNT. (set (match_operand:SI 0 "register_operand" "=W,W,W") (match_dup 4))] - "TARGET_ARC700" + "!TARGET_ARC600_FAMILY" "%O4.f %0,%1,%2 ; mult commutative" [(set_attr "type" "compare,compare,compare") (set_attr "cond" "set_zn,set_zn,set_zn") @@ -881,7 +900,7 @@ (define_insn "*noncommutative_binary_comparison" [(set (match_operand:CC_ZN 0 "cc_set_register" "") - (match_operator 5 "zn_compare_operator" + (match_operator:CC_ZN 5 "zn_compare_operator" [(match_operator:SI 4 "noncommutative_operator" [(match_operand:SI 1 "register_operand" "c,c,c") (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")]) @@ -1145,7 +1164,7 @@ (set (match_operand:SI 0 "dest_reg_operand" "=w,w") (plus:SI (match_dup 1) (match_dup 2)))] "" - "ldw.a%V4 %3,[%0,%S2]" + "ld%_.a%V4 %3,[%0,%S2]" [(set_attr "type" "load,load") (set_attr "length" "4,8")]) @@ -1157,7 +1176,7 @@ (set (match_operand:SI 0 "dest_reg_operand" "=r,r") (plus:SI (match_dup 1) (match_dup 2)))] "" - "ldw.a%V4 %3,[%0,%S2]" + "ld%_.a%V4 %3,[%0,%S2]" [(set_attr "type" "load,load") (set_attr "length" "4,8")]) @@ -1170,7 +1189,7 @@ (set (match_operand:SI 0 "dest_reg_operand" "=w,w") (plus:SI (match_dup 1) (match_dup 2)))] "" - "ldw.x.a%V4 %3,[%0,%S2]" + "ld%_.x.a%V4 %3,[%0,%S2]" [(set_attr "type" "load,load") (set_attr "length" "4,8")]) @@ -1182,7 +1201,7 @@ (set (match_operand:SI 0 "dest_reg_operand" "=w") (plus:SI (match_dup 1) (match_dup 2)))] "" - "stw.a%V4 %3,[%0,%2]" + "st%_.a%V4 %3,[%0,%2]" [(set_attr "type" "store") (set_attr "length" "4")]) @@ -1283,7 +1302,7 @@ && satisfies_constraint_Rcq (operands[0])) return "sub%?.ne %0,%0,%0"; /* ??? might be good for speed on ARC600 too, *if* properly scheduled. */ - if ((TARGET_ARC700 || optimize_size) + if ((optimize_size && (!TARGET_ARC600_FAMILY)) && rtx_equal_p (operands[1], constm1_rtx) && GET_CODE (operands[3]) == LTU) return "sbc.cs %0,%0,%0"; @@ -1435,13 +1454,13 @@ (zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,Usd,m")))] "" "@ - extw%? %0,%1%& - extw%? %0,%1%& + ext%_%? %0,%1%& + ext%_%? %0,%1%& bmsk%? %0,%1,15 - extw %0,%1 - ldw%? %0,%1%& - ldw%U1 %0,%1 - ldw%U1%V1 %0,%1" + ext%_ %0,%1 + ld%_%? %0,%1%& + ld%_%U1 %0,%1 + ld%_%U1%V1 %0,%1" [(set_attr "type" "unary,unary,unary,unary,load,load,load") (set_attr "iscompact" "maybe,true,false,false,true,false,false") (set_attr "predicable" "no,no,yes,no,no,no,no")]) @@ -1498,9 +1517,9 @@ (sign_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "Rcqq,c,m")))] "" "@ - sexw%? %0,%1%& - sexw %0,%1 - ldw.x%U1%V1 %0,%1" + sex%_%? %0,%1%& + sex%_ %0,%1 + ld%_.x%U1%V1 %0,%1" [(set_attr "type" "unary,unary,load") (set_attr "iscompact" "true,false,false")]) @@ -1604,7 +1623,88 @@ (set_attr "cond" "canuse,canuse,canuse,canuse,canuse,canuse,nocond,canuse,nocond,nocond,nocond,nocond,canuse_limm,canuse_limm,canuse,canuse,nocond") ]) -;; ARC700/ARC600 multiply +;; ARCv2 MPYW and MPYUW +(define_expand "mulhisi3" + [(set (match_operand:SI 0 "register_operand" "") + (mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" "")) + (sign_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))] + "TARGET_MPYW" + "{ + if (CONSTANT_P (operands[2])) + { + emit_insn (gen_mulhisi3_imm (operands[0], operands[1], operands[2])); + DONE; + } + }" +) + +(define_insn "mulhisi3_imm" + [(set (match_operand:SI 0 "register_operand" "=r,r,r, r, r") + (mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" "0,r,0, 0, r")) + (match_operand:HI 2 "short_const_int_operand" "L,L,I,C16,C16")))] + "TARGET_MPYW" + "mpyw%? %0,%1,%2" + [(set_attr "length" "4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "mul16_em") + (set_attr "predicable" "yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse_limm,nocond") + ]) + +(define_insn "mulhisi3_reg" + [(set (match_operand:SI 0 "register_operand" "=Rcqq,r,r") + (mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" " 0,0,r")) + (sign_extend:SI (match_operand:HI 2 "nonmemory_operand" "Rcqq,r,r"))))] + "TARGET_MPYW" + "mpyw%? %0,%1,%2" + [(set_attr "length" "*,4,4") + (set_attr "iscompact" "maybe,false,false") + (set_attr "type" "mul16_em") + (set_attr "predicable" "yes,yes,no") + (set_attr "cond" "canuse,canuse,nocond") + ]) + +(define_expand "umulhisi3" + [(set (match_operand:SI 0 "register_operand" "") + (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "")) + (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))] + "TARGET_MPYW" + "{ + if (CONSTANT_P (operands[2])) + { + emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2])); + DONE; + } + }" +) + +(define_insn "umulhisi3_imm" + [(set (match_operand:SI 0 "register_operand" "=r, r,r, r, r") + (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0, 0, r")) + (match_operand:HI 2 "short_const_int_operand" " L, L,I,C16,C16")))] + "TARGET_MPYW" + "mpyuw%? %0,%1,%2" + [(set_attr "length" "4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "mul16_em") + (set_attr "predicable" "yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse_limm,nocond") + ]) + +(define_insn "umulhisi3_reg" + [(set (match_operand:SI 0 "register_operand" "=Rcqq, r, r") + (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, 0, r")) + (zero_extend:SI (match_operand:HI 2 "register_operand" " Rcqq, r, r"))))] + "TARGET_MPYW" + "mpyuw%? %0,%1,%2" + [(set_attr "length" "*,4,4") + (set_attr "iscompact" "maybe,false,false") + (set_attr "type" "mul16_em") + (set_attr "predicable" "yes,yes,no") + (set_attr "cond" "canuse,canuse,nocond") + ]) + +;; ARC700/ARC600/V2 multiply ;; SI <- SI * SI (define_expand "mulsi3" @@ -1613,7 +1713,7 @@ (match_operand:SI 2 "nonmemory_operand" "")))] "" { - if (TARGET_ARC700 && !TARGET_NOMPY_SET) + if (TARGET_MPY) { if (!register_operand (operands[0], SImode)) { @@ -1743,8 +1843,7 @@ (clobber (reg:SI LP_START)) (clobber (reg:SI LP_END)) (clobber (reg:CC CC_REG))] - "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET - && (!TARGET_ARC700 || TARGET_NOMPY_SET) + "!TARGET_ANY_MPY && SFUNC_CHECK_PREDICABLE" "*return arc_output_libcall (\"__mulsi3\");" [(set_attr "is_sfunc" "yes") @@ -1794,23 +1893,35 @@ [(set (match_operand:SI 0 "mpy_dest_reg_operand" "=Rcr,r,r,Rcr,r") (mult:SI (match_operand:SI 1 "register_operand" " 0,c,0,0,c") (match_operand:SI 2 "nonmemory_operand" "cL,cL,I,Cal,Cal")))] -"TARGET_ARC700 && !TARGET_NOMPY_SET" + "TARGET_ARC700_MPY" "mpyu%? %0,%1,%2" [(set_attr "length" "4,4,4,8,8") (set_attr "type" "umulti") (set_attr "predicable" "yes,no,no,yes,no") (set_attr "cond" "canuse,nocond,canuse_limm,canuse,nocond")]) +; ARCv2 has no penalties between mpy and mpyu. So, we use mpy because of its +; short variant. LP_COUNT constraints are still valid. +(define_insn "mulsi3_v2" + [(set (match_operand:SI 0 "mpy_dest_reg_operand" "=Rcqq,Rcr, r,r,Rcr, r") + (mult:SI (match_operand:SI 1 "register_operand" "%0, 0, c,0, 0, c") + (match_operand:SI 2 "nonmemory_operand" " Rcqq, cL,cL,I,Cal,Cal")))] + "TARGET_MULTI" + "mpy%? %0,%1,%2" + [(set_attr "length" "*,4,4,4,8,8") + (set_attr "iscompact" "maybe,false,false,false,false,false") + (set_attr "type" "umulti") + (set_attr "predicable" "no,yes,no,no,yes,no") + (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")]) + (define_expand "mulsidi3" [(set (match_operand:DI 0 "nonimmediate_operand" "") (mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" "")) (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] - "(TARGET_ARC700 && !TARGET_NOMPY_SET) - || TARGET_MUL64_SET - || TARGET_MULMAC_32BY16_SET" + "TARGET_ANY_MPY" " { - if (TARGET_ARC700 && !TARGET_NOMPY_SET) + if (TARGET_MPY) { operands[2] = force_reg (SImode, operands[2]); if (!register_operand (operands[0], DImode)) @@ -1892,7 +2003,7 @@ [(set (match_operand:DI 0 "register_operand" "=&r") (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%c")) (sign_extend:DI (match_operand:SI 2 "extend_operand" "cL"))))] - "TARGET_ARC700 && !TARGET_NOMPY_SET" + "TARGET_MPY" "#" "&& reload_completed" [(const_int 0)] @@ -1902,7 +2013,7 @@ rtx l0 = simplify_gen_subreg (word_mode, operands[0], DImode, lo); rtx h0 = simplify_gen_subreg (word_mode, operands[0], DImode, hi); emit_insn (gen_mulsi3_highpart (h0, operands[1], operands[2])); - emit_insn (gen_mulsi3_700 (l0, operands[1], operands[2])); + emit_insn (gen_mulsi3 (l0, operands[1], operands[2])); DONE; } [(set_attr "type" "multi") @@ -1916,8 +2027,8 @@ (sign_extend:DI (match_operand:SI 1 "register_operand" "%0,c, 0,c")) (sign_extend:DI (match_operand:SI 2 "extend_operand" "c,c, i,i"))) (const_int 32))))] - "TARGET_ARC700 && !TARGET_NOMPY_SET" - "mpyh%? %0,%1,%2" + "TARGET_MPY" + "mpy%+%? %0,%1,%2" [(set_attr "length" "4,4,8,8") (set_attr "type" "multi") (set_attr "predicable" "yes,no,yes,no") @@ -1933,8 +2044,8 @@ (zero_extend:DI (match_operand:SI 1 "register_operand" "%0,c, 0,c")) (zero_extend:DI (match_operand:SI 2 "extend_operand" "c,c, i,i"))) (const_int 32))))] - "TARGET_ARC700 && !TARGET_NOMPY_SET" - "mpyhu%? %0,%1,%2" + "TARGET_MPY" + "mpy%+u%? %0,%1,%2" [(set_attr "length" "4,4,8,8") (set_attr "type" "multi") (set_attr "predicable" "yes,no,yes,no") @@ -1956,8 +2067,7 @@ (clobber (reg:DI MUL64_OUT_REG)) (clobber (reg:CC CC_REG))] "!TARGET_BIG_ENDIAN - && !TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET - && (!TARGET_ARC700 || TARGET_NOMPY_SET) + && !TARGET_ANY_MPY && SFUNC_CHECK_PREDICABLE" "*return arc_output_libcall (\"__umulsi3_highpart\");" [(set_attr "is_sfunc" "yes") @@ -1977,8 +2087,7 @@ (clobber (reg:DI MUL64_OUT_REG)) (clobber (reg:CC CC_REG))] "TARGET_BIG_ENDIAN - && !TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET - && (!TARGET_ARC700 || TARGET_NOMPY_SET) + && !TARGET_ANY_MPY && SFUNC_CHECK_PREDICABLE" "*return arc_output_libcall (\"__umulsi3_highpart\");" [(set_attr "is_sfunc" "yes") @@ -1995,8 +2104,8 @@ (zero_extend:DI (match_operand:SI 1 "register_operand" " 0, c, 0, 0, c")) (match_operand:DI 2 "immediate_usidi_operand" "L, L, I, Cal, Cal")) (const_int 32))))] - "TARGET_ARC700 && !TARGET_NOMPY_SET" - "mpyhu%? %0,%1,%2" + "TARGET_MPY" + "mpy%+u%? %0,%1,%2" [(set_attr "length" "4,4,4,8,8") (set_attr "type" "multi") (set_attr "predicable" "yes,no,no,yes,no") @@ -2010,12 +2119,12 @@ (zero_extend:DI (match_operand:SI 1 "register_operand" "")) (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""))) (const_int 32))))] - "TARGET_ARC700 || (!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET)" + "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET" " { rtx target = operands[0]; - if (!TARGET_ARC700 || TARGET_NOMPY_SET) + if (!TARGET_MPY) { emit_move_insn (gen_rtx_REG (SImode, 0), operands[1]); emit_move_insn (gen_rtx_REG (SImode, 1), operands[2]); @@ -2047,7 +2156,7 @@ (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))] "" { - if (TARGET_ARC700 && !TARGET_NOMPY_SET) + if (TARGET_MPY) { operands[2] = force_reg (SImode, operands[2]); if (!register_operand (operands[0], DImode)) @@ -2141,7 +2250,7 @@ [(set (match_operand:DI 0 "dest_reg_operand" "=&r") (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%c")) (zero_extend:DI (match_operand:SI 2 "extend_operand" "cL"))))] - "TARGET_ARC700 && !TARGET_NOMPY_SET" + "TARGET_MPY" "#" "reload_completed" [(const_int 0)] @@ -2151,7 +2260,7 @@ rtx l0 = operand_subword (operands[0], lo, 0, DImode); rtx h0 = operand_subword (operands[0], hi, 0, DImode); emit_insn (gen_umulsi3_highpart (h0, operands[1], operands[2])); - emit_insn (gen_mulsi3_700 (l0, operands[1], operands[2])); + emit_insn (gen_mulsi3 (l0, operands[1], operands[2])); DONE; } [(set_attr "type" "umulti") @@ -2166,8 +2275,7 @@ (clobber (reg:SI R12_REG)) (clobber (reg:DI MUL64_OUT_REG)) (clobber (reg:CC CC_REG))] - "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET - && (!TARGET_ARC700 || TARGET_NOMPY_SET) + "!TARGET_ANY_MPY && SFUNC_CHECK_PREDICABLE" "*return arc_output_libcall (\"__umulsidi3\");" [(set_attr "is_sfunc" "yes") @@ -2183,8 +2291,7 @@ (clobber (reg:SI R12_REG)) (clobber (reg:DI MUL64_OUT_REG)) (clobber (reg:CC CC_REG))])] - "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET - && (!TARGET_ARC700 || TARGET_NOMPY_SET) + "!TARGET_ANY_MPY && peep2_regno_dead_p (1, TARGET_BIG_ENDIAN ? R1_REG : R0_REG)" [(pc)] { @@ -2350,7 +2457,7 @@ adc %0,%1,%2" ; if we have a bad schedule after sched2, split. "reload_completed - && !optimize_size && TARGET_ARC700 + && !optimize_size && (!TARGET_ARC600_FAMILY) && arc_scheduling_not_expected () && arc_sets_cc_p (prev_nonnote_insn (insn)) /* If next comes a return or other insn that needs a delay slot, @@ -2564,7 +2671,7 @@ sbc %0,%1,%2" ; if we have a bad schedule after sched2, split. "reload_completed - && !optimize_size && TARGET_ARC700 + && !optimize_size && (!TARGET_ARC600_FAMILY) && arc_scheduling_not_expected () && arc_sets_cc_p (prev_nonnote_insn (insn)) /* If next comes a return or other insn that needs a delay slot, @@ -2802,7 +2909,7 @@ return \"bclr%? %0,%1,%M2%&\"; case 4: return (INTVAL (operands[2]) == 0xff - ? \"extb%? %0,%1%&\" : \"extw%? %0,%1%&\"); + ? \"extb%? %0,%1%&\" : \"ext%_%? %0,%1%&\"); case 9: case 14: return \"bic%? %0,%1,%n2-1\"; case 18: if (TARGET_BIG_ENDIAN) @@ -2813,11 +2920,11 @@ xop[1] = adjust_address (operands[1], QImode, INTVAL (operands[2]) == 0xff ? 3 : 2); output_asm_insn (INTVAL (operands[2]) == 0xff - ? \"ldb %0,%1\" : \"ldw %0,%1\", + ? \"ldb %0,%1\" : \"ld%_ %0,%1\", xop); return \"\"; } - return INTVAL (operands[2]) == 0xff ? \"ldb %0,%1\" : \"ldw %0,%1\"; + return INTVAL (operands[2]) == 0xff ? \"ldb %0,%1\" : \"ld%_ %0,%1\"; default: gcc_unreachable (); } @@ -3196,19 +3303,19 @@ ;; Next come the scc insns. (define_expand "cstoresi4" - [(set (reg:CC CC_REG) - (compare:CC (match_operand:SI 2 "nonmemory_operand" "") - (match_operand:SI 3 "nonmemory_operand" ""))) - (set (match_operand:SI 0 "dest_reg_operand" "") - (match_operator:SI 1 "ordered_comparison_operator" [(reg CC_REG) - (const_int 0)]))] + [(set (match_operand:SI 0 "dest_reg_operand" "") + (match_operator:SI 1 "ordered_comparison_operator" [(match_operand:SI 2 "nonmemory_operand" "") + (match_operand:SI 3 "nonmemory_operand" "")]))] "" { - gcc_assert (XEXP (operands[1], 0) == operands[2]); - gcc_assert (XEXP (operands[1], 1) == operands[3]); - operands[1] = gen_compare_reg (operands[1], SImode); - emit_insn (gen_scc_insn (operands[0], operands[1])); - DONE; + if (!TARGET_CODE_DENSITY) + { + gcc_assert (XEXP (operands[1], 0) == operands[2]); + gcc_assert (XEXP (operands[1], 1) == operands[3]); + operands[1] = gen_compare_reg (operands[1], SImode); + emit_insn (gen_scc_insn (operands[0], operands[1])); + DONE; + } }) (define_mode_iterator SDF [SF DF]) @@ -3590,8 +3697,8 @@ return \"ld.as %0,[%1,%2]%&\"; case HImode: if (ADDR_DIFF_VEC_FLAGS (diff_vec).offset_unsigned) - return \"ldw.as %0,[%1,%2]\"; - return \"ldw.x.as %0,[%1,%2]\"; + return \"ld%_.as %0,[%1,%2]\"; + return \"ld%_.x.as %0,[%1,%2]\"; case QImode: if (ADDR_DIFF_VEC_FLAGS (diff_vec).offset_unsigned) return \"ldb%? %0,[%1,%2]%&\"; @@ -3658,7 +3765,7 @@ 2 of these are for alignment, and are anticipated in the length of the ADDR_DIFF_VEC. */ if (unalign && !satisfies_constraint_Rcq (xop[0])) - s = \"add2 %2,pcl,%0\n\tld_s%2,[%2,12]\"; + s = \"add2 %2,pcl,%0\n\tld_s %2,[%2,12]\"; else if (unalign) s = \"add_s %2,%0,2\n\tld.as %2,[pcl,%2]\"; else @@ -3670,12 +3777,12 @@ { if (satisfies_constraint_Rcq (xop[0])) { - s = \"add_s %2,%0,%1\n\tldw.as %2,[pcl,%2]\"; + s = \"add_s %2,%0,%1\n\tld%_.as %2,[pcl,%2]\"; xop[1] = GEN_INT ((10 - unalign) / 2U); } else { - s = \"add1 %2,pcl,%0\n\tldw_s %2,[%2,%1]\"; + s = \"add1 %2,pcl,%0\n\tld%__s %2,[%2,%1]\"; xop[1] = GEN_INT (10 + unalign); } } @@ -3683,12 +3790,12 @@ { if (satisfies_constraint_Rcq (xop[0])) { - s = \"add_s %2,%0,%1\n\tldw.x.as %2,[pcl,%2]\"; + s = \"add_s %2,%0,%1\n\tld%_.x.as %2,[pcl,%2]\"; xop[1] = GEN_INT ((10 - unalign) / 2U); } else { - s = \"add1 %2,pcl,%0\n\tldw_s.x %2,[%2,%1]\"; + s = \"add1 %2,pcl,%0\n\tld%__s.x %2,[%2,%1]\"; xop[1] = GEN_INT (10 + unalign); } } @@ -3886,6 +3993,14 @@ (set_attr "cond" "canuse") (set_attr "length" "2")]) +(define_insn "nopv" + [(unspec_volatile [(const_int 0)] VUNSPEC_NOP)] + "" + "nop%?" + [(set_attr "type" "misc") + (set_attr "iscompact" "true") + (set_attr "length" "2")]) + ;; Special pattern to flush the icache. ;; ??? Not sure what to do here. Some ARC's are known to support this. @@ -3985,7 +4100,7 @@ (set (match_operand:SI 4 "register_operand" "") (mult:SI (match_operand:SI 2 "register_operand") (match_operand:SI 3 "nonmemory_operand" "")))] - "TARGET_ARC700 && !TARGET_NOMPY_SET + "TARGET_ARC700_MPY && (rtx_equal_p (operands[0], operands[2]) || rtx_equal_p (operands[0], operands[3])) && peep2_regno_dead_p (0, CC_REG) @@ -4015,7 +4130,7 @@ (set (match_operand:SI 4 "register_operand" "") (mult:SI (match_operand:SI 2 "register_operand") (match_operand:SI 3 "nonmemory_operand" "")))] - "TARGET_ARC700 && !TARGET_NOMPY_SET + "TARGET_ARC700_MPY && (rtx_equal_p (operands[0], operands[2]) || rtx_equal_p (operands[0], operands[3])) && peep2_regno_dead_p (2, CC_REG)" @@ -4068,8 +4183,8 @@ (clrsb:HI (match_operand:HI 1 "general_operand" "cL,Cal"))))] "TARGET_NORM" "@ - normw \t%0, %1 - normw \t%0, %S1" + norm%_ \t%0, %1 + norm%_ \t%0, %S1" [(set_attr "length" "4,8") (set_attr "type" "two_cycle_core,two_cycle_core")]) @@ -4479,6 +4594,11 @@ = gen_rtx_REG (Pmode, arc_return_address_regs[arc_compute_function_type (cfun)]); + if (arc_compute_function_type (cfun) == ARC_FUNCTION_ILINK1 + && TARGET_V2) + { + return \"rtie\"; + } if (TARGET_PAD_RETURN) arc_pad_return (); output_asm_insn (\"j%!%* [%0]%&\", ®); @@ -4487,8 +4607,13 @@ [(set_attr "type" "return") ; predicable won't help here since the canonical rtl looks different ; for branches. - (set_attr "cond" "canuse") - (set (attr "iscompact") + (set (attr "cond") + (cond [(and (eq (symbol_ref "arc_compute_function_type (cfun)") + (symbol_ref "ARC_FUNCTION_ILINK1")) + (match_test "TARGET_V2")) + (const_string "nocond")] + (const_string "canuse"))) + (set (attr "iscompact") (cond [(eq (symbol_ref "arc_compute_function_type (cfun)") (symbol_ref "ARC_FUNCTION_NORMAL")) (const_string "maybe")] @@ -4504,7 +4629,9 @@ (if_then_else (match_operator 0 "proper_comparison_operator" [(reg CC_REG) (const_int 0)]) (simple_return) (pc)))] - "reload_completed" + "reload_completed + && !(TARGET_V2 + && arc_compute_function_type (cfun) == ARC_FUNCTION_ILINK1)" { rtx xop[2]; xop[0] = operands[0]; @@ -4909,7 +5036,7 @@ (define_expand "doloop_end" [(use (match_operand 0 "register_operand" "")) (use (label_ref (match_operand 1 "" "")))] - "TARGET_ARC600 || TARGET_ARC700" + "!TARGET_ARC601" { /* We could do smaller bivs with biv widening, and wider bivs by having a high-word counter in an outer loop - but punt on this for now. */ @@ -5158,6 +5285,247 @@ ;; this would not work right for -0. OTOH optabs.c has already code ;; to synthesyze negate by flipping the sign bit. +;;V2 instructions +(define_insn "bswapsi2" + [(set (match_operand:SI 0 "register_operand" "= r,r") + (bswap:SI (match_operand:SI 1 "nonmemory_operand" "rL,Cal")))] + "TARGET_V2 && TARGET_SWAP" + "swape %0, %1" + [(set_attr "length" "4,8") + (set_attr "type" "two_cycle_core")]) + +(define_expand "prefetch" + [(prefetch (match_operand:SI 0 "address_operand" "") + (match_operand:SI 1 "const_int_operand" "") + (match_operand:SI 2 "const_int_operand" ""))] + "TARGET_HS" + "") + +(define_insn "prefetch_1" + [(prefetch (match_operand:SI 0 "register_operand" "r") + (match_operand:SI 1 "const_int_operand" "n") + (match_operand:SI 2 "const_int_operand" "n"))] + "TARGET_HS" + { + if (INTVAL (operands[1])) + return "prefetchw [%0]"; + else + return "prefetch [%0]"; + } + [(set_attr "type" "load") + (set_attr "length" "4")]) + +(define_insn "prefetch_2" + [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r") + (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal")) + (match_operand:SI 2 "const_int_operand" "n,n,n") + (match_operand:SI 3 "const_int_operand" "n,n,n"))] + "TARGET_HS" + { + if (INTVAL (operands[2])) + return "prefetchw [%0, %1]"; + else + return "prefetch [%0, %1]"; + } + [(set_attr "type" "load") + (set_attr "length" "4,4,8")]) + +(define_insn "prefetch_3" + [(prefetch (match_operand:SI 0 "address_operand" "p") + (match_operand:SI 1 "const_int_operand" "n") + (match_operand:SI 2 "const_int_operand" "n"))] + "TARGET_HS" + { + operands[0] = gen_rtx_MEM (SImode, operands[0]); + if (INTVAL (operands[1])) + return "prefetchw%U0 %0"; + else + return "prefetch%U0 %0"; + } + [(set_attr "type" "load") + (set_attr "length" "8")]) + +(define_insn "divsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r,r,r, r, r") + (div:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r, r,L,L,I,Cal,Cal")))] + "TARGET_DIVREM" + "div%? %0, %1, %2" + [(set_attr "length" "4,4,8,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "div_rem") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +(define_insn "udivsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r,r,r, r, r") + (udiv:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r, r,L,L,I,Cal,Cal")))] + "TARGET_DIVREM" + "divu%? %0, %1, %2" + [(set_attr "length" "4,4,8,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "div_rem") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +(define_insn "modsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r,r,r, r, r") + (mod:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r, r,L,L,I,Cal,Cal")))] + "TARGET_DIVREM" + "rem%? %0, %1, %2" + [(set_attr "length" "4,4,8,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "div_rem") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +(define_insn "umodsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r,r,r, r, r") + (umod:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r, r,L,L,I,Cal,Cal")))] + "TARGET_DIVREM" + "remu%? %0, %1, %2" + [(set_attr "length" "4,4,8,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "div_rem") + (set_attr "predicable" "yes,no,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +;; SETcc instructions +(define_code_iterator arcCC_cond [eq ne gt lt ge le]) + +(define_insn "arcset<code>" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") + (arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,0,r") + (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))] + "TARGET_V2 && TARGET_CODE_DENSITY" + "set<code>%? %0, %1, %2" + [(set_attr "length" "4,4,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "compare") + (set_attr "predicable" "yes,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +(define_insn "arcsetltu" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") + (ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] + "TARGET_V2 && TARGET_CODE_DENSITY" + "setlo%? %0, %1, %2" + [(set_attr "length" "4,4,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "compare") + (set_attr "predicable" "yes,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +(define_insn "arcsetgeu" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r, r, r") + (geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0, 0, r") + (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I, n, n")))] + "TARGET_V2 && TARGET_CODE_DENSITY" + "seths%? %0, %1, %2" + [(set_attr "length" "4,4,4,4,4,8,8") + (set_attr "iscompact" "false") + (set_attr "type" "compare") + (set_attr "predicable" "yes,no,yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond") + ]) + +;; Special cases of SETCC +(define_insn_and_split "arcsethi" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") + (gtu:SI (match_operand:SI 1 "nonmemory_operand" "r,r, r,r") + (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] + "TARGET_V2 && TARGET_CODE_DENSITY" + "setlo%? %0, %2, %1" + "reload_completed + && CONST_INT_P (operands[2]) + && satisfies_constraint_C62 (operands[2])" + [(const_int 0)] + "{ + /* sethi a,b,u6 => seths a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2])); + DONE; + }" + [(set_attr "length" "4,4,4,8") + (set_attr "iscompact" "false") + (set_attr "type" "compare") + (set_attr "predicable" "yes,no,no,no") + (set_attr "cond" "canuse,nocond,nocond,nocond")] +) + +(define_insn_and_split "arcsetls" + [(set (match_operand:SI 0 "register_operand" "=r,r, r,r") + (leu:SI (match_operand:SI 1 "nonmemory_operand" "r,r, r,r") + (match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))] + "TARGET_V2 && TARGET_CODE_DENSITY" + "seths%? %0, %2, %1" + "reload_completed + && CONST_INT_P (operands[2]) + && satisfies_constraint_C62 (operands[2])" + [(const_int 0)] + "{ + /* setls a,b,u6 => setlo a,b,u6 + 1. */ + operands[2] = GEN_INT (INTVAL (operands[2]) + 1); + emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2])); + DONE; + }" + [(set_attr "length" "4,4,4,8") + (set_attr "iscompact" "false") + (set_attr "type" "compare") + (set_attr "predicable" "yes,no,no,no") + (set_attr "cond" "canuse,nocond,nocond,nocond")] +) + +; Any mode that needs to be solved by secondary reload +(define_mode_iterator SRI [QI HI]) + +(define_expand "reload_<mode>_load" + [(parallel [(match_operand:SRI 0 "register_operand" "=r") + (match_operand:SRI 1 "memory_operand" "m") + (match_operand:SI 2 "register_operand" "=&r")])] + "" +{ + arc_secondary_reload_conv (operands[0], operands[1], operands[2], false); + DONE; +}) + +(define_expand "reload_<mode>_store" + [(parallel [(match_operand:SRI 0 "memory_operand" "=m") + (match_operand:SRI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "=&r")])] + "" +{ + arc_secondary_reload_conv (operands[1], operands[0], operands[2], true); + DONE; +}) + + +(define_insn "extzvsi" + [(set (match_operand:SI 0 "register_operand" "=r , r , r, r, r") + (zero_extract:SI (match_operand:SI 1 "register_operand" "0 , r , 0, 0, r") + (match_operand:SI 2 "const_int_operand" "C3p, C3p, i, i, i") + (match_operand:SI 3 "const_int_operand" "i , i , i, i, i")))] + "TARGET_HS && TARGET_BARREL_SHIFTER" + { + int assemble_op2 = (((INTVAL (operands[2]) - 1) & 0x1f) << 5) | (INTVAL (operands[3]) & 0x1f); + operands[2] = GEN_INT (assemble_op2); + return "xbfu%? %0,%1,%2"; + } + [(set_attr "type" "shift") + (set_attr "iscompact" "false") + (set_attr "length" "4,4,4,8,8") + (set_attr "predicable" "yes,no,no,yes,no") + (set_attr "cond" "canuse,nocond,nocond,canuse,nocond")]) ;; include the arc-FPX instructions (include "fpx.md") diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt index 29e89f93d15..0c10c67c4e7 100644 --- a/gcc/config/arc/arc.opt +++ b/gcc/config/arc/arc.opt @@ -53,6 +53,18 @@ mARC700 Target Report Same as -mA7. +mmpy-option= +Target RejectNegative Joined UInteger Var(arc_mpy_option) Init(2) +-mmpy-option={0,1,2,3,4,5,6,7,8,9} Compile ARCv2 code with a multiplier design option. Option 2 is default on. + +mdiv-rem +Target Report Mask(DIVREM) +Enable DIV-REM instructions for ARCv2 + +mcode-density +Target Report Mask(CODE_DENSITY) +Enable code density instructions for ARCv2 + mmixed-code Target Report Mask(MIXED_CODE_SET) Tweak register allocation to help 16-bit instruction generation. @@ -162,11 +174,32 @@ EnumValue Enum(processor_type) String(ARC600) Value(PROCESSOR_ARC600) EnumValue +Enum(processor_type) String(arc600) Value(PROCESSOR_ARC600) + +EnumValue Enum(processor_type) String(ARC601) Value(PROCESSOR_ARC601) EnumValue +Enum(processor_type) String(arc601) Value(PROCESSOR_ARC601) + +EnumValue Enum(processor_type) String(ARC700) Value(PROCESSOR_ARC700) +EnumValue +Enum(processor_type) String(arc700) Value(PROCESSOR_ARC700) + +EnumValue +Enum(processor_type) String(ARCEM) Value(PROCESSOR_ARCEM) + +EnumValue +Enum(processor_type) String(arcem) Value(PROCESSOR_ARCEM) + +EnumValue +Enum(processor_type) String(ARCHS) Value(PROCESSOR_ARCHS) + +EnumValue +Enum(processor_type) String(archs) Value(PROCESSOR_ARCHS) + msize-level= Target RejectNegative Joined UInteger Var(arc_size_opt_level) Init(-1) size optimization level: 0:none 1:opportunistic 2: regalloc 3:drop align, -Os. diff --git a/gcc/config/arc/arcEM.md b/gcc/config/arc/arcEM.md new file mode 100644 index 00000000000..a72d2504e52 --- /dev/null +++ b/gcc/config/arc/arcEM.md @@ -0,0 +1,93 @@ +;; DFA scheduling description of the Synopsys DesignWare ARC EM cpu +;; for GNU C compiler +;; Copyright (C) 2007-2015 Free Software Foundation, Inc. +;; Contributor: Claudiu Zissulescu <claudiu.zissulescu@synopsys.com> + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_automaton "ARCEM") + +(define_cpu_unit "em_issue, ld_st, mul_em, divrem_em" "ARCEM") + +(define_insn_reservation "em_data_load" 2 + (and (match_test "TARGET_EM") + (eq_attr "type" "load")) + "em_issue+ld_st,nothing") + +(define_insn_reservation "em_data_store" 1 + (and (match_test "TARGET_EM") + (eq_attr "type" "store")) + "em_issue+ld_st") + +;; Multipliers options +(define_insn_reservation "mul_em_mpyw_1" 1 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option > 0") + (match_test "arc_mpy_option <= 2") + (eq_attr "type" "mul16_em")) + "em_issue+mul_em") + +(define_insn_reservation "mul_em_mpyw_2" 2 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option > 2") + (match_test "arc_mpy_option <= 5") + (eq_attr "type" "mul16_em")) + "em_issue+mul_em, nothing") + +(define_insn_reservation "mul_em_mpyw_4" 4 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 6") + (eq_attr "type" "mul16_em")) + "em_issue+mul_em, mul_em*3") + +(define_insn_reservation "mul_em_multi_wlh1" 1 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 2") + (eq_attr "type" "multi,umulti")) + "em_issue+mul_em") + +(define_insn_reservation "mul_em_multi_wlh2" 2 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 3") + (eq_attr "type" "multi,umulti")) + "em_issue+mul_em, nothing") + +(define_insn_reservation "mul_em_multi_wlh3" 3 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 4") + (eq_attr "type" "multi,umulti")) + "em_issue+mul_em, mul_em*2") + +;; FIXME! Make the difference between MPY and MPYM for WLH4 +(define_insn_reservation "mul_em_multi_wlh4" 4 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 5") + (eq_attr "type" "multi,umulti")) + "em_issue+mul_em, mul_em*4") + +(define_insn_reservation "mul_em_multi_wlh5" 9 + (and (match_test "TARGET_EM") + (match_test "arc_mpy_option == 6") + (eq_attr "type" "multi,umulti")) + "em_issue+mul_em, mul_em*8") + +;; Radix-4 divider timing +(define_insn_reservation "em_divrem" 3 + (and (match_test "TARGET_EM") + (match_test "TARGET_DIVREM") + (eq_attr "type" "div_rem")) + "em_issue+mul_em+divrem_em, (mul_em+divrem_em)*2") diff --git a/gcc/config/arc/arcHS.md b/gcc/config/arc/arcHS.md new file mode 100644 index 00000000000..06937445a47 --- /dev/null +++ b/gcc/config/arc/arcHS.md @@ -0,0 +1,76 @@ +;; DFA scheduling description of the Synopsys DesignWare ARC HS cpu +;; for GNU C compiler +;; Copyright (C) 2007-2015 Free Software Foundation, Inc. +;; Contributor: Claudiu Zissulescu <claudiu.zissulescu@synopsys.com> + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_automaton "ARCHS") + +(define_cpu_unit "hs_issue, hs_ld_st, divrem_hs, mul_hs, x1, x2" "ARCHS") + +(define_insn_reservation "hs_data_load" 4 + (and (match_test "TARGET_HS") + (eq_attr "type" "load")) + "hs_issue+hs_ld_st,hs_ld_st,nothing*2") + +(define_insn_reservation "hs_data_store" 1 + (and (match_test "TARGET_HS") + (eq_attr "type" "store")) + "hs_issue+hs_ld_st") + +(define_insn_reservation "hs_alu0" 2 + (and (match_test "TARGET_HS") + (eq_attr "type" "cc_arith, two_cycle_core, shift, lr, sr")) + "hs_issue+x1,x2") + +(define_insn_reservation "hs_alu1" 4 + (and (match_test "TARGET_HS") + (eq_attr "type" "move, cmove, unary, binary, compare, misc")) + "hs_issue+x1, nothing*3") + +(define_insn_reservation "hs_divrem" 13 + (and (match_test "TARGET_HS") + (match_test "TARGET_DIVREM") + (eq_attr "type" "div_rem")) + "hs_issue+divrem_hs, (divrem_hs)*12") + +(define_insn_reservation "hs_mul" 3 + (and (match_test "TARGET_HS") + (eq_attr "type" "mul16_em, multi, umulti")) + "hs_issue+mul_hs, nothing*3") + +;; BYPASS EALU -> +(define_bypass 1 "hs_alu0" "hs_divrem") +(define_bypass 1 "hs_alu0" "hs_mul") + +;; BYPASS BALU -> +(define_bypass 1 "hs_alu1" "hs_alu1") +(define_bypass 1 "hs_alu1" "hs_data_store" "store_data_bypass_p") + +;; BYPASS LD -> +(define_bypass 1 "hs_data_load" "hs_alu1") +(define_bypass 3 "hs_data_load" "hs_divrem") +(define_bypass 3 "hs_data_load" "hs_data_load") +(define_bypass 3 "hs_data_load" "hs_mul") +(define_bypass 1 "hs_data_load" "hs_data_store" "store_data_bypass_p") + +;; BYPASS MPY -> +;;(define_bypass 3 "hs_mul" "hs_mul") +(define_bypass 1 "hs_mul" "hs_alu1") +(define_bypass 3 "hs_mul" "hs_divrem") +(define_bypass 1 "hs_mul" "hs_data_store" "store_data_bypass_p") diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md index 3d0db360557..65ea44a9f13 100644 --- a/gcc/config/arc/constraints.md +++ b/gcc/config/arc/constraints.md @@ -127,6 +127,12 @@ (and (match_code "const_int") (match_test "UNSIGNED_INT6 (-ival)"))) +(define_constraint "C16" + "@internal + A 16-bit signed integer constant" + (and (match_code "const_int") + (match_test "SIGNED_INT16 (ival)"))) + (define_constraint "M" "@internal A 5-bit unsigned integer constant" @@ -212,6 +218,12 @@ (and (match_code "const_int") (match_test "ival && IS_POWEROF2_P (ival + 1)"))) +(define_constraint "C3p" + "@internal + constant int used to select xbfu a,b,u6 instruction. The values accepted are 1 and 2." + (and (match_code "const_int") + (match_test "((ival == 1) || (ival == 2))"))) + (define_constraint "Ccp" "@internal constant such that ~x (one's Complement) is a power of two" @@ -397,3 +409,15 @@ Integer constant zero" (and (match_code "const_int") (match_test "IS_ZERO (ival)"))) + +(define_constraint "Cm2" + "@internal + A signed 9-bit integer constant." + (and (match_code "const_int") + (match_test "(ival >= -256) && (ival <=255)"))) + +(define_constraint "C62" + "@internal + An unsigned 6-bit integer constant, up to 62." + (and (match_code "const_int") + (match_test "UNSIGNED_INT6 (ival - 1)"))) diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md index d72f097eb71..43f9474c691 100644 --- a/gcc/config/arc/predicates.md +++ b/gcc/config/arc/predicates.md @@ -664,7 +664,7 @@ (match_operand 0 "shiftr4_operator"))) (define_predicate "mult_operator" - (and (match_code "mult") (match_test "TARGET_ARC700 && !TARGET_NOMPY_SET")) + (and (match_code "mult") (match_test "TARGET_MPY")) ) (define_predicate "commutative_operator" @@ -809,3 +809,7 @@ (match_test "INTVAL (op) >= 0") (and (match_test "const_double_operand (op, mode)") (match_test "CONST_DOUBLE_HIGH (op) == 0")))) + +(define_predicate "short_const_int_operand" + (and (match_operand 0 "const_int_operand") + (match_test "satisfies_constraint_C16 (op)"))) diff --git a/gcc/config/arc/t-arc-newlib b/gcc/config/arc/t-arc-newlib index 8823805b8aa..ea43a52cdc0 100644 --- a/gcc/config/arc/t-arc-newlib +++ b/gcc/config/arc/t-arc-newlib @@ -17,8 +17,8 @@ # with GCC; see the file COPYING3. If not see # <http://www.gnu.org/licenses/>. -MULTILIB_OPTIONS=mcpu=ARC600/mcpu=ARC601 mmul64/mmul32x16 mnorm -MULTILIB_DIRNAMES=arc600 arc601 mul64 mul32x16 norm +MULTILIB_OPTIONS=mcpu=ARC600/mcpu=ARC601/mcpu=ARC700/mcpu=ARCEM/mcpu=ARCHS mmul64/mmul32x16 mnorm +MULTILIB_DIRNAMES=arc600 arc601 arc700 em hs mul64 mul32x16 norm # # Aliases: MULTILIB_MATCHES = mcpu?ARC600=mcpu?arc600 @@ -26,10 +26,21 @@ MULTILIB_MATCHES += mcpu?ARC600=mARC600 MULTILIB_MATCHES += mcpu?ARC600=mA6 MULTILIB_MATCHES += mcpu?ARC600=mno-mpy MULTILIB_MATCHES += mcpu?ARC601=mcpu?arc601 +MULTILIB_MATCHES += mcpu?ARC700=mA7 +MULTILIB_MATCHES += mcpu?ARC700=mARC700 +MULTILIB_MATCHES += mcpu?ARC700=mcpu?arc700 +MULTILIB_MATCHES += mcpu?ARCEM=mcpu?arcem +MULTILIB_MATCHES += mcpu?ARCHS=mcpu?archs MULTILIB_MATCHES += EL=mlittle-endian MULTILIB_MATCHES += EB=mbig-endian # # These don't make sense for the ARC700 default target: -MULTILIB_EXCEPTIONS=mmul64* mmul32x16* mnorm* +MULTILIB_EXCEPTIONS=mmul64* mmul32x16* norm* # And neither of the -mmul* options make sense without -mnorm: MULTILIB_EXCLUSIONS=mARC600/mmul64/!mnorm mcpu=ARC601/mmul64/!mnorm mARC600/mmul32x16/!mnorm +# Exclusions for ARC700 +MULTILIB_EXCEPTIONS += mcpu=ARC700/mnorm* mcpu=ARC700/mmul64* mcpu=ARC700/mmul32x16* +# Exclusions for ARCv2EM +MULTILIB_EXCEPTIONS += mcpu=ARCEM/mmul64* mcpu=ARCEM/mmul32x16* +# Exclusions for ARCv2HS +MULTILIB_EXCEPTIONS += mcpu=ARCHS/mmul64* mcpu=ARCHS/mmul32x16* mcpu=ARCHS/mnorm* diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index bad3dc381a1..f73afc269c3 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -67,7 +67,9 @@ enum arm_type_qualifiers /* Polynomial types. */ qualifier_poly = 0x100, /* Lane indices - must be within range of previous argument = a vector. */ - qualifier_lane_index = 0x200 + qualifier_lane_index = 0x200, + /* Lane indices for single lane structure loads and stores. */ + qualifier_struct_load_store_lane_index = 0x400 }; /* The qualifier_internal allows generation of a unary builtin from @@ -150,7 +152,7 @@ arm_load1_qualifiers[SIMD_MAX_BUILTIN_ARGS] static enum arm_type_qualifiers arm_load1_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_none, qualifier_const_pointer_map_mode, - qualifier_none, qualifier_immediate }; + qualifier_none, qualifier_struct_load_store_lane_index }; #define LOAD1LANE_QUALIFIERS (arm_load1_lane_qualifiers) /* The first argument (return type) of a store should be void type, @@ -169,7 +171,7 @@ arm_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS] static enum arm_type_qualifiers arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_void, qualifier_pointer_map_mode, - qualifier_none, qualifier_immediate }; + qualifier_none, qualifier_struct_load_store_lane_index }; #define STORE1LANE_QUALIFIERS (arm_storestruct_lane_qualifiers) #define v8qi_UP V8QImode @@ -1963,6 +1965,7 @@ typedef enum { NEON_ARG_COPY_TO_REG, NEON_ARG_CONSTANT, NEON_ARG_LANE_INDEX, + NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX, NEON_ARG_MEMORY, NEON_ARG_STOP } builtin_arg; @@ -2020,9 +2023,9 @@ neon_dereference_pointer (tree exp, tree type, machine_mode mem_mode, /* Expand a Neon builtin. */ static rtx arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode, - int icode, int have_retval, tree exp, ...) + int icode, int have_retval, tree exp, + builtin_arg *args) { - va_list ap; rtx pat; tree arg[SIMD_MAX_BUILTIN_ARGS]; rtx op[SIMD_MAX_BUILTIN_ARGS]; @@ -2037,13 +2040,11 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode, || !(*insn_data[icode].operand[0].predicate) (target, tmode))) target = gen_reg_rtx (tmode); - va_start (ap, exp); - formals = TYPE_ARG_TYPES (TREE_TYPE (arm_builtin_decls[fcode])); for (;;) { - builtin_arg thisarg = (builtin_arg) va_arg (ap, int); + builtin_arg thisarg = args[argc]; if (thisarg == NEON_ARG_STOP) break; @@ -2079,6 +2080,18 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode, op[argc] = copy_to_mode_reg (mode[argc], op[argc]); break; + case NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX: + gcc_assert (argc > 1); + if (CONST_INT_P (op[argc])) + { + neon_lane_bounds (op[argc], 0, + GET_MODE_NUNITS (map_mode), exp); + /* Keep to GCC-vector-extension lane indices in the RTL. */ + op[argc] = + GEN_INT (NEON_ENDIAN_LANE_N (map_mode, INTVAL (op[argc]))); + } + goto constant_arg; + case NEON_ARG_LANE_INDEX: /* Previous argument must be a vector, which this indexes. */ gcc_assert (argc > 0); @@ -2089,19 +2102,22 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode, } /* Fall through - if the lane index isn't a constant then the next case will error. */ + case NEON_ARG_CONSTANT: +constant_arg: if (!(*insn_data[icode].operand[opno].predicate) (op[argc], mode[argc])) - error_at (EXPR_LOCATION (exp), "incompatible type for argument %d, " - "expected %<const int%>", argc + 1); + { + error ("%Kargument %d must be a constant immediate", + exp, argc + 1); + return const0_rtx; + } break; + case NEON_ARG_MEMORY: /* Check if expand failed. */ if (op[argc] == const0_rtx) - { - va_end (ap); return 0; - } gcc_assert (MEM_P (op[argc])); PUT_MODE (op[argc], mode[argc]); /* ??? arm_neon.h uses the same built-in functions for signed @@ -2122,8 +2138,6 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode, } } - va_end (ap); - if (have_retval) switch (argc) { @@ -2235,6 +2249,8 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target) if (d->qualifiers[qualifiers_k] & qualifier_lane_index) args[k] = NEON_ARG_LANE_INDEX; + else if (d->qualifiers[qualifiers_k] & qualifier_struct_load_store_lane_index) + args[k] = NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX; else if (d->qualifiers[qualifiers_k] & qualifier_immediate) args[k] = NEON_ARG_CONSTANT; else if (d->qualifiers[qualifiers_k] & qualifier_maybe_immediate) @@ -2260,11 +2276,7 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target) the function is void, and a 1 if it is not. */ return arm_expand_neon_args (target, d->mode, fcode, icode, !is_void, exp, - args[1], - args[2], - args[3], - args[4], - NEON_ARG_STOP); + &args[1]); } /* Expand an expression EXP that calls a built-in function, diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index f4ebbc80f16..709369441d0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -11049,6 +11049,23 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, case UNSIGNED_FIX: if (TARGET_HARD_FLOAT) { + /* The *combine_vcvtf2i reduces a vmul+vcvt into + a vcvt fixed-point conversion. */ + if (code == FIX && mode == SImode + && GET_CODE (XEXP (x, 0)) == FIX + && GET_MODE (XEXP (x, 0)) == SFmode + && GET_CODE (XEXP (XEXP (x, 0), 0)) == MULT + && vfp3_const_double_for_bits (XEXP (XEXP (XEXP (x, 0), 0), 1)) + > 0) + { + if (speed_p) + *cost += extra_cost->fp[0].toint; + + *cost += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0), mode, + code, 0, speed_p); + return true; + } + if (GET_MODE_CLASS (mode) == MODE_INT) { mode = GET_MODE (XEXP (x, 0)); @@ -12339,32 +12356,15 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse, { rtx el = vector ? CONST_VECTOR_ELT (op, i) : op; unsigned HOST_WIDE_INT elpart; - unsigned int part, parts; - if (CONST_INT_P (el)) - { - elpart = INTVAL (el); - parts = 1; - } - else if (CONST_DOUBLE_P (el)) - { - elpart = CONST_DOUBLE_LOW (el); - parts = 2; - } - else - gcc_unreachable (); + gcc_assert (CONST_INT_P (el)); + elpart = INTVAL (el); - for (part = 0; part < parts; part++) - { - unsigned int byte; - for (byte = 0; byte < innersize; byte++) - { - bytes[idx++] = (elpart & 0xff) ^ invmask; - elpart >>= BITS_PER_UNIT; - } - if (CONST_DOUBLE_P (el)) - elpart = CONST_DOUBLE_HIGH (el); - } + for (unsigned int byte = 0; byte < innersize; byte++) + { + bytes[idx++] = (elpart & 0xff) ^ invmask; + elpart >>= BITS_PER_UNIT; + } } /* Sanity check. */ @@ -12960,14 +12960,14 @@ neon_vector_mem_operand (rtx op, int type, bool strict) rtx ind; /* Reject eliminable registers. */ - if (! (reload_in_progress || reload_completed) - && ( reg_mentioned_p (frame_pointer_rtx, op) + if (strict && ! (reload_in_progress || reload_completed) + && (reg_mentioned_p (frame_pointer_rtx, op) || reg_mentioned_p (arg_pointer_rtx, op) || reg_mentioned_p (virtual_incoming_args_rtx, op) || reg_mentioned_p (virtual_outgoing_args_rtx, op) || reg_mentioned_p (virtual_stack_dynamic_rtx, op) || reg_mentioned_p (virtual_stack_vars_rtx, op))) - return !strict; + return FALSE; /* Constants are converted into offsets from labels. */ if (!MEM_P (op)) @@ -30103,4 +30103,5 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri, *pri = tmp; return; } + #include "gt-arm.h" diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index a1a04a94ef2..313fed5b450 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -284,6 +284,12 @@ extern void (*arm_lang_output_object_attributes_hook)(void); #define TARGET_BPABI false #endif +/* Transform lane numbers on big endian targets. This is used to allow for the + endianness difference between NEON architectural lane numbers and those + used in RTL */ +#define NEON_ENDIAN_LANE_N(mode, n) \ + (BYTES_BIG_ENDIAN ? GET_MODE_NUNITS (mode) - 1 - n : n) + /* Support for a compile-time default CPU, et cetera. The rules are: --with-arch is ignored if -march or -mcpu are specified. --with-cpu is ignored if -march or -mcpu are specified, and is overridden diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index e5a2b0f1c9a..119550c4baa 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -4253,6 +4253,9 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load1_1reg<q>")] ) +;; The lane numbers in the RTL are in GCC lane order, having been flipped +;; in arm_expand_neon_args. The lane numbers are restored to architectural +;; lane order here. (define_insn "neon_vld1_lane<mode>" [(set (match_operand:VDX 0 "s_register_operand" "=w") (unspec:VDX [(match_operand:<V_elem> 1 "neon_struct_operand" "Um") @@ -4261,10 +4264,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD1_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); - if (lane < 0 || lane >= max) - error ("lane out of range"); + operands[3] = GEN_INT (lane); if (max == 1) return "vld1.<V_sz_elem>\t%P0, %A1"; else @@ -4273,6 +4275,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load1_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld1_lane<mode>" [(set (match_operand:VQX 0 "s_register_operand" "=w") (unspec:VQX [(match_operand:<V_elem> 1 "neon_struct_operand" "Um") @@ -4281,12 +4285,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD1_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); + operands[3] = GEN_INT (lane); int regno = REGNO (operands[0]); - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -4359,6 +4362,8 @@ if (BYTES_BIG_ENDIAN) "vst1.<V_sz_elem>\t%h1, %A0" [(set_attr "type" "neon_store1_1reg<q>")]) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst1_lane<mode>" [(set (match_operand:<V_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_elem> @@ -4367,10 +4372,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST1_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); - if (lane < 0 || lane >= max) - error ("lane out of range"); + operands[2] = GEN_INT (lane); if (max == 1) return "vst1.<V_sz_elem>\t{%P1}, %A0"; else @@ -4379,6 +4383,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store1_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst1_lane<mode>" [(set (match_operand:<V_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_elem> @@ -4387,17 +4393,15 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST1_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; - operands[2] = GEN_INT (lane); } + operands[2] = GEN_INT (lane); operands[1] = gen_rtx_REG (<V_HALF>mode, regno); if (max == 2) return "vst1.<V_sz_elem>\t{%P1}, %A0"; @@ -4448,6 +4452,8 @@ if (BYTES_BIG_ENDIAN) "vld2.<V_sz_elem>\t%h0, %A1" [(set_attr "type" "neon_load2_2reg_q")]) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld2_lane<mode>" [(set (match_operand:TI 0 "s_register_operand" "=w") (unspec:TI [(match_operand:<V_two_elem> 1 "neon_struct_operand" "Um") @@ -4457,22 +4463,22 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD2_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[4]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = gen_rtx_REG (DImode, regno); ops[1] = gen_rtx_REG (DImode, regno + 2); ops[2] = operands[1]; - ops[3] = operands[3]; + ops[3] = GEN_INT (lane); output_asm_insn ("vld2.<V_sz_elem>\t{%P0[%c3], %P1[%c3]}, %A2", ops); return ""; } [(set_attr "type" "neon_load2_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld2_lane<mode>" [(set (match_operand:OI 0 "s_register_operand" "=w") (unspec:OI [(match_operand:<V_two_elem> 1 "neon_struct_operand" "Um") @@ -4482,13 +4488,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD2_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[4]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -4563,6 +4567,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store2_4reg<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst2_lane<mode>" [(set (match_operand:<V_two_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_two_elem> @@ -4572,22 +4578,22 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST2_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[4]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = operands[0]; ops[1] = gen_rtx_REG (DImode, regno); ops[2] = gen_rtx_REG (DImode, regno + 2); - ops[3] = operands[2]; + ops[3] = GEN_INT (lane); output_asm_insn ("vst2.<V_sz_elem>\t{%P1[%c3], %P2[%c3]}, %A0", ops); return ""; } [(set_attr "type" "neon_store2_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst2_lane<mode>" [(set (match_operand:<V_two_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_two_elem> @@ -4597,13 +4603,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST2_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[4]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -4707,6 +4711,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load3_3reg<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld3_lane<mode>" [(set (match_operand:EI 0 "s_register_operand" "=w") (unspec:EI [(match_operand:<V_three_elem> 1 "neon_struct_operand" "Um") @@ -4716,17 +4722,15 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD3_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N (<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[5]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = gen_rtx_REG (DImode, regno); ops[1] = gen_rtx_REG (DImode, regno + 2); ops[2] = gen_rtx_REG (DImode, regno + 4); ops[3] = operands[1]; - ops[4] = operands[3]; + ops[4] = GEN_INT (lane); output_asm_insn ("vld3.<V_sz_elem>\t{%P0[%c4], %P1[%c4], %P2[%c4]}, %3", ops); return ""; @@ -4734,6 +4738,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load3_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld3_lane<mode>" [(set (match_operand:CI 0 "s_register_operand" "=w") (unspec:CI [(match_operand:<V_three_elem> 1 "neon_struct_operand" "Um") @@ -4743,13 +4749,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD3_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[5]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -4879,6 +4883,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store3_3reg<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst3_lane<mode>" [(set (match_operand:<V_three_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_three_elem> @@ -4888,17 +4894,15 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST3_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[5]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = operands[0]; ops[1] = gen_rtx_REG (DImode, regno); ops[2] = gen_rtx_REG (DImode, regno + 2); ops[3] = gen_rtx_REG (DImode, regno + 4); - ops[4] = operands[2]; + ops[4] = GEN_INT (lane); output_asm_insn ("vst3.<V_sz_elem>\t{%P1[%c4], %P2[%c4], %P3[%c4]}, %0", ops); return ""; @@ -4906,6 +4910,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store3_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst3_lane<mode>" [(set (match_operand:<V_three_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_three_elem> @@ -4915,13 +4921,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST3_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[5]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -5029,6 +5033,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load4_4reg<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld4_lane<mode>" [(set (match_operand:OI 0 "s_register_operand" "=w") (unspec:OI [(match_operand:<V_four_elem> 1 "neon_struct_operand" "Um") @@ -5038,18 +5044,16 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD4_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[6]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = gen_rtx_REG (DImode, regno); ops[1] = gen_rtx_REG (DImode, regno + 2); ops[2] = gen_rtx_REG (DImode, regno + 4); ops[3] = gen_rtx_REG (DImode, regno + 6); ops[4] = operands[1]; - ops[5] = operands[3]; + ops[5] = GEN_INT (lane); output_asm_insn ("vld4.<V_sz_elem>\t{%P0[%c5], %P1[%c5], %P2[%c5], %P3[%c5]}, %A4", ops); return ""; @@ -5057,6 +5061,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_load4_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vld4_lane<mode>" [(set (match_operand:XI 0 "s_register_operand" "=w") (unspec:XI [(match_operand:<V_four_elem> 1 "neon_struct_operand" "Um") @@ -5066,13 +5072,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD4_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[3]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[0]); rtx ops[6]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; @@ -5209,6 +5213,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store4_4reg<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst4_lane<mode>" [(set (match_operand:<V_four_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_four_elem> @@ -5218,18 +5224,16 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST4_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[6]; - if (lane < 0 || lane >= max) - error ("lane out of range"); ops[0] = operands[0]; ops[1] = gen_rtx_REG (DImode, regno); ops[2] = gen_rtx_REG (DImode, regno + 2); ops[3] = gen_rtx_REG (DImode, regno + 4); ops[4] = gen_rtx_REG (DImode, regno + 6); - ops[5] = operands[2]; + ops[5] = GEN_INT (lane); output_asm_insn ("vst4.<V_sz_elem>\t{%P1[%c5], %P2[%c5], %P3[%c5], %P4[%c5]}, %A0", ops); return ""; @@ -5237,6 +5241,8 @@ if (BYTES_BIG_ENDIAN) [(set_attr "type" "neon_store4_one_lane<q>")] ) +;; see comment on neon_vld1_lane for reason why the lane numbers are reversed +;; here on big endian targets. (define_insn "neon_vst4_lane<mode>" [(set (match_operand:<V_four_elem> 0 "neon_struct_operand" "=Um") (unspec:<V_four_elem> @@ -5246,13 +5252,11 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VST4_LANE))] "TARGET_NEON" { - HOST_WIDE_INT lane = INTVAL (operands[2]); + HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2])); HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode); int regno = REGNO (operands[1]); rtx ops[6]; - if (lane < 0 || lane >= max) - error ("lane out of range"); - else if (lane >= max / 2) + if (lane >= max / 2) { lane -= max / 2; regno += 2; diff --git a/gcc/config/ft32/ft32.c b/gcc/config/ft32/ft32.c index 85e5ba3bbe5..ab620617bf7 100644 --- a/gcc/config/ft32/ft32.c +++ b/gcc/config/ft32/ft32.c @@ -238,7 +238,7 @@ ft32_print_operand (FILE * file, rtx x, int code) return; case MEM: - output_address (XEXP (operand, 0)); + output_address (GET_MODE (XEXP (operand, 0)), XEXP (operand, 0)); return; default: diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 9e20714099d..bd084dc9714 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -132,6 +132,7 @@ extern bool ix86_expand_vec_perm_const (rtx[]); extern bool ix86_expand_mask_vec_cmp (rtx[]); extern bool ix86_expand_int_vec_cmp (rtx[]); extern bool ix86_expand_fp_vec_cmp (rtx[]); +extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx); extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool); extern bool ix86_expand_int_addcc (rtx[]); extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index f6c17dfd405..571f7d7b5ec 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -80,7 +80,7 @@ along with GCC; see the file COPYING3. If not see static rtx legitimize_dllimport_symbol (rtx, bool); static rtx legitimize_pe_coff_extern_decl (rtx, bool); static rtx legitimize_pe_coff_symbol (rtx, bool); -static void ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t); +static void ix86_print_operand_address_as (FILE *, rtx, addr_space_t, bool); #ifndef CHECK_STACK_LIMIT #define CHECK_STACK_LIMIT (-1) @@ -2175,7 +2175,7 @@ const struct processor_costs *ix86_cost = &pentium_cost; #define m_BONNELL (1<<PROCESSOR_BONNELL) #define m_SILVERMONT (1<<PROCESSOR_SILVERMONT) #define m_KNL (1<<PROCESSOR_KNL) -#define m_SKYLAKE_AVX512 (1<<PROCESSOT_SKYLAKE_AVX512) +#define m_SKYLAKE_AVX512 (1<<PROCESSOR_SKYLAKE_AVX512) #define m_INTEL (1<<PROCESSOR_INTEL) #define m_GEODE (1<<PROCESSOR_GEODE) @@ -17131,13 +17131,6 @@ ix86_print_operand (FILE *file, rtx x, int code) { rtx addr = XEXP (x, 0); - /* Avoid (%rip) for call operands. */ - if (code == 'P' && CONSTANT_ADDRESS_P (x) && !CONST_INT_P (x)) - { - output_addr_const (file, addr); - return; - } - /* No `byte ptr' prefix for call instructions ... */ if (ASSEMBLER_DIALECT == ASM_INTEL && code != 'X' && code != 'P') { @@ -17187,7 +17180,8 @@ ix86_print_operand (FILE *file, rtx x, int code) if (this_is_asm_operands && ! address_operand (addr, VOIDmode)) output_operand_lossage ("invalid constraints for operand"); else - ix86_print_operand_address_as (file, addr, MEM_ADDR_SPACE (x)); + ix86_print_operand_address_as + (file, addr, MEM_ADDR_SPACE (x), code == 'p' || code == 'P'); } else if (CONST_DOUBLE_P (x) && GET_MODE (x) == SFmode) @@ -17272,7 +17266,8 @@ ix86_print_operand_punct_valid_p (unsigned char code) /* Print a memory operand whose address is ADDR. */ static void -ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as) +ix86_print_operand_address_as (FILE *file, rtx addr, + addr_space_t as, bool no_rip) { struct ix86_address parts; rtx base, index, disp; @@ -17346,7 +17341,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as) } /* Use one byte shorter RIP relative addressing for 64bit mode. */ - if (TARGET_64BIT && !base && !index) + if (TARGET_64BIT && !base && !index && !no_rip) { rtx symbol = disp; @@ -17360,10 +17355,10 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as) && SYMBOL_REF_TLS_MODEL (symbol) == 0)) base = pc_rtx; } + if (!base && !index) { /* Displacement only requires special attention. */ - if (CONST_INT_P (disp)) { if (ASSEMBLER_DIALECT == ASM_INTEL && parts.seg == ADDR_SPACE_GENERIC) @@ -17505,7 +17500,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as) static void ix86_print_operand_address (FILE *file, machine_mode /*mode*/, rtx addr) { - ix86_print_operand_address_as (file, addr, ADDR_SPACE_GENERIC); + ix86_print_operand_address_as (file, addr, ADDR_SPACE_GENERIC, false); } /* Implementation of TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA. */ @@ -22633,7 +22628,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1, /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical operations. This is used for both scalar and vector conditional moves. */ -static void +void ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) { machine_mode mode = GET_MODE (dest); @@ -36113,7 +36108,11 @@ get_builtin_code_for_version (tree decl, tree *predicate_list) priority = P_PROC_AVX; break; case PROCESSOR_HASWELL: - if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_ADX) + if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_AVX512VL) + arg_str = "skylake-avx512"; + else if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_XSAVES) + arg_str = "skylake"; + else if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_ADX) arg_str = "broadwell"; else arg_str = "haswell"; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 52dd03717b4..34a6d3f4d82 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2601,7 +2601,7 @@ switch (which_alternative) { case 0: - return "movabs{<imodesuffix>}\t{%1, %0|%0, %1}"; + return "movabs{<imodesuffix>}\t{%1, %P0|<iptrsize> PTR [%P0], %1}"; case 1: return "mov{<imodesuffix>}\t{%1, %0|%0, %1}"; default: @@ -2625,7 +2625,7 @@ switch (which_alternative) { case 0: - return "movabs{<imodesuffix>}\t{%1, %0|%0, %1}"; + return "movabs{<imodesuffix>}\t{%P1, %0|%0, <iptrsize> PTR [%P1]}"; case 1: return "mov{<imodesuffix>}\t{%1, %0|%0, %1}"; default: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f804255aedf..aad6a0ddd98 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -799,6 +799,14 @@ [(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t") (V64QI "g") (V32HI "g") (V16SI "g") (V8DI "g") (V16SF "g") (V8DF "g")]) +;; Half mask mode for unpacks +(define_mode_attr HALFMASKMODE + [(DI "SI") (SI "HI")]) + +;; Double mask mode for packs +(define_mode_attr DOUBLEMASKMODE + [(HI "SI") (SI "DI")]) + ;; Include define_subst patterns for instructions with mask (include "subst.md") @@ -3015,6 +3023,87 @@ DONE; }) +(define_expand "vcond_mask_<mode><avx512fmaskmodelower>" + [(set (match_operand:V48_AVX512VL 0 "register_operand") + (vec_merge:V48_AVX512VL + (match_operand:V48_AVX512VL 1 "nonimmediate_operand") + (match_operand:V48_AVX512VL 2 "vector_move_operand") + (match_operand:<avx512fmaskmode> 3 "register_operand")))] + "TARGET_AVX512F") + +(define_expand "vcond_mask_<mode><avx512fmaskmodelower>" + [(set (match_operand:VI12_AVX512VL 0 "register_operand") + (vec_merge:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "nonimmediate_operand") + (match_operand:VI12_AVX512VL 2 "vector_move_operand") + (match_operand:<avx512fmaskmode> 3 "register_operand")))] + "TARGET_AVX512BW") + +(define_expand "vcond_mask_<mode><sseintvecmodelower>" + [(set (match_operand:VI_256 0 "register_operand") + (vec_merge:VI_256 + (match_operand:VI_256 1 "nonimmediate_operand") + (match_operand:VI_256 2 "vector_move_operand") + (match_operand:<sseintvecmode> 3 "register_operand")))] + "TARGET_AVX2" +{ + ix86_expand_sse_movcc (operands[0], operands[3], + operands[1], operands[2]); + DONE; +}) + +(define_expand "vcond_mask_<mode><sseintvecmodelower>" + [(set (match_operand:VI124_128 0 "register_operand") + (vec_merge:VI124_128 + (match_operand:VI124_128 1 "nonimmediate_operand") + (match_operand:VI124_128 2 "vector_move_operand") + (match_operand:<sseintvecmode> 3 "register_operand")))] + "TARGET_SSE2" +{ + ix86_expand_sse_movcc (operands[0], operands[3], + operands[1], operands[2]); + DONE; +}) + +(define_expand "vcond_mask_v2div2di" + [(set (match_operand:V2DI 0 "register_operand") + (vec_merge:V2DI + (match_operand:V2DI 1 "nonimmediate_operand") + (match_operand:V2DI 2 "vector_move_operand") + (match_operand:V2DI 3 "register_operand")))] + "TARGET_SSE4_2" +{ + ix86_expand_sse_movcc (operands[0], operands[3], + operands[1], operands[2]); + DONE; +}) + +(define_expand "vcond_mask_<mode><sseintvecmodelower>" + [(set (match_operand:VF_256 0 "register_operand") + (vec_merge:VF_256 + (match_operand:VF_256 1 "nonimmediate_operand") + (match_operand:VF_256 2 "vector_move_operand") + (match_operand:<sseintvecmode> 3 "register_operand")))] + "TARGET_AVX" +{ + ix86_expand_sse_movcc (operands[0], operands[3], + operands[1], operands[2]); + DONE; +}) + +(define_expand "vcond_mask_<mode><sseintvecmodelower>" + [(set (match_operand:VF_128 0 "register_operand") + (vec_merge:VF_128 + (match_operand:VF_128 1 "nonimmediate_operand") + (match_operand:VF_128 2 "vector_move_operand") + (match_operand:<sseintvecmode> 3 "register_operand")))] + "TARGET_SSE" +{ + ix86_expand_sse_movcc (operands[0], operands[3], + operands[1], operands[2]); + DONE; +}) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel floating point logical operations @@ -11497,6 +11586,23 @@ DONE; }) +(define_expand "vec_pack_trunc_qi" + [(set (match_operand:HI 0 ("register_operand")) + (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 1 ("register_operand"))) + (const_int 8)) + (zero_extend:HI (match_operand:QI 2 ("register_operand")))))] + "TARGET_AVX512F") + +(define_expand "vec_pack_trunc_<mode>" + [(set (match_operand:<DOUBLEMASKMODE> 0 ("register_operand")) + (ior:<DOUBLEMASKMODE> (ashift:<DOUBLEMASKMODE> (zero_extend:<DOUBLEMASKMODE> (match_operand:SWI24 1 ("register_operand"))) + (match_dup 3)) + (zero_extend:<DOUBLEMASKMODE> (match_operand:SWI24 2 ("register_operand")))))] + "TARGET_AVX512BW" +{ + operands[3] = GEN_INT (GET_MODE_BITSIZE (<MODE>mode)); +}) + (define_insn "<sse2_avx2>_packsswb<mask_name>" [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x") (vec_concat:VI1_AVX512 @@ -13393,12 +13499,42 @@ "TARGET_SSE2" "ix86_expand_sse_unpack (operands[0], operands[1], true, false); DONE;") +(define_expand "vec_unpacks_lo_hi" + [(set (match_operand:QI 0 "register_operand") + (subreg:QI (match_operand:HI 1 "register_operand") 0))] + "TARGET_AVX512DQ") + +(define_expand "vec_unpacks_lo_si" + [(set (match_operand:HI 0 "register_operand") + (subreg:HI (match_operand:SI 1 "register_operand") 0))] + "TARGET_AVX512F") + +(define_expand "vec_unpacks_lo_di" + [(set (match_operand:SI 0 "register_operand") + (subreg:SI (match_operand:DI 1 "register_operand") 0))] + "TARGET_AVX512BW") + (define_expand "vec_unpacku_hi_<mode>" [(match_operand:<sseunpackmode> 0 "register_operand") (match_operand:VI124_AVX2_24_AVX512F_1_AVX512BW 1 "register_operand")] "TARGET_SSE2" "ix86_expand_sse_unpack (operands[0], operands[1], true, true); DONE;") +(define_expand "vec_unpacks_hi_hi" + [(set (subreg:HI (match_operand:QI 0 "register_operand") 0) + (lshiftrt:HI (match_operand:HI 1 "register_operand") + (const_int 8)))] + "TARGET_AVX512F") + +(define_expand "vec_unpacks_hi_<mode>" + [(set (subreg:SWI48x (match_operand:<HALFMASKMODE> 0 "register_operand") 0) + (lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand") + (match_dup 2)))] + "TARGET_AVX512BW" +{ + operands[2] = GEN_INT (GET_MODE_BITSIZE (<HALFMASKMODE>mode)); +}) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Miscellaneous diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 9880b236d6d..d3b7730486d 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -16824,6 +16824,34 @@ mips_avoid_hazard (rtx_insn *after, rtx_insn *insn, int *hilo_delay, } } +/* A SEQUENCE is breakable iff the branch inside it has a compact form + and the target has compact branches. */ + +static bool +mips_breakable_sequence_p (rtx_insn *insn) +{ + return (insn && GET_CODE (PATTERN (insn)) == SEQUENCE + && TARGET_CB_MAYBE + && get_attr_compact_form (SEQ_BEGIN (insn)) != COMPACT_FORM_NEVER); +} + +/* Remove a SEQUENCE and replace it with the delay slot instruction + followed by the branch and return the instruction in the delay slot. + Return the first of the two new instructions. + Subroutine of mips_reorg_process_insns. */ + +static rtx_insn * +mips_break_sequence (rtx_insn *insn) +{ + rtx_insn *before = PREV_INSN (insn); + rtx_insn *branch = SEQ_BEGIN (insn); + rtx_insn *ds = SEQ_END (insn); + remove_insn (insn); + add_insn_after (ds, before, NULL); + add_insn_after (branch, ds, NULL); + return ds; +} + /* Go through the instruction stream and insert nops where necessary. Also delete any high-part relocations whose partnering low parts are now all dead. See if the whole function can then be put into @@ -16916,6 +16944,68 @@ mips_reorg_process_insns (void) { if (GET_CODE (PATTERN (insn)) == SEQUENCE) { + rtx_insn *next_active = next_active_insn (insn); + /* Undo delay slots to avoid bubbles if the next instruction can + be placed in a forbidden slot or the cost of adding an + explicit NOP in a forbidden slot is OK and if the SEQUENCE is + safely breakable. */ + if (TARGET_CB_MAYBE + && mips_breakable_sequence_p (insn) + && INSN_P (SEQ_BEGIN (insn)) + && INSN_P (SEQ_END (insn)) + && ((next_active + && INSN_P (next_active) + && GET_CODE (PATTERN (next_active)) != SEQUENCE + && get_attr_can_delay (next_active) == CAN_DELAY_YES) + || !optimize_size)) + { + /* To hide a potential pipeline bubble, if we scan backwards + from the current SEQUENCE and find that there is a load + of a value that is used in the CTI and there are no + dependencies between the CTI and instruction in the delay + slot, break the sequence so the load delay is hidden. */ + HARD_REG_SET uses; + CLEAR_HARD_REG_SET (uses); + note_uses (&PATTERN (SEQ_BEGIN (insn)), record_hard_reg_uses, + &uses); + HARD_REG_SET delay_sets; + CLEAR_HARD_REG_SET (delay_sets); + note_stores (PATTERN (SEQ_END (insn)), record_hard_reg_sets, + &delay_sets); + + rtx_insn *prev = prev_active_insn (insn); + if (prev + && GET_CODE (PATTERN (prev)) == SET + && MEM_P (SET_SRC (PATTERN (prev)))) + { + HARD_REG_SET sets; + CLEAR_HARD_REG_SET (sets); + note_stores (PATTERN (prev), record_hard_reg_sets, + &sets); + + /* Re-order if safe. */ + if (!hard_reg_set_intersect_p (delay_sets, uses) + && hard_reg_set_intersect_p (uses, sets)) + { + next_insn = mips_break_sequence (insn); + /* Need to process the hazards of the newly + introduced instructions. */ + continue; + } + } + + /* If we find an orphaned high-part relocation in a delay + slot then we can convert to a compact branch and get + the orphaned high part deleted. */ + if (mips_orphaned_high_part_p (&htab, SEQ_END (insn))) + { + next_insn = mips_break_sequence (insn); + /* Need to process the hazards of the newly + introduced instructions. */ + continue; + } + } + /* If we find an orphaned high-part relocation in a delay slot, it's easier to turn that instruction into a NOP than to delete it. The delay slot will be a NOP either way. */ @@ -16950,6 +17040,33 @@ mips_reorg_process_insns (void) { mips_avoid_hazard (last_insn, insn, &hilo_delay, &delayed_reg, lo_reg, &fs_delay); + /* When a compact branch introduces a forbidden slot hazard + and the next useful instruction is a SEQUENCE of a jump + and a non-nop instruction in the delay slot, remove the + sequence and replace it with the delay slot instruction + then the jump to clear the forbidden slot hazard. */ + + if (fs_delay) + { + /* Search onwards from the current position looking for + a SEQUENCE. We are looking for pipeline hazards here + and do not need to worry about labels or barriers as + the optimization only undoes delay slot filling which + only affects the order of the branch and its delay + slot. */ + rtx_insn *next = next_active_insn (insn); + if (next + && USEFUL_INSN_P (next) + && GET_CODE (PATTERN (next)) == SEQUENCE + && mips_breakable_sequence_p (next)) + { + last_insn = insn; + next_insn = mips_break_sequence (next); + /* Need to process the hazards of the newly + introduced instructions. */ + continue; + } + } last_insn = insn; } } diff --git a/gcc/config/moxie/moxie.c b/gcc/config/moxie/moxie.c index a45b825ced0..756e2f74e2d 100644 --- a/gcc/config/moxie/moxie.c +++ b/gcc/config/moxie/moxie.c @@ -106,7 +106,7 @@ moxie_operand_lossage (const char *msgid, rtx op) /* The PRINT_OPERAND_ADDRESS worker. */ static void -moxie_print_operand_address (FILE *file, rtx x) +moxie_print_operand_address (FILE *file, machine_mode, rtx x) { switch (GET_CODE (x)) { @@ -183,7 +183,7 @@ moxie_print_operand (FILE *file, rtx x, int code) return; case MEM: - output_address (XEXP (operand, 0)); + output_address (GET_MODE (XEXP (operand, 0)), XEXP (operand, 0)); return; default: diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index f1ac307b346..d8673018819 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -137,6 +137,9 @@ nvptx_option_override (void) write_symbols = NO_DEBUG; debug_info_level = DINFO_LEVEL_NONE; + if (nvptx_optimize < 0) + nvptx_optimize = optimize > 0; + declared_fndecls_htab = hash_table<tree_hasher>::create_ggc (17); needed_fndecls_htab = hash_table<tree_hasher>::create_ggc (17); declared_libfuncs_htab @@ -2942,6 +2945,69 @@ nvptx_skip_par (unsigned mask, parallel *par) nvptx_single (mask, par->forked_block, pre_tail); } +/* If PAR has a single inner parallel and PAR itself only contains + empty entry and exit blocks, swallow the inner PAR. */ + +static void +nvptx_optimize_inner (parallel *par) +{ + parallel *inner = par->inner; + + /* We mustn't be the outer dummy par. */ + if (!par->mask) + return; + + /* We must have a single inner par. */ + if (!inner || inner->next) + return; + + /* We must only contain 2 blocks ourselves -- the head and tail of + the inner par. */ + if (par->blocks.length () != 2) + return; + + /* We must be disjoint partitioning. As we only have vector and + worker partitioning, this is sufficient to guarantee the pars + have adjacent partitioning. */ + if ((par->mask & inner->mask) & (GOMP_DIM_MASK (GOMP_DIM_MAX) - 1)) + /* This indicates malformed code generation. */ + return; + + /* The outer forked insn should be immediately followed by the inner + fork insn. */ + rtx_insn *forked = par->forked_insn; + rtx_insn *fork = BB_END (par->forked_block); + + if (NEXT_INSN (forked) != fork) + return; + gcc_checking_assert (recog_memoized (fork) == CODE_FOR_nvptx_fork); + + /* The outer joining insn must immediately follow the inner join + insn. */ + rtx_insn *joining = par->joining_insn; + rtx_insn *join = inner->join_insn; + if (NEXT_INSN (join) != joining) + return; + + /* Preconditions met. Swallow the inner par. */ + if (dump_file) + fprintf (dump_file, "Merging loop %x [%d,%d] into %x [%d,%d]\n", + inner->mask, inner->forked_block->index, + inner->join_block->index, + par->mask, par->forked_block->index, par->join_block->index); + + par->mask |= inner->mask & (GOMP_DIM_MASK (GOMP_DIM_MAX) - 1); + + par->blocks.reserve (inner->blocks.length ()); + while (inner->blocks.length ()) + par->blocks.quick_push (inner->blocks.pop ()); + + par->inner = inner->inner; + inner->inner = NULL; + + delete inner; +} + /* Process the parallel PAR and all its contained parallels. We do everything but the neutering. Return mask of partitioned modes used within this parallel. */ @@ -2949,6 +3015,9 @@ nvptx_skip_par (unsigned mask, parallel *par) static unsigned nvptx_process_pars (parallel *par) { + if (nvptx_optimize) + nvptx_optimize_inner (par); + unsigned inner_mask = par->mask; /* Do the inner parallels first. */ diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt index 80170465bea..342915d8095 100644 --- a/gcc/config/nvptx/nvptx.opt +++ b/gcc/config/nvptx/nvptx.opt @@ -28,3 +28,7 @@ Generate code for a 64-bit ABI. mmainkernel Target Report RejectNegative Link in code for a __main kernel. + +moptimize +Target Report Var(nvptx_optimize) Init(-1) +Optimize partition neutering diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h index dbcfb9579cb..375a13edb27 100644 --- a/gcc/config/rs6000/aix.h +++ b/gcc/config/rs6000/aix.h @@ -101,8 +101,6 @@ { \ builtin_define ("_IBMR2"); \ builtin_define ("_POWER"); \ - builtin_define ("__powerpc__"); \ - builtin_define ("__PPC__"); \ builtin_define ("__unix__"); \ builtin_define ("_AIX"); \ builtin_define ("_AIX32"); \ @@ -112,6 +110,22 @@ builtin_define ("__LONGDOUBLE128"); \ builtin_assert ("system=unix"); \ builtin_assert ("system=aix"); \ + if (TARGET_64BIT) \ + { \ + builtin_define ("__PPC__"); \ + builtin_define ("__PPC64__"); \ + builtin_define ("__powerpc__"); \ + builtin_define ("__powerpc64__"); \ + builtin_assert ("cpu=powerpc64"); \ + builtin_assert ("machine=powerpc64"); \ + } \ + else \ + { \ + builtin_define ("__PPC__"); \ + builtin_define ("__powerpc__"); \ + builtin_assert ("cpu=powerpc"); \ + builtin_assert ("machine=powerpc"); \ + } \ } \ while (0) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ca93609bb6b..7b6aca9e813 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -18150,28 +18150,7 @@ rs6000_secondary_reload_direct_move (enum rs6000_reg_type to_type, } } - if (TARGET_POWERPC64 && size == 16) - { - /* Handle moving 128-bit values from GPRs to VSX point registers on - power8 when running in 64-bit mode using XXPERMDI to glue the two - 64-bit values back together. */ - if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE) - { - cost = 3; /* 2 mtvsrd's, 1 xxpermdi. */ - icode = reg_addr[mode].reload_vsx_gpr; - } - - /* Handle moving 128-bit values from VSX point registers to GPRs on - power8 when running in 64-bit mode using XXPERMDI to get access to the - bottom 64-bit value. */ - else if (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE) - { - cost = 3; /* 2 mfvsrd's, 1 xxpermdi. */ - icode = reg_addr[mode].reload_gpr_vsx; - } - } - - else if (!TARGET_POWERPC64 && size == 8) + else if (size == 8) { /* Handle moving 64-bit values from GPRs to floating point registers on power8 when running in 32-bit mode using FMRGOW to glue the two 32-bit diff --git a/gcc/configure b/gcc/configure index 0cd85fb8646..4b4e72457a7 100755 --- a/gcc/configure +++ b/gcc/configure @@ -28329,7 +28329,7 @@ else enable_default_ssp=no fi -if test x$enable_default_ssp == xyes ; then +if test x$enable_default_ssp = xyes ; then $as_echo "#define ENABLE_DEFAULT_SSP 1" >>confdefs.h @@ -29181,7 +29181,7 @@ else enable_default_pie=no fi -if test x$enable_default_pie == xyes ; then +if test x$enable_default_pie = xyes ; then $as_echo "#define ENABLE_DEFAULT_PIE 1" >>confdefs.h diff --git a/gcc/configure.ac b/gcc/configure.ac index ed2e665b40c..42d8f136e9c 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -5463,7 +5463,7 @@ else enable_default_ssp=no fi], enable_default_ssp=no) -if test x$enable_default_ssp == xyes ; then +if test x$enable_default_ssp = xyes ; then AC_DEFINE(ENABLE_DEFAULT_SSP, 1, [Define if your target supports default stack protector and it is enabled.]) fi @@ -6028,7 +6028,7 @@ AC_ARG_ENABLE(default-pie, [enable Position Independent Executable as default])], enable_default_pie=$enableval, enable_default_pie=no) -if test x$enable_default_pie == xyes ; then +if test x$enable_default_pie = xyes ; then AC_DEFINE(ENABLE_DEFAULT_PIE, 1, [Define if your target supports default PIE and it is enabled.]) fi diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 43d58a3475d..1c2fa5826dc 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -549,7 +549,9 @@ Objective-C and Objective-C++ Dialects}. -mexpand-adddi -mindexed-loads -mlra -mlra-priority-none @gol -mlra-priority-compact mlra-priority-noncompact -mno-millicode @gol -mmixed-code -mq-class -mRcq -mRcw -msize-level=@var{level} @gol --mtune=@var{cpu} -mmultcost=@var{num} -munalign-prob-threshold=@var{probability}} +-mtune=@var{cpu} -mmultcost=@var{num} @gol +-munalign-prob-threshold=@var{probability} -mmpy-option=@var{multo} @gol +-mdiv-rem -mcode-density} @emph{ARM Options} @gccoptlist{-mapcs-frame -mno-apcs-frame @gol @@ -873,7 +875,7 @@ Objective-C and Objective-C++ Dialects}. -march=@var{arch} -mbmx -mno-bmx -mcdx -mno-cdx} @emph{Nvidia PTX Options} -@gccoptlist{-m32 -m64 -mmainkernel} +@gccoptlist{-m32 -m64 -mmainkernel -moptimize} @emph{PDP-11 Options} @gccoptlist{-mfpu -msoft-float -mac0 -mno-ac0 -m40 -m45 -m10 @gol @@ -12846,7 +12848,7 @@ is being compiled: @item -mbarrel-shifter @opindex mbarrel-shifter Generate instructions supported by barrel shifter. This is the default -unless @option{-mcpu=ARC601} is in effect. +unless @option{-mcpu=ARC601} or @samp{-mcpu=ARCEM} is in effect. @item -mcpu=@var{cpu} @opindex mcpu @@ -12859,17 +12861,28 @@ values for @var{cpu} are @opindex mA6 @opindex mARC600 @item ARC600 +@item arc600 Compile for ARC600. Aliases: @option{-mA6}, @option{-mARC600}. @item ARC601 +@item arc601 @opindex mARC601 Compile for ARC601. Alias: @option{-mARC601}. @item ARC700 +@item arc700 @opindex mA7 @opindex mARC700 Compile for ARC700. Aliases: @option{-mA7}, @option{-mARC700}. This is the default when configured with @option{--with-cpu=arc700}@. + +@item ARCEM +@item arcem +Compile for ARC EM. + +@item ARCHS +@item archs +Compile for ARC HS. @end table @item -mdpfp @@ -12940,6 +12953,62 @@ can overridden by FPX options; @samp{mspfp}, @samp{mspfp-compact}, or @opindex mswap Generate swap instructions. +@item -mdiv-rem +@opindex mdiv-rem +Enable DIV/REM instructions for ARCv2 cores. + +@item -mcode-density +@opindex mcode-density +Enable code density instructions for ARC EM, default on for ARC HS. + +@item -mmpy-option=@var{multo} +@opindex mmpy-option +Compile ARCv2 code with a multiplier design option. @samp{wlh1} is +the default value. The recognized values for @var{multo} are: + +@table @samp +@item 0 +No multiplier available. + +@item 1 +@opindex w +The multiply option is set to w: 16x16 multiplier, fully pipelined. +The following instructions are enabled: MPYW, and MPYUW. + +@item 2 +@opindex wlh1 +The multiply option is set to wlh1: 32x32 multiplier, fully +pipelined (1 stage). The following instructions are additionaly +enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S. + +@item 3 +@opindex wlh2 +The multiply option is set to wlh2: 32x32 multiplier, fully pipelined +(2 stages). The following instructions are additionaly enabled: MPY, +MPYU, MPYM, MPYMU, and MPY_S. + +@item 4 +@opindex wlh3 +The multiply option is set to wlh3: Two 16x16 multiplier, blocking, +sequential. The following instructions are additionaly enabled: MPY, +MPYU, MPYM, MPYMU, and MPY_S. + +@item 5 +@opindex wlh4 +The multiply option is set to wlh4: One 16x16 multiplier, blocking, +sequential. The following instructions are additionaly enabled: MPY, +MPYU, MPYM, MPYMU, and MPY_S. + +@item 6 +@opindex wlh5 +The multiply option is set to wlh5: One 32x4 multiplier, blocking, +sequential. The following instructions are additionaly enabled: MPY, +MPYU, MPYM, MPYMU, and MPY_S. + +@end table + +This option is only available for ARCv2 cores@. + @end table The following options are passed through to the assembler, and also @@ -18965,6 +19034,11 @@ Generate code for 32-bit or 64-bit ABI. Link in code for a __main kernel. This is for stand-alone instead of offloading execution. +@item -moptimize +@opindex moptimize +Apply partitioned execution optimizations. This is the default when any +level of optimization is selected. + @end table @node PDP-11 Options diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 8b437ab8f26..eb76117ca1a 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -11886,16 +11886,16 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index) offset_int low_bound = 0; if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE) - { - tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); - if (domain_type && TYPE_MIN_VALUE (domain_type)) { - /* Static constructors for variably sized objects makes no sense. */ - gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); - index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); - low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); + tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor)); + if (domain_type && TYPE_MIN_VALUE (domain_type)) + { + /* Static constructors for variably sized objects makes no sense. */ + gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST); + index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type)); + low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type)); + } } - } if (index_type) access_index = wi::ext (access_index, TYPE_PRECISION (index_type), @@ -11911,29 +11911,29 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index) tree cfield, cval; FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) - { - /* Array constructor might explicitely set index, or specify range - * or leave index NULL meaning that it is next index after previous - * one. */ - if (cfield) { - if (TREE_CODE (cfield) == INTEGER_CST) - max_index = index = wi::to_offset (cfield); + /* Array constructor might explicitly set index, or specify a range, + or leave index NULL meaning that it is next index after previous + one. */ + if (cfield) + { + if (TREE_CODE (cfield) == INTEGER_CST) + max_index = index = wi::to_offset (cfield); + else + { + gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); + index = wi::to_offset (TREE_OPERAND (cfield, 0)); + max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); + } + } else - { - gcc_assert (TREE_CODE (cfield) == RANGE_EXPR); - index = wi::to_offset (TREE_OPERAND (cfield, 0)); - max_index = wi::to_offset (TREE_OPERAND (cfield, 1)); - } - } - else - { - index += 1; - if (index_type) - index = wi::ext (index, TYPE_PRECISION (index_type), - TYPE_SIGN (index_type)); - max_index = index; - } + { + index += 1; + if (index_type) + index = wi::ext (index, TYPE_PRECISION (index_type), + TYPE_SIGN (index_type)); + max_index = index; + } /* Do we have match? */ if (wi::cmpu (access_index, index) >= 0 diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index cd4c94e6764..33c541a38d6 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,8 @@ +2015-11-11 Dominique d'Humieres <dominiq@lps.ens.fr> + + PR fortran/67826 + * openmp.c (gfc_omp_udr_find): Fix typo. + 2015-11-08 Steven g. Kargl <kargl@gcc.gnu.org> PR fortran/68053 diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index a7c7a1927e3..4af139a2a17 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -1820,7 +1820,7 @@ gfc_omp_udr_find (gfc_symtree *st, gfc_typespec *ts) for (omp_udr = st->n.omp_udr; omp_udr; omp_udr = omp_udr->next) if (omp_udr->ts.type == ts->type || ((omp_udr->ts.type == BT_DERIVED || omp_udr->ts.type == BT_CLASS) - && (ts->type == BT_DERIVED && ts->type == BT_CLASS))) + && (ts->type == BT_DERIVED || ts->type == BT_CLASS))) { if (omp_udr->ts.type == BT_DERIVED || omp_udr->ts.type == BT_CLASS) { diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c index ce32ad33e94..b8078230f34 100644 --- a/gcc/gimple-ssa-strength-reduction.c +++ b/gcc/gimple-ssa-strength-reduction.c @@ -2226,12 +2226,11 @@ create_phi_basis (slsr_cand_t c, gimple *from_phi, tree basis_name, int i; tree name, phi_arg; gphi *phi; - vec<tree> phi_args; slsr_cand_t basis = lookup_cand (c->basis); int nargs = gimple_phi_num_args (from_phi); basic_block phi_bb = gimple_bb (from_phi); slsr_cand_t phi_cand = base_cand_from_table (gimple_phi_result (from_phi)); - phi_args.create (nargs); + auto_vec<tree> phi_args (nargs); /* Process each argument of the existing phi that represents conditionally-executed add candidates. */ diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE index f325bb33ecb..d23a6cb5f58 100644 --- a/gcc/go/gofrontend/MERGE +++ b/gcc/go/gofrontend/MERGE @@ -1,4 +1,4 @@ -012ab5cb2ef1c26e8023ce90d3a2bba174da7b30 +e3aef41ce0c5be81e2589e60d9cb0db1516e9e2d The first line of this file holds the git revision number of the last merge done from the gofrontend repository. diff --git a/gcc/optabs.c b/gcc/optabs.c index f9fbfde967d..4ffbc0cdefd 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -1047,7 +1047,8 @@ expand_binop_directly (machine_mode mode, optab binoptab, /* The mode of the result is different then the mode of the arguments. */ tmp_mode = insn_data[(int) icode].operand[0].mode; - if (GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode)) + if (VECTOR_MODE_P (mode) + && GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode)) { delete_insns_since (last); return NULL_RTX; diff --git a/gcc/passes.c b/gcc/passes.c index d3d6e1d76b5..8a283ae8a7a 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -2058,6 +2058,18 @@ verify_curr_properties (function *fn, void *data) gcc_assert ((fn->curr_properties & props) == props); } +/* Release dump file name if set. */ + +static void +release_dump_file_name (void) +{ + if (dump_file_name) + { + free (CONST_CAST (char *, dump_file_name)); + dump_file_name = NULL; + } +} + /* Initialize pass dump file. */ /* This is non-static so that the plugins can use it. */ @@ -2071,6 +2083,7 @@ pass_init_dump_file (opt_pass *pass) gcc::dump_manager *dumps = g->get_dumps (); bool initializing_dump = !dumps->dump_initialized_p (pass->static_pass_number); + release_dump_file_name (); dump_file_name = dumps->get_dump_file_name (pass->static_pass_number); dumps->dump_start (pass->static_pass_number, &dump_flags); if (dump_file && current_function_decl) @@ -2098,11 +2111,7 @@ pass_fini_dump_file (opt_pass *pass) timevar_push (TV_DUMP); /* Flush and close dump file. */ - if (dump_file_name) - { - free (CONST_CAST (char *, dump_file_name)); - dump_file_name = NULL; - } + release_dump_file_name (); g->get_dumps ()->dump_finish (pass->static_pass_number); timevar_pop (TV_DUMP); diff --git a/gcc/regrename.c b/gcc/regrename.c index d727dd9095b..d41410a9348 100644 --- a/gcc/regrename.c +++ b/gcc/regrename.c @@ -1068,7 +1068,9 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act && GET_CODE (pat) == SET && GET_CODE (SET_DEST (pat)) == REG && GET_CODE (SET_SRC (pat)) == REG - && terminated_this_insn) + && terminated_this_insn + && terminated_this_insn->nregs + == REG_NREGS (recog_data.operand[1])) { gcc_assert (terminated_this_insn->regno == REGNO (recog_data.operand[1])); @@ -1593,6 +1595,7 @@ build_def_use (basic_block bb) enum rtx_code set_code = SET; enum rtx_code clobber_code = CLOBBER; insn_rr_info *insn_info = NULL; + terminated_this_insn = NULL; /* Process the insn, determining its effect on the def-use chains and live hard registers. We perform the following @@ -1749,8 +1752,6 @@ build_def_use (basic_block bb) scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read, OP_INOUT); - terminated_this_insn = NULL; - /* Step 4: Close chains for registers that die here, unless the register is mentioned in a REG_UNUSED note. In that case we keep the chain open until step #7 below to ensure diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 1ac009f2613..4637d5fc6a8 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,25 @@ +2015-11-11 Simon Dardis <simon.dardis@imgtec.com> + + * gcc.target/mips/split-ds-sequence.c: New test. + +2015-11-11 Julia Koval <julia.koval@intel.com> + + * g++.dg/ext/mv16.C: New functions. + +2015-11-11 Richard Biener <rguenth@gcc.gnu.org> + Jiong Wang <jiong.wang@arm.com> + + * gcc.dg/tree-ssa/pr68234.c: New testcase. + +2015-11-10 Nathan Sidwell <nathan@codesourcery.com> + + * gcc.dg/goacc/nvptx-opt-1.c: New test. + +2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> + + * gcc.target/i386/mask-pack.c: New test. + * gcc.target/i386/mask-unpack.c: New test. + 2015-11-10 Ilya Enkovich <enkovich.gnu@gmail.com> * gcc.target/i386/avx2-vec-mask-bit-not.c: New test. diff --git a/gcc/testsuite/g++.dg/ext/mv16.C b/gcc/testsuite/g++.dg/ext/mv16.C index 8992bfc6fc1..a3a0fe804fd 100644 --- a/gcc/testsuite/g++.dg/ext/mv16.C +++ b/gcc/testsuite/g++.dg/ext/mv16.C @@ -44,6 +44,18 @@ foo () return 12; } +int __attribute__ ((target("arch=broadwell"))) foo () { + return 13; +} + +int __attribute__ ((target("arch=skylake"))) foo () { + return 14; +} + +int __attribute__ ((target("arch=skylake-avx512"))) foo () { + return 15; +} + int main () { int val = foo (); @@ -58,6 +70,12 @@ int main () assert (val == 9); else if (__builtin_cpu_is ("haswell")) assert (val == 12); + else if (__builtin_cpu_is ("broadwell")) + assert (val == 13); + else if (__builtin_cpu_is ("skylake")) + assert (val == 14); + else if (__builtin_cpu_is ("skylake-avx512")) + assert (val == 15); else assert (val == 0); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c b/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c new file mode 100644 index 00000000000..e7c2a95aa4c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-vrp2" } */ + +extern int nc; +void ff (unsigned long long); + +void +f (void) +{ + unsigned char resp[1024]; + int c; + int bl = 0; + unsigned long long *dwords = (unsigned long long *) (resp + 5); + for (c = 0; c < nc; c++) + { + /* PR middle-end/68234, this signed division should be optimized into + right shift as vrp pass should deduct range info of 'bl' falls into + positive number. */ + ff (dwords[bl / 64]); + bl++; + } +} + +/* { dg-final { scan-tree-dump ">> 6" "vrp2" } } */ diff --git a/gcc/testsuite/gcc.target/i386/mask-pack.c b/gcc/testsuite/gcc.target/i386/mask-pack.c new file mode 100644 index 00000000000..0b564ef4284 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mask-pack.c @@ -0,0 +1,100 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -O3 -fopenmp-simd -fdump-tree-vect-details" } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */ +/* { dg-final { scan-assembler-not "maskmov" } } */ + +#define LENGTH 1000 + +long l1[LENGTH], l2[LENGTH]; +int i1[LENGTH], i2[LENGTH]; +short s1[LENGTH], s2[LENGTH]; +char c1[LENGTH], c2[LENGTH]; +double d1[LENGTH], d2[LENGTH]; + +int test1 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (l1[i] > l2[i]) + i1[i] = 1; +} + +int test2 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (i1[i] > i2[i]) + s1[i] = 1; +} + +int test3 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (s1[i] > s2[i]) + c1[i] = 1; +} + +int test4 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (d1[i] > d2[i]) + c1[i] = 1; +} + +int test5 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + i1[i] = l1[i] > l2[i] ? 3 : 4; +} + +int test6 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + s1[i] = i1[i] > i2[i] ? 3 : 4; +} + +int test7 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + c1[i] = s1[i] > s2[i] ? 3 : 4; +} + +int test8 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + c1[i] = d1[i] > d2[i] ? 3 : 4; +} + +int test9 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (l1[i] > l2[i] && i1[i] < i2[i]) + c1[i] = 1; +} + +int test10 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (l1[i] > l2[i] && i1[i] < i2[i]) + c1[i] = 1; + else + c1[i] = 2; +} diff --git a/gcc/testsuite/gcc.target/i386/mask-unpack.c b/gcc/testsuite/gcc.target/i386/mask-unpack.c new file mode 100644 index 00000000000..5905e1cf00f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mask-unpack.c @@ -0,0 +1,100 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mavx512dq -O3 -fopenmp-simd -fdump-tree-vect-details" } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */ +/* { dg-final { scan-assembler-not "maskmov" } } */ + +#define LENGTH 1000 + +long l1[LENGTH], l2[LENGTH]; +int i1[LENGTH], i2[LENGTH]; +short s1[LENGTH], s2[LENGTH]; +char c1[LENGTH], c2[LENGTH]; +double d1[LENGTH], d2[LENGTH]; + +int test1 () +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (i1[i] > i2[i]) + l1[i] = 1; +} + +int test2 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + if (s1[i] > s2[i]) + i1[i] = 1; +} + +int test3 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + if (c1[i] > c2[i]) + s1[i] = 1; +} + +int test4 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + if (c1[i] > c2[i]) + d1[i] = 1; +} + +int test5 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + l1[i] = i1[i] > i2[i] ? 1 : 2; +} + +int test6 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + i1[i] = s1[i] > s2[i] ? 1 : 2; +} + +int test7 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + s1[i] = c1[i] > c2[i] ? 1 : 2; +} + +int test8 (int n) +{ + int i; + #pragma omp simd safelen(32) + for (i = 0; i < LENGTH; i++) + d1[i] = c1[i] > c2[i] ? 1 : 2; +} + +int test9 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (c1[i] > c2[i] && i1[i] < i2[i]) + l1[i] = 1; +} + +int test10 (int n) +{ + int i; + #pragma omp simd safelen(16) + for (i = 0; i < LENGTH; i++) + if (c1[i] > c2[i] && i1[i] < i2[i]) + l1[i] = 1; + else + l1[i] = 2; +} diff --git a/gcc/testsuite/gcc.target/mips/split-ds-sequence.c b/gcc/testsuite/gcc.target/mips/split-ds-sequence.c new file mode 100644 index 00000000000..e60270db304 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/split-ds-sequence.c @@ -0,0 +1,19 @@ +/* { dg-options "isa_rev>=6" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-mcompact-branches=never" } { "" } } */ +/* { dg-final { scan-assembler-not "nop" } } */ + +int +testg2 (int a, int c) +{ + + int j = 0; + do + { + j += a; + } + while (j < 56); + + j += c; + return j; + +} diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 30aee19aae7..2835c993588 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -4996,9 +4996,9 @@ convert_callers_for_node (struct cgraph_node *node, if (dump_file) fprintf (dump_file, "Adjusting call %s/%i -> %s/%i\n", - xstrdup (cs->caller->name ()), + xstrdup_for_dump (cs->caller->name ()), cs->caller->order, - xstrdup (cs->callee->name ()), + xstrdup_for_dump (cs->callee->name ()), cs->callee->order); ipa_modify_call_arguments (cs, cs->call_stmt, *adjustments); diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index cbf0073ffcf..55e53093caa 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -492,20 +492,27 @@ vect_determine_vectorization_factor (loop_vec_info loop_vinfo) } } - /* The vectorization factor is according to the smallest - scalar type (or the largest vector size, but we only - support one vector size per loop). */ - if (!bool_result) - scalar_type = vect_get_smallest_scalar_type (stmt, &dummy, - &dummy); - if (dump_enabled_p ()) + /* Don't try to compute VF out scalar types if we stmt + produces boolean vector. Use result vectype instead. */ + if (VECTOR_BOOLEAN_TYPE_P (vectype)) + vf_vectype = vectype; + else { - dump_printf_loc (MSG_NOTE, vect_location, - "get vectype for scalar type: "); - dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type); - dump_printf (MSG_NOTE, "\n"); + /* The vectorization factor is according to the smallest + scalar type (or the largest vector size, but we only + support one vector size per loop). */ + if (!bool_result) + scalar_type = vect_get_smallest_scalar_type (stmt, &dummy, + &dummy); + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, + "get vectype for scalar type: "); + dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type); + dump_printf (MSG_NOTE, "\n"); + } + vf_vectype = get_vectype_for_scalar_type (scalar_type); } - vf_vectype = get_vectype_for_scalar_type (scalar_type); if (!vf_vectype) { if (dump_enabled_p ()) diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index e91c6e008a0..4e1d2dbe858 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -66,6 +66,7 @@ static gimple *vect_recog_mult_pattern (vec<gimple *> *, static gimple *vect_recog_mixed_size_cond_pattern (vec<gimple *> *, tree *, tree *); static gimple *vect_recog_bool_pattern (vec<gimple *> *, tree *, tree *); +static gimple *vect_recog_mask_conversion_pattern (vec<gimple *> *, tree *, tree *); static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_widen_mult_pattern, vect_recog_widen_sum_pattern, @@ -79,7 +80,8 @@ static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_divmod_pattern, vect_recog_mult_pattern, vect_recog_mixed_size_cond_pattern, - vect_recog_bool_pattern}; + vect_recog_bool_pattern, + vect_recog_mask_conversion_pattern}; static inline void append_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt) @@ -3152,7 +3154,7 @@ search_type_for_mask (tree var, vec_info *vinfo) enum vect_def_type dt; tree rhs1; enum tree_code rhs_code; - tree res = NULL_TREE; + tree res = NULL_TREE, res2; if (TREE_CODE (var) != SSA_NAME) return NULL_TREE; @@ -3185,13 +3187,26 @@ search_type_for_mask (tree var, vec_info *vinfo) case BIT_AND_EXPR: case BIT_IOR_EXPR: case BIT_XOR_EXPR: - if (!(res = search_type_for_mask (rhs1, vinfo))) - res = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo); + res = search_type_for_mask (rhs1, vinfo); + res2 = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo); + if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION (res2))) + res = res2; break; default: if (TREE_CODE_CLASS (rhs_code) == tcc_comparison) { + tree comp_vectype, mask_type; + + comp_vectype = get_vectype_for_scalar_type (TREE_TYPE (rhs1)); + if (comp_vectype == NULL_TREE) + return NULL_TREE; + + mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1)); + if (!mask_type + || !expand_vec_cmp_expr_p (comp_vectype, mask_type)) + return NULL_TREE; + if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE || !TYPE_UNSIGNED (TREE_TYPE (rhs1))) { @@ -3461,6 +3476,255 @@ vect_recog_bool_pattern (vec<gimple *> *stmts, tree *type_in, } +/* A helper for vect_recog_mask_conversion_pattern. Build + conversion of MASK to a type suitable for masking VECTYPE. + Built statement gets required vectype and is appended to + a pattern sequence of STMT_VINFO. + + Return converted mask. */ + +static tree +build_mask_conversion (tree mask, tree vectype, stmt_vec_info stmt_vinfo, + vec_info *vinfo) +{ + gimple *stmt; + tree masktype, tmp; + stmt_vec_info new_stmt_info; + + masktype = build_same_sized_truth_vector_type (vectype); + tmp = vect_recog_temp_ssa_var (TREE_TYPE (masktype), NULL); + stmt = gimple_build_assign (tmp, CONVERT_EXPR, mask); + new_stmt_info = new_stmt_vec_info (stmt, vinfo); + set_vinfo_for_stmt (stmt, new_stmt_info); + STMT_VINFO_VECTYPE (new_stmt_info) = masktype; + append_pattern_def_seq (stmt_vinfo, stmt); + + return tmp; +} + + +/* Function vect_recog_mask_conversion_pattern + + Try to find statements which require boolean type + converison. Additional conversion statements are + added to handle such cases. For example: + + bool m_1, m_2, m_3; + int i_4, i_5; + double d_6, d_7; + char c_1, c_2, c_3; + + S1 m_1 = i_4 > i_5; + S2 m_2 = d_6 < d_7; + S3 m_3 = m_1 & m_2; + S4 c_1 = m_3 ? c_2 : c_3; + + Will be transformed into: + + S1 m_1 = i_4 > i_5; + S2 m_2 = d_6 < d_7; + S3'' m_2' = (_Bool[bitsize=32])m_2 + S3' m_3' = m_1 & m_2'; + S4'' m_3'' = (_Bool[bitsize=8])m_3' + S4' c_1' = m_3'' ? c_2 : c_3; */ + +static gimple * +vect_recog_mask_conversion_pattern (vec<gimple *> *stmts, tree *type_in, + tree *type_out) +{ + gimple *last_stmt = stmts->pop (); + enum tree_code rhs_code; + tree lhs, rhs1, rhs2, tmp, rhs1_type, rhs2_type, vectype1, vectype2; + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); + stmt_vec_info pattern_stmt_info; + vec_info *vinfo = stmt_vinfo->vinfo; + gimple *pattern_stmt; + + /* Check for MASK_LOAD ans MASK_STORE calls requiring mask conversion. */ + if (is_gimple_call (last_stmt) + && gimple_call_internal_p (last_stmt) + && (gimple_call_internal_fn (last_stmt) == IFN_MASK_STORE + || gimple_call_internal_fn (last_stmt) == IFN_MASK_LOAD)) + { + bool load = (gimple_call_internal_fn (last_stmt) == IFN_MASK_LOAD); + + if (load) + { + lhs = gimple_call_lhs (last_stmt); + vectype1 = get_vectype_for_scalar_type (TREE_TYPE (lhs)); + } + else + { + rhs2 = gimple_call_arg (last_stmt, 3); + vectype1 = get_vectype_for_scalar_type (TREE_TYPE (rhs2)); + } + + rhs1 = gimple_call_arg (last_stmt, 2); + rhs1_type = search_type_for_mask (rhs1, vinfo); + if (!rhs1_type) + return NULL; + vectype2 = get_mask_type_for_scalar_type (rhs1_type); + + if (!vectype1 || !vectype2 + || TYPE_VECTOR_SUBPARTS (vectype1) == TYPE_VECTOR_SUBPARTS (vectype2)) + return NULL; + + tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo); + + if (load) + { + lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); + pattern_stmt + = gimple_build_call_internal (IFN_MASK_LOAD, 3, + gimple_call_arg (last_stmt, 0), + gimple_call_arg (last_stmt, 1), + tmp); + gimple_call_set_lhs (pattern_stmt, lhs); + } + else + pattern_stmt + = gimple_build_call_internal (IFN_MASK_STORE, 4, + gimple_call_arg (last_stmt, 0), + gimple_call_arg (last_stmt, 1), + tmp, + gimple_call_arg (last_stmt, 3)); + + + pattern_stmt_info = new_stmt_vec_info (pattern_stmt, vinfo); + set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info); + STMT_VINFO_DATA_REF (pattern_stmt_info) + = STMT_VINFO_DATA_REF (stmt_vinfo); + STMT_VINFO_DR_BASE_ADDRESS (pattern_stmt_info) + = STMT_VINFO_DR_BASE_ADDRESS (stmt_vinfo); + STMT_VINFO_DR_INIT (pattern_stmt_info) = STMT_VINFO_DR_INIT (stmt_vinfo); + STMT_VINFO_DR_OFFSET (pattern_stmt_info) + = STMT_VINFO_DR_OFFSET (stmt_vinfo); + STMT_VINFO_DR_STEP (pattern_stmt_info) = STMT_VINFO_DR_STEP (stmt_vinfo); + STMT_VINFO_DR_ALIGNED_TO (pattern_stmt_info) + = STMT_VINFO_DR_ALIGNED_TO (stmt_vinfo); + DR_STMT (STMT_VINFO_DATA_REF (stmt_vinfo)) = pattern_stmt; + + *type_out = vectype1; + *type_in = vectype1; + stmts->safe_push (last_stmt); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_mask_conversion_pattern: detected:\n"); + + return pattern_stmt; + } + + if (!is_gimple_assign (last_stmt)) + return NULL; + + lhs = gimple_assign_lhs (last_stmt); + rhs1 = gimple_assign_rhs1 (last_stmt); + rhs_code = gimple_assign_rhs_code (last_stmt); + + /* Check for cond expression requiring mask conversion. */ + if (rhs_code == COND_EXPR) + { + /* vect_recog_mixed_size_cond_pattern could apply. + Do nothing then. */ + if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo)) + return NULL; + + vectype1 = get_vectype_for_scalar_type (TREE_TYPE (lhs)); + + if (TREE_CODE (rhs1) == SSA_NAME) + { + rhs1_type = search_type_for_mask (rhs1, vinfo); + if (!rhs1_type) + return NULL; + } + else + rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0)); + + vectype2 = get_mask_type_for_scalar_type (rhs1_type); + + if (!vectype1 || !vectype2 + || TYPE_VECTOR_SUBPARTS (vectype1) == TYPE_VECTOR_SUBPARTS (vectype2)) + return NULL; + + /* If rhs1 is a comparison we need to move it into a + separate statement. */ + if (TREE_CODE (rhs1) != SSA_NAME) + { + tmp = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL); + pattern_stmt = gimple_build_assign (tmp, rhs1); + rhs1 = tmp; + + pattern_stmt_info = new_stmt_vec_info (pattern_stmt, vinfo); + set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info); + STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype2; + append_pattern_def_seq (stmt_vinfo, pattern_stmt); + } + + tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo); + + lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); + pattern_stmt = gimple_build_assign (lhs, COND_EXPR, tmp, + gimple_assign_rhs2 (last_stmt), + gimple_assign_rhs3 (last_stmt)); + + *type_out = vectype1; + *type_in = vectype1; + stmts->safe_push (last_stmt); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_mask_conversion_pattern: detected:\n"); + + return pattern_stmt; + } + + /* Now check for binary boolean operations requiring conversion for + one of operands. */ + if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE) + return NULL; + + if (rhs_code != BIT_IOR_EXPR + && rhs_code != BIT_XOR_EXPR + && rhs_code != BIT_AND_EXPR) + return NULL; + + rhs2 = gimple_assign_rhs2 (last_stmt); + + rhs1_type = search_type_for_mask (rhs1, vinfo); + rhs2_type = search_type_for_mask (rhs2, vinfo); + + if (!rhs1_type || !rhs2_type + || TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (rhs2_type)) + return NULL; + + if (TYPE_PRECISION (rhs1_type) < TYPE_PRECISION (rhs2_type)) + { + vectype1 = get_mask_type_for_scalar_type (rhs1_type); + if (!vectype1) + return NULL; + rhs2 = build_mask_conversion (rhs2, vectype1, stmt_vinfo, vinfo); + } + else + { + vectype1 = get_mask_type_for_scalar_type (rhs2_type); + if (!vectype1) + return NULL; + rhs1 = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo); + } + + lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL); + pattern_stmt = gimple_build_assign (lhs, rhs_code, rhs1, rhs2); + + *type_out = vectype1; + *type_in = vectype1; + stmts->safe_push (last_stmt); + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "vect_recog_mask_conversion_pattern: detected:\n"); + + return pattern_stmt; +} + + /* Mark statements that are involved in a pattern. */ static inline void @@ -3556,7 +3820,8 @@ vect_pattern_recog_1 (vect_recog_func_ptr vect_recog_func, stmt_info = vinfo_for_stmt (stmt); loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); - if (VECTOR_MODE_P (TYPE_MODE (type_in))) + if (VECTOR_BOOLEAN_TYPE_P (type_in) + || VECTOR_MODE_P (TYPE_MODE (type_in))) { /* No need to check target support (already checked by the pattern recognition function). */ diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index bdf16faff79..e6a320b341e 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -1974,6 +1974,11 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi, /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed from the IL. */ + if (STMT_VINFO_RELATED_STMT (stmt_info)) + { + stmt = STMT_VINFO_RELATED_STMT (stmt_info); + stmt_info = vinfo_for_stmt (stmt); + } tree lhs = gimple_call_lhs (stmt); new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs))); set_vinfo_for_stmt (new_stmt, stmt_info); @@ -2092,6 +2097,11 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi, { /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed from the IL. */ + if (STMT_VINFO_RELATED_STMT (stmt_info)) + { + stmt = STMT_VINFO_RELATED_STMT (stmt_info); + stmt_info = vinfo_for_stmt (stmt); + } tree lhs = gimple_call_lhs (stmt); new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs))); set_vinfo_for_stmt (new_stmt, stmt_info); @@ -3565,12 +3575,13 @@ vectorizable_conversion (gimple *stmt, gimple_stmt_iterator *gsi, && SCALAR_FLOAT_TYPE_P (rhs_type)))) return false; - if ((INTEGRAL_TYPE_P (lhs_type) - && (TYPE_PRECISION (lhs_type) - != GET_MODE_PRECISION (TYPE_MODE (lhs_type)))) - || (INTEGRAL_TYPE_P (rhs_type) - && (TYPE_PRECISION (rhs_type) - != GET_MODE_PRECISION (TYPE_MODE (rhs_type))))) + if (!VECTOR_BOOLEAN_TYPE_P (vectype_out) + && ((INTEGRAL_TYPE_P (lhs_type) + && (TYPE_PRECISION (lhs_type) + != GET_MODE_PRECISION (TYPE_MODE (lhs_type)))) + || (INTEGRAL_TYPE_P (rhs_type) + && (TYPE_PRECISION (rhs_type) + != GET_MODE_PRECISION (TYPE_MODE (rhs_type)))))) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -3628,6 +3639,21 @@ vectorizable_conversion (gimple *stmt, gimple_stmt_iterator *gsi, return false; } + if (VECTOR_BOOLEAN_TYPE_P (vectype_out) + && !VECTOR_BOOLEAN_TYPE_P (vectype_in)) + { + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "can't convert between boolean and non " + "boolean vectors"); + dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM, rhs_type); + dump_printf (MSG_MISSED_OPTIMIZATION, "\n"); + } + + return false; + } + nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in); nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out); if (nunits_in < nunits_out) @@ -8170,7 +8196,7 @@ free_stmt_vec_info (gimple *stmt) gimple *patt_stmt = STMT_VINFO_STMT (patt_info); gimple_set_bb (patt_stmt, NULL); tree lhs = gimple_get_lhs (patt_stmt); - if (TREE_CODE (lhs) == SSA_NAME) + if (lhs && TREE_CODE (lhs) == SSA_NAME) release_ssa_name (lhs); if (seq) { @@ -8180,7 +8206,7 @@ free_stmt_vec_info (gimple *stmt) gimple *seq_stmt = gsi_stmt (si); gimple_set_bb (seq_stmt, NULL); lhs = gimple_get_lhs (seq_stmt); - if (TREE_CODE (lhs) == SSA_NAME) + if (lhs && TREE_CODE (lhs) == SSA_NAME) release_ssa_name (lhs); free_stmt_vec_info (seq_stmt); } diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9bf7949e081..45c2d9bbbe1 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -1087,7 +1087,7 @@ extern gimple *vect_find_last_scalar_stmt_in_slp (slp_tree); Additional pattern recognition functions can (and will) be added in the future. */ typedef gimple *(* vect_recog_func_ptr) (vec<gimple *> *, tree *, tree *); -#define NUM_PATTERNS 13 +#define NUM_PATTERNS 14 void vect_pattern_recog (vec_info *); /* In tree-vectorizer.c. */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index cb43430ecb1..b0f6c78d734 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -8810,20 +8810,11 @@ vrp_visit_phi_node (gphi *phi) /* If we dropped either bound to +-INF then if this is a loop PHI node SCEV may known more about its value-range. */ - if ((cmp_min > 0 || cmp_min < 0 + if (cmp_min > 0 || cmp_min < 0 || cmp_max < 0 || cmp_max > 0) - && (l = loop_containing_stmt (phi)) - && l->header == gimple_bb (phi)) - adjust_range_with_scev (&vr_result, l, phi, lhs); - - /* If we will end up with a (-INF, +INF) range, set it to - VARYING. Same if the previous max value was invalid for - the type and we end up with vr_result.min > vr_result.max. */ - if ((vrp_val_is_max (vr_result.max) - && vrp_val_is_min (vr_result.min)) - || compare_values (vr_result.min, - vr_result.max) > 0) - goto varying; + goto scev_check; + + goto infinite_check; } /* If the new range is different than the previous value, keep @@ -8849,8 +8840,28 @@ update_range: /* Nothing changed, don't add outgoing edges. */ return SSA_PROP_NOT_INTERESTING; - /* No match found. Set the LHS to VARYING. */ varying: + set_value_range_to_varying (&vr_result); + +scev_check: + /* If this is a loop PHI node SCEV may known more about its value-range. + scev_check can be reached from two paths, one is a fall through from above + "varying" label, the other is direct goto from code block which tries to + avoid infinite simulation. */ + if ((l = loop_containing_stmt (phi)) + && l->header == gimple_bb (phi)) + adjust_range_with_scev (&vr_result, l, phi, lhs); + +infinite_check: + /* If we will end up with a (-INF, +INF) range, set it to + VARYING. Same if the previous max value was invalid for + the type and we end up with vr_result.min > vr_result.max. */ + if ((vr_result.type == VR_RANGE || vr_result.type == VR_ANTI_RANGE) + && !((vrp_val_is_max (vr_result.max) && vrp_val_is_min (vr_result.min)) + || compare_values (vr_result.min, vr_result.max) > 0)) + goto update_range; + + /* No match found. Set the LHS to VARYING. */ set_value_range_to_varying (lhs_vr); return SSA_PROP_VARYING; } diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index de2674058f5..c8be4e8b722 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -9814,7 +9814,7 @@ vt_initialize (void) alloc_aux_for_blocks (sizeof (variable_tracking_info)); - empty_shared_hash = new shared_hash; + empty_shared_hash = shared_hash_pool.allocate (); empty_shared_hash->refcount = 1; empty_shared_hash->htab = new variable_table_type (1); changed_variables = new variable_table_type (10); diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog index 13ed8133857..549d0ac3990 100644 --- a/libgcc/ChangeLog +++ b/libgcc/ChangeLog @@ -1,3 +1,19 @@ +2015-11-11 Claudiu Zissulescu <claziss@synopsys.com> + + * config/arc/dp-hack.h: Add support for ARCHS. + * config/arc/ieee-754/divdf3.S: Likewise. + * config/arc/ieee-754/divsf3-stdmul.S: Likewise. + * config/arc/ieee-754/muldf3.S: Likewise. + * config/arc/ieee-754/mulsf3.S: Likewise + * config/arc/lib1funcs.S: Likewise + * config/arc/gmon/dcache_linesz.S: Don't read the build register + for ARCv2 cores. + * config/arc/gmon/profil.S (__profil, __profil_irq): Don't profile + for ARCv2 cores. + * config/arc/ieee-754/arc-ieee-754.h (MPYHU, MPYH): Define. + * config/arc/t-arc700-uClibc: Remove hard selection for ARC 700 + cores. + 2015-11-09 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> * config/ia64/crtbegin.S: Check HAVE_INITFINI_ARRAY_SUPPORT diff --git a/libgcc/config/arc/dp-hack.h b/libgcc/config/arc/dp-hack.h index c1ab9b2294e..a212e3b8b60 100644 --- a/libgcc/config/arc/dp-hack.h +++ b/libgcc/config/arc/dp-hack.h @@ -48,7 +48,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define L_mul_df #define L_div_df #elif (!defined (__ARC700__) && !defined (__ARC_MUL64__) \ - && !defined(__ARC_MUL32BY16__)) + && !defined (__ARC_MUL32BY16__) && !defined (__HS__)) #define L_mul_df #define L_div_df #undef QUIET_NAN diff --git a/libgcc/config/arc/gmon/dcache_linesz.S b/libgcc/config/arc/gmon/dcache_linesz.S index 8cf64426aca..972a5879fed 100644 --- a/libgcc/config/arc/gmon/dcache_linesz.S +++ b/libgcc/config/arc/gmon/dcache_linesz.S @@ -38,6 +38,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see .global __dcache_linesz .balign 4 __dcache_linesz: +#if !defined (__EM__) && !defined (__HS__) lr r12,[D_CACHE_BUILD] extb_s r0,r12 breq_s r0,0,.Lsz_nocache @@ -51,5 +52,6 @@ __dcache_linesz: asl_s r0,r0,r12 j_s [blink] .Lsz_nocache: +#endif /* !__EM__ && !__HS__ */ mov_s r0,1 j_s [blink] diff --git a/libgcc/config/arc/gmon/profil.S b/libgcc/config/arc/gmon/profil.S index 3be2869c924..df10dbd6af7 100644 --- a/libgcc/config/arc/gmon/profil.S +++ b/libgcc/config/arc/gmon/profil.S @@ -45,6 +45,7 @@ __profil_offset: .global __dcache_linesz .global __profil FUNC(__profil) +#if !defined (__EM__) && !defined (__HS__) .Lstop_profiling: sr r0,[CONTROL0] j_s [blink] @@ -107,6 +108,12 @@ nocache: j_s [blink] .balign 4 1: j __profil_irq +#else +__profil: + .balign 4 + mov_s r0,-1 + j_s [blink] +#endif /* !__EM__ && !__HS__ */ ENDFUNC(__profil) FUNC(__profil_irq) @@ -114,6 +121,7 @@ nocache: .balign 32,0,12 ; make sure the code spans no more that two cache lines nop_s __profil_irq: +#if !defined (__EM__) && !defined (__HS__) push_s r0 ld r0,[__profil_offset] push_s r1 @@ -128,6 +136,9 @@ __profil_irq: nostore:ld.ab r2,[sp,8] pop_s r0 j.f [ilink1] +#else + rtie +#endif /* !__EM__ && !__HS__ */ ENDFUNC(__profil_irq) ; could save one cycle if the counters were allocated at link time and diff --git a/libgcc/config/arc/ieee-754/arc-ieee-754.h b/libgcc/config/arc/ieee-754/arc-ieee-754.h index 08a14a6f429..f1ac98e4278 100644 --- a/libgcc/config/arc/ieee-754/arc-ieee-754.h +++ b/libgcc/config/arc/ieee-754/arc-ieee-754.h @@ -54,3 +54,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define bmsk_l bmsk #define bxor_l bxor #define bcs_s blo_s +#if defined (__HS__) || defined (__EM__) +#define MPYHU mpymu +#define MPYH mpym +#else +#define MPYHU mpyhu +#define MPYH mpyh +#endif diff --git a/libgcc/config/arc/ieee-754/divdf3.S b/libgcc/config/arc/ieee-754/divdf3.S index 2d000e40a04..27705ed5909 100644 --- a/libgcc/config/arc/ieee-754/divdf3.S +++ b/libgcc/config/arc/ieee-754/divdf3.S @@ -118,7 +118,7 @@ __divdf3_support: /* This label makes debugger output saner. */ sub r11,r11,11 asl DBL1L,DBL1L,r11 sub r11,r11,1 - mpyhu r5,r4,r8 + MPYHU r5,r4,r8 sub r7,r7,r11 asl r4,r4,12 b.d .Lpast_denorm_dbl1 @@ -189,25 +189,33 @@ __divdf3: asl r8,DBL1H,12 lsr r12,DBL1L,20 lsr r4,r8,26 +#ifdef __HS__ + add3 r10,pcl,60 ; (.Ldivtab-.) >> 3 +#else add3 r10,pcl,59 ; (.Ldivtab-.) >> 3 +#endif ld.as r4,[r10,r4] +#ifdef __HS__ + ld.as r9,[pcl,182]; [pcl,(-((.-.L7ff00000) >> 2))] ; 0x7ff00000 +#else ld.as r9,[pcl,180]; [pcl,(-((.-.L7ff00000) >> 2))] ; 0x7ff00000 +#endif or r8,r8,r12 - mpyhu r5,r4,r8 + MPYHU r5,r4,r8 and.f r7,DBL1H,r9 asl r4,r4,12 ; having the asl here is a concession to the XMAC pipeline. beq.d .Ldenorm_dbl1 and r6,DBL0H,r9 .Lpast_denorm_dbl1: ; wb stall sub r4,r4,r5 - mpyhu r5,r4,r4 + MPYHU r5,r4,r4 breq.d r6,0,.Ldenorm_dbl0 lsr r8,r8,1 asl r12,DBL0H,11 lsr r10,DBL0L,21 .Lpast_denorm_dbl0: ; wb stall bset r8,r8,31 - mpyhu r11,r5,r8 + MPYHU r11,r5,r8 add_s r12,r12,r10 bset r5,r12,31 cmp r5,r8 @@ -215,7 +223,7 @@ __divdf3: ; wb stall lsr.cc r5,r5,1 sub r4,r4,r11 ; u1.31 inverse, about 30 bit - mpyhu r11,r5,r4 ; result fraction highpart + MPYHU r11,r5,r4 ; result fraction highpart breq r7,r9,.Linf_nan_dbl1 lsr r8,r8,2 ; u3.29 add r5,r6, /* wait for immediate / XMAC wb stall */ \ @@ -226,7 +234,7 @@ __divdf3: asl_s DBL1L,DBL1L,9 ; u-29.23:9 sbc r6,r5,r7 ; resource conflict (not for XMAC) - mpyhu r5,r11,DBL1L ; u-28.23:9 + MPYHU r5,r11,DBL1L ; u-28.23:9 add.cs DBL0L,DBL0L,DBL0L asl_s DBL0L,DBL0L,6 ; u-26.25:7 asl r10,r11,23 @@ -234,7 +242,7 @@ __divdf3: ; wb stall (before 'and' for XMAC) lsr r7,r11,9 sub r5,DBL0L,r5 ; rest msw ; u-26.31:0 - mpyh r12,r5,r4 ; result fraction lowpart + MPYH r12,r5,r4 ; result fraction lowpart xor.f 0,DBL0H,DBL1H and DBL0H,r6,r9 add_s DBL0H,DBL0H,r7 ; (XMAC wb stall) @@ -261,7 +269,7 @@ __divdf3: sub.cs DBL0H,DBL0H,1 sub.f r12,r12,2 ; resource conflict (not for XMAC) - mpyhu r7,r12,DBL1L ; u-51.32 + MPYHU r7,r12,DBL1L ; u-51.32 asl r5,r5,25 ; s-51.7:25 lsr r10,r10,7 ; u-51.30:2 ; resource conflict (not for XMAC) @@ -291,10 +299,21 @@ __divdf3: rsub r7,r6,5 asr r10,r12,28 bmsk r4,r12,27 +#ifdef __HS__ + min r7, r7, 31 + asr DBL0L, r4, r7 +#else asrs DBL0L,r4,r7 +#endif add DBL1H,r11,r10 +#ifdef __HS__ + abs.f r10, r4 + sub.mi r10, r10, 1 +#endif add.f r7,r6,32-5 +#ifdef __ARC700__ abss r10,r4 +#endif asl r4,r4,r7 mov.mi r4,r10 add.f r10,r6,23 @@ -319,7 +338,7 @@ __divdf3: and r9,DBL0L,1 ; tie-breaker: round to even lsr r11,r11,7 ; u-51.30:2 ; resource conflict (not for XMAC) - mpyhu r8,r12,DBL1L ; u-51.32 + MPYHU r8,r12,DBL1L ; u-51.32 sub.mi r11,r11,DBL1L ; signed multiply adjust for r12*DBL1L add_s DBL1H,DBL1H,r11 ; resource conflict (not for XMAC) diff --git a/libgcc/config/arc/ieee-754/divsf3-stdmul.S b/libgcc/config/arc/ieee-754/divsf3-stdmul.S index 09861d3318c..f13944ae11a 100644 --- a/libgcc/config/arc/ieee-754/divsf3-stdmul.S +++ b/libgcc/config/arc/ieee-754/divsf3-stdmul.S @@ -144,7 +144,7 @@ __divsf3_support: /* This label makes debugger output saner. */ ld.as r5,[r3,r5] add r4,r6,r6 ; load latency - mpyhu r7,r5,r4 + MPYHU r7,r5,r4 bic.ne.f 0, \ 0x60000000,r0 ; large number / denorm -> Inf beq_s .Linf_NaN @@ -152,7 +152,7 @@ __divsf3_support: /* This label makes debugger output saner. */ ; wb stall ; slow track sub r7,r5,r7 - mpyhu r8,r7,r6 + MPYHU r8,r7,r6 asl_s r12,r12,23 and.f r2,r0,r9 add r2,r2,r12 @@ -160,7 +160,7 @@ __divsf3_support: /* This label makes debugger output saner. */ ; wb stall bne.d .Lpast_denorm_fp1 .Ldenorm_fp0: - mpyhu r8,r8,r7 + MPYHU r8,r8,r7 bclr r12,r12,31 norm.f r3,r12 ; flag for 0/x -> 0 check bic.ne.f 0,0x60000000,r1 ; denorm/large number -> 0 @@ -209,7 +209,7 @@ __divsf3: ld.as r5,[r3,r2] asl r4,r1,9 ld.as r9,[pcl,-114]; [pcl,(-((.-.L7f800000) >> 2))] ; 0x7f800000 - mpyhu r7,r5,r4 + MPYHU r7,r5,r4 asl r6,r1,8 and.f r11,r1,r9 bset r6,r6,31 @@ -217,14 +217,14 @@ __divsf3: ; wb stall beq .Ldenorm_fp1 sub r7,r5,r7 - mpyhu r8,r7,r6 + MPYHU r8,r7,r6 breq.d r11,r9,.Linf_nan_fp1 and.f r2,r0,r9 beq.d .Ldenorm_fp0 asl r12,r0,8 ; wb stall breq r2,r9,.Linf_nan_fp0 - mpyhu r8,r8,r7 + MPYHU r8,r8,r7 .Lpast_denorm_fp1: bset r3,r12,31 .Lpast_denorm_fp0: @@ -234,7 +234,7 @@ __divsf3: /* wb stall */ \ 0x3f000000 sub r7,r7,r8 ; u1.31 inverse, about 30 bit - mpyhu r3,r3,r7 + MPYHU r3,r3,r7 sbc r2,r2,r11 xor.f 0,r0,r1 and r0,r2,r9 diff --git a/libgcc/config/arc/ieee-754/muldf3.S b/libgcc/config/arc/ieee-754/muldf3.S index 805db5c8922..5f562e23354 100644 --- a/libgcc/config/arc/ieee-754/muldf3.S +++ b/libgcc/config/arc/ieee-754/muldf3.S @@ -132,19 +132,19 @@ __muldf3_support: /* This label makes debugger output saner. */ .balign 4 __muldf3: ld.as r9,[pcl,0x4b] ; ((.L7ff00000-.+2)/4)] - mpyhu r4,DBL0L,DBL1L + MPYHU r4,DBL0L,DBL1L bmsk r6,DBL0H,19 bset r6,r6,20 mpyu r7,r6,DBL1L and r11,DBL0H,r9 breq r11,0,.Ldenorm_dbl0 - mpyhu r8,r6,DBL1L + MPYHU r8,r6,DBL1L bmsk r10,DBL1H,19 bset r10,r10,20 - mpyhu r5,r10,DBL0L + MPYHU r5,r10,DBL0L add.f r4,r4,r7 and r12,DBL1H,r9 - mpyhu r7,r6,r10 + MPYHU r7,r6,r10 breq r12,0,.Ldenorm_dbl1 adc.f r5,r5,r8 mpyu r8,r10,DBL0L diff --git a/libgcc/config/arc/ieee-754/mulsf3.S b/libgcc/config/arc/ieee-754/mulsf3.S index 7a6c7916ddb..df2660a2102 100644 --- a/libgcc/config/arc/ieee-754/mulsf3.S +++ b/libgcc/config/arc/ieee-754/mulsf3.S @@ -64,7 +64,7 @@ __mulsf3: bset r2,r0,23 asl_s r2,r2,8 bset r3,r4,23 - mpyhu r6,r2,r3 + MPYHU r6,r2,r3 and r11,r0,r9 breq r11,0,.Ldenorm_dbl0 mpyu r7,r2,r3 @@ -144,7 +144,7 @@ __mulsf3: add_s r2,r2,r2 asl r2,r2,r4 asl r4,r4,23 - mpyhu r6,r2,r3 + MPYHU r6,r2,r3 breq r12,r9,.Ldenorm_dbl0_inf_nan_dbl1 sub.ne.f r12,r12,r4 mpyu r7,r2,r3 @@ -163,7 +163,7 @@ __mulsf3: asl r4,r4,r3 sub_s r3,r3,1 asl_s r3,r3,23 - mpyhu r6,r2,r4 + MPYHU r6,r2,r4 sub.ne.f r11,r11,r3 bmsk r8,r0,30 mpyu r7,r2,r4 diff --git a/libgcc/config/arc/lib1funcs.S b/libgcc/config/arc/lib1funcs.S index e59340a2242..022a2ea0cbe 100644 --- a/libgcc/config/arc/lib1funcs.S +++ b/libgcc/config/arc/lib1funcs.S @@ -79,7 +79,7 @@ SYM(__mulsi3): j_s.d [blink] mov_s r0,mlo ENDFUNC(__mulsi3) -#elif defined (__ARC700__) +#elif defined (__ARC700__) || defined (__HS__) HIDDEN_FUNC(__mulsi3) mpyu r0,r0,r1 nop_s @@ -393,7 +393,12 @@ SYM(__udivmodsi4): lsr_s r1,r1 cmp_s r0,r1 xor.f r2,lp_count,31 +#if !defined (__EM__) mov_s lp_count,r2 +#else + mov lp_count,r2 + nop_s +#endif /* !__EM__ */ #endif /* !__ARC_NORM__ */ sub.cc r0,r0,r1 mov_s r3,3 @@ -1260,7 +1265,7 @@ SYM(__ld_r13_to_r14_ret): #endif #ifdef L_muldf3 -#ifdef __ARC700__ +#if defined (__ARC700__) || defined (__HS__) #include "ieee-754/muldf3.S" #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__) #include "ieee-754/arc600-mul64/muldf3.S" @@ -1276,7 +1281,7 @@ SYM(__ld_r13_to_r14_ret): #endif #ifdef L_mulsf3 -#ifdef __ARC700__ +#if defined (__ARC700__) || defined (__HS__) #include "ieee-754/mulsf3.S" #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__) #include "ieee-754/arc600-mul64/mulsf3.S" @@ -1288,7 +1293,7 @@ SYM(__ld_r13_to_r14_ret): #endif #ifdef L_divdf3 -#ifdef __ARC700__ +#if defined (__ARC700__) || defined (__HS__) #include "ieee-754/divdf3.S" #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__) #include "ieee-754/arc600-mul64/divdf3.S" @@ -1298,7 +1303,7 @@ SYM(__ld_r13_to_r14_ret): #endif #ifdef L_divsf3 -#ifdef __ARC700__ +#if defined (__ARC700__) || defined (__HS__) #include "ieee-754/divsf3-stdmul.S" #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__) #include "ieee-754/arc600-mul64/divsf3.S" diff --git a/libgcc/config/arc/t-arc700-uClibc b/libgcc/config/arc/t-arc700-uClibc index 651c3de5260..ff570398d90 100644 --- a/libgcc/config/arc/t-arc700-uClibc +++ b/libgcc/config/arc/t-arc700-uClibc @@ -28,10 +28,10 @@ CRTSTUFF_T_CFLAGS += -mno-sdata # Compile crtbeginS.o and crtendS.o with pic. -CRTSTUFF_T_CFLAGS_S = $(CRTSTUFF_T_CFLAGS) -mA7 -fPIC +CRTSTUFF_T_CFLAGS_S = $(CRTSTUFF_T_CFLAGS) -fPIC # Compile libgcc2.a with pic. -TARGET_LIBGCC2_CFLAGS = -mA7 -fPIC +TARGET_LIBGCC2_CFLAGS = -fPIC PROFILE_OSDEP = prof-freq.o diff --git a/libgo/configure b/libgo/configure index 08a197d5a61..eb37e29d2f8 100755 --- a/libgo/configure +++ b/libgo/configure @@ -14249,6 +14249,46 @@ fi fi unset ac_cv_func_gethostbyname + ac_fn_c_check_func "$LINENO" "sendfile" "ac_cv_func_sendfile" +if test "x$ac_cv_func_sendfile" = x""yes; then : + +else + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for main in -lsendfile" >&5 +$as_echo_n "checking for main in -lsendfile... " >&6; } +if test "${ac_cv_lib_sendfile_main+set}" = set; then : + $as_echo_n "(cached) " >&6 +else + ac_check_lib_save_LIBS=$LIBS +LIBS="-lsendfile $LIBS" +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + + +int +main () +{ +return main (); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + ac_cv_lib_sendfile_main=yes +else + ac_cv_lib_sendfile_main=no +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext +LIBS=$ac_check_lib_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_sendfile_main" >&5 +$as_echo "$ac_cv_lib_sendfile_main" >&6; } +if test "x$ac_cv_lib_sendfile_main" = x""yes; then : + libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lsendfile" +fi + +fi + LIBS=$libgo_old_libs fi diff --git a/libgo/configure.ac b/libgo/configure.ac index 332e540a302..6e23a85fa6d 100644 --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -473,6 +473,9 @@ AC_CACHE_CHECK([for socket libraries], libgo_cv_lib_sockets, [AC_CHECK_LIB(nsl, main, [libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lnsl"])]) unset ac_cv_func_gethostbyname + AC_CHECK_FUNC(sendfile, , + [AC_CHECK_LIB(sendfile, main, + [libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lsendfile"])]) LIBS=$libgo_old_libs ]) NET_LIBS="$libgo_cv_lib_sockets" diff --git a/libgo/go/cmd/go/build.go b/libgo/go/cmd/go/build.go index 3afac2ee062..865871c5314 100644 --- a/libgo/go/cmd/go/build.go +++ b/libgo/go/cmd/go/build.go @@ -2555,17 +2555,9 @@ func (tools gccgoToolchain) ld(b *builder, root *action, out string, allactions } } - switch ldBuildmode { - case "c-archive", "c-shared": - ldflags = append(ldflags, "-Wl,--whole-archive") - } - + ldflags = append(ldflags, "-Wl,--whole-archive") ldflags = append(ldflags, afiles...) - - switch ldBuildmode { - case "c-archive", "c-shared": - ldflags = append(ldflags, "-Wl,--no-whole-archive") - } + ldflags = append(ldflags, "-Wl,--no-whole-archive") ldflags = append(ldflags, cgoldflags...) ldflags = append(ldflags, envList("CGO_LDFLAGS", "")...) diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh index 6d39df96e95..662619f2076 100755 --- a/libgo/mksysinfo.sh +++ b/libgo/mksysinfo.sh @@ -1488,4 +1488,24 @@ grep '^type _zone_net_addr_t ' gen-sysinfo.go | \ sed -e 's/_in6_addr/[16]byte/' \ >> ${OUT} +# The Solaris 12 _flow_arp_desc_t struct. +grep '^type _flow_arp_desc_t ' gen-sysinfo.go | \ + sed -e 's/_in6_addr_t/[16]byte/g' \ + >> ${OUT} + +# The Solaris 12 _flow_l3_desc_t struct. +grep '^type _flow_l3_desc_t ' gen-sysinfo.go | \ + sed -e 's/_in6_addr_t/[16]byte/g' \ + >> ${OUT} + +# The Solaris 12 _mac_ipaddr_t struct. +grep '^type _mac_ipaddr_t ' gen-sysinfo.go | \ + sed -e 's/_in6_addr_t/[16]byte/g' \ + >> ${OUT} + +# The Solaris 12 _mactun_info_t struct. +grep '^type _mactun_info_t ' gen-sysinfo.go | \ + sed -e 's/_in6_addr_t/[16]byte/g' \ + >> ${OUT} + exit $? diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index 08d467b1055..ed86943bb32 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,6 +1,10 @@ 2015-11-09 Nathan Sidwell <nathan@codesourcery.com> - * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: New. + * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Remove + inadvertent commit. + +2015-11-09 Nathan Sidwell <nathan@codesourcery.com> + * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: New. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: New. * testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: New. diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c deleted file mode 100644 index 7f5d3d37617..00000000000 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c +++ /dev/null @@ -1,41 +0,0 @@ -/* { dg-do run } */ - -#include <openacc.h> - -int main () -{ - int ok = 1; - int val = 2; - int ary[32]; - int ondev = 0; - - for (int i = 0; i < 32; i++) - ary[i] = ~0; - -#pragma acc parallel num_gangs (32) copy (ok) firstprivate (val) copy(ary, ondev) - { - ondev = acc_on_device (acc_device_not_host); -#pragma acc loop gang(static:1) - for (unsigned i = 0; i < 32; i++) - { - if (val != 2) - ok = 0; - val += i; - ary[i] = val; - } - } - - if (ondev) - { - if (!ok) - return 1; - if (val != 2) - return 1; - - for (int i = 0; i < 32; i++) - if (ary[i] != 2 + i) - return 1; - } - - return 0; -} diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index 540041d63dd..960a56ca186 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,5 +1,24 @@ +2015-11-11 Jonathan Wakely <jwakely@redhat.com> + + PR libstdc++/64651 + * libsupc++/exception_ptr.h (rethrow_exception): Add using-declaration + to __exception_ptr namespace. + * testsuite/18_support/exception_ptr/rethrow_exception.cc: Test ADL. + Remove unnecessary test variables. + 2015-11-10 Jonathan Wakely <jwakely@redhat.com> + PR libstdc++/68190 + * include/bits/stl_multiset.h (multiset::find): Fix return types. + * include/bits/stl_set.h (set::find): Likewise. + * testsuite/23_containers/map/operations/2.cc: Test find return types. + * testsuite/23_containers/multimap/operations/2.cc: Likewise. + * testsuite/23_containers/multiset/operations/2.cc: Likewise. + * testsuite/23_containers/set/operations/2.cc: Likewise. + + * doc/xml/manual/status_cxx2017.xml: Update. + * doc/html/*: Regenerate. + * include/bits/functional_hash.h: Fix grammar in comment. 2015-11-09 François Dumont <fdumont@gcc.gnu.org> diff --git a/libstdc++-v3/doc/html/manual/status.html b/libstdc++-v3/doc/html/manual/status.html index cdbc8b94f2f..91404aace42 100644 --- a/libstdc++-v3/doc/html/manual/status.html +++ b/libstdc++-v3/doc/html/manual/status.html @@ -495,11 +495,11 @@ not in any particular release. <a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4277.html" target="_top"> N4277 </a> - </td><td align="left">TriviallyCopyable <code class="code">reference_wrapper</code> </td><td align="left">Y</td><td align="left"> </td></tr><tr bgcolor="#B0B0B0"><td align="left"> + </td><td align="left">TriviallyCopyable <code class="code">reference_wrapper</code> </td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left"> <a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4258.pdf" target="_top"> N4258 </a> - </td><td align="left">Cleaning-up noexcept in the Library</td><td align="left">Partial</td><td align="left">Changes to basic_string not complete.</td></tr><tr><td align="left"> + </td><td align="left">Cleaning-up noexcept in the Library</td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left"> <a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4279.html" target="_top"> N4279 </a> @@ -507,11 +507,11 @@ not in any particular release. <a class="link" href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2014/n3911.pdf" target="_top"> N3911 </a> - </td><td align="left">Transformation Trait Alias <code class="code">void_t</code></td><td align="left">Y</td><td align="left"> </td></tr><tr bgcolor="#C8B0B0"><td align="left"> + </td><td align="left">Transformation Trait Alias <code class="code">void_t</code></td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left"> <a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4169.html" target="_top"> N4169 </a> - </td><td align="left">A proposal to add invoke function template</td><td align="left">N</td><td align="left">In progress</td></tr><tr><td align="left"> + </td><td align="left">A proposal to add invoke function template</td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left"> <a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4280.pdf" target="_top"> N4280 </a> diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml index fc2ebd2466f..4ea0d1e0293 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml @@ -103,15 +103,14 @@ not in any particular release. </row> <row> - <?dbhtml bgcolor="#B0B0B0" ?> <entry> <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4258.pdf"> N4258 </link> </entry> <entry>Cleaning-up noexcept in the Library</entry> - <entry>Partial</entry> - <entry>Changes to basic_string not complete.</entry> + <entry>Y</entry> + <entry/> </row> <row> @@ -137,15 +136,14 @@ not in any particular release. </row> <row> - <?dbhtml bgcolor="#C8B0B0" ?> <entry> <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4169.html"> N4169 </link> </entry> <entry>A proposal to add invoke function template</entry> - <entry>N</entry> - <entry>In progress</entry> + <entry>Y</entry> + <entry/> </row> <row> diff --git a/libstdc++-v3/include/bits/stl_multiset.h b/libstdc++-v3/include/bits/stl_multiset.h index 5ccc6dd61f7..e6e233772b3 100644 --- a/libstdc++-v3/include/bits/stl_multiset.h +++ b/libstdc++-v3/include/bits/stl_multiset.h @@ -680,13 +680,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER #if __cplusplus > 201103L template<typename _Kt> auto - find(const _Kt& __x) -> decltype(_M_t._M_find_tr(__x)) - { return _M_t._M_find_tr(__x); } + find(const _Kt& __x) + -> decltype(iterator{_M_t._M_find_tr(__x)}) + { return iterator{_M_t._M_find_tr(__x)}; } template<typename _Kt> auto - find(const _Kt& __x) const -> decltype(_M_t._M_find_tr(__x)) - { return _M_t._M_find_tr(__x); } + find(const _Kt& __x) const + -> decltype(const_iterator{_M_t._M_find_tr(__x)}) + { return const_iterator{_M_t._M_find_tr(__x)}; } #endif //@} diff --git a/libstdc++-v3/include/bits/stl_set.h b/libstdc++-v3/include/bits/stl_set.h index cf74368fa0e..8bea61a3b23 100644 --- a/libstdc++-v3/include/bits/stl_set.h +++ b/libstdc++-v3/include/bits/stl_set.h @@ -699,13 +699,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER #if __cplusplus > 201103L template<typename _Kt> auto - find(const _Kt& __x) -> decltype(_M_t._M_find_tr(__x)) - { return _M_t._M_find_tr(__x); } + find(const _Kt& __x) + -> decltype(iterator{_M_t._M_find_tr(__x)}) + { return iterator{_M_t._M_find_tr(__x)}; } template<typename _Kt> auto - find(const _Kt& __x) const -> decltype(_M_t._M_find_tr(__x)) - { return _M_t._M_find_tr(__x); } + find(const _Kt& __x) const + -> decltype(const_iterator{_M_t._M_find_tr(__x)}) + { return const_iterator{_M_t._M_find_tr(__x)}; } #endif //@} diff --git a/libstdc++-v3/libsupc++/exception_ptr.h b/libstdc++-v3/libsupc++/exception_ptr.h index 8fbad1c86d1..7821c149f0e 100644 --- a/libstdc++-v3/libsupc++/exception_ptr.h +++ b/libstdc++-v3/libsupc++/exception_ptr.h @@ -68,6 +68,8 @@ namespace std namespace __exception_ptr { + using std::rethrow_exception; + /** * @brief An opaque pointer to an arbitrary exception. * @ingroup exceptions diff --git a/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc b/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc index 31da2ecbe82..7d3989213e3 100644 --- a/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc +++ b/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc @@ -30,7 +30,6 @@ void test01() { - bool test __attribute__((unused)) = true; using namespace std; try { @@ -54,7 +53,6 @@ void test02() void test03() { - bool test __attribute__((unused)) = true; using namespace std; exception_ptr ep; @@ -71,7 +69,6 @@ void test03() void test04() { - bool test __attribute__((unused)) = true; using namespace std; // Weave the exceptions in an attempt to confuse the machinery. @@ -103,12 +100,23 @@ void test04() } } +void test05() +{ + // libstdc++/64651 std::rethrow_exception not found by ADL + // This is not required to work but is a conforming extension. + try { + rethrow_exception(std::make_exception_ptr(0)); + } catch(...) { + } +} + int main() { test01(); test02(); test03(); test04(); + test05(); return 0; } diff --git a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc index 6cc277aedce..ef301ef136c 100644 --- a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc +++ b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc @@ -54,6 +54,11 @@ test01() VERIFY( cit == cx.end() ); VERIFY( Cmp::count == 0); + + static_assert(std::is_same<decltype(it), test_type::iterator>::value, + "find returns iterator"); + static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value, + "const find returns const_iterator"); } void diff --git a/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc b/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc index 67c3bfd60a3..eef6ee4515d 100644 --- a/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc +++ b/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc @@ -54,6 +54,11 @@ test01() VERIFY( cit == cx.end() ); VERIFY( Cmp::count == 0); + + static_assert(std::is_same<decltype(it), test_type::iterator>::value, + "find returns iterator"); + static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value, + "const find returns const_iterator"); } void diff --git a/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc b/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc index ff2748f713a..4bea719160f 100644 --- a/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc +++ b/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc @@ -54,6 +54,11 @@ test01() VERIFY( cit == cx.end() ); VERIFY( Cmp::count == 0); + + static_assert(std::is_same<decltype(it), test_type::iterator>::value, + "find returns iterator"); + static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value, + "const find returns const_iterator"); } void diff --git a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc index 84ddd1f1ddc..6a68453ec7b 100644 --- a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc +++ b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc @@ -54,6 +54,11 @@ test01() VERIFY( cit == cx.end() ); VERIFY( Cmp::count == 0); + + static_assert(std::is_same<decltype(it), test_type::iterator>::value, + "find returns iterator"); + static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value, + "const find returns const_iterator"); } void |