diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2016-03-07 14:44:37 -0800 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2016-04-22 12:13:38 -0700 |
commit | 8e3f061aef9f2e1a01d586de865a8c354d163e27 (patch) | |
tree | f8417157ab666f630bf67a9aa95a36c7ba584026 /gcc/testsuite/gcc.target/i386/pr70155-19.c | |
parent | 3009ef64f7d77f8b7382de1204143a1e2b9e9213 (diff) | |
download | gcc-8e3f061aef9f2e1a01d586de865a8c354d163e27.tar.gz |
Extend STV pass to 64-bit modehjl/pr70155/uros
128-bit SSE load and store instructions can be used for load and store
of 128-bit integers if they are the only operations on 128-bit integers.
To convert load and store of 128-bit integers to 128-bit SSE load and
store, the original STV pass, which is designed to convert 64-bit integer
operations to SSE2 operations in 32-bit mode, is extended to 64-bit mode
in the following ways:
1. Class scalar_chain is turned into base class. The 32-bit specific
member functions are moved to the new derived class, scalar_chain_32.
The new derived class, scalar_chain_64, is added to convert oad and
store of 128-bit integers to 128-bit SSE load and store.
2. Add the 64-bit version of scalar_to_vector_candidate_p and
remove_non_convertible_regs. Only TImode load and store are allowed
for conversion. If one instruction on the chain of dependent
instructions aren't TImode load or store, the chain of instructions
won't be converted.
3. In 64-bit, we only convert from TImode to V1TImode, which have the
same size. The difference is only vector registers are allowed in
TImode so that 128-bit SSE load and store instructions will be used
for load and store of 128-bit integers.
4. Put the 64-bit STV pass before the CSE pass so that instructions
changed or generated by the STV pass can be CSEed.
gcc/
PR target/70155
* config/i386/i386.c (scalar_to_vector_candidate_p): Renamed
to ...
(scalar_to_vector_candidate_p_32): This.
(scalar_to_vector_candidate_p_64): New function.
(scalar_to_vector_candidate_p): Likewise.
(check_non_convertible_regs_64): Likewise.
(remove_non_convertible_regs_64): Likewise.
(remove_non_convertible_regs): Likewise.
(remove_non_convertible_regs): Renamed to ...
(remove_non_convertible_regs_32): This.
(scalar_chain::~scalar_chain): Make it virtual.
(scalar_chain::compute_convert_gain): Make it pure virtual.
(scalar_chain::convert_insn): Likewise.
(scalar_chain::convert_registers): Likewise.
(scalar_chain::analyze_register_chain): Likewise.
(scalar_chain::add_to_queue): Make it protected.
(scalar_chain::emit_conversion_insns): Likewise.
(scalar_chain::mark_dual_mode_def): Moved to scalar_chain_32.
(scalar_chain::replace_with_subreg): Likewise.
(scalar_chain::replace_with_subreg_in_insn): Likewise.
(scalar_chain::convert_op): Likewise.
(scalar_chain::convert_reg): Likewise.
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_registers): New pure virtual function.
(class scalar_chain_32): New class.
(class scalar_chain_64): Likewise.
(scalar_chain::mark_dual_mode_def): Renamed to ...
(scalar_chain_32::mark_dual_mode_def): This.
(scalar_chain::analyze_register_chain): Renamed to ...
(scalar_chain_32::analyze_register_chain ): This.
(scalar_chain_64::analyze_register_chain): New function.
(scalar_chain::compute_convert_gain): Renamed to ...
(scalar_chain_32::compute_convert_gain): This.
(scalar_chain::replace_with_subreg): Renamed to ...
(scalar_chain_32::replace_with_subreg): This.
(scalar_chain::replace_with_subreg_in_insn): Renamed to ...
(scalar_chain_32::replace_with_subreg_in_insn): This.
(scalar_chain::make_vector_copies): Renamed to ...
(scalar_chain_32::make_vector_copies): This.
(scalar_chain::convert_reg): Renamed to ...
(scalar_chain_32::convert_reg ): This.
(scalar_chain::convert_op): Renamed to ...
(scalar_chain_32::convert_op): This.
(scalar_chain::convert_insn): Renamed to ...
(scalar_chain_32::convert_insn): This.
(scalar_chain_64::convert_insn): New function.
(scalar_chain_32::convert_registers): Likewise.
(scalar_chain::convert): Call convert_registers.
(convert_scalars_to_vector): Change to scalar_chain pointer to
use scalar_chain_64 in 64-bit mode and scalar_chain_32 in 32-bit
mode. Delete scalar_chain pointer. Call free_dominance_info in
64-bit mode.
(pass_stv::gate): Remove TARGET_64BIT check.
(ix86_option_override): Put the 64-bit STV pass before the CSE
pass.
(standard_sse_constant_p): Allow V1TImode for all 1s.
gcc/testsuite/
PR target/70155
* gcc.target/i386/pr55247-2.c: Updated to check movti_internal
and movv1ti_internal patterns
* gcc.target/i386/pr70155-1.c: New test.
* gcc.target/i386/pr70155-2.c: Likewise.
* gcc.target/i386/pr70155-3.c: Likewise.
* gcc.target/i386/pr70155-4.c: Likewise.
* gcc.target/i386/pr70155-5.c: Likewise.
* gcc.target/i386/pr70155-6.c: Likewise.
* gcc.target/i386/pr70155-7.c: Likewise.
* gcc.target/i386/pr70155-8.c: Likewise.
* gcc.target/i386/pr70155-9.c: Likewise.
* gcc.target/i386/pr70155-10.c: Likewise.
* gcc.target/i386/pr70155-11.c: Likewise.
* gcc.target/i386/pr70155-12.c: Likewise.
* gcc.target/i386/pr70155-13.c: Likewise.
* gcc.target/i386/pr70155-14.c: Likewise.
* gcc.target/i386/pr70155-15.c: Likewise.
* gcc.target/i386/pr70155-16.c: Likewise.
* gcc.target/i386/pr70155-17.c: Likewise.
* gcc.target/i386/pr70155-18.c: Likewise.
* gcc.target/i386/pr70155-19.c: Likewise.
* gcc.target/i386/pr70155-20.c: Likewise.
* gcc.target/i386/pr70155-21.c: Likewise.
* gcc.target/i386/pr70155-22.c: Likewise.
Diffstat (limited to 'gcc/testsuite/gcc.target/i386/pr70155-19.c')
-rw-r--r-- | gcc/testsuite/gcc.target/i386/pr70155-19.c | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/gcc/testsuite/gcc.target/i386/pr70155-19.c b/gcc/testsuite/gcc.target/i386/pr70155-19.c new file mode 100644 index 00000000000..e2e73aabafa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr70155-19.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse2 -mtune=generic -dp" } */ + +extern char *src, *dst; + +char * +foo1 (void) +{ + return __builtin_mempcpy (dst, src, 16); +} + +/* { dg-final { scan-assembler-times "movv1ti_internal" 2 } } */ |