[gcc]

2016-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/70915 * config/rs6000/constraints.md (wE constraint): New constraint for a vector constant that can be loaded with XXSPLTIB. (wM constraint): New constraint for a vector constant of a 1's. (wS constraint): New constraint for a vector constant that can be loaded with XXSPLTIB and a vector sign extend instruction. * config/rs6000/predicates.md (xxspltib_constant_split): New predicates for wE/wS constraints. (xxspltib_constant_nosplit): Likewise. (easy_vector_constant): Add support for constants that can be loaded via XXSPLTIB. (all_ones_constant): New predicate for vector constant with all 1's set. (splat_input_operand): Add support for ISA 3.0 word splat operations. * config/rs6000/rs6000.c (xxspltib_constant_p): New function to return if a constant can be loaded with the ISA 3.0 XXSPLTIB instruction and possibly with a sign extension. (output_vec_const_move): Add support for XXSPLTIB. If we are loading up 0/-1 into Altivec registers, prefer using VSPLTISW instead of XXLXOR/XXLORC. (rs6000_expand_vector_init): Add support for ISA 3.0 word splat operations. (rs6000_legitimize_reload_address): Likewise. (rs6000_output_move_128bit): Use output_vec_const_move to emit constants. * config/rs6000/vsx.md (VSX_M): Add TImode (if -mvsx-timode) and combine VSX_M and VSX_M2 into one iterator. (VSX_M2): Likewise. (VSINT_84): New iterators for loading constants with XXSPLTIB. (VSINT_842): Likewise. (UNSPEC_VSX_SIGN_EXTEND): New UNSPEC. (xxspltib_v16qi): New insns to load up constants with the ISA 3.0 XXSPLTIB instruction. (xxspltib_<mode>_nosplit): Likewise. (xxspltib_<mode>_split): New insn to load up constants with XXSPLTIB and a sign extend instruction. (vsx_mov<mode>): Replace single move that handled all vector types with separate 32-bit and 64-bit moves. Combine the movti_<bit> moves (when -mvsx-timode is in effect) into the main vector moves. Eliminate separate moves for <VSr> <VSa>, where the preferred register class (<VSr>) is listed first, and the secondary register class (<VSa>) is listed second with a '?' to discourage use. Prefer loading 0/-1 in any VSX register for ISA 3.0, and Altivec registers for ISA 2.06/2.07 (PR target/70915) so that if the register was involved in a slow operation, the clear/set operation does not wait for the slow operation to finish. Adjust the length attributes for 32-bit mode. Use rs6000_output_move_128bit and drop the use of the string instructions for 32-bit movti when -mvsx-timode is in effect. Use spacing so that the alternatives and attributes don't generate long lines, and put things in columns, so that it is easier to match up the operands and attributes with the insn alternatives. (vsx_mov<mode>_64bit): Likewise. (vsx_mov<mode>_32bit): Likewise. (vsx_movti_64bit): Fold movti into normal vector moves. (vsx_movti_32bit): Likewise. (vsx_splat_<mode>, V4SI/V4SF modes): Add support for ISA 3.0 word spat instructions. (vsx_splat_v4si_internal): Likewise. (vsx_splat_v4sf_internal): Likewise. (vector fusion peepholes): Use VSX_M instead of VSX_M2. (vsx_sign_extend_qi_<mode>): New ISA 3.0 instructions to sign extend vector elements. (vsx_sign_extend_hi_<mode>): Likewise. (vsx_sign_extend_si_v2di): Likewise. * config/rs6000/rs6000-protos.h (xxspltib_constant_p): Add declaration. * doc/md.texi (PowerPC constraints): Document the wE, wM, and wS constraints. Add trailing period to wL documentation. [gcc/testsuite] 2016-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/p9-splat-1.c: New tests for ISA 3.0 word splat operations and the XXSPLTIB instruction. * gcc.target/powerpc/p9-splat-2.c: Likewise. * gcc.target/powerpc/p9-splat-3.c: Likewise. * gcc.target/powerpc/pr47755.c: Allow vspltisw in addition to xxlxor to clear a register. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@236394 138bc75d-0d04-0410-961f-82ee72b054a4
author: meissner <meissner@138bc75d-0d04-0410-961f-82ee72b054a4> 2016-05-18 14:04:32 +0000
committer: meissner <meissner@138bc75d-0d04-0410-961f-82ee72b054a4> 2016-05-18 14:04:32 +0000
commit: c93e99319b7a5abd2385df8a17144b3cbc1a9e3c (patch)
tree: 1422c79324b1451c85b95d071bf6ce225e401840 /gcc/config/rs6000/rs6000.c
parent: 8ba6639c945969f51e51f9650f0ffb5a22245fc0 (diff)
download: gcc-c93e99319b7a5abd2385df8a17144b3cbc1a9e3c.tar.gz
1 files changed, 201 insertions, 22 deletions
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3f721c67cd5..a1841bce765 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6241,6 +6241,128 @@ gen_easy_altivec_constant (rtx op)
   gcc_unreachable ();
 }
 
+/* Return true if OP is of the given MODE and can be synthesized with ISA 3.0
+   instructions (xxspltib, vupkhsb/vextsb2w/vextb2d).
+
+   Return the number of instructions needed (1 or 2) into the address pointed
+   via NUM_INSNS_PTR.
+
+   If NOSPLIT_P, only return true for constants that only generate the XXSPLTIB
+   instruction and can go in any VSX register.  If !NOSPLIT_P, only return true
+   for constants that generate XXSPLTIB and need a sign extend operation, which
+   restricts us to the Altivec registers.
+
+   Allow either (vec_const [...]) or (vec_duplicate <const>).  If OP is a valid
+   XXSPLTIB constant, return the constant being set via the CONST_PTR
+   pointer.  */
+
+bool
+xxspltib_constant_p (rtx op,
+		     machine_mode mode,
+		     int *num_insns_ptr,
+		     int *constant_ptr)
+{
+  size_t nunits = GET_MODE_NUNITS (mode);
+  size_t i;
+  HOST_WIDE_INT value;
+  rtx element;
+
+  /* Set the returned values to out of bound values.  */
+  *num_insns_ptr = -1;
+  *constant_ptr = 256;
+
+  if (!TARGET_P9_VECTOR)
+    return false;
+
+  if (mode == VOIDmode)
+    mode = GET_MODE (op);
+
+  else if (mode != GET_MODE (op))
+    return false;
+
+  /* Handle (vec_duplicate <constant>).  */
+  if (GET_CODE (op) == VEC_DUPLICATE)
+    {
+      if (mode != V16QImode && mode != V8HImode && mode != V4SImode
+	  && mode != V2DImode)
+	return false;
+
+      element = XEXP (op, 0);
+      if (!CONST_INT_P (element))
+	return false;
+
+      value = INTVAL (element);
+      if (!IN_RANGE (value, -128, 127))
+	return false;
+    }
+
+  /* Handle (const_vector [...]).  */
+  else if (GET_CODE (op) == CONST_VECTOR)
+    {
+      if (mode != V16QImode && mode != V8HImode && mode != V4SImode
+	  && mode != V2DImode)
+	return false;
+
+      element = CONST_VECTOR_ELT (op, 0);
+      if (!CONST_INT_P (element))
+	return false;
+
+      value = INTVAL (element);
+      if (!IN_RANGE (value, -128, 127))
+	return false;
+
+      for (i = 1; i < nunits; i++)
+	{
+	  element = CONST_VECTOR_ELT (op, i);
+	  if (!CONST_INT_P (element))
+	    return false;
+
+	  if (value != INTVAL (element))
+	    return false;
+	}
+
+      /* See if we could generate vspltisw/vspltish directly instead of
+	 xxspltib + sign extend.  Special case 0/-1 to allow getting
+         any VSX register instead of an Altivec register.  */
+      if (!IN_RANGE (value, -1, 0) && EASY_VECTOR_15 (value)
+	  && (mode == V4SImode || mode == V8HImode))
+	return false;
+    }
+
+  /* Handle integer constants being loaded into the upper part of the VSX
+     register as a scalar.  If the value isn't 0/-1, only allow it if
+     the mode can go in Altivec registers.  */
+  else if (CONST_INT_P (op))
+    {
+      if (!SCALAR_INT_MODE_P (mode))
+	return false;
+
+      value = INTVAL (op);
+      if (!IN_RANGE (value, -128, 127))
+	return false;
+
+      if (!IN_RANGE (value, -1, 0)
+	  && (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0)
+	return false;
+    }
+
+  else
+    return false;
+
+  /* Return # of instructions and the constant byte for XXSPLTIB.  */
+  if (mode == V16QImode)
+    *num_insns_ptr = 1;
+
+  else if (IN_RANGE (value, -1, 0))
+    *num_insns_ptr = 1;
+
+  else
+    *num_insns_ptr = 2;
+
+  *constant_ptr = (int) value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
@@ -6254,23 +6376,60 @@ output_vec_const_move (rtx *operands)
 
   if (TARGET_VSX)
     {
+      bool dest_vmx_p = ALTIVEC_REGNO_P (REGNO (dest));
+      int xxspltib_value = 256;
+      int num_insns = -1;
+
       if (zero_constant (vec, mode))
-	return "xxlxor %x0,%x0,%x0";
+	{
+	  if (TARGET_P9_VECTOR)
+	    return "xxspltib %x0,0";
+
+	  else if (dest_vmx_p)
+	    return "vspltisw %0,0";
+
+	  else
+	    return "xxlxor %x0,%x0,%x0";
+	}
+
+      if (all_ones_constant (vec, mode))
+	{
+	  if (TARGET_P9_VECTOR)
+	    return "xxspltib %x0,255";
 
-      if (TARGET_P8_VECTOR && vec == CONSTM1_RTX (mode))
-	return "xxlorc %x0,%x0,%x0";
+	  else if (dest_vmx_p)
+	    return "vspltisw %0,-1";
 
-      if ((mode == V2DImode || mode == V1TImode)
-	  && INTVAL (CONST_VECTOR_ELT (vec, 0)) == -1
-	  && INTVAL (CONST_VECTOR_ELT (vec, 1)) == -1)
-	return (TARGET_P8_VECTOR) ? "xxlorc %x0,%x0,%x0" : "vspltisw %0,-1";
+	  else if (TARGET_P8_VECTOR)
+	    return "xxlorc %x0,%x0,%x0";
+
+	  else
+	    gcc_unreachable ();
+	}
+
+      if (TARGET_P9_VECTOR
+	  && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value))
+	{
+	  if (num_insns == 1)
+	    {
+	      operands[2] = GEN_INT (xxspltib_value & 0xff);
+	      return "xxspltib %x0,%2";
+	    }
+
+	  return "#";
+	}
     }
 
   if (TARGET_ALTIVEC)
     {
       rtx splat_vec;
+
+      gcc_assert (ALTIVEC_REGNO_P (REGNO (dest)));
       if (zero_constant (vec, mode))
-	return "vxor %0,%0,%0";
+	return "vspltisw %0,0";
+
+      if (all_ones_constant (vec, mode))
+	return "vspltisw %0,-1";
 
       /* Do we need to construct a value using VSLDOI?  */
       shift = vspltis_shifted (vec);
@@ -6543,6 +6702,15 @@ rs6000_expand_vector_init (rtx target, rtx vals)
       return;
     }
 
+  /* Word values on ISA 3.0 can use mtvsrws, lxvwsx, or vspltisw.  V4SF is
+     complicated since scalars are stored as doubles in the registers.  */
+  if (TARGET_P9_VECTOR && mode == V4SImode && all_same
+      && VECTOR_MEM_VSX_P (mode))
+    {
+      emit_insn (gen_vsx_splat_v4si (target, XVECEXP (vals, 0, 0)));
+      return;
+    }
+
   /* With single precision floating point on VSX, know that internally single
      precision is actually represented as a double, and either make 2 V2DF
      vectors, and convert these vectors to single precision, or do one
@@ -6551,14 +6719,23 @@ rs6000_expand_vector_init (rtx target, rtx vals)
     {
       if (all_same)
 	{
-	  rtx freg = gen_reg_rtx (V4SFmode);
-	  rtx sreg = force_reg (SFmode, XVECEXP (vals, 0, 0));
-	  rtx cvt  = ((TARGET_XSCVDPSPN)
-		      ? gen_vsx_xscvdpspn_scalar (freg, sreg)
-		      : gen_vsx_xscvdpsp_scalar (freg, sreg));
+	  rtx op0 = XVECEXP (vals, 0, 0);
+
+	  if (TARGET_P9_VECTOR)
+	    emit_insn (gen_vsx_splat_v4sf (target, op0));
 
-	  emit_insn (cvt);
-	  emit_insn (gen_vsx_xxspltw_v4sf_direct (target, freg, const0_rtx));
+	  else
+	    {
+	      rtx freg = gen_reg_rtx (V4SFmode);
+	      rtx sreg = force_reg (SFmode, op0);
+	      rtx cvt  = (TARGET_XSCVDPSPN
+			  ? gen_vsx_xscvdpspn_scalar (freg, sreg)
+			  : gen_vsx_xscvdpsp_scalar (freg, sreg));
+
+	      emit_insn (cvt);
+	      emit_insn (gen_vsx_xxspltw_v4sf_direct (target, freg,
+						      const0_rtx));
+	    }
 	}
       else
 	{
@@ -8326,12 +8503,16 @@ rs6000_legitimize_reload_address (rtx x, machine_mode mode,
 {
   bool reg_offset_p = reg_offset_addressing_ok_p (mode);
 
-  /* Nasty hack for vsx_splat_V2DF/V2DI load from mem, which takes a
-     DFmode/DImode MEM.  */
+  /* Nasty hack for vsx_splat_v2df/v2di load from mem, which takes a
+     DFmode/DImode MEM.  Ditto for ISA 3.0 vsx_splat_v4sf/v4si.  */
   if (reg_offset_p
       && opnum == 1
       && ((mode == DFmode && recog_data.operand_mode[0] == V2DFmode)
-	  || (mode == DImode && recog_data.operand_mode[0] == V2DImode)))
+	  || (mode == DImode && recog_data.operand_mode[0] == V2DImode)
+	  || (mode == SFmode && recog_data.operand_mode[0] == V4SFmode
+	      && TARGET_P9_VECTOR)
+	  || (mode == SImode && recog_data.operand_mode[0] == V4SImode
+	      && TARGET_P9_VECTOR)))
     reg_offset_p = false;
 
   /* We must recognize output that we have already generated ourselves.  */
@@ -20111,10 +20292,8 @@ rs6000_output_move_128bit (rtx operands[])
       if (dest_gpr_p)
 	return "#";
 
-      else if (TARGET_VSX && dest_vsx_p && zero_constant (src, mode))
-	return "xxlxor %x0,%x0,%x0";
-
-      else if (TARGET_ALTIVEC && dest_vmx_p)
+      else if ((dest_vmx_p && TARGET_ALTIVEC)
+	       || (dest_vsx_p && TARGET_VSX))
 	return output_vec_const_move (operands);
     }
author	meissner <meissner@138bc75d-0d04-0410-961f-82ee72b054a4>	2016-05-18 14:04:32 +0000
committer	meissner <meissner@138bc75d-0d04-0410-961f-82ee72b054a4>	2016-05-18 14:04:32 +0000
commit	c93e99319b7a5abd2385df8a17144b3cbc1a9e3c (patch)
tree	1422c79324b1451c85b95d071bf6ce225e401840 /gcc/config/rs6000/rs6000.c
parent	8ba6639c945969f51e51f9650f0ffb5a22245fc0 (diff)
download	gcc-c93e99319b7a5abd2385df8a17144b3cbc1a9e3c.tar.gz