S/390 Vector base support.

gcc/ * config/s390/constraints.md (j00, jm1, jxx, jyy, v): New constraints. * config/s390/predicates.md (const0_operand, constm1_operand) (constable_operand): Accept vector operands. * config/s390/s390-modes.def: Add supported vector modes. * config/s390/s390-protos.h (s390_cannot_change_mode_class) (s390_function_arg_vector, s390_contiguous_bitmask_vector_p) (s390_bytemask_vector_p, s390_expand_vec_strlen) (s390_expand_vec_compare, s390_expand_vcond) (s390_expand_vec_init): Add prototypes. * config/s390/s390.c (VEC_ARG_NUM_REG): New macro. (s390_vector_mode_supported_p): New function. (s390_contiguous_bitmask_p): Mask out the irrelevant bits. (s390_contiguous_bitmask_vector_p): New function. (s390_bytemask_vector_p): New function. (s390_split_ok_p): Vector regs don't work either. (regclass_map): Add VEC_REGS. (s390_legitimate_constant_p): Handle vector constants. (s390_cannot_force_const_mem): Handle CONST_VECTOR. (legitimate_reload_vector_constant_p): New function. (s390_preferred_reload_class): Handle CONST_VECTOR. (s390_reload_symref_address): Likewise. (s390_secondary_reload): Vector memory instructions only support short displacements. Rename reload*_nonoffmem* to reload*_la*. (s390_emit_ccraw_jump): New function. (s390_expand_vec_strlen): New function. (s390_expand_vec_compare): New function. (s390_expand_vcond): New function. (s390_expand_vec_init): New function. (s390_dwarf_frame_reg_mode): New function. (print_operand): Handle addresses with 'O' and 'R' constraints. (NR_C_MODES, constant_modes): Add vector modes. (s390_output_pool_entry): Handle vector constants. (s390_hard_regno_mode_ok): Handle vector registers. (s390_class_max_nregs): Likewise. (s390_cannot_change_mode_class): New function. (s390_invalid_arg_for_unprototyped_fn): New function. (s390_function_arg_vector): New function. (s390_function_arg_float): Remove size variable. (s390_pass_by_reference): Handle vector arguments. (s390_function_arg_advance): Likewise. (s390_function_arg): Likewise. (s390_return_in_memory): Vector values are returned in a VR if possible. (s390_function_and_libcall_value): Handle vector arguments. (s390_gimplify_va_arg): Likewise. (s390_call_saved_register_used): Consider the arguments named. (s390_conditional_register_usage): Disable v16-v31 for non-vec targets. (s390_preferred_simd_mode): New function. (s390_support_vector_misalignment): New function. (s390_vector_alignment): New function. (TARGET_STRICT_ARGUMENT_NAMING, TARGET_DWARF_FRAME_REG_MODE) (TARGET_VECTOR_MODE_SUPPORTED_P) (TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN) (TARGET_VECTORIZE_PREFERRED_SIMD_MODE) (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT) (TARGET_VECTOR_ALIGNMENT): Define target macro. * config/s390/s390.h (FUNCTION_ARG_PADDING): Define macro. (FIRST_PSEUDO_REGISTER): Increase value. (VECTOR_NOFP_REGNO_P, VECTOR_REGNO_P, VECTOR_NOFP_REG_P) (VECTOR_REG_P): Define macros. (FIXED_REGISTERS, CALL_USED_REGISTERS) (CALL_REALLY_USED_REGISTERS, REG_ALLOC_ORDER) (HARD_REGNO_CALL_PART_CLOBBERED, REG_CLASS_NAMES) (FUNCTION_ARG_REGNO_P, FUNCTION_VALUE_REGNO_P, REGISTER_NAMES): Add vector registers. (CANNOT_CHANGE_MODE_CLASS): Call C function. (enum reg_class): Add VEC_REGS, ADDR_VEC_REGS, GENERAL_VEC_REGS. (SECONDARY_MEMORY_NEEDED): Allow SF<->SI mode moves without memory. (DBX_REGISTER_NUMBER, FIRST_VEC_ARG_REGNO, LAST_VEC_ARG_REGNO) (SHORT_DISP_IN_RANGE, VECTOR_STORE_FLAG_VALUE): Define macro. * config/s390/s390.md (UNSPEC_VEC_*): New constants. (VR*_REGNUM): New constants. (ALL): New mode iterator. (INTALL): Remove mode iterator. Include vector.md. (movti): Implement TImode moves for VRs. Disable TImode splitter for VR targets. Implement splitting TImode GPR<->VR moves. (reload*_tomem_z10, reload*_toreg_z10): Replace INTALL with ALL. (reload<mode>_nonoffmem_in, reload<mode>_nonoffmem_out): Rename to reload<mode>_la_in, reload<mode>_la_out. (*movdi_64, *movsi_zarch, *movhi, *movqi, *mov<mode>_64dfp) (*mov<mode>_64, *mov<mode>_31): Add vector instructions. (TD/TF mode splitter): Enable for GPRs only (formerly !FP). (mov<mode> SF SD): Prefer lder, lde for loading. Add lrl and strl instructions. Add vector instructions. (strlen<mode>): Rename old strlen<mode> to strlen_srst<mode>. Call s390_expand_vec_strlen on z13. (*cc_to_int): Change predicate to nonimmediate_operand. (addti3): Rename to *addti3. New expander. (subti3): Rename to *subti3. New expander. * config/s390/vector.md: New file. From-SVN: r223395
author: Andreas Krebbel <krebbel@linux.vnet.ibm.com> 2015-05-19 17:26:35 +0000
committer: Andreas Krebbel <krebbel@gcc.gnu.org> 2015-05-19 17:26:35 +0000
commit: 085261c8042d644baaf594bc436b87326c1c0390 (patch)
tree: f2088fab028533e22683011130369154e968243d /gcc/config/s390/vector.md
parent: 55ac540cd6ec0bbdf76ba5fd57ddd67f17112609 (diff)
download: gcc-085261c8042d644baaf594bc436b87326c1c0390.tar.gz
1 files changed, 1226 insertions, 0 deletions
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
new file mode 100644
index 00000000000..f07f5a70f18
--- /dev/null
+++ b/gcc/config/s390/vector.md
@@ -0,0 +1,1226 @@
+;;- Instruction patterns for the System z vector facility
+;;  Copyright (C) 2015 Free Software Foundation, Inc.
+;;  Contributed by Andreas Krebbel (Andreas.Krebbel@de.ibm.com)
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+; All vector modes supported in a vector register
+(define_mode_iterator V
+  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF
+   V2SF V4SF V1DF V2DF])
+(define_mode_iterator VT
+  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF
+   V2SF V4SF V1DF V2DF V1TF V1TI TI])
+
+; All vector modes directly supported by the hardware having full vector reg size
+; V_HW2 is duplicate of V_HW for having two iterators expanding
+; independently e.g. vcond
+(define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF])
+(define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF])
+; Including TI for instructions that support it (va, vn, ...)
+(define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI])
+
+; All full size integer vector modes supported in a vector register + TImode
+(define_mode_iterator VIT_HW    [V16QI V8HI V4SI V2DI V1TI TI])
+(define_mode_iterator VI_HW     [V16QI V8HI V4SI V2DI])
+(define_mode_iterator VI_HW_QHS [V16QI V8HI V4SI])
+(define_mode_iterator VI_HW_HS  [V8HI V4SI])
+(define_mode_iterator VI_HW_QH  [V16QI V8HI])
+
+; All integer vector modes supported in a vector register + TImode
+(define_mode_iterator VIT [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1TI TI])
+(define_mode_iterator VI  [V2QI V4QI V8QI V16QI V2HI V4HI V8HI V2SI V4SI V2DI])
+(define_mode_iterator VI_QHS [V4QI V8QI V16QI V4HI V8HI V4SI])
+
+(define_mode_iterator V_8   [V1QI])
+(define_mode_iterator V_16  [V2QI  V1HI])
+(define_mode_iterator V_32  [V4QI  V2HI V1SI V1SF])
+(define_mode_iterator V_64  [V8QI  V4HI V2SI V2SF V1DI V1DF])
+(define_mode_iterator V_128 [V16QI V8HI V4SI V4SF V2DI V2DF V1TI V1TF])
+
+; A blank for vector modes and a * for TImode.  This is used to hide
+; the TImode expander name in case it is defined already.  See addti3
+; for an example.
+(define_mode_attr ti* [(V1QI "") (V2QI "") (V4QI "") (V8QI "") (V16QI "")
+		       (V1HI "") (V2HI "") (V4HI "") (V8HI "")
+		       (V1SI "") (V2SI "") (V4SI "")
+		       (V1DI "") (V2DI "")
+		       (V1TI "*") (TI "*")])
+
+; The element type of the vector.
+(define_mode_attr non_vec[(V1QI "QI") (V2QI "QI") (V4QI "QI") (V8QI "QI") (V16QI "QI")
+			  (V1HI "HI") (V2HI "HI") (V4HI "HI") (V8HI "HI")
+			  (V1SI "SI") (V2SI "SI") (V4SI "SI")
+			  (V1DI "DI") (V2DI "DI")
+			  (V1TI "TI")
+			  (V1SF "SF") (V2SF "SF") (V4SF "SF")
+			  (V1DF "DF") (V2DF "DF")
+			  (V1TF "TF")])
+
+; The instruction suffix
+(define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
+			(V1HI "h") (V2HI "h") (V4HI "h") (V8HI "h")
+			(V1SI "f") (V2SI "f") (V4SI "f")
+			(V1DI "g") (V2DI "g")
+			(V1TI "q") (TI "q")
+			(V1SF "f") (V2SF "f") (V4SF "f")
+			(V1DF "g") (V2DF "g")
+			(V1TF "q")])
+
+; This is for vmalhw. It gets an 'w' attached to avoid confusion with
+; multiply and add logical high vmalh.
+(define_mode_attr w [(V1QI "")  (V2QI "")  (V4QI "")  (V8QI "") (V16QI "")
+		     (V1HI "w") (V2HI "w") (V4HI "w") (V8HI "w")
+		     (V1SI "")  (V2SI "")  (V4SI "")
+		     (V1DI "")  (V2DI "")])
+
+; Resulting mode of a vector comparison.  For floating point modes an
+; integer vector mode with the same element size is picked.
+(define_mode_attr tointvec [(V1QI "V1QI") (V2QI "V2QI") (V4QI "V4QI") (V8QI "V8QI") (V16QI "V16QI")
+			    (V1HI "V1HI") (V2HI "V2HI") (V4HI "V4HI") (V8HI "V8HI")
+			    (V1SI "V1SI") (V2SI "V2SI") (V4SI "V4SI")
+			    (V1DI "V1DI") (V2DI "V2DI")
+			    (V1TI "V1TI")
+			    (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI")
+			    (V1DF "V1DI") (V2DF "V2DI")
+			    (V1TF "V1TI")])
+
+; Vector with doubled element size.
+(define_mode_attr vec_double [(V2QI "V1HI") (V4QI "V2HI") (V8QI "V4HI") (V16QI "V8HI")
+			      (V2HI "V1SI") (V4HI "V2SI") (V8HI "V4SI")
+			      (V2SI "V1DI") (V4SI "V2DI")
+			      (V2DI "V1TI")
+			      (V2SF "V1DF") (V4SF "V2DF")])
+
+; Vector with half the element size.
+(define_mode_attr vec_half [(V1HI "V2QI") (V2HI "V4QI") (V4HI "V8QI") (V8HI "V16QI")
+			    (V1SI "V2HI") (V2SI "V4HI") (V4SI "V8HI")
+			    (V1DI "V2SI") (V2DI "V4SI")
+			    (V1TI "V2DI")
+			    (V1DF "V2SF") (V2DF "V4SF")
+			    (V1TF "V1DF")])
+
+; The comparisons not setting CC iterate over the rtx code.
+(define_code_iterator VFCMP_HW_OP [eq gt ge])
+(define_code_attr asm_fcmp_op [(eq "e") (gt "h") (ge "he")])
+
+
+
+; Comparison operators on int and fp compares which are directly
+; supported by the HW.
+(define_code_iterator VICMP_HW_OP [eq gt gtu])
+; For int insn_cmp_op can be used in the insn name as well as in the asm output.
+(define_code_attr insn_cmp_op [(eq "eq") (gt "h") (gtu "hl") (ge "he")])
+
+; Flags for vector string instructions (vfae all 4, vfee only ZS and CS, vstrc all 4)
+(define_constants
+  [(VSTRING_FLAG_IN         8)   ; invert result
+   (VSTRING_FLAG_RT         4)   ; result type
+   (VSTRING_FLAG_ZS         2)   ; zero search
+   (VSTRING_FLAG_CS         1)]) ; condition code set
+
+; Full HW vector size moves
+(define_insn "mov<mode>"
+  [(set (match_operand:V_128 0 "nonimmediate_operand" "=v, v,QR,  v,  v,  v,  v,v,d")
+	(match_operand:V_128 1 "general_operand"      " v,QR, v,j00,jm1,jyy,jxx,d,v"))]
+  "TARGET_VX"
+  "@
+   vlr\t%v0,%v1
+   vl\t%v0,%1
+   vst\t%v1,%0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm<bhfgq>\t%v0,%s1,%e1
+   vlvgp\t%v0,%1,%N1
+   #"
+  [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRI,VRI,VRI,VRR,*")])
+
+(define_split
+  [(set (match_operand:V_128 0 "register_operand" "")
+	(match_operand:V_128 1 "register_operand" ""))]
+  "TARGET_VX && GENERAL_REG_P (operands[0]) && VECTOR_REG_P (operands[1])"
+  [(set (match_dup 2)
+	(unspec:DI [(subreg:V2DI (match_dup 1) 0)
+		    (const_int 0)] UNSPEC_VEC_EXTRACT))
+   (set (match_dup 3)
+	(unspec:DI [(subreg:V2DI (match_dup 1) 0)
+		    (const_int 1)] UNSPEC_VEC_EXTRACT))]
+{
+  operands[2] = operand_subword (operands[0], 0, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 1, 0, <MODE>mode);
+})
+
+; Moves for smaller vector modes.
+
+; In these patterns only the vlr, vone, and vzero instructions write
+; VR bytes outside the mode.  This should be ok since we disallow
+; formerly bigger modes being accessed with smaller modes via
+; subreg. Note: The vone, vzero instructions could easily be replaced
+; with vlei which would only access the bytes belonging to the mode.
+; However, this would probably be slower.
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_8 0 "nonimmediate_operand" "=v,v,d, v,QR,  v,  v,  v,  v,d,  Q,  S,  Q,  S,  d,  d,d,d,d,R,T")
+        (match_operand:V_8 1 "general_operand"      " v,d,v,QR, v,j00,jm1,jyy,jxx,d,j00,j00,jm1,jm1,j00,jm1,R,T,b,d,d"))]
+  ""
+  "@
+   vlr\t%v0,%v1
+   vlvgb\t%v0,%1,0
+   vlgvb\t%0,%v1,0
+   vleb\t%v0,%1,0
+   vsteb\t%v1,%0,0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   lr\t%0,%1
+   mvi\t%0,0
+   mviy\t%0,0
+   mvi\t%0,-1
+   mviy\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lh\t%0,%1
+   lhy\t%0,%1
+   lhrl\t%0,%1
+   stc\t%1,%0
+   stcy\t%1,%0"
+  [(set_attr "op_type"      "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SI,SIY,SI,SIY,RI,RI,RX,RXY,RIL,RX,RXY")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_16 0 "nonimmediate_operand" "=v,v,d, v,QR,  v,  v,  v,  v,d,  Q,  Q,  d,  d,d,d,d,R,T,b")
+        (match_operand:V_16 1 "general_operand"      " v,d,v,QR, v,j00,jm1,jyy,jxx,d,j00,jm1,j00,jm1,R,T,b,d,d,d"))]
+  ""
+  "@
+   vlr\t%v0,%v1
+   vlvgh\t%v0,%1,0
+   vlgvh\t%0,%v1,0
+   vleh\t%v0,%1,0
+   vsteh\t%v1,%0,0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   lr\t%0,%1
+   mvhhi\t%0,0
+   mvhhi\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lh\t%0,%1
+   lhy\t%0,%1
+   lhrl\t%0,%1
+   sth\t%1,%0
+   sthy\t%1,%0
+   sthrl\t%1,%0"
+  [(set_attr "op_type"      "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SIL,SIL,RI,RI,RX,RXY,RIL,RX,RXY,RIL")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_32 0 "nonimmediate_operand" "=f,f,f,R,T,v,v,d, v,QR,  f,  v,  v,  v,  v,  Q,  Q,  d,  d,d,d,d,d,R,T,b")
+	(match_operand:V_32 1 "general_operand"      " f,R,T,f,f,v,d,v,QR, v,j00,j00,jm1,jyy,jxx,j00,jm1,j00,jm1,b,d,R,T,d,d,d"))]
+  "TARGET_VX"
+  "@
+   lder\t%v0,%v1
+   lde\t%0,%1
+   ley\t%0,%1
+   ste\t%1,%0
+   stey\t%1,%0
+   vlr\t%v0,%v1
+   vlvgf\t%v0,%1,0
+   vlgvf\t%0,%v1,0
+   vlef\t%v0,%1,0
+   vstef\t%1,%0,0
+   lzer\t%v0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   mvhi\t%0,0
+   mvhi\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lrl\t%0,%1
+   lr\t%0,%1
+   l\t%0,%1
+   ly\t%0,%1
+   st\t%1,%0
+   sty\t%1,%0
+   strl\t%1,%0"
+  [(set_attr "op_type" "RRE,RXE,RXY,RX,RXY,VRR,VRS,VRS,VRX,VRX,RRE,VRI,VRI,VRI,VRI,SIL,SIL,RI,RI,
+                        RIL,RR,RX,RXY,RX,RXY,RIL")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_64 0 "nonimmediate_operand"
+         "=f,f,f,R,T,v,v,d, v,QR,  f,  v,  v,  v,  v,  Q,  Q,  d,  d,f,d,d,d, d,RT,b")
+        (match_operand:V_64 1 "general_operand"
+         " f,R,T,f,f,v,d,v,QR, v,j00,j00,jm1,jyy,jxx,j00,jm1,j00,jm1,d,f,b,d,RT, d,d"))]
+  "TARGET_ZARCH"
+  "@
+   ldr\t%0,%1
+   ld\t%0,%1
+   ldy\t%0,%1
+   std\t%1,%0
+   stdy\t%1,%0
+   vlr\t%v0,%v1
+   vlvgg\t%v0,%1,0
+   vlgvg\t%0,%v1,0
+   vleg\t%v0,%1,0
+   vsteg\t%v1,%0,0
+   lzdr\t%0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   mvghi\t%0,0
+   mvghi\t%0,-1
+   lghi\t%0,0
+   lghi\t%0,-1
+   ldgr\t%0,%1
+   lgdr\t%0,%1
+   lgrl\t%0,%1
+   lgr\t%0,%1
+   lg\t%0,%1
+   stg\t%1,%0
+   stgrl\t%1,%0"
+  [(set_attr "op_type" "RRE,RX,RXY,RX,RXY,VRR,VRS,VRS,VRX,VRX,RRE,VRI,VRI,VRI,VRI,
+                        SIL,SIL,RI,RI,RRE,RRE,RIL,RR,RXY,RXY,RIL")])
+
+
+; vec_load_lanes?
+
+; vec_store_lanes?
+
+; FIXME: Support also vector mode operands for 1
+; FIXME: A target memory operand seems to be useful otherwise we end
+; up with vl vlvgg vst.  Shouldn't the middle-end be able to handle
+; that itself?
+(define_insn "*vec_set<mode>"
+  [(set (match_operand:V                    0 "register_operand"             "=v, v,v")
+	(unspec:V [(match_operand:<non_vec> 1 "general_operand"               "d,QR,K")
+		   (match_operand:DI        2 "shift_count_or_setmem_operand" "Y, I,I")
+		   (match_operand:V         3 "register_operand"              "0, 0,0")]
+		  UNSPEC_VEC_SET))]
+  "TARGET_VX"
+  "@
+   vlvg<bhfgq>\t%v0,%1,%Y2
+   vle<bhfgq>\t%v0,%1,%2
+   vlei<bhfgq>\t%v0,%1,%2"
+  [(set_attr "op_type" "VRS,VRX,VRI")])
+
+; vec_set is supposed to *modify* an existing vector so operand 0 is
+; duplicated as input operand.
+(define_expand "vec_set<mode>"
+  [(set (match_operand:V                    0 "register_operand"              "")
+	(unspec:V [(match_operand:<non_vec> 1 "general_operand"               "")
+		   (match_operand:SI        2 "shift_count_or_setmem_operand" "")
+		   (match_dup 0)]
+		   UNSPEC_VEC_SET))]
+  "TARGET_VX")
+
+; FIXME: Support also vector mode operands for 0
+; FIXME: This should be (vec_select ..) or something but it does only allow constant selectors :(
+; This is used via RTL standard name as well as for expanding the builtin
+(define_insn "vec_extract<mode>"
+  [(set (match_operand:<non_vec> 0 "nonimmediate_operand"                        "=d,QR")
+	(unspec:<non_vec> [(match_operand:V  1 "register_operand"                " v, v")
+			   (match_operand:SI 2 "shift_count_or_setmem_operand"   " Y, I")]
+			  UNSPEC_VEC_EXTRACT))]
+  "TARGET_VX"
+  "@
+   vlgv<bhfgq>\t%0,%v1,%Y2
+   vste<bhfgq>\t%v1,%0,%2"
+  [(set_attr "op_type" "VRS,VRX")])
+
+(define_expand "vec_init<V_HW:mode>"
+  [(match_operand:V_HW 0 "register_operand" "")
+   (match_operand:V_HW 1 "nonmemory_operand" "")]
+  "TARGET_VX"
+{
+  s390_expand_vec_init (operands[0], operands[1]);
+  DONE;
+})
+
+; Replicate from vector element
+(define_insn "*vec_splat<mode>"
+  [(set (match_operand:V_HW   0 "register_operand" "=v")
+	(vec_duplicate:V_HW
+	 (vec_select:<non_vec>
+	  (match_operand:V_HW 1 "register_operand"  "v")
+	  (parallel
+	   [(match_operand:QI 2 "immediate_operand" "C")]))))]
+  "TARGET_VX"
+  "vrep<bhfgq>\t%v0,%v1,%2"
+  [(set_attr "op_type" "VRI")])
+
+(define_insn "*vec_splats<mode>"
+  [(set (match_operand:V_HW                          0 "register_operand" "=v,v,v,v")
+	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "general_operand"  "QR,I,v,d")))]
+  "TARGET_VX"
+  "@
+   vlrep<bhfgq>\t%v0,%1
+   vrepi<bhfgq>\t%v0,%1
+   vrep<bhfgq>\t%v0,%v1,0
+   #"
+  [(set_attr "op_type" "VRX,VRI,VRI,*")])
+
+; vec_splats is supposed to replicate op1 into all elements of op0
+; This splitter first sets the rightmost element of op0 to op1 and
+; then does a vec_splat to replicate that element into all other
+; elements.
+(define_split
+  [(set (match_operand:V_HW                          0 "register_operand" "")
+	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "register_operand" "")))]
+  "TARGET_VX && GENERAL_REG_P (operands[1])"
+  [(set (match_dup 0)
+	(unspec:V_HW [(match_dup 1) (match_dup 2) (match_dup 0)] UNSPEC_VEC_SET))
+   (set (match_dup 0)
+	(vec_duplicate:V_HW
+	 (vec_select:<non_vec>
+	  (match_dup 0) (parallel [(match_dup 2)]))))]
+{
+  operands[2] = GEN_INT (GET_MODE_NUNITS (<MODE>mode) - 1);
+})
+
+(define_expand "vcond<V_HW:mode><V_HW2:mode>"
+  [(set (match_operand:V_HW 0 "register_operand" "")
+	(if_then_else:V_HW
+	 (match_operator 3 "comparison_operator"
+			 [(match_operand:V_HW2 4 "register_operand" "")
+			  (match_operand:V_HW2 5 "register_operand" "")])
+	 (match_operand:V_HW 1 "nonmemory_operand" "")
+	 (match_operand:V_HW 2 "nonmemory_operand" "")))]
+  "TARGET_VX && GET_MODE_NUNITS (<V_HW:MODE>mode) == GET_MODE_NUNITS (<V_HW2:MODE>mode)"
+{
+  s390_expand_vcond (operands[0], operands[1], operands[2],
+		     GET_CODE (operands[3]), operands[4], operands[5]);
+  DONE;
+})
+
+(define_expand "vcondu<V_HW:mode><V_HW2:mode>"
+  [(set (match_operand:V_HW 0 "register_operand" "")
+	(if_then_else:V_HW
+	 (match_operator 3 "comparison_operator"
+			 [(match_operand:V_HW2 4 "register_operand" "")
+			  (match_operand:V_HW2 5 "register_operand" "")])
+	 (match_operand:V_HW 1 "nonmemory_operand" "")
+	 (match_operand:V_HW 2 "nonmemory_operand" "")))]
+  "TARGET_VX && GET_MODE_NUNITS (<V_HW:MODE>mode) == GET_MODE_NUNITS (<V_HW2:MODE>mode)"
+{
+  s390_expand_vcond (operands[0], operands[1], operands[2],
+		     GET_CODE (operands[3]), operands[4], operands[5]);
+  DONE;
+})
+
+; We only have HW support for byte vectors.  The middle-end is
+; supposed to lower the mode if required.
+(define_insn "vec_permv16qi"
+  [(set (match_operand:V16QI 0 "register_operand"               "=v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+		       (match_operand:V16QI 2 "register_operand" "v")
+		       (match_operand:V16QI 3 "register_operand" "v")]
+		      UNSPEC_VEC_PERM))]
+  "TARGET_VX"
+  "vperm\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+; vec_perm_const for V2DI using vpdi?
+
+;;
+;; Vector integer arithmetic instructions
+;;
+
+; vab, vah, vaf, vag, vaq
+
+; We use nonimmediate_operand instead of register_operand since it is
+; better to have the reloads into VRs instead of splitting the
+; operation into two DImode ADDs.
+(define_insn "<ti*>add<mode>3"
+  [(set (match_operand:VIT           0 "nonimmediate_operand" "=v")
+	(plus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
+		  (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+  "TARGET_VX"
+  "va<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vsb, vsh, vsf, vsg, vsq
+(define_insn "<ti*>sub<mode>3"
+  [(set (match_operand:VIT            0 "nonimmediate_operand" "=v")
+	(minus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
+		   (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+  "TARGET_VX"
+  "vs<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmlb, vmlhw, vmlf
+(define_insn "mul<mode>3"
+  [(set (match_operand:VI_QHS              0 "register_operand" "=v")
+	(mult:VI_QHS (match_operand:VI_QHS 1 "register_operand"  "v")
+		     (match_operand:VI_QHS 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vml<bhfgq><w>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vlcb, vlch, vlcf, vlcg
+(define_insn "neg<mode>2"
+  [(set (match_operand:VI         0 "register_operand" "=v")
+	(neg:VI (match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vlc<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; vlpb, vlph, vlpf, vlpg
+(define_insn "abs<mode>2"
+  [(set (match_operand:VI         0 "register_operand" "=v")
+	(abs:VI (match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vlp<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector sum across
+
+; Sum across DImode parts of the 1st operand and add the rightmost
+; element of 2nd operand
+; vsumgh, vsumgf
+(define_insn "*vec_sum2<mode>"
+  [(set (match_operand:V2DI 0 "register_operand" "=v")
+	(unspec:V2DI [(match_operand:VI_HW_HS 1 "register_operand" "v")
+		      (match_operand:VI_HW_HS 2 "register_operand" "v")]
+		     UNSPEC_VEC_VSUMG))]
+  "TARGET_VX"
+  "vsumg<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vsumb, vsumh
+(define_insn "*vec_sum4<mode>"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+	(unspec:V4SI [(match_operand:VI_HW_QH 1 "register_operand" "v")
+		      (match_operand:VI_HW_QH 2 "register_operand" "v")]
+		     UNSPEC_VEC_VSUM))]
+  "TARGET_VX"
+  "vsum<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+;;
+;; Vector bit instructions (int + fp)
+;;
+
+; Vector and
+
+(define_insn "and<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(and:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vn\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector or
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(ior:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vo\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector xor
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(xor:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vx\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Bitwise inversion of a vector - used for vec_cmpne
+(define_insn "*not<mode>"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(not:VT (match_operand:VT 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vnot\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Vector population count
+
+(define_insn "popcountv16qi2"
+  [(set (match_operand:V16QI                0 "register_operand" "=v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")]
+		      UNSPEC_POPCNT))]
+  "TARGET_VX"
+  "vpopct\t%v0,%v1,0"
+  [(set_attr "op_type" "VRR")])
+
+; vpopct only counts bits in byte elements.  Bigger element sizes need
+; to be emulated.  Word and doubleword elements can use the sum across
+; instructions.  For halfword sized elements we do a shift of a copy
+; of the result, add it to the result and extend it to halfword
+; element size (unpack).
+
+(define_expand "popcountv8hi2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V8HI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   ; Make a copy of the result
+   (set (match_dup 3) (match_dup 2))
+   ; Generate the shift count operand in a VR (8->byte 7)
+   (set (match_dup 4) (match_dup 5))
+   (set (match_dup 4) (unspec:V16QI [(const_int 8)
+				     (const_int 7)
+				     (match_dup 4)] UNSPEC_VEC_SET))
+   ; Vector shift right logical by one byte
+   (set (match_dup 3)
+	(unspec:V16QI [(match_dup 3) (match_dup 4)] UNSPEC_VEC_SRLB))
+   ; Add the shifted and the original result
+   (set (match_dup 2)
+	(plus:V16QI (match_dup 2) (match_dup 3)))
+   ; Generate mask for the odd numbered byte elements
+   (set (match_dup 3)
+	(const_vector:V16QI [(const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)]))
+   ; Zero out the even indexed bytes
+   (set (match_operand:V8HI 0 "register_operand" "=v")
+	(and:V8HI (subreg:V8HI (match_dup 2) 0)
+		  (subreg:V8HI (match_dup 3) 0)))
+]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = gen_reg_rtx (V16QImode);
+  operands[4] = gen_reg_rtx (V16QImode);
+  operands[5] = CONST0_RTX (V16QImode);
+})
+
+(define_expand "popcountv4si2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V4SI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   (set (match_operand:V4SI 0 "register_operand" "=v")
+	(unspec:V4SI [(match_dup 2) (match_dup 3)]
+		     UNSPEC_VEC_VSUM))]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = force_reg (V16QImode, CONST0_RTX (V16QImode));
+})
+
+(define_expand "popcountv2di2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V2DI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   (set (match_dup 3)
+	(unspec:V4SI [(match_dup 2) (match_dup 4)]
+		     UNSPEC_VEC_VSUM))
+   (set (match_operand:V2DI 0 "register_operand" "=v")
+	(unspec:V2DI [(match_dup 3) (match_dup 5)]
+		     UNSPEC_VEC_VSUMG))]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = gen_reg_rtx (V4SImode);
+  operands[4] = force_reg (V16QImode, CONST0_RTX (V16QImode));
+  operands[5] = force_reg (V4SImode, CONST0_RTX (V4SImode));
+})
+
+; Count leading zeros
+(define_insn "clz<mode>2"
+  [(set (match_operand:V        0 "register_operand" "=v")
+	(clz:V (match_operand:V 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vclz<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Count trailing zeros
+(define_insn "ctz<mode>2"
+  [(set (match_operand:V        0 "register_operand" "=v")
+	(ctz:V (match_operand:V 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vctz<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector rotate instructions
+
+; Each vector element rotated by a scalar
+; verllb, verllh, verllf, verllg
+(define_insn "rotl<mode>3"
+  [(set (match_operand:VI            0 "register_operand"             "=v")
+	(rotate:VI (match_operand:VI 1 "register_operand"              "v")
+		   (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "verll<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; Each vector element rotated by the corresponding vector element
+; verllvb, verllvh, verllvf, verllvg
+(define_insn "vrotl<mode>3"
+  [(set (match_operand:VI            0 "register_operand" "=v")
+	(rotate:VI (match_operand:VI 1 "register_operand"  "v")
+		   (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "verllv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Shift each element by scalar value
+
+; veslb, veslh, veslf, veslg
+(define_insn "ashl<mode>3"
+  [(set (match_operand:VI            0 "register_operand"             "=v")
+	(ashift:VI (match_operand:VI 1 "register_operand"              "v")
+		   (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesl<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; vesrab, vesrah, vesraf, vesrag
+(define_insn "ashr<mode>3"
+  [(set (match_operand:VI              0 "register_operand"             "=v")
+	(ashiftrt:VI (match_operand:VI 1 "register_operand"              "v")
+		     (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesra<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; vesrlb, vesrlh, vesrlf, vesrlg
+(define_insn "lshr<mode>3"
+  [(set (match_operand:VI              0 "register_operand"             "=v")
+	(lshiftrt:VI (match_operand:VI 1 "register_operand"              "v")
+		     (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesrl<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+
+; Shift each element by corresponding vector element
+
+; veslvb, veslvh, veslvf, veslvg
+(define_insn "vashl<mode>3"
+  [(set (match_operand:VI            0 "register_operand" "=v")
+	(ashift:VI (match_operand:VI 1 "register_operand"  "v")
+		   (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "veslv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vesravb, vesravh, vesravf, vesravg
+(define_insn "vashr<mode>3"
+  [(set (match_operand:VI              0 "register_operand" "=v")
+	(ashiftrt:VI (match_operand:VI 1 "register_operand"  "v")
+		     (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vesrav<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vesrlvb, vesrlvh, vesrlvf, vesrlvg
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:VI              0 "register_operand" "=v")
+	(lshiftrt:VI (match_operand:VI 1 "register_operand"  "v")
+		     (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vesrlv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; Vector shift right logical by byte
+
+; Pattern used by e.g. popcount
+(define_insn "*vec_srb<mode>"
+  [(set (match_operand:V_HW 0 "register_operand"                    "=v")
+	(unspec:V_HW [(match_operand:V_HW 1 "register_operand"       "v")
+		      (match_operand:<tointvec> 2 "register_operand" "v")]
+		     UNSPEC_VEC_SRLB))]
+  "TARGET_VX"
+  "vsrlb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; vmnb, vmnh, vmnf, vmng
+(define_insn "smin<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(smin:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmn<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmxb, vmxh, vmxf, vmxg
+(define_insn "smax<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(smax:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmx<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmnlb, vmnlh, vmnlf, vmnlg
+(define_insn "umin<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(umin:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmnl<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmxlb, vmxlh, vmxlf, vmxlg
+(define_insn "umax<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(umax:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmxl<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmeb, vmeh, vmef
+(define_insn "vec_widen_smult_even_<mode>"
+  [(set (match_operand:<vec_double>                    0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_SMULT_EVEN))]
+  "TARGET_VX"
+  "vme<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmleb, vmleh, vmlef
+(define_insn "vec_widen_umult_even_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_UMULT_EVEN))]
+  "TARGET_VX"
+  "vmle<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmob, vmoh, vmof
+(define_insn "vec_widen_smult_odd_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_SMULT_ODD))]
+  "TARGET_VX"
+  "vmo<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmlob, vmloh, vmlof
+(define_insn "vec_widen_umult_odd_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_UMULT_ODD))]
+  "TARGET_VX"
+  "vmlo<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vec_widen_umult_hi
+; vec_widen_umult_lo
+; vec_widen_smult_hi
+; vec_widen_smult_lo
+
+; vec_widen_ushiftl_hi
+; vec_widen_ushiftl_lo
+; vec_widen_sshiftl_hi
+; vec_widen_sshiftl_lo
+
+;;
+;; Vector floating point arithmetic instructions
+;;
+
+(define_insn "addv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(plus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfadb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "subv2df3"
+  [(set (match_operand:V2DF             0 "register_operand" "=v")
+	(minus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		    (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfsdb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "mulv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(mult:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfmdb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "divv2df3"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(div:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		  (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfddb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "sqrtv2df2"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(sqrt:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfsqdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "fmav2df4"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(fma:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		  (match_operand:V2DF 2 "register_operand"  "v")
+		  (match_operand:V2DF 3 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfmadb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "fmsv2df4"
+  [(set (match_operand:V2DF                     0 "register_operand" "=v")
+	(fma:V2DF (match_operand:V2DF           1 "register_operand"  "v")
+		  (match_operand:V2DF           2 "register_operand"  "v")
+		  (neg:V2DF (match_operand:V2DF 3 "register_operand"  "v"))))]
+  "TARGET_VX"
+  "vfmsdb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "negv2df2"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(neg:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vflcdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "absv2df2"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(abs:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vflpdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "*negabsv2df2"
+  [(set (match_operand:V2DF                     0 "register_operand" "=v")
+	(neg:V2DF (abs:V2DF (match_operand:V2DF 1 "register_operand"  "v"))))]
+  "TARGET_VX"
+  "vflndb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Emulate with compare + select
+(define_insn_and_split "smaxv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(smax:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:V2DI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+	(if_then_else:V2DF
+	 (eq (match_dup 3) (match_dup 4))
+	 (match_dup 2)
+	 (match_dup 1)))]
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+  operands[4] = CONST0_RTX (V2DImode);
+})
+
+; Emulate with compare + select
+(define_insn_and_split "sminv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(smin:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:V2DI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+	(if_then_else:V2DF
+	 (eq (match_dup 3) (match_dup 4))
+	 (match_dup 1)
+	 (match_dup 2)))]
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+  operands[4] = CONST0_RTX (V2DImode);
+})
+
+
+;;
+;; Integer compares
+;;
+
+(define_insn "*vec_cmp<VICMP_HW_OP:code><VI:mode>_nocc"
+  [(set (match_operand:VI                 2 "register_operand" "=v")
+	(VICMP_HW_OP:VI (match_operand:VI 0 "register_operand"  "v")
+			(match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vc<VICMP_HW_OP:insn_cmp_op><VI:bhfgq>\t%v2,%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+;;
+;; Floating point compares
+;;
+
+; EQ, GT, GE
+(define_insn "*vec_cmp<VFCMP_HW_OP:code>v2df_nocc"
+  [(set (match_operand:V2DI                   0 "register_operand" "=v")
+	(VFCMP_HW_OP:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+			  (match_operand:V2DF 2 "register_operand"  "v")))]
+   "TARGET_VX"
+   "vfc<VFCMP_HW_OP:asm_fcmp_op>db\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; Expanders for not directly supported comparisons
+
+; UNEQ a u== b -> !(a > b | b > a)
+(define_expand "vec_cmpuneqv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3)
+	(gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; LTGT a <> b -> a > b | b > a
+(define_expand "vec_cmpltgtv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; ORDERED (a, b): a >= b | b > a
+(define_expand "vec_orderedv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; UNORDERED (a, b): !ORDERED (a, b)
+(define_expand "vec_unorderedv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+(define_insn "*vec_load_pairv2di"
+  [(set (match_operand:V2DI                0 "register_operand" "=v")
+	(vec_concat:V2DI (match_operand:DI 1 "register_operand"  "d")
+			 (match_operand:DI 2 "register_operand"  "d")))]
+  "TARGET_VX"
+  "vlvgp\t%v0,%1,%2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "vllv16qi"
+  [(set (match_operand:V16QI              0 "register_operand" "=v")
+	(unspec:V16QI [(match_operand:SI  1 "register_operand"  "d")
+		       (match_operand:BLK 2 "memory_operand"    "Q")]
+		      UNSPEC_VEC_LOAD_LEN))]
+  "TARGET_VX"
+  "vll\t%v0,%1,%2"
+  [(set_attr "op_type" "VRS")])
+
+; vfenebs, vfenehs, vfenefs
+; vfenezbs, vfenezhs, vfenezfs
+(define_insn "vec_vfenes<mode>"
+  [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+			   (match_operand:VI_HW_QHS 2 "register_operand" "v")
+			   (match_operand:QI 3 "immediate_operand" "C")]
+			  UNSPEC_VEC_VFENE))
+   (set (reg:CCRAW CC_REGNUM)
+	(unspec:CCRAW [(match_dup 1)
+		       (match_dup 2)
+		       (match_dup 3)]
+		      UNSPEC_VEC_VFENECC))]
+  "TARGET_VX"
+{
+  unsigned HOST_WIDE_INT flags = INTVAL (operands[3]);
+
+  gcc_assert (!(flags & ~(VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
+  flags &= ~VSTRING_FLAG_CS;
+
+  if (flags == VSTRING_FLAG_ZS)
+    return "vfenez<bhfgq>s\t%v0,%v1,%v2";
+  return "vfene<bhfgq>s\t%v0,%v1,%v2";
+}
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector select
+
+; The following splitters simplify vec_sel for constant 0 or -1
+; selection sources.  This is required to generate efficient code for
+; vcond.
+
+; a = b == c;
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "const0_operand" "")
+	 (match_operand:V 2 "constm1_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = ~(b == c)
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "constm1_operand" "")
+	 (match_operand:V 2 "const0_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (not:V (match_dup 3)))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = b != c
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (ne (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "constm1_operand" "")
+	 (match_operand:V 2 "const0_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = ~(b != c)
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (ne (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "const0_operand" "")
+	 (match_operand:V 2 "constm1_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (not:V (match_dup 3)))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; op0 = op3 == 0 ? op1 : op2
+(define_insn "*vec_sel0<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "v")
+	     (match_operand:<tointvec> 4 "const0_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%2,%1,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = !op3 == 0 ? op1 : op2
+(define_insn "*vec_sel0<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (not:<tointvec> (match_operand:<tointvec> 3 "register_operand" "v"))
+	     (match_operand:<tointvec> 4 "const0_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%1,%2,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = op3 == -1 ? op1 : op2
+(define_insn "*vec_sel1<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "v")
+	     (match_operand:<tointvec> 4 "constm1_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%1,%2,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = !op3 == -1 ? op1 : op2
+(define_insn "*vec_sel1<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (not:<tointvec> (match_operand:<tointvec> 3 "register_operand" "v"))
+	     (match_operand:<tointvec> 4 "constm1_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%2,%1,%3"
+  [(set_attr "op_type" "VRR")])
+
+
+
+; reduc_smin
+; reduc_smax
+; reduc_umin
+; reduc_umax
+
+; vec_shl vrep + vsl
+; vec_shr
+
+; vec_pack_trunc
+; vec_pack_ssat
+; vec_pack_usat
+; vec_pack_sfix_trunc
+; vec_pack_ufix_trunc
+; vec_unpacks_hi
+; vec_unpacks_low
+; vec_unpacku_hi
+; vec_unpacku_low
+; vec_unpacks_float_hi
+; vec_unpacks_float_lo
+; vec_unpacku_float_hi
+; vec_unpacku_float_lo
author	Andreas Krebbel <krebbel@linux.vnet.ibm.com>	2015-05-19 17:26:35 +0000
committer	Andreas Krebbel <krebbel@gcc.gnu.org>	2015-05-19 17:26:35 +0000
commit	085261c8042d644baaf594bc436b87326c1c0390 (patch)
tree	f2088fab028533e22683011130369154e968243d /gcc/config/s390/vector.md
parent	55ac540cd6ec0bbdf76ba5fd57ddd67f17112609 (diff)
download	gcc-085261c8042d644baaf594bc436b87326c1c0390.tar.gz