From cbf2e4d4f12f9b2be0caacb612bb519a6bec7c06 Mon Sep 17 00:00:00 2001 From: Harsha Jagasia Date: Wed, 30 Sep 2009 00:00:45 +0000 Subject: config.gcc (i[34567]86-*-*): Include fma4intrin.h. 2009-09-29 Harsha Jagasia * config.gcc (i[34567]86-*-*): Include fma4intrin.h. (x86_64-*-*): Ditto. * config/i386/fma4intrin.h: New file, provide common x86 compiler intrinisics for FMA4. * config/i386/cpuid.h (bit_FMA4): Define FMA4 bit. * config/i386/x86intrin.h: Fix typo to SSE4A instead of SSE4a. Add FMA4 check and fma4intrin.h. * config/i386/i386-c.c(ix86_target_macros_internal): Check ISA_FLAG for FMA4. * config/i386/i386.h(TARGET_FMA4): New macro for FMA4. * config/i386/i386.md (UNSPEC_FMA4_INTRINSIC): Add new UNSPEC constant for FMA4 support. (UNSPEC_FMA4_FMADDSUB): Ditto. (UNSPEC_FMA4_FMSUBADD): Ditto. * config/i386/i386.opt (-mfma4): New switch for FMA4 support. * config/i386/i386-protos.h (ix86_fma4_valid_op_p): Add declaration. (ix86_expand_fma4_multiple_memory): Ditto. * config/i386/i386.c (OPTION_MASK_ISA_FMA4_SET): New. (OPTION_MASK_ISA_FMA4_UNSET): New. (OPTION_MASK_ISA_SSE4A_UNSET): Change definition to depend on FMA4. (OPTION_MASK_ISA_AVX_UNSET): Change definition to depend on FMA4. (ix86_handle_option): Handle -mfma4. (isa_opts): Handle -mfma4. (enum pta_flags): Add PTA_FMA4. (override_options): Add FMA4 support. (IX86_BUILTIN_VFMADDSS): New for FMA4 intrinsic. (IX86_BUILTIN_VFMADDSD): Ditto. (IX86_BUILTIN_VFMADDPS): Ditto. (IX86_BUILTIN_VFMADDPD): Ditto. (IX86_BUILTIN_VFMSUBSS): Ditto. (IX86_BUILTIN_VFMSUBSD): Ditto. (IX86_BUILTIN_VFMSUBPS): Ditto. (IX86_BUILTIN_VFMSUBPD): Ditto. (IX86_BUILTIN_VFMADDSUBPS): Ditto. (IX86_BUILTIN_VFMADDSUBPD): Ditto. (IX86_BUILTIN_VFMSUBADDPS): Ditto. (IX86_BUILTIN_VFMSUBADDPD): Ditto. (IX86_BUILTIN_VFNMADDSS): Ditto. (IX86_BUILTIN_VFNMADDSD): Ditto. (IX86_BUILTIN_VFNMADDPS): Ditto. (IX86_BUILTIN_VFNMADDPD): Ditto. (IX86_BUILTIN_VFNMSUBSS): Ditto. (IX86_BUILTIN_VFNMSUBSD): Ditto. (IX86_BUILTIN_VFNMSUBPS): Ditto. (IX86_BUILTIN_VFNMSUBPD): Ditto. (IX86_BUILTIN_VFMADDPS256): Ditto. (IX86_BUILTIN_VFMADDPD256): Ditto. (IX86_BUILTIN_VFMSUBPS256): Ditto. (IX86_BUILTIN_VFMSUBPD256): Ditto. (IX86_BUILTIN_VFMADDSUBPS256): Ditto. (IX86_BUILTIN_VFMADDSUBPD256): Ditto. (IX86_BUILTIN_VFMSUBADDPS256): Ditto. (IX86_BUILTIN_VFMSUBADDPD256): Ditto. (IX86_BUILTIN_VFNMADDPS256): Ditto. (IX86_BUILTIN_VFNMADDPD256): Ditto. (IX86_BUILTIN_VFNMSUBPS256): Ditto. (IX86_BUILTIN_VFNMSUBPD256): Ditto. (enum multi_arg_type): New enum for describing the various FMA4 intrinsic argument types. (bdesc_multi_arg): New table for FMA4 intrinsics. (ix86_init_mmx_sse_builtins): Add FMA4 intrinsic support. (ix86_expand_multi_arg_builtin): New function for creating FMA4 intrinsics. (ix86_expand_builtin): Add FMA4 intrinsic support. (ix86_fma4_valid_op_p): New function to validate FMA4 3 and 4 operand instructions. (ix86_expand_fma4_multiple_memory): New function to split the second memory reference from FMA4 instructions. * config/i386/sse.md (ssemodesuffixf4): New mode attribute for FMA4. (ssemodesuffixf2s): Ditto. (fma4_fmadd4): Add FMA4 floating point multiply/add instructions. (fma4_fmsub4): Ditto. (fma4_fnmadd4): Ditto. (fma4_fnmsub4): Ditto. (fma4_vmfmadd4): Ditto. (fma4_vmfmsub4): Ditto. (fma4_vmfnmadd4): Ditto. (fma4_vmfnmsub4): Ditto. (fma4_fmadd4256): Ditto. (fma4_fmsub4256): Ditto. (fma4_fnmadd4256): Ditto. (fma4_fnmsub4256): Ditto. (fma4_fmaddsubv8sf4): Ditto. (fma4_fmaddsubv4sf4): Ditto. (fma4_fmaddsubv4df4): Ditto. (fma4_fmaddsubv2df4): Ditto. (fma4_fmsubaddv8sf4): Ditto. (fma4_fmsubaddv4sf4): Ditto. (fma4_fmsubaddv4df4): Ditto. (fma4_fmsubaddv2df4): Ditto. (fma4i_fmadd4): Add FMA4 floating point multiply/add instructions for intrinsics. (fma4i_fmsub4): Ditto. (fma4i_fnmadd4): Ditto. (fma4i_fnmsub4): Ditto. (fma4i_vmfmadd4): Ditto. (fma4i_vmfmsub4): Ditto. (fma4i_vmfnmadd4): Ditto. (fma4i_vmfnmsub4): Ditto. (fma4i_fmadd4256): Ditto. (fma4i_fmsub4256): Ditto. (fma4i_fnmadd4256): Ditto. (fma4i_fnmsub4256): Ditto. (fma4i_fmaddsubv8sf4): Ditto. (fma4i_fmaddsubv4sf4): Ditto. (fma4i_fmaddsubv4df4): Ditto. (fma4i_fmaddsubv2df4): Ditto. (fma4i_fmsubaddv8sf4): Ditto. (fma4i_fmsubaddv4sf4): Ditto. (fma4i_fmsubaddv4df4): Ditto. (fma4i_fmsubaddv2df4): Ditto. * doc/invoke.texi (-mfma4): Add documentation. * doc/extend.texi (x86 intrinsics): Add FMA4 intrinsics. * gcc.target/i386/fma4-check.h * gcc.target/i386/fma4-fma.c * gcc.target/i386/fma4-maccXX.c * gcc.target/i386/fma4-msubXX.c * gcc.target/i386/fma4-nmaccXX.c * gcc.target/i386/fma4-nmsubXX.c * gcc.target/i386/fma4-vector.c * gcc.target/i386/fma4-256-maccXX.c * gcc.target/i386/fma4-256-msubXX.c * gcc.target/i386/fma4-256-nmaccXX.c * gcc.target/i386/fma4-256-nmsubXX.c * gcc.target/i386/fma4-256-vector.c * gcc.target/i386/funcspec-2.c: New file. * gcc.target/i386/funcspec-4.c: Test error conditions related to FMA4. * gcc.target/i386/funcspec-5.c * gcc.target/i386/funcspec-6.c * gcc.target/i386/funcspec-8.c: Add FMA4. * gcc.target/i386/funcspec-9.c: New file. * gcc.target/i386/i386.exp: Add check_effective_target_fma4. * gcc.target/i386/isa-10.c * gcc.target/i386/isa-11.c * gcc.target/i386/isa-12.c * gcc.target/i386/isa-13.c * gcc.target/i386/isa-2.c * gcc.target/i386/isa-3.c * gcc.target/i386/isa-4.c * gcc.target/i386/isa-7.c * gcc.target/i386/isa-8.c * gcc.target/i386/isa-9.c: New file. * gcc.target/i386/isa-14.c * gcc.target/i386/isa-1.c * gcc.target/i386/isa-5.c * gcc.target/i386/isa-6.c: Add FMA4. * gcc.target/i386/sse-12.c * gcc.target/i386/sse-13.c * gcc.target/i386/sse-14.c * gcc.target/i386/sse-22.c: New file. * g++.dg/other/i386-2.C * g++.dg/other/i386-3.C * g++.dg/other/i386-5.C * g++.dg/other/i386-6.C: Add -mfma4 in dg-options. From-SVN: r152311 --- gcc/doc/extend.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ gcc/doc/invoke.texi | 6 ++++-- 2 files changed, 49 insertions(+), 2 deletions(-) (limited to 'gcc/doc') diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 993863f284a..6f0955577c3 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3168,6 +3168,11 @@ Enable/disable the generation of the sse4.2 instructions. @cindex @code{target("sse4a")} attribute Enable/disable the generation of the SSE4A instructions. +@item fma4 +@itemx no-fma4 +@cindex @code{target("fma4")} attribute +Enable/disable the generation of the FMA4 instructions. + @item ssse3 @itemx no-ssse3 @cindex @code{target("ssse3")} attribute @@ -8888,6 +8893,46 @@ v2di __builtin_ia32_insertq (v2di, v2di) v2di __builtin_ia32_insertqi (v2di, v2di, const unsigned int, const unsigned int) @end smallexample +The following built-in functions are available when @option{-mfma4} is used. +All of them generate the machine instruction that is part of the name +with MMX registers. + +@smallexample +v2df __builtin_ia32_fmaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmaddsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmaddps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmaddsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmaddss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmsubsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmsubss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmaddsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubaddps (v4sf, v4sf, v4sf) +v4df __builtin_ia32_fmaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmaddps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fnmaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fnmaddps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fnmsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fnmsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmaddsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmaddsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmsubaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmsubaddps256 (v8sf, v8sf, v8sf) + +@end smallexample + The following built-in functions are available when @option{-m3dnow} is used. All of them generate the machine instruction that is part of the name. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4ae8a024a58..e12241c97c1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -592,7 +592,7 @@ Objective-C and Objective-C++ Dialects}. -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol -maes -mpclmul @gol --msse4a -m3dnow -mpopcnt -mabm @gol +-msse4a -m3dnow -mpopcnt -mabm -mfma4 @gol -mthreads -mno-align-stringops -minline-all-stringops @gol -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol -mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol @@ -11727,6 +11727,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @itemx -mno-pclmul @itemx -msse4a @itemx -mno-sse4a +@itemx -mfma4 +@itemx -mno-fma4 @itemx -m3dnow @itemx -mno-3dnow @itemx -mpopcnt @@ -11740,7 +11742,7 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @opindex m3dnow @opindex mno-3dnow These switches enable or disable the use of instructions in the MMX, -SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, ABM or +SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, FMA4, ABM or 3DNow!@: extended instruction sets. These extensions are also available as built-in functions: see @ref{X86 Built-in Functions}, for details of the functions enabled and -- cgit v1.2.1