diff options
author | Harsha Jagasia <harsha.jagasia@amd.com> | 2009-09-30 00:00:45 +0000 |
---|---|---|
committer | Harsha Jagasia <hjagasia@gcc.gnu.org> | 2009-09-30 00:00:45 +0000 |
commit | cbf2e4d4f12f9b2be0caacb612bb519a6bec7c06 (patch) | |
tree | a6738f5a8818475b5816bfd870277e6deb35e967 /gcc/doc | |
parent | f8fd49b54950059ad08d6a8cc291ec07fed15108 (diff) | |
download | gcc-cbf2e4d4f12f9b2be0caacb612bb519a6bec7c06.tar.gz |
config.gcc (i[34567]86-*-*): Include fma4intrin.h.
2009-09-29 Harsha Jagasia <harsha.jagasia@amd.com>
* config.gcc (i[34567]86-*-*): Include fma4intrin.h.
(x86_64-*-*): Ditto.
* config/i386/fma4intrin.h: New file, provide common x86 compiler
intrinisics for FMA4.
* config/i386/cpuid.h (bit_FMA4): Define FMA4 bit.
* config/i386/x86intrin.h: Fix typo to SSE4A instead of SSE4a.
Add FMA4 check and fma4intrin.h.
* config/i386/i386-c.c(ix86_target_macros_internal): Check
ISA_FLAG for FMA4.
* config/i386/i386.h(TARGET_FMA4): New macro for FMA4.
* config/i386/i386.md (UNSPEC_FMA4_INTRINSIC): Add new UNSPEC
constant for FMA4 support.
(UNSPEC_FMA4_FMADDSUB): Ditto.
(UNSPEC_FMA4_FMSUBADD): Ditto.
* config/i386/i386.opt (-mfma4): New switch for FMA4 support.
* config/i386/i386-protos.h (ix86_fma4_valid_op_p): Add
declaration.
(ix86_expand_fma4_multiple_memory): Ditto.
* config/i386/i386.c (OPTION_MASK_ISA_FMA4_SET): New.
(OPTION_MASK_ISA_FMA4_UNSET): New.
(OPTION_MASK_ISA_SSE4A_UNSET): Change definition to
depend on FMA4.
(OPTION_MASK_ISA_AVX_UNSET): Change definition to
depend on FMA4.
(ix86_handle_option): Handle -mfma4.
(isa_opts): Handle -mfma4.
(enum pta_flags): Add PTA_FMA4.
(override_options): Add FMA4 support.
(IX86_BUILTIN_VFMADDSS): New for FMA4 intrinsic.
(IX86_BUILTIN_VFMADDSD): Ditto.
(IX86_BUILTIN_VFMADDPS): Ditto.
(IX86_BUILTIN_VFMADDPD): Ditto.
(IX86_BUILTIN_VFMSUBSS): Ditto.
(IX86_BUILTIN_VFMSUBSD): Ditto.
(IX86_BUILTIN_VFMSUBPS): Ditto.
(IX86_BUILTIN_VFMSUBPD): Ditto.
(IX86_BUILTIN_VFMADDSUBPS): Ditto.
(IX86_BUILTIN_VFMADDSUBPD): Ditto.
(IX86_BUILTIN_VFMSUBADDPS): Ditto.
(IX86_BUILTIN_VFMSUBADDPD): Ditto.
(IX86_BUILTIN_VFNMADDSS): Ditto.
(IX86_BUILTIN_VFNMADDSD): Ditto.
(IX86_BUILTIN_VFNMADDPS): Ditto.
(IX86_BUILTIN_VFNMADDPD): Ditto.
(IX86_BUILTIN_VFNMSUBSS): Ditto.
(IX86_BUILTIN_VFNMSUBSD): Ditto.
(IX86_BUILTIN_VFNMSUBPS): Ditto.
(IX86_BUILTIN_VFNMSUBPD): Ditto.
(IX86_BUILTIN_VFMADDPS256): Ditto.
(IX86_BUILTIN_VFMADDPD256): Ditto.
(IX86_BUILTIN_VFMSUBPS256): Ditto.
(IX86_BUILTIN_VFMSUBPD256): Ditto.
(IX86_BUILTIN_VFMADDSUBPS256): Ditto.
(IX86_BUILTIN_VFMADDSUBPD256): Ditto.
(IX86_BUILTIN_VFMSUBADDPS256): Ditto.
(IX86_BUILTIN_VFMSUBADDPD256): Ditto.
(IX86_BUILTIN_VFNMADDPS256): Ditto.
(IX86_BUILTIN_VFNMADDPD256): Ditto.
(IX86_BUILTIN_VFNMSUBPS256): Ditto.
(IX86_BUILTIN_VFNMSUBPD256): Ditto.
(enum multi_arg_type): New enum for describing the various FMA4
intrinsic argument types.
(bdesc_multi_arg): New table for FMA4 intrinsics.
(ix86_init_mmx_sse_builtins): Add FMA4 intrinsic support.
(ix86_expand_multi_arg_builtin): New function for creating FMA4
intrinsics.
(ix86_expand_builtin): Add FMA4 intrinsic support.
(ix86_fma4_valid_op_p): New function to validate FMA4 3 and 4
operand instructions.
(ix86_expand_fma4_multiple_memory): New function to split the
second memory reference from FMA4 instructions.
* config/i386/sse.md (ssemodesuffixf4): New mode attribute for FMA4.
(ssemodesuffixf2s): Ditto.
(fma4_fmadd<mode>4): Add FMA4 floating point multiply/add
instructions.
(fma4_fmsub<mode>4): Ditto.
(fma4_fnmadd<mode>4): Ditto.
(fma4_fnmsub<mode>4): Ditto.
(fma4_vmfmadd<mode>4): Ditto.
(fma4_vmfmsub<mode>4): Ditto.
(fma4_vmfnmadd<mode>4): Ditto.
(fma4_vmfnmsub<mode>4): Ditto.
(fma4_fmadd<mode>4256): Ditto.
(fma4_fmsub<mode>4256): Ditto.
(fma4_fnmadd<mode>4256): Ditto.
(fma4_fnmsub<mode>4256): Ditto.
(fma4_fmaddsubv8sf4): Ditto.
(fma4_fmaddsubv4sf4): Ditto.
(fma4_fmaddsubv4df4): Ditto.
(fma4_fmaddsubv2df4): Ditto.
(fma4_fmsubaddv8sf4): Ditto.
(fma4_fmsubaddv4sf4): Ditto.
(fma4_fmsubaddv4df4): Ditto.
(fma4_fmsubaddv2df4): Ditto.
(fma4i_fmadd<mode>4): Add FMA4 floating point multiply/add
instructions for intrinsics.
(fma4i_fmsub<mode>4): Ditto.
(fma4i_fnmadd<mode>4): Ditto.
(fma4i_fnmsub<mode>4): Ditto.
(fma4i_vmfmadd<mode>4): Ditto.
(fma4i_vmfmsub<mode>4): Ditto.
(fma4i_vmfnmadd<mode>4): Ditto.
(fma4i_vmfnmsub<mode>4): Ditto.
(fma4i_fmadd<mode>4256): Ditto.
(fma4i_fmsub<mode>4256): Ditto.
(fma4i_fnmadd<mode>4256): Ditto.
(fma4i_fnmsub<mode>4256): Ditto.
(fma4i_fmaddsubv8sf4): Ditto.
(fma4i_fmaddsubv4sf4): Ditto.
(fma4i_fmaddsubv4df4): Ditto.
(fma4i_fmaddsubv2df4): Ditto.
(fma4i_fmsubaddv8sf4): Ditto.
(fma4i_fmsubaddv4sf4): Ditto.
(fma4i_fmsubaddv4df4): Ditto.
(fma4i_fmsubaddv2df4): Ditto.
* doc/invoke.texi (-mfma4): Add documentation.
* doc/extend.texi (x86 intrinsics): Add FMA4 intrinsics.
* gcc.target/i386/fma4-check.h
* gcc.target/i386/fma4-fma.c
* gcc.target/i386/fma4-maccXX.c
* gcc.target/i386/fma4-msubXX.c
* gcc.target/i386/fma4-nmaccXX.c
* gcc.target/i386/fma4-nmsubXX.c
* gcc.target/i386/fma4-vector.c
* gcc.target/i386/fma4-256-maccXX.c
* gcc.target/i386/fma4-256-msubXX.c
* gcc.target/i386/fma4-256-nmaccXX.c
* gcc.target/i386/fma4-256-nmsubXX.c
* gcc.target/i386/fma4-256-vector.c
* gcc.target/i386/funcspec-2.c: New file.
* gcc.target/i386/funcspec-4.c: Test error conditions
related to FMA4.
* gcc.target/i386/funcspec-5.c
* gcc.target/i386/funcspec-6.c
* gcc.target/i386/funcspec-8.c: Add FMA4.
* gcc.target/i386/funcspec-9.c: New file.
* gcc.target/i386/i386.exp: Add check_effective_target_fma4.
* gcc.target/i386/isa-10.c
* gcc.target/i386/isa-11.c
* gcc.target/i386/isa-12.c
* gcc.target/i386/isa-13.c
* gcc.target/i386/isa-2.c
* gcc.target/i386/isa-3.c
* gcc.target/i386/isa-4.c
* gcc.target/i386/isa-7.c
* gcc.target/i386/isa-8.c
* gcc.target/i386/isa-9.c: New file.
* gcc.target/i386/isa-14.c
* gcc.target/i386/isa-1.c
* gcc.target/i386/isa-5.c
* gcc.target/i386/isa-6.c: Add FMA4.
* gcc.target/i386/sse-12.c
* gcc.target/i386/sse-13.c
* gcc.target/i386/sse-14.c
* gcc.target/i386/sse-22.c: New file.
* g++.dg/other/i386-2.C
* g++.dg/other/i386-3.C
* g++.dg/other/i386-5.C
* g++.dg/other/i386-6.C: Add -mfma4 in dg-options.
From-SVN: r152311
Diffstat (limited to 'gcc/doc')
-rw-r--r-- | gcc/doc/extend.texi | 45 | ||||
-rw-r--r-- | gcc/doc/invoke.texi | 6 |
2 files changed, 49 insertions, 2 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 993863f284a..6f0955577c3 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3168,6 +3168,11 @@ Enable/disable the generation of the sse4.2 instructions. @cindex @code{target("sse4a")} attribute Enable/disable the generation of the SSE4A instructions. +@item fma4 +@itemx no-fma4 +@cindex @code{target("fma4")} attribute +Enable/disable the generation of the FMA4 instructions. + @item ssse3 @itemx no-ssse3 @cindex @code{target("ssse3")} attribute @@ -8888,6 +8893,46 @@ v2di __builtin_ia32_insertq (v2di, v2di) v2di __builtin_ia32_insertqi (v2di, v2di, const unsigned int, const unsigned int) @end smallexample +The following built-in functions are available when @option{-mfma4} is used. +All of them generate the machine instruction that is part of the name +with MMX registers. + +@smallexample +v2df __builtin_ia32_fmaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmaddsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmaddps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmaddsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmaddss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fnmsubsd (v2df, v2df, v2df) +v4sf __builtin_ia32_fnmsubss (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmaddsubpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmaddsubps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_fmsubaddpd (v2df, v2df, v2df) +v4sf __builtin_ia32_fmsubaddps (v4sf, v4sf, v4sf) +v4df __builtin_ia32_fmaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmaddps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fnmaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fnmaddps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fnmsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fnmsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmaddsubpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmaddsubps256 (v8sf, v8sf, v8sf) +v4df __builtin_ia32_fmsubaddpd256 (v4df, v4df, v4df) +v8sf __builtin_ia32_fmsubaddps256 (v8sf, v8sf, v8sf) + +@end smallexample + The following built-in functions are available when @option{-m3dnow} is used. All of them generate the machine instruction that is part of the name. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4ae8a024a58..e12241c97c1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -592,7 +592,7 @@ Objective-C and Objective-C++ Dialects}. -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol -maes -mpclmul @gol --msse4a -m3dnow -mpopcnt -mabm @gol +-msse4a -m3dnow -mpopcnt -mabm -mfma4 @gol -mthreads -mno-align-stringops -minline-all-stringops @gol -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol -mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol @@ -11727,6 +11727,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @itemx -mno-pclmul @itemx -msse4a @itemx -mno-sse4a +@itemx -mfma4 +@itemx -mno-fma4 @itemx -m3dnow @itemx -mno-3dnow @itemx -mpopcnt @@ -11740,7 +11742,7 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @opindex m3dnow @opindex mno-3dnow These switches enable or disable the use of instructions in the MMX, -SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, ABM or +SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, FMA4, ABM or 3DNow!@: extended instruction sets. These extensions are also available as built-in functions: see @ref{X86 Built-in Functions}, for details of the functions enabled and |