summaryrefslogtreecommitdiff
path: root/compiler/nativeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix a popular typo in commentsGabor Greif2014-02-012-2/+2
|
* Disable -fregs-graph (#7679, #8657)Simon Marlow2014-01-161-2/+4
|
* nativeGen: Fix spelling in commentBen Gamari2014-01-071-1/+1
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* TyposKrzysztof Gogolewski2013-10-121-1/+1
|
* Future-proof code for upcoming `array-0.5.0.0`Herbert Valerio Riedel2013-10-111-4/+4
| | | | | | This way CPP conditionals can be avoided for the transition period. Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
* Add support for prefetch with locality levels.Austin Seipp2013-10-015-7/+56
| | | | | | | | | | | | | | | | | This patch adds support for several new primitive operations which support using processor-specific instructions to help guide data and cache locality decisions. We have levels ranging from [0..3] For LLVM, we generate llvm.prefetch intrinsics at the proper locality level (similar to GCC.) For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and PowerPC, the locality levels are ignored. This closes #8256. Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
* Globally replace "hackage.haskell.org" with "ghc.haskell.org"Simon Marlow2013-10-0120-20/+20
|
* Discard unreachable code in the register allocator (#7574)Simon Marlow2013-09-233-15/+48
| | | | | | | | | | | | | | | | | | | The problem with unreachable code is that it might refer to undefined registers. This happens accidentally: a block can be orphaned by an optimisation, for example when the result of a comparsion becomes known. The register allocator panics when it finds an undefined register, because they shouldn't occur in generated code. So we need to also discard unreachable code to prevent this panic being triggered by optimisations. The register alloator already does a strongly-connected component analysis, so it ought to be easy to make it discard unreachable code as part of that traversal. It turns out that we need a different variant of the scc algorithm to do that (see Digraph), however the new variant also generates slightly better code by putting the blocks within a loop in a better order for register allocation.
* SIMD primops are now generated using schemas that are polymorphic inGeoffrey Mainland2013-09-221-0/+2
| | | | | | | | | | | | | width and element type. SIMD primops are now polymorphic in vector size and element type, but only internally to the compiler. More specifically, utils/genprimopcode has been extended so that it "knows" about SIMD vectors. This allows us to, for example, write a single definition for the "add two vectors" primop in primops.txt.pp and have it instantiated at many vector types. This generates a primop in GHC.Prim for each vector type at which "add two vectors" is instantiated, but only one data constructor for the PrimOp data type, so the code generator is much, much simpler.
* Fix AMP warnings.Austin Seipp2013-09-113-1/+27
| | | | | Authored-by: David Luposchainsky <dluposchainsky@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
* Remove dead codeJan Stolarek2013-09-102-7/+1
|
* Add basic support for GHCJSAustin Seipp2013-09-065-0/+13
| | | | | | | | | | | | | | | | | | | This patch encompasses most of the basic infrastructure for GHCJS. It includes: * A new extension, -XJavaScriptFFI * A new architecture, ArchJavaScript * Parser and lexer support for 'foreign import javascript', only available under -XJavaScriptFFI, using ArchJavaScript. * As a knock-on, there is also a new 'WayCustom' constructor in DynFlags, so clients of the GHC API can add custom 'tags' to their built files. This should be useful for other users as well. The remaining changes are really just the resulting fallout, making sure all the cases are handled appropriately for DynFlags and Platform. Authored-by: Luite Stegeman <stegeman@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Add support for byte endian swapping for Word 16/32/64.Austin Seipp2013-07-176-0/+39
| | | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machop: MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Authored-by: Vincent Hanquez <tab@snarc.org> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix many ASSERT uses under Clang.Austin Seipp2013-06-182-2/+2
| | | | | | Clang doesn't like whitespace between macro and arguments. Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Revert "Add support for byte endian swapping for Word 16/32/64."Simon Peyton Jones2013-06-116-29/+0
| | | | This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
* Add support for byte endian swapping for Word 16/32/64.Ian Lynagh2013-06-096-0/+29
| | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machops MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Patch from Vincent Hanquez.
* Make the current module available to labelDynamicIan Lynagh2013-05-133-59/+78
| | | | It doesn't actually use it yet
* Use NatM_State record fields, rather than matching/constructing the whole typeIan Lynagh2013-05-131-12/+7
|
* Refactor cmmMakeDynamicReferenceIan Lynagh2013-05-135-20/+27
| | | | | It now has its own class, and the addImport function is defined in that class, rather than needing to be passed as an argument.
* Remove redundant cmmMakeDynamicReference' wrapperIan Lynagh2013-05-131-4/+2
|
* No need to map over all blocks, setting up PIC.Gabor Greif2013-04-131-12/+4
| | | | | Darwin x86 has inconsistent PIC base register, so splitting (which happened before) ensures that each cmm procedure only has one entry point (namely the first block).
* Make explicit that there can be only one entry pointGabor Greif2013-04-121-3/+3
| | | | | | per cmm procedure on Darwin/PPC, because of splitting. x86 should be treated the same way, I'll come back to that later.
* There can be several blocks in a PPC/ELF cmm procGabor Greif2013-04-081-8/+15
| | | | add FETCHPC to all of them (this fixes #7814).
* Remove tabs (M-x untabify)Gabor Greif2013-04-071-99/+91
|
* Fix typosGabor Greif2013-04-073-6/+6
|
* TyposGabor Greif2013-04-071-1/+1
|
* Detab modules with tabs on 5 lines or fewerIan Lynagh2013-04-062-25/+11
|
* Fix typosGabor Greif2013-04-063-3/+3
|
* Simplify away some old -dynamic-too stuff from the previous approachIan Lynagh2013-03-091-48/+33
|
* x86: promote arguments to C functions according to the ABI (#7383)Simon Marlow2013-02-231-6/+14
| | | | | | | | | I don't think the x86-64 version is quite right, but this ought to be enough to pass cgrun071. This code is terrible and needs a complete refactor. There's a lot of duplication, and we ought to be specifying the ABI in a much more abstract way (like LLVM).
* allocMoreStack: we should be retargetting table jumps too.Simon Marlow2013-02-111-3/+3
| | | | Thanks to @PHO on #7498 for pointing this out.
* Fix bugs in PPC.Instr.allocMoreStack (#7498)PHO2013-02-111-39/+85
| | | | This patch is ported from #7510, which fixes the same bug in the x86 nativeGen.
* AsmCodeGen.NcgImpl.ncgMakeFarBranches should take account of info tables (#709)PHO2013-02-022-11/+13
| | | | We have to reduce the maximum number of instructions to jump over depending on the number of info tables in a proc.
* Move AsmCodeGen.makeFarBranches to PPC.Instr (#709)PHO2013-02-022-39/+40
| | | | Its implementation is totally specific to PPC.
* Add prefetch primops.Geoffrey Mainland2013-02-013-0/+5
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-011-41/+47
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-011-0/+18
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-011-1/+35
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* typosGabor Greif2013-01-301-1/+1
|
* Merge branch 'master' of https://github.com/ghc/ghcJohan Tibell2013-01-111-31/+24
|\
| * Update a panic messageIan Lynagh2013-01-111-1/+1
| | | | | | | | | | | | I don't actually know if suggesting -fllvm as a workaround is useful advice, but -fvia-C certainly won't help as it doesn't do anything any more.
| * Whitespace only in nativeGen/SPARC/Base.hsIan Lynagh2013-01-111-31/+24
| |
* | Add preprocessor defines when SSE is enabledJohan Tibell2013-01-101-10/+2
|/ | | | | | | | | | | This will add the following preprocessor defines when Haskell source files are compiled: * __SSE__ - If any version of SSE is enabled * __SSE2__ - If SSE2 or greater is enabled * __SSE4_2_ - If SSE4.2 is enabled Note that SSE2 is enabled by default on x86-64.
* Fix bugs in allocMoreStack (#7498, #7510)Simon Marlow2013-01-073-42/+99
| | | | | | | | | | | | | | | | | | | | | | | | There were four bugs here. Clearly I didn't test this enough to expose the bugs - it appeared to work on x86/Linux, but completely by accident it seems. 1. the delta was wrong by a factor of the slot size (as noted on #7498) 2. we weren't correctly aligning the stack pointer (sp needs to be 16-byte aligned on x86/x86_64) 3. we were doing the adjustment multiple times in the case of a block that was both a return point and a local branch target. To fix this I had to add new shim blocks to adjust the stack pointer, and retarget the original branches. See comment for details. 4. we were doing the adjustment for CALL instructions, which is unnecessary and wrong; only JMPs should be preceded by a stack adjustment. (Someone with a PPC box will need to update the PPC version of allocMoreStack to fix the above bugs, using the x86 version as a guide.)
* Small refactoring: makes it easier to see what nativeCodeGen actually doesIan Lynagh2012-12-162-65/+75
|
* PPC: Implement stack resizing for the linear register allocator.Erik de Castro Lopo2012-12-162-15/+59
| | | | Fixes #7498.
* De-tab compiler/nativeGen/PPC/Instr.hs.Erik de Castro Lopo2012-12-161-273/+266
|
* Implement word2Float# and word2Double#Johan Tibell2012-12-134-1/+28
|
* Small code tidy-upIan Lynagh2012-12-121-8/+7
|
* typoGabor Greif2012-12-121-1/+1
|