summaryrefslogtreecommitdiff
path: root/compiler/cmm
diff options
context:
space:
mode:
authorBen Gamari <bgamari.foss@gmail.com>2015-07-09 02:08:01 +0200
committerBen Gamari <ben@smart-cactus.org>2015-07-09 02:08:01 +0200
commitb29633f5cf310824f3e34716e9261162ced779d3 (patch)
tree5bfb753fb27c55a52c4cdc1303dee150015278a3 /compiler/cmm
parent4f9d6008c04b71fc9449b3dc10861f757539ed0f (diff)
downloadhaskell-b29633f5cf310824f3e34716e9261162ced779d3.tar.gz
Bitmap: Fix thunk explosion
Previously we would build up another `map (-N)` thunk for every word in the bitmap. Now we strictly accumulate the position and carry out a single ``map (`subtract` accum)``. `Bitmap.intsToBitmap` showed up in the profile while compiling a testcase of #7450 (namely a program containing a record type with large number of fields which derived `Read`). The culprit was `CmmBuildInfoTables.procpointSRT.bitmap`. On the testcase (with 4096 fields), the profile previously looked like, ``` total time = 307.94 secs (307943 ticks @ 1000 us, 1 processor) total alloc = 336,797,868,056 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc lintAnnots CoreLint 17.2 25.8 procpointSRT.bitmap CmmBuildInfoTables 11.3 25.2 FloatOutwards SimplCore 7.5 1.6 flatten.lookup CmmBuildInfoTables 4.0 3.9 ... ``` After this fix it looks like, ``` total time = 256.88 secs (256876 ticks @ 1000 us, 1 processor) total alloc = 255,033,667,448 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc lintAnnots CoreLint 20.3 34.1 FloatOutwards SimplCore 9.1 2.1 flatten.lookup CmmBuildInfoTables 4.8 5.2 pprNativeCode AsmCodeGen 3.7 4.3 simplLetUnfolding Simplify 3.6 2.2 StgCmm HscMain 3.6 2.1 ``` Signed-off-by: Ben Gamari <ben@smart-cactus.org> Test Plan: Validate Reviewers: austin, simonpj Reviewed By: simonpj Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1041 GHC Trac Issues: #7450
Diffstat (limited to 'compiler/cmm')
-rw-r--r--compiler/cmm/Bitmap.hs84
1 files changed, 64 insertions, 20 deletions
diff --git a/compiler/cmm/Bitmap.hs b/compiler/cmm/Bitmap.hs
index e7aa072063..22ec6ee238 100644
--- a/compiler/cmm/Bitmap.hs
+++ b/compiler/cmm/Bitmap.hs
@@ -1,4 +1,4 @@
-{-# LANGUAGE CPP #-}
+{-# LANGUAGE CPP, BangPatterns #-}
--
-- (c) The University of Glasgow 2003-2006
@@ -45,31 +45,75 @@ chunkToBitmap dflags chunk =
-- eg. @[0,1,3], size 4 ==> 0xb@.
--
-- The list of @Int@s /must/ be already sorted.
-intsToBitmap :: DynFlags -> Int -> [Int] -> Bitmap
-intsToBitmap dflags size slots{- must be sorted -}
- | size <= 0 = []
- | otherwise =
- (foldr (.|.) (toStgWord dflags 0) (map (toStgWord dflags 1 `shiftL`) these)) :
- intsToBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
- (map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
- where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
+intsToBitmap :: DynFlags
+ -> Int -- ^ size in bits
+ -> [Int] -- ^ sorted indices of ones
+ -> Bitmap
+intsToBitmap dflags size = go 0
+ where
+ word_sz = wORD_SIZE_IN_BITS dflags
+ oneAt :: Int -> StgWord
+ oneAt i = toStgWord dflags 1 `shiftL` i
+
+ -- It is important that we maintain strictness here.
+ -- See Note [Strictness when building Bitmaps].
+ go :: Int -> [Int] -> Bitmap
+ go !pos slots
+ | size <= pos = []
+ | otherwise =
+ (foldr (.|.) (toStgWord dflags 0) (map (\i->oneAt (i - pos)) these)) :
+ go (pos + word_sz) rest
+ where
+ (these,rest) = span (< (pos + word_sz)) slots
-- | Make a bitmap where the slots specified are the /zeros/ in the bitmap.
-- eg. @[0,1,3], size 4 ==> 0x4@ (we leave any bits outside the size as zero,
-- just to make the bitmap easier to read).
--
-- The list of @Int@s /must/ be already sorted and duplicate-free.
-intsToReverseBitmap :: DynFlags -> Int -> [Int] -> Bitmap
-intsToReverseBitmap dflags size slots{- must be sorted -}
- | size <= 0 = []
- | otherwise =
- (foldr xor (toStgWord dflags init) (map (toStgWord dflags 1 `shiftL`) these)) :
- intsToReverseBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
- (map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
- where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
- init
- | size >= wORD_SIZE_IN_BITS dflags = -1
- | otherwise = (1 `shiftL` size) - 1
+intsToReverseBitmap :: DynFlags
+ -> Int -- ^ size in bits
+ -> [Int] -- ^ sorted indices of zeros free of duplicates
+ -> Bitmap
+intsToReverseBitmap dflags size = go 0
+ where
+ word_sz = wORD_SIZE_IN_BITS dflags
+ oneAt :: Int -> StgWord
+ oneAt i = toStgWord dflags 1 `shiftL` i
+
+ -- It is important that we maintain strictness here.
+ -- See Note [Strictness when building Bitmaps].
+ go :: Int -> [Int] -> Bitmap
+ go !pos slots
+ | size <= pos = []
+ | otherwise =
+ (foldr xor (toStgWord dflags init) (map (\i->oneAt (i - pos)) these)) :
+ go (pos + word_sz) rest
+ where
+ (these,rest) = span (< (pos + word_sz)) slots
+ remain = size - pos
+ init
+ | remain >= word_sz = -1
+ | otherwise = (1 `shiftL` remain) - 1
+
+{-
+
+Note [Strictness when building Bitmaps]
+========================================
+
+One of the places where @Bitmap@ is used is in in building Static Reference
+Tables (SRTs) (in @CmmBuildInfoTables.procpointSRT@). In #7450 it was noticed
+that some test cases (particularly those whose C-- have large numbers of CAFs)
+produced large quantities of allocations from this function.
+
+The source traced back to 'intsToBitmap', which was lazily subtracting the word
+size from the elements of the tail of the @slots@ list and recursively invoking
+itself with the result. This resulted in large numbers of subtraction thunks
+being built up. Here we take care to avoid passing new thunks to the recursive
+call. Instead we pass the unmodified tail along with an explicit position
+accumulator, which get subtracted in the fold when we compute the Word.
+
+-}
{- |
Magic number, must agree with @BITMAP_BITS_SHIFT@ in InfoTables.h.