ghc/docs/users_guide/parallel.vsgml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316

% both concurrent and parallel
%************************************************************************
%*                                                                      *
<sect1>Concurrent and Parallel Haskell
<label id="concurrent-and-parallel">
<p>
<nidx>Concurrent Haskell</nidx>
<nidx>Parallel Haskell</nidx>
%*                                                                      *
%************************************************************************

Concurrent and Parallel Haskell are Glasgow extensions to Haskell
which let you structure your program as a group of independent
`threads'.

Concurrent and Parallel Haskell have very different purposes.

Concurrent Haskell is for applications which have an inherent
structure of interacting, concurrent tasks (i.e. `threads').  Threads
in such programs may be <em>required</em>.  For example, if a concurrent
thread has been spawned to handle a mouse click, it isn't
optional---the user wants something done!

A Concurrent Haskell program implies multiple `threads' running within
a single Unix process on a single processor.

You will find at least one paper about Concurrent Haskell hanging off
of <url name="Simon Peyton Jones's Web page"
url="http://www.dcs.gla.ac.uk/~simonpj/">.

Parallel Haskell is about <em>speed</em>---spawning threads onto multiple
processors so that your program will run faster.  The `threads'
are always <em>advisory</em>---if the runtime system thinks it can
get the job done more quickly by sequential execution, then fine.

A Parallel Haskell program implies multiple processes running on
multiple processors, under a PVM (Parallel Virtual Machine) framework.

Parallel Haskell is still relatively new; it is more about ``research
fun'' than about ``speed.'' That will change.

Again, check Simon's Web page for publications about Parallel Haskell
(including ``GUM'', the key bits of the runtime system).

Some details about Concurrent and Parallel Haskell follow.

%************************************************************************
%*                                                                      *
<sect2>Language features specific to Concurrent Haskell
<label id="concurrent-haskell">
<p>
<nidx>Concurrent Haskell---features</nidx>
%*                                                                      *
%************************************************************************

%************************************************************************
%*                                                                      *
<sect3>The @Concurrent@ interface (recommended)
<label id="concurrent-interface">
<p>
<nidx>Concurrent interface</nidx>
%*                                                                      *
%************************************************************************

GHC provides a @Concurrent@ module, a common interface to a
collection of useful concurrency abstractions, including those
mentioned in the ``concurrent paper''.

Just add the flag @-syslib concurrent@ to your GHC command line and
put @import Concurrent@ into your modules, and away you go.  To create
a ``required thread'':

<tscreen><verb>
forkIO :: IO () -> IO ThreadId
</verb></tscreen>

where @ThreadId@ is an abstract type representing a handle to the
newly created thread.  Threads may also be killed:

<tscreen><verb>
killThread :: ThreadId -> IO ()
</verb></tscreen>

this terminates the given thread.  Any work already done by the thread
isn't lost: the computation is suspended until required by another
thread.  The memory used by the thread will be garbage collected if it
isn't referenced from anywhere else.

More generally, an arbitrary exception may be raised in any thread for
which we have a <tt/ThreadId/, with <tt/raiseInThread/:

<tscreen><verb>
raiseInThread :: ThreadId -> Exception -> IO ()
</verb></tscreen>

Actually <tt/killThread/ just raises the <tt/ThreadKilled/ exception
in the target thread, the normal action of which is to just terminate
the thread.  The target thread will stop whatever it was doing (even
if it was blocked on an <tt/MVar/ or other computation) and handle the
exception.

The <tt/ThreadId/ for the current thread can be obtained with
<tt/myThreadId/:

<tscreen><verb>
myThreadId :: IO ThreadId
</verb></tscreen>

NOTE: if you have a @ThreadId@, you essentially have a pointer to the
thread itself.  This means the thread itself can't be garbage
collected until you drop the @ThreadId@.  This misfeature will
hopefully be corrected at a later date.

The @Concurrent@ interface also provides access to ``M-Vars'', which
are <em>synchronising variables</em>.  

<nidx>synchronising variables (Glasgow extension)</nidx>
<nidx>concurrency -- synchronising variables</nidx>

@MVars@<nidx>MVars (Glasgow extension)</nidx> are rendezvous points,
mostly for concurrent threads.  They begin either empty or full, and
any attempt to read an empty @MVar@ blocks.  When an @MVar@ is
written, a single blocked thread may be freed.  Reading an @MVar@
toggles its state from full back to empty.  Therefore, any value
written to an @MVar@ may only be read once.  Multiple reads and writes
are allowed, but there must be at least one read between any two
writes. Interface:

<tscreen><verb>
newEmptyMVar :: IO (MVar a)
newMVar      :: a -> IO (MVar a)
takeMVar     :: MVar a -> IO a
putMVar      :: MVar a -> a -> IO ()
readMVar     :: MVar a -> IO a
swapMVar     :: MVar a -> a -> IO a
</verb></tscreen>

A <em>channel variable</em> (@CVar@) is a one-element channel, as
described in the paper:

<tscreen><verb>
data CVar a
newCVar :: IO (CVar a)
putCVar :: CVar a -> a -> IO ()
getCVar :: CVar a -> IO a
</verb></tscreen>

A @Channel@ is an unbounded channel:

<tscreen><verb>
data Chan a 
newChan         :: IO (Chan a)
putChan         :: Chan a -> a -> IO ()
getChan         :: Chan a -> IO a
dupChan         :: Chan a -> IO (Chan a)
unGetChan       :: Chan a -> a -> IO ()
getChanContents :: Chan a -> IO [a]
</verb></tscreen>

General and quantity semaphores:

<tscreen><verb>
data QSem
newQSem     :: Int   -> IO QSem
waitQSem    :: QSem  -> IO ()
signalQSem  :: QSem  -> IO ()

data QSemN
newQSemN    :: Int   -> IO QSemN
signalQSemN :: QSemN -> Int -> IO ()
waitQSemN   :: QSemN -> Int -> IO ()
</verb></tscreen>

Merging streams---binary and n-ary:

<tscreen><verb>
mergeIO  :: [a]   -> [a] -> IO [a]
nmergeIO :: [[a]] -> IO [a]
</verb></tscreen>

A <em>Sample variable</em> (@SampleVar@) is slightly different from a
normal @MVar@:

<itemize>
<item> Reading an empty @SampleVar@ causes the reader to block
    (same as @takeMVar@ on empty @MVar@).
<item> Reading a filled @SampleVar@ empties it and returns value.
    (same as @takeMVar@)
<item> Writing to an empty @SampleVar@ fills it with a value, and
potentially, wakes up a blocked reader  (same as for @putMVar@ on empty @MVar@).
<item> Writing to a filled @SampleVar@ overwrites the current value.
 (different from @putMVar@ on full @MVar@.)
</itemize>

<tscreen><verb>
type SampleVar a = MVar (Int, MVar a)

emptySampleVar :: SampleVar a -> IO ()
newSampleVar   :: IO (SampleVar a)
readSample     :: SampleVar a -> IO a
writeSample    :: SampleVar a -> a -> IO ()
</verb></tscreen>

Finally, there are operations to delay a concurrent thread, and to
make one wait:<nidx>delay a concurrent thread</nidx>
<nidx>wait for a file descriptor</nidx>
<tscreen><verb>
threadDelay     :: Int -> IO () -- delay rescheduling for N microseconds
threadWaitRead  :: Int -> IO () -- wait for input on specified file descriptor
threadWaitWrite :: Int -> IO () -- (read and write, respectively).
</verb></tscreen>

%************************************************************************
%*                                                                      *
<sect2>Features specific to Parallel Haskell
<nidx>Parallel Haskell---features</nidx>
<p>
%*                                                                      *
%************************************************************************

%************************************************************************
%*                                                                      *
<sect3>The @Parallel@ interface (recommended)
<nidx>Parallel interface</nidx>
<p>
%*                                                                      *
%************************************************************************

GHC provides two functions for controlling parallel execution, through
the @Parallel@ interface:

<tscreen><verb>
interface Parallel where
infixr 0 `par`
infixr 1 `seq`

par :: a -> b -> b
seq :: a -> b -> b
</verb></tscreen>

The expression @(x `par` y)@ <em>sparks</em> the evaluation of @x@
(to weak head normal form) and returns @y@.  Sparks are queued for
execution in FIFO order, but are not executed immediately.  At the
next heap allocation, the currently executing thread will yield
control to the scheduler, and the scheduler will start a new thread
(until reaching the active thread limit) for each spark which has not
already been evaluated to WHNF.

The expression @(x `seq` y)@ evaluates @x@ to weak head normal
form and then returns @y@.  The @seq@ primitive can be used to
force evaluation of an expression beyond WHNF, or to impose a desired
execution sequence for the evaluation of an expression.

For example, consider the following parallel version of our old
nemesis, @nfib@:

<tscreen><verb>
import Parallel

nfib :: Int -> Int
nfib n | n <= 1 = 1
       | otherwise = par n1 (seq n2 (n1 + n2 + 1))
                     where n1 = nfib (n-1) 
                           n2 = nfib (n-2)
</verb></tscreen>

For values of @n@ greater than 1, we use @par@ to spark a thread
to evaluate @nfib (n-1)@, and then we use @seq@ to force the
parent thread to evaluate @nfib (n-2)@ before going on to add
together these two subexpressions.  In this divide-and-conquer
approach, we only spark a new thread for one branch of the computation
(leaving the parent to evaluate the other branch).  Also, we must use
@seq@ to ensure that the parent will evaluate @n2@ <em>before</em>
@n1@ in the expression @(n1 + n2 + 1)@.  It is not sufficient to
reorder the expression as @(n2 + n1 + 1)@, because the compiler may
not generate code to evaluate the addends from left to right.

%************************************************************************
%*                                                                      *
<sect3>Underlying functions and primitives
<nidx>parallelism primitives</nidx>
<nidx>primitives for parallelism</nidx>
<p>
%*                                                                      *
%************************************************************************

The functions @par@ and @seq@ are wired into GHC, and unfold
into uses of the @par#@ and @seq#@ primitives, respectively.  If
you'd like to see this with your very own eyes, just run GHC with the
@-ddump-simpl@ option.  (Anything for a good time...)

%************************************************************************
%*                                                                      *
<sect3>Scheduling policy for concurrent/parallel threads
<nidx>Scheduling---concurrent/parallel</nidx>
<nidx>Concurrent/parallel scheduling</nidx>
<p>
%*                                                                      *
%************************************************************************

Runnable threads are scheduled in round-robin fashion.  Context
switches are signalled by the generation of new sparks or by the
expiry of a virtual timer (the timer interval is configurable with the
@-C[<num>]@<nidx>-C&lt;num&gt; RTS option (concurrent,
parallel)</nidx> RTS option).  However, a context switch doesn't
really happen until the current heap block is full.  You can't get any
faster context switching than this.

When a context switch occurs, pending sparks which have not already
been reduced to weak head normal form are turned into new threads.
However, there is a limit to the number of active threads (runnable or
blocked) which are allowed at any given time.  This limit can be
adjusted with the @-t<num>@<nidx>-t &lt;num&gt; RTS option (concurrent, parallel)</nidx>
RTS option (the default is 32).  Once the
thread limit is reached, any remaining sparks are deferred until some
of the currently active threads are completed.