summaryrefslogtreecommitdiff
path: root/compiler/GHC/StgToJS.hs
blob: e81aa1fbcf81875fd19c152915eb7df5e054a73c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
module GHC.StgToJS
  ( stgToJS
  )
where

import GHC.StgToJS.CodeGen


-- Note [StgToJS design]
-- ~~~~~~~~~~~~~~~~~~~~~
--
-- StgToJS ("JS backend") is adapted from GHCJS [GHCJS2013].
--
-- Haskell to JavaScript
-- ~~~~~~~~~~~~~~~~~~~~~
-- StgToJS converts STG into a JavaScript AST (in GHC.JS) that has been adapted
-- from JMacro [JMacro].
--
-- Tail calls: translated code is tail call optimized through a trampoline,
-- since JavaScript implementations don't always support tail calls.
--  TODO: add GHCJS optimizer for this to be true
--
-- JavaScript ASTs are then optimized. A dataflow analysis is performed and then
-- dead code and redundant assignments are removed.
--
-- Primitives
-- ~~~~~~~~~~
--  See GHC.StgToJS.Types:VarType
--  - Addr#: represented with two fields: array (used as a namespace) and index
--  - StablePtr#: similar to Addr# with array fixed to h$stablePtrBuf
--  - Int64#/Word64#: represented with two fields: high, low
--  - Float#/Double#: both represented as Javascript Double (no Float!)
--  - JSVal#: any Javascript object (used to pass JS objects via FFI)
--  - TVar#, MVar#, etc. are represented with a JS object
--
-- Foreign JavaScript imports
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~
-- StgToJS supports inline JavaScript code. Example:
--
--    > foreign import javascript unsafe
--    >   "$1 + $2"
--    >   plus :: Int -> Int -> Int
--
-- The parser is inherited from JMacro and supports local variable declarations,
-- loops, etc. Local variables are converted to hygienic names to avoid capture.
--
-- TODO: argument order for multi-values primreps (Int64#, Word64#, Addr#)
-- TODO: "$c" safe call continuation?
--
-- Memory management
-- ~~~~~~~~~~~~~~~~~
-- Stack: the Haskell stack is implemented with a dynamically growing JavaScript
-- array ("h$stack").
--  TODO: does it shrink sometimes?
--  TODO: what are the elements of the stack? one JS object per stack frame?
--
-- Heap: objects on the heap ("closures") are represented as JavaScript objects
-- with the following fields:
--
--  { f: function -- entry function
--  , m: meta     -- meta data
--  , d1: x       -- closure specific fields
--  , d2: y
--  }
--
-- Every heap object has an entry function "f".
--
-- Similarly to info tables in native code generation, the JS function object
-- "f" also contains some metadata about the Haskell object:
--
--    { t: closure type
--    , a: constructor tag / fun arity
--    }
--
-- Note that functions in JS are objects so if "f" is a function we can:
--  - call it, e.g. "f(arg0,arg1...)"
--  - get/set its metadata, e.g. "var closureType = f.t"
--
-- THUNK =
--  { f  = returns the object reduced to WHNF
--  , m  = ?
--  , d1 = ?
--  , d2 = ?
--  }
--
-- FUN =
--  { f  = function itself
--  , m  = ?
--  , d1 = free variable 1
--  , d2 = free variable 2
--  }
--
-- There are two different kinds of partial application:
--  - pap_r   : pre-generated PAP that contains r registers
--  - pap_gen : generic PAP, contains any number of registers
--
-- PAP =
--  { f  = ?
--  , m  = ?
--  , d1 = function
--  , d2 =
--    { d1 & 0xff = number of args (PAP arity)
--    , d1 >> 8   = number of registers (r for h$pap_r)
--    , d2, d3... = args (r)
--    }
--  }
--
-- CON =
--  { f  = entry function of the datacon worker
--  , m  = 0
--  , d1 = first arg
--  , d2 = arity = 2: second arg
--         arity > 2: { d1, d2, ...} object with remaining args (starts with "d1 = x2"!)
--  }
--
-- BLACKHOLE =
--  { f  = h$blackhole
--  , m  = ?
--  , d1 = owning TSO
--  , d2 = waiters array
--  }
--
-- StackFrame closures are *not* represented as JS objects. Instead they are
-- "unpacked" in the stack, i.e. a stack frame occupies a few slots in the JS
-- array representing the stack ("h$stack").
--
-- When a shared thunk is entered, it is overriden with a black hole ("eager
-- blackholing") and an update frame is pushed on the stack.
--
-- Interaction with JavaScript's garbage collector
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Using JS objects to represent Haskell heap objects means that JS's GC does
-- most of the memory management work.
--
-- However, GHC extends Haskell with features that rely on GC layer violation
-- (weak references, finalizers, etc.). To support these features, a heap scan
-- is can be performed (using TSOs, StablePtr, etc. as roots) to mark reachable
-- objects. Scanning the heap is an expensive operation, but fortunately it
-- doesn't need to happen too often and it can be disabled.
--
-- TODO: importance of eager blackholing
--
-- Concurrency
-- ~~~~~~~~~~~
-- The scheduler is implemented in JS and runs in a single JavaScript thread
-- (similarly to the C RTS not using `-threaded`).
--
-- The scheduler relies on callbacks/continuations to interact with other JS
-- codes (user interface, etc.). In particular, safe foreign import can use "$c"
-- as a continuation function to return to Haskell code.
--
-- TODO: is this still true since 2013 or are we using more recent JS features now?
-- TODO: synchronous threads
--
--
-- REFERENCES
--  * [GHCJS2013] "Demo Proposal: GHCJS, Concurrent Haskell in the Browser", Luite Stegeman,
--    2013 (https://www.haskell.org/haskell-symposium/2013/ghcjs.pdf)
--  * [JMacro] https://hackage.haskell.org/package/jmacro