summaryrefslogtreecommitdiff
path: root/docs/misc/stubdom.txt
blob: c717a95d17d2e562639a5574e89df3c4db8712fa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
                                IOEMU stubdom
                                =============

  This boosts HVM performance by putting ioemu in its own lightweight domain.

General Configuration
=====================

Due to a race between the creation of the IOEMU stubdomain itself and allocation
of video memory for the HVM domain, you need to avoid the need for ballooning,
by using the hypervisor dom0_mem= option for instance.

Using with XL
-------------

The enable IOEMU stub domains set the following in your domain
config:

    device_model_stubdomain_override = 1

See xl.cfg(5) for more details of the xl domain configuration syntax
and https://wiki.xen.org/wiki/Device_Model_Stub_Domains for more
information on device model stub domains


Toolstack to MiniOS ioemu stubdomain protocol
---------------------------------------------

This section describe communication protocol between toolstack and
qemu-traditional running in MiniOS stubdomain. The protocol include
expectations of both qemu and stubdomain itself.

Setup (done by toolstack, expected by stubdomain):
 - Block devices for target domain are connected as PV disks to stubdomain,
   according to configuration order, starting with xvda
 - Network devices for target domain are connected as PV nics to stubdomain,
   according to configuration order, starting with 0
 - if graphics output is expected, VFB and VKB devices are set for stubdomain
   (its backend is responsible for exposing them using appropriate protocol
   like VNC or Spice)
 - other target domain's devices are not connected at this point to stubdomain
   (may be hot-plugged later)
 - QEMU command line (space separated arguments) is stored in
   /vm/<target-uuid>/image/dmargs xenstore path
 - target domain id is stored in /local/domain/<stubdom-id>/target xenstore path
?? - bios type is stored in /local/domain/<target-id>/hvmloader/bios
 - stubdomain's console 0 is connected to qemu log file
 - stubdomain's console 1 is connected to qemu save file (for saving state)
 - stubdomain's console 2 is connected to qemu save file (for restoring state)
 - next consoles are connected according to target guest's serial console configuration

Startup:
1. PV stubdomain is started with ioemu-stubdom.gz kernel and no initrd
2. stubdomain initialize relevant devices
3. stubdomain signal readiness by writing "running" to /local/domain/<stubdom-id>/device-model/<target-id>/state xenstore path
4. now stubdomain is considered running

Runtime control (hotplug etc):
Toolstack can issue command through xenstore. The sequence is (from toolstack POV):
1. Write parameter to /local/domain/<stubdom-id>/device-model/<target-id>/parameter.
2. Write command to /local/domain/<stubdom-id>/device-model/<target-id>/command.
3. Wait for command result in /local/domain/<stubdom-id>/device-model/<target-id>/state (command specific value).
4. Write "running" back to /local/domain/<stubdom-id>/device-model/<target-id>/state.

Defined commands:
 - "pci-ins" - PCI hot plug, results:
   - "pci-inserted" - success
   - "pci-insert-failed" - failure
 - "pci-rem" - PCI hot remove, results:
   - "pci-removed" - success
   - ??
 - "save" - save domain state to console 1, results:
   - "paused" - success
 - "continue" - resume domain execution, after loading state from console 2 (require -loadvm command argument), results:
   - "running" - success


Toolstack to Linux ioemu stubdomain protocol
--------------------------------------------

This section describe communication protocol between toolstack and
qemu-upstream running in Linux stubdomain. The protocol include
expectations of both stubdomain, and qemu.

Setup (done by toolstack, expected by stubdomain):
 - Block devices for target domain are connected as PV disks to stubdomain,
   according to configuration order, starting with xvda
 - Network devices for target domain are connected as PV nics to stubdomain,
   according to configuration order, starting with 0
 - [not implemented] if graphics output is expected, VFB and VKB devices are set for stubdomain
   (its backend is responsible for exposing them using appropriate protocol
   like VNC or Spice)
 - other target domain's devices are not connected at this point to stubdomain
   (may be hot-plugged later)
 - QEMU command line is stored in
   /vm/<target-uuid>/image/dm-argv xenstore dir, each argument as separate key
   in form /vm/<target-uuid>/image/dm-argv/NNN, where NNN is 0-padded argument
   number
 - target domain id is stored in /local/domain/<stubdom-id>/target xenstore path
?? - bios type is stored in /local/domain/<target-id>/hvmloader/bios
 - stubdomain's console 0 is connected to qemu log file
 - stubdomain's console 1 is connected to qemu save file (for saving state)
 - stubdomain's console 2 is connected to qemu save file (for restoring state)
 - next consoles are connected according to target guest's serial console configuration

Environment exposed by stubdomain to qemu (needed to construct appropriate qemu command line and later interact with qmp):
 - target domain's disks are available as /dev/xvd[a-z]
 - console 2 (incoming domain state) must be connected to an FD and the command
   line argument $STUBDOM_RESTORE_INCOMING_ARG must be replaced with fd:$FD to
   form "-incoming fd:$FD"
 - console 1 (saving domain state) is added over QMP to qemu as "fdset-id 1" (done by stubdomain, toolstack doesn't need to care about it)
 - nics are connected to relevant stubdomain PV vifs when available (qemu -netdev should specify ifname= explicitly)

Startup:
1. toolstack starts PV stubdomain with stubdom-linux-kernel kernel and stubdom-linux-initrd initrd
2. stubdomain initialize relevant devices
3. stubdomain starts qemu with requested command line, plus few stubdomain specific ones - including local qmp access options
4. stubdomain starts vchan server on /local/domain/<stubdom-id>/device-model/<target-id>/qmp-vchan, exposing qmp socket to the toolstack
5. qemu signal readiness by writing "running" to /local/domain/<stubdom-id>/device-model/<target-id>/state xenstore path
6. now device model is considered running

QEMU can be controlled using QMP over vchan at /local/domain/<stubdom-id>/device-model/<target-id>/qmp-vchan. Only one simultaneous connection is supported and toolstack needs to ensure that.

Limitations:
 - PCI passthrough require permissive mode
 - only one nic is supported
 - at most 26 emulated disks are supported (more are still available as PV disks)
 - graphics output (VNC/SDL/Spice) not supported


                                   PV-GRUB
                                   =======

  This replaces pygrub to boot domU images safely: it runs the regular grub
inside the created domain itself and uses regular domU facilities to read the
disk / fetch files from network etc. ; it eventually loads the PV kernel and
chain-boots it.
  
Configuration
=============

In your PV config,

- use pv-grub.gz as kernel:

kernel = "pv-grub.gz"

- set the path to menu.lst, as seen from the domU, in extra:

extra = "(hd0,0)/boot/grub/menu.lst"

or you can provide the content of a menu.lst stored in dom0 by passing it as a
ramdisk:

ramdisk = "/boot/domU-1-menu.lst"

or you can also use a tftp path (dhcp will be automatically performed):

extra = "(nd)/somepath/menu.lst"

or you can set it in option 150 of your dhcp server and leave extra and ramdisk
empty (dhcp will be automatically performed)

Limitations
===========

- You can not boot a 64bit kernel with a 32bit-compiled PV-GRUB and vice-versa.
To cross-compile a 32bit PV-GRUB,

export XEN_TARGET_ARCH=x86_32

- bootsplash is supported, but the ioemu backend does not yet support restart
for use by the booted kernel.

- PV-GRUB doesn't support virtualized partitions. For instance:

disk = [ 'phy:hda7,hda7,w' ]

will be seen by PV-GRUB as (hd0), not (hd0,6), since GRUB will not see any
partition table.


                                Your own stubdom
                                ================

  By running

cd stubdom/
make c-stubdom

  or

cd stubdom/
make caml-stubdom

  you can compile examples of C or caml stub domain kernels.  You can use these
and the relevant Makefile rules as basis to build your own stub domain kernel.
Available libraries are libc, libxc, libxs, zlib and libpci.