summaryrefslogtreecommitdiff
path: root/doc/source/admin/live-migration-usage.rst
blob: 32c67c2b0ada5b511c6bf4a8e8d1b1cba501c12c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
======================
Live-migrate instances
======================

Live-migrating an instance means moving its virtual machine to a different
OpenStack Compute server while the instance continues running.  Before starting
a live-migration, review the chapter
:ref:`section_configuring-compute-migrations`. It covers the configuration
settings required to enable live-migration, but also reasons for migrations and
non-live-migration options.

The instructions below cover shared-storage and volume-backed migration.  To
block-migrate instances, add the command-line option
``-block-migrate`` to the :command:`nova live-migration` command,
and ``--block-migration`` to the :command:`openstack server migrate`
command.

.. _section-manual-selection-of-dest:

Manual selection of the destination host
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#. Obtain the ID of the instance you want to migrate:

   .. code-block:: console

      $ openstack server list

      +--------------------------------------+------+--------+-----------------+------------+
      | ID                                   | Name | Status | Networks        | Image Name |
      +--------------------------------------+------+--------+-----------------+------------+
      | d1df1b5a-70c4-4fed-98b7-423362f2c47c | vm1  | ACTIVE | private=a.b.c.d | ...        |
      | d693db9e-a7cf-45ef-a7c9-b3ecb5f22645 | vm2  | ACTIVE | private=e.f.g.h | ...        |
      +--------------------------------------+------+--------+-----------------+------------+

#. Determine on which host the instance is currently running. In this example,
   ``vm1`` is running on ``HostB``:

   .. code-block:: console

      $ openstack server show d1df1b5a-70c4-4fed-98b7-423362f2c47c

      +----------------------+--------------------------------------+
      | Field                | Value                                |
      +----------------------+--------------------------------------+
      | ...                  | ...                                  |
      | OS-EXT-SRV-ATTR:host | HostB                                |
      | ...                  | ...                                  |
      | addresses            | a.b.c.d                              |
      | flavor               | m1.tiny                              |
      | id                   | d1df1b5a-70c4-4fed-98b7-423362f2c47c |
      | name                 | vm1                                  |
      | status               | ACTIVE                               |
      | ...                  | ...                                  |
      +----------------------+--------------------------------------+

#. Select the compute node the instance will be migrated to. In this example,
   we will migrate the instance to ``HostC``, because ``nova-compute`` is
   running on it:

   .. code-block:: console

      $ openstack compute service list

      +----+------------------+-------+----------+---------+-------+----------------------------+
      | ID | Binary           | Host  | Zone     | Status  | State | Updated At                 |
      +----+------------------+-------+----------+---------+-------+----------------------------+
      |  3 | nova-conductor   | HostA | internal | enabled | up    | 2017-02-18T09:42:29.000000 |
      |  4 | nova-scheduler   | HostA | internal | enabled | up    | 2017-02-18T09:42:26.000000 |
      |  5 | nova-compute     | HostB | nova     | enabled | up    | 2017-02-18T09:42:29.000000 |
      |  6 | nova-compute     | HostC | nova     | enabled | up    | 2017-02-18T09:42:29.000000 |
      +----+------------------+-------+----------+---------+-------+----------------------------+

#. Check that ``HostC`` has enough resources for migration:

   .. code-block:: console

      $ openstack host show HostC

      +-------+------------+-----+-----------+---------+
      | Host  | Project    | CPU | Memory MB | Disk GB |
      +-------+------------+-----+-----------+---------+
      | HostC | (total)    |  16 |     32232 |     878 |
      | HostC | (used_now) |  22 |     21284 |     422 |
      | HostC | (used_max) |  22 |     21284 |     422 |
      | HostC | p1         |  22 |     21284 |     422 |
      | HostC | p2         |  22 |     21284 |     422 |
      +-------+------------+-----+-----------+---------+

   - ``cpu``: Number of CPUs

   - ``memory_mb``: Total amount of memory, in MB

   - ``disk_gb``: Total amount of space for NOVA-INST-DIR/instances, in GB

   In this table, the first row shows the total amount of resources available
   on the physical server. The second line shows the currently used resources.
   The third line shows the maximum used resources. The fourth line and below
   shows the resources available for each project.

#. Migrate the instance:

   .. code-block:: console

      $ openstack server migrate d1df1b5a-70c4-4fed-98b7-423362f2c47c --live-migration --host HostC

#. Confirm that the instance has been migrated successfully:

   .. code-block:: console

      $ openstack server show d1df1b5a-70c4-4fed-98b7-423362f2c47c

      +----------------------+--------------------------------------+
      | Field                | Value                                |
      +----------------------+--------------------------------------+
      | ...                  | ...                                  |
      | OS-EXT-SRV-ATTR:host | HostC                                |
      | ...                  | ...                                  |
      +----------------------+--------------------------------------+

   If the instance is still running on ``HostB``, the migration failed. The
   ``nova-scheduler`` and ``nova-conductor`` log files on the controller and
   the ``nova-compute`` log file on the source compute host can help pin-point
   the problem.

.. _auto_selection_of_dest:

Automatic selection of the destination host
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To leave the selection of the destination host to the Compute service, use the
nova command-line client.

#. Obtain the instance ID as shown in step 1 of the section
   :ref:`section-manual-selection-of-dest`.

#. Leave out the host selection steps 2, 3, and 4.

#. Migrate the instance:

   .. code-block:: console

      $ nova live-migration d1df1b5a-70c4-4fed-98b7-423362f2c47c

Monitoring the migration
~~~~~~~~~~~~~~~~~~~~~~~~

#. Confirm that the instance is migrating:

   .. code-block:: console

      $ openstack server show d1df1b5a-70c4-4fed-98b7-423362f2c47c

      +----------------------+--------------------------------------+
      | Field                | Value                                |
      +----------------------+--------------------------------------+
      | ...                  | ...                                  |
      | status               | MIGRATING                            |
      | ...                  | ...                                  |
      +----------------------+--------------------------------------+

#. Check progress

   Use the nova command-line client for nova's migration monitoring feature.
   First, obtain the migration ID:

   .. code-block:: console

      $ nova server-migration-list d1df1b5a-70c4-4fed-98b7-423362f2c47c
      +----+-------------+-----------  (...)
      | Id | Source Node | Dest Node | (...)
      +----+-------------+-----------+ (...)
      | 2  | -           | -         | (...)
      +----+-------------+-----------+ (...)

   For readability, most output columns were removed. Only the first column,
   **Id**, is relevant.  In this example, the migration ID is 2. Use this to
   get the migration status.

   .. code-block:: console

      $ nova server-migration-show d1df1b5a-70c4-4fed-98b7-423362f2c47c 2
      +------------------------+--------------------------------------+
      | Property               | Value                                |
      +------------------------+--------------------------------------+
      | created_at             | 2017-03-08T02:53:06.000000           |
      | dest_compute           | controller                           |
      | dest_host              | -                                    |
      | dest_node              | -                                    |
      | disk_processed_bytes   | 0                                    |
      | disk_remaining_bytes   | 0                                    |
      | disk_total_bytes       | 0                                    |
      | id                     | 2                                    |
      | memory_processed_bytes | 65502513                             |
      | memory_remaining_bytes | 786427904                            |
      | memory_total_bytes     | 1091379200                           |
      | server_uuid            | d1df1b5a-70c4-4fed-98b7-423362f2c47c |
      | source_compute         | compute2                             |
      | source_node            | -                                    |
      | status                 | running                              |
      | updated_at             | 2017-03-08T02:53:47.000000           |
      +------------------------+--------------------------------------+

   The output shows that the migration is running. Progress is measured by the
   number of memory bytes that remain to be copied. If this number is not
   decreasing over time, the migration may be unable to complete, and it may be
   aborted by the Compute service.

   .. note::

      The command reports that no disk bytes are processed, even in the event
      of block migration.

What to do when the migration times out
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

During the migration process, the instance may write to a memory page after
that page has been copied to the destination. When that happens, the same page
has to be copied again. The instance may write to memory pages faster than they
can be copied, so that the migration cannot complete. There are two optional
actions, controlled by
:oslo.config:option:`libvirt.live_migration_timeout_action`, which can be
taken against a VM after
:oslo.config:option:`libvirt.live_migration_completion_timeout` is reached:

1. ``abort`` (default): The live migration operation will be cancelled after
   the completion timeout is reached. This is similar to using API
   ``DELETE /servers/{server_id}/migrations/{migration_id}``.

2. ``force_complete``: The compute service will either pause the VM or trigger
   post-copy depending on if post copy is enabled and available
   (:oslo.config:option:`libvirt.live_migration_permit_post_copy` is set to
   `True`). This is similar to using API
   ``POST /servers/{server_id}/migrations/{migration_id}/action (force_complete)``.

You can also read the
:oslo.config:option:`libvirt.live_migration_timeout_action`
configuration option help for more details.

The following remarks assume the KVM/Libvirt hypervisor.

How to know that the migration timed out
----------------------------------------

To determine that the migration timed out, inspect the ``nova-compute`` log
file on the source host. The following log entry shows that the migration timed
out:

.. code-block:: console

   # grep WARNING.*d1df1b5a-70c4-4fed-98b7-423362f2c47c /var/log/nova/nova-compute.log
   ...
   WARNING nova.virt.libvirt.migration [req-...] [instance: ...]
   live migration not completed after 1800 sec

Addressing migration timeouts
-----------------------------

To stop the migration from putting load on infrastructure resources like
network and disks, you may opt to cancel it manually.

.. code-block:: console

   $ nova live-migration-abort INSTANCE_ID MIGRATION_ID

To make live-migration succeed, you have several options:

- **Manually force-complete the migration**

  .. code-block:: console

     $ nova live-migration-force-complete INSTANCE_ID MIGRATION_ID

  The instance is paused until memory copy completes.

  .. caution::

     Since the pause impacts time keeping on the instance and not all
     applications tolerate incorrect time settings, use this approach with
     caution.

- **Enable auto-convergence**

  Auto-convergence is a Libvirt feature. Libvirt detects that the migration is
  unlikely to complete and slows down its CPU until the memory copy process is
  faster than the instance's memory writes.

  To enable auto-convergence, set
  ``live_migration_permit_auto_converge=true`` in ``nova.conf`` and restart
  ``nova-compute``. Do this on all compute hosts.

  .. caution::

     One possible downside of auto-convergence is the slowing down of the
     instance.

- **Enable post-copy**

  This is a Libvirt feature. Libvirt detects that the migration does not
  progress and responds by activating the virtual machine on the destination
  host before all its memory has been copied. Access to missing memory pages
  result in page faults that are satisfied from the source host.

  To enable post-copy, set ``live_migration_permit_post_copy=true`` in
  ``nova.conf`` and restart ``nova-compute``. Do this on all compute hosts.

  When post-copy is enabled, manual force-completion does not pause the
  instance but switches to the post-copy process.

  .. caution::

     Possible downsides:

     - When the network connection between source and destination is
       interrupted, page faults cannot be resolved anymore, and the virtual
       machine is rebooted.

     - Post-copy may lead to an increased page fault rate during migration,
       which can slow the instance down.

If live migrations routinely timeout or fail during cleanup operations due
to the user token timing out, consider configuring nova to use
:ref:`service user tokens <user_token_timeout>`.