summaryrefslogtreecommitdiff
path: root/doc/source/admin/raid.rst
blob: db98bc84e214777385042db0d1caaf3c93bbd606 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
.. _raid:

==================
RAID Configuration
==================

Overview
========
Ironic supports RAID configuration for bare metal nodes.  It allows operators
to specify the desired RAID configuration via the OpenStackClient CLI or REST
API.  The desired RAID configuration is applied on the bare metal during manual
cleaning.

The examples described here use the OpenStackClient CLI; please see the
`REST API reference <https://developer.openstack.org/api-ref/baremetal/>`_
for their corresponding REST API requests.

Prerequisites
=============
The bare metal node needs to use a hardware type that supports RAID
configuration. RAID interfaces may implement RAID configuration either in-band
or out-of-band.

In-band RAID configuration is done using the Ironic Python Agent
ramdisk. For in-band RAID configuration using agent ramdisk, a hardware
manager which supports RAID should be bundled with the ramdisk.
Whether a node supports RAID configuration could be found using the CLI
command ``openstack baremetal node validate <node-uuid>``.

Build agent ramdisk which supports RAID configuration
=====================================================

For doing in-band hardware RAID configuration, Ironic needs an agent ramdisk
bundled with a hardware manager which supports RAID configuration for your
hardware. For example, the :ref:`DIB_raid_support` should be used for HPE
Proliant Servers.

.. note::
    For in-band software RAID, the agent ramdisk does not need to be bundled
    with a hardware manager as the generic hardware manager in the Ironic
    Python Agent already provides (basic) support for software RAID.

RAID configuration JSON format
==============================
The desired RAID configuration and current RAID configuration are represented
in JSON format.

Target RAID configuration
-------------------------
This is the desired RAID configuration on the bare metal node.  Using the
OpenStackClient CLI (or REST API), the operator sets ``target_raid_config``
field of the node. The target RAID configuration will be applied during manual
cleaning.

Target RAID configuration is a dictionary having ``logical_disks``
as the key. The value for the ``logical_disks`` is a list of JSON
dictionaries. It looks like::

  {
   "logical_disks": [
                     {<desired properties of logical disk 1>},
                     {<desired properties of logical disk 2>},
                     .
                     .
                     .
                    ]
  }

If the ``target_raid_config`` is an empty dictionary, it unsets the value of
``target_raid_config`` if the value was set with previous RAID configuration
done on the node.

Each dictionary of logical disk contains the desired properties of logical
disk supported by the hardware type. These properties are discoverable by::

    openstack baremetal --os-baremetal-api-version 1.15 driver raid property list <driver name>

The RAID feature is available in ironic API version 1.15 and above.
If ``--os-baremetal-api-version`` is not used in the CLI, it will error out
with the following message::

   No API version was specified and the requested operation was not
   supported by the client's negotiated API version 1.9. Supported
   version range is: 1.1 to ...

 where the "..." in above error message would be the maximum version
 supported by the service.

The RAID properties can be split into 4 different types:

#. Mandatory properties. These properties must be specified for each logical
   disk and have no default values.

   - ``size_gb`` - Size (Integer) of the logical disk to be created in GiB.
     ``MAX`` may be specified if the logical disk should use all of the
     remaining space available. This can be used only when backing physical
     disks are specified (see below).

   - ``raid_level`` - RAID level for the logical disk. Ironic supports the
     following RAID levels: 0, 1, 2, 5, 6, 1+0, 5+0, 6+0.

#. Optional properties. These properties have default values and
   they may be overridden in the specification of any logical disk.

   - ``volume_name`` - Name of the volume. Should be unique within the Node.
     If not specified, volume name will be auto-generated.

   - ``is_root_volume`` - Set to ``true`` if this is the root volume. At
     most one logical disk can have this set to ``true``; the other
     logical disks must have this set to ``false``. The
     ``root device hint`` will be saved, if the RAID interface is capable of
     retrieving it. This is ``false`` by default.

#. Backing physical disk hints. These hints are specified for each logical
   disk to let Ironic find the desired disks for RAID configuration. This is
   machine-independent information.  This serves the use-case where the
   operator doesn't want to provide individual details for each bare metal
   node.

   - ``share_physical_disks`` - Set to ``true`` if this logical disk can
     share physical disks with other logical disks. The default value is
     ``false``.

   - ``disk_type`` - ``hdd`` or ``ssd``. If this is not specified, disk type
     will not be a criterion to find backing physical disks.

   - ``interface_type`` - ``sata`` or ``scsi`` or ``sas``. If this is not
     specified, interface type will not be a criterion to
     find backing physical disks.

   - ``number_of_physical_disks`` - Integer, number of disks to use for the
     logical disk. Defaults to minimum number of disks required for the
     particular RAID level.

#. Backing physical disks. These are the actual machine-dependent
   information. This is suitable for environments where the operator wants
   to automate the selection of physical disks with a 3rd-party tool based
   on a wider range of attributes (eg. S.M.A.R.T. status, physical location).
   The values for these properties are hardware dependent.

   - ``controller`` - The name of the controller as read by the RAID interface.
     In order to trigger the setup of a Software RAID via the Ironic Python
     Agent, the value of this property needs to be set to ``software``.
   - ``physical_disks`` - A list of physical disks to use as read by the
     RAID interface.

.. note::
    If properties from both "Backing physical disk hints" or
    "Backing physical disks" are specified, they should be consistent with
    each other.  If they are not consistent, then the RAID configuration
    will fail (because the appropriate backing physical disks could
    not be found).

.. note::
    For software RAID as provided by the generic hardware manager that ships
    with the Ironic Python Agent, only the mandatory properties (plus the
    required ``controller`` property) are currently supported.

Examples for ``target_raid_config``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

*Example 1*. Single RAID disk of RAID level 5 with all of the space
available. Make this the root volume to which Ironic deploys the image::

  {
   "logical_disks": [
                     {
                      "size_gb": "MAX",
                      "raid_level": "5",
                      "is_root_volume": true
                     }
                    ]
  }

*Example 2*. Two RAID disks. One with RAID level 5 of 100 GiB and make it
root volume and use SSD.  Another with RAID level 1 of 500 GiB and use
HDD::

  {
   "logical_disks": [
                     {
                      "size_gb": 100,
                      "raid_level": "5",
                      "is_root_volume": true,
                      "disk_type": "ssd"
                     },
                     {
                      "size_gb": 500,
                      "raid_level": "1",
                      "disk_type": "hdd"
                     }
                    ]
  }

*Example 3*. Single RAID disk. I know which disks and controller to use::

  {
   "logical_disks": [
                     {
                      "size_gb": 100,
                      "raid_level": "5",
                      "controller": "Smart Array P822 in Slot 3",
                      "physical_disks": ["6I:1:5", "6I:1:6", "6I:1:7"],
                      "is_root_volume": true
                     }
                    ]
  }

*Example 4*. Using backing physical disks::

  {
    "logical_disks":
      [
        {
          "size_gb": 50,
          "raid_level": "1+0",
          "controller": "RAID.Integrated.1-1",
          "volume_name": "root_volume",
          "is_root_volume": true,
          "physical_disks": [
                             "Disk.Bay.0:Encl.Int.0-1:RAID.Integrated.1-1",
                             "Disk.Bay.1:Encl.Int.0-1:RAID.Integrated.1-1"
                            ]
        },
        {
          "size_gb": 100,
          "raid_level": "5",
          "controller": "RAID.Integrated.1-1",
          "volume_name": "data_volume",
          "physical_disks": [
                             "Disk.Bay.2:Encl.Int.0-1:RAID.Integrated.1-1",
                             "Disk.Bay.3:Encl.Int.0-1:RAID.Integrated.1-1",
                             "Disk.Bay.4:Encl.Int.0-1:RAID.Integrated.1-1"
                            ]
        }
      ]
  }

*Example 5*. Software RAID with two RAID devices::

  {
   "logical_disks": [
                     {
                      "size_gb": 100,
                      "raid_level": "1",
                      "controller": "software"
                     },
                     {
                      "size_gb": "MAX",
                      "raid_level": "0",
                      "controller": "software"
                     }
                    ]
  }

Current RAID configuration
--------------------------
After target RAID configuration is applied on the bare metal node, Ironic
populates the current RAID configuration.  This is populated in the
``raid_config`` field in the Ironic node. This contains the details about
every logical disk after they were created on the bare metal node. It
contains details like RAID controller used, the backing physical disks used,
WWN of each logical disk, etc. It also contains information about each
physical disk found on the bare metal node.

To get the current RAID configuration::

    openstack baremetal --os-baremetal-api-version 1.15 node show <node-uuid-or-name>

Workflow
========

* Operator configures the bare metal node with a hardware type that has
  a ``RAIDInterface`` other than ``no-raid``. For instance, for Software RAID,
  this would be ``agent``.

* For in-band RAID configuration, operator builds an agent ramdisk which
  supports RAID configuration by bundling the hardware manager with the
  ramdisk. See `Build agent ramdisk which supports RAID configuration`_ for
  more information.

* Operator prepares the desired target RAID configuration as mentioned in
  `Target RAID configuration`_. The target RAID configuration is set on
  the Ironic node::

      openstack baremetal node set <node-uuid-or-name> \
         --target-raid-config <JSON file containing target RAID configuration>

    The CLI command can accept the input from standard input also:
       openstack baremetal node set <node-uuid-or-name> \
          --target-raid-config -

* Create a JSON file with the RAID clean steps for manual cleaning. Add other
  clean steps as desired::


    [{
      "interface": "raid",
      "step": "delete_configuration"
    },
    {
      "interface": "raid",
      "step": "create_configuration"
    }]

  .. note::
    'create_configuration' doesn't remove existing disks.  It is recommended
    to add 'delete_configuration' before 'create_configuration' to make
    sure that only the desired logical disks exist in the system after
    manual cleaning.

* Bring the node to ``manageable`` state and do a ``clean`` action to start
  cleaning on the node::

      openstack baremetal node clean <node-uuid-or-name> \
         --clean-steps <JSON file containing clean steps created above>

* After manual cleaning is complete, the current RAID configuration is
  reported in the ``raid_config`` field when running::

      openstack baremetal node show <node-uuid-or-name>

Limitations of Software RAID
============================

There are certain limitations to be aware of when setting up a Software RAID via the
Ironic Python Agent:

* There is no way to select the disks which are used to set up the software RAID,
  so the Ironic Python Agent will use all available disks. This seems appropriate
  for servers with 2 or 4 disks, but needs to be considered when disk arrays are
  attached.

* The number of created Software RAID devices must be 1 or 2. If there is only one
  Software RAID device, it has to be a RAID-1. If there are two, the first one has
  to be a RAID-1, while the RAID level for the second one can 0, 1, or 1+0. As the
  first RAID device will be the deployment device, enforcing a RAID-1 reduces the
  risk of ending up with a non-booting node in case of a disk failure.

* There is no support for partition images, only whole-disk images are supported with
  Software RAID.

Using RAID in nova flavor for scheduling
========================================

The operator can specify the `raid_level` capability in nova flavor for node to be selected
for scheduling::

  nova flavor-key my-baremetal-flavor set capabilities:raid_level="1+0"

Developer documentation
=======================
In-band RAID configuration is done using IPA ramdisk. IPA ramdisk has
support for pluggable hardware managers which can be used to extend the
functionality offered by IPA ramdisk using stevedore plugins.  For more
information, see Ironic Python Agent `Hardware Manager`_ documentation.

.. _`Hardware Manager`: https://docs.openstack.org/ironic-python-agent/latest/install/index.html#hardware-managers

The hardware manager that supports RAID configuration should do the following:

#. Implement a method named ``create_configuration``. This method creates
   the RAID configuration as given in ``target_raid_config``. After successful
   RAID configuration, it returns the current RAID configuration information
   which ironic uses to set ``node.raid_config``.

#. Implement a method named ``delete_configuration``. This method deletes
   all the RAID disks on the bare metal.

#. Return these two clean steps in ``get_clean_steps`` method with priority
   as 0. Example::

        return [{'step': 'create_configuration',
                 'interface': 'raid',
                 'priority': 0},
                {'step': 'delete_configuration',
                 'interface': 'raid',
                 'priority': 0}]