diff options
author | Lee Yarwood <lyarwood@redhat.com> | 2020-03-06 11:00:49 +0000 |
---|---|---|
committer | Elod Illes <elod.illes@est.tech> | 2022-08-16 22:24:37 +0200 |
commit | fa540fabce020cade91adb52800197f688c843d9 (patch) | |
tree | 675e211253052239ba7812768ee7c4df76469787 | |
parent | e4f8dec28b74729fee2d5659ba39a2c63de567a4 (diff) | |
download | nova-fa540fabce020cade91adb52800197f688c843d9.tar.gz |
[CI] Fix gate by using zuulv3 live migration and grenade jobs
This patch is a combination of several legacy-to-zuulv3 job patches to
unblock the gate: with the latest Ceph release the legacy grenade jobs
started to fail with the following erros (back till ussuri):
'Error EPERM: configuring pool size as 1 is disabled by default.'
The patch contains almost a clean backport of the job configuration.
Conflicts:
.zuul.yaml
gate/live_migration/hooks/ceph.sh
gate/live_migration/hooks/run_tests.sh
gate/live_migration/hooks/utils.sh
playbooks/legacy/nova-grenade-multinode/run.yaml
playbooks/legacy/nova-live-migration/run.yaml
NOTE(melwitt): The .zuul.yaml conflict is because change
I4b2d321b7243ec149e9445035d1feb7a425e9a4b (Skip to run all integration
jobs for policy-only changes.) and change
I86f56b0238c72d2784e62f199cfc7704b95bbcf2 (FUP to
Ie1a0cbd82a617dbcc15729647218ac3e9cd0e5a9) are not in Train.
The gate/live_migration/hooks/ceph.sh conflict is because change
Id565a20ba3ebe2ea1a72b879bd2762ba3e655658 (Convert legacy
nova-live-migration and nova-multinode-grenade to py3) is not in Train
and change I1d029ebe78b16ed2d4345201b515baf3701533d5 ([stable-only]
gate: Pin CEPH_RELEASE to nautilus in LM hook) is only in Train.
The gate/live_migration/hooks/run_tests.sh conflict is because change
Id565a20ba3ebe2ea1a72b879bd2762ba3e655658 (Convert legacy
nova-live-migration and nova-multinode-grenade to py3) is not in Train.
The gate/live_migration/hooks/utils.sh conflict is because change
Iad2d198c58512b26dc2733b97bedeffc00108656 was added only in Train.
The playbooks/legacy/nova-grenade-multinode/run.yaml conflict is
because change Id565a20ba3ebe2ea1a72b879bd2762ba3e655658 (Convert
legacy nova-live-migration and nova-multinode-grenade to py3) and
change Icac785eec824da5146efe0ea8ecd01383f18398e (Drop
neutron-grenade-multinode job) are not in Train.
The playbooks/legacy/nova-live-migration/run.yaml conflict is because
change Id565a20ba3ebe2ea1a72b879bd2762ba3e655658 (Convert legacy
nova-live-migration and nova-multinode-grenade to py3) is not in Train.
NOTE(lyarwood): An additional change was required to the
run-evacuate-hook as we are now running against Bionic based hosts.
These hosts only have a single libvirtd service running so stop and
start only this during an evacuation run.
List of included patches:
1. zuul: Start to migrate nova-live-migration to zuulv3
2. zuul: Replace nova-live-migration with zuulv3 jobs
Closes-Bug: #1901739
Change-Id: Ib342e2d3c395830b4667a60de7e492d3b9de2f0a
(cherry picked from commit 4ac4a04d1843b0450e8d6d80189ce3e85253dcd0)
(cherry picked from commit 478be6f4fbbbc7b05becd5dd92a27f0c4e8f8ef8)
3. zuul: Replace grenade and nova-grenade-multinode with grenade-multinode
Change-Id: I02b2b851a74f24816d2f782a66d94de81ee527b0
(cherry picked from commit 91e53e4c2b90ea57aeac4ec522dd7c8c54961d09)
(cherry picked from commit c45bedd98d50af865d727b7456c974c8e27bff8b)
(cherry picked from commit 2af08fb5ead8ca1fa4d6b8ea00f3c5c3d26e562c)
Change-Id: Ibbb3930a6e629e93a424b3ae048f599f11923be3
(cherry picked from commit 1c733d973015999ee692ed48fb10a282c50fdc49)
(cherry picked from commit 341ba7aa175a0a082fec6e5360ae3afa2596ca95)
-rw-r--r-- | .zuul.yaml | 66 | ||||
-rwxr-xr-x | gate/live_migration/hooks/ceph.sh | 208 | ||||
-rwxr-xr-x | gate/live_migration/hooks/nfs.sh | 50 | ||||
-rwxr-xr-x | gate/live_migration/hooks/run_tests.sh | 75 | ||||
-rwxr-xr-x | gate/live_migration/hooks/utils.sh | 21 | ||||
-rw-r--r-- | playbooks/legacy/nova-grenade-multinode/post.yaml | 15 | ||||
-rw-r--r-- | playbooks/legacy/nova-grenade-multinode/run.yaml | 58 | ||||
-rw-r--r-- | playbooks/legacy/nova-live-migration/post.yaml | 15 | ||||
-rw-r--r-- | playbooks/legacy/nova-live-migration/run.yaml | 59 | ||||
-rw-r--r-- | playbooks/nova-evacuate/run.yaml | 8 | ||||
-rw-r--r-- | playbooks/nova-live-migration/post-run.yaml | 10 | ||||
-rw-r--r-- | roles/run-evacuate-hook/README.rst | 1 | ||||
-rwxr-xr-x | roles/run-evacuate-hook/files/setup_evacuate_resources.sh | 34 | ||||
-rwxr-xr-x | roles/run-evacuate-hook/files/test_evacuate.sh | 55 | ||||
-rwxr-xr-x | roles/run-evacuate-hook/files/test_negative_evacuate.sh | 37 | ||||
-rw-r--r-- | roles/run-evacuate-hook/tasks/main.yaml | 64 |
16 files changed, 253 insertions, 523 deletions
diff --git a/.zuul.yaml b/.zuul.yaml index 69ec4712f7..8e62b42c25 100644 --- a/.zuul.yaml +++ b/.zuul.yaml @@ -131,15 +131,36 @@ - job: name: nova-live-migration - parent: nova-dsvm-multinode-base + parent: tempest-multinode-full-py3 description: | - Run tempest live migration tests against both local storage and shared - storage using ceph (the environment is reconfigured for ceph after the - local storage tests are run). Also runs simple evacuate tests. - Config drive is forced on all instances. - run: playbooks/legacy/nova-live-migration/run.yaml - post-run: playbooks/legacy/nova-live-migration/post.yaml + Run tempest live migration tests against local qcow2 ephemeral storage + and shared LVM/iSCSI cinder volumes. irrelevant-files: *dsvm-irrelevant-files + vars: + tox_envlist: all + tempest_test_regex: (^tempest\.api\.compute\.admin\.(test_live_migration|test_migration)) + devstack_local_conf: + test-config: + $TEMPEST_CONFIG: + compute-feature-enabled: + volume_backed_live_migration: true + block_migration_for_live_migration: true + block_migrate_cinder_iscsi: true + post-run: playbooks/nova-live-migration/post-run.yaml + +# TODO(lyarwood): The following jobs need to be written as part of the +# migration to zuulv3 before nova-live-migration can be removed: +# +#- job: +# name: nova-multinode-live-migration-ceph +# description: | +# Run tempest live migration tests against ceph ephemeral storage and +# cinder volumes. +#- job: +# name: nova-multinode-evacuate-ceph +# description: | +# Verifiy the evacuation of instances with ceph ephemeral disks +# from down compute hosts. - job: name: nova-lvm @@ -282,21 +303,24 @@ - job: name: nova-grenade-multinode - parent: nova-dsvm-multinode-base + parent: grenade-multinode description: | - Multi-node grenade job which runs gate/live_migration/hooks tests. - In other words, this tests live and cold migration and resize with - mixed-version compute services which is important for things like - rolling upgrade support. + Run a multinode grenade job and run the smoke, cold and live migration + tests with the controller upgraded and the compute on the older release. The former names for this job were "nova-grenade-live-migration" and "legacy-grenade-dsvm-neutron-multinode-live-migration". - run: playbooks/legacy/nova-grenade-multinode/run.yaml - post-run: playbooks/legacy/nova-grenade-multinode/post.yaml - required-projects: - - openstack/grenade - - openstack/devstack-gate - - openstack/nova irrelevant-files: *dsvm-irrelevant-files + vars: + devstack_local_conf: + test-config: + $TEMPEST_CONFIG: + compute-feature-enabled: + live_migration: true + volume_backed_live_migration: true + block_migration_for_live_migration: true + block_migrate_cinder_iscsi: true + tox_envlist: all + tempest_test_regex: ((tempest\.(api\.compute|scenario)\..*smoke.*)|(^tempest\.api\.compute\.admin\.(test_live_migration|test_migration))) - job: name: nova-multi-cell @@ -395,7 +419,6 @@ # code; we don't need to run this on all changes, nor do we run # it in the gate. - ^(?!nova/network/.*)(?!nova/virt/libvirt/vif.py).*$ - - nova-grenade-multinode - nova-live-migration - nova-lvm - nova-multi-cell @@ -408,7 +431,7 @@ irrelevant-files: *dsvm-irrelevant-files - tempest-slow-py3: irrelevant-files: *dsvm-irrelevant-files - - grenade-py3: + - nova-grenade-multinode: irrelevant-files: *dsvm-irrelevant-files - tempest-ipv6-only: irrelevant-files: *dsvm-irrelevant-files @@ -416,7 +439,6 @@ nodeset: ubuntu-bionic gate: jobs: - - nova-grenade-multinode - nova-live-migration - nova-tox-functional - nova-tox-functional-py36 @@ -427,7 +449,7 @@ irrelevant-files: *dsvm-irrelevant-files - tempest-slow-py3: irrelevant-files: *dsvm-irrelevant-files - - grenade-py3: + - nova-grenade-multinode: irrelevant-files: *dsvm-irrelevant-files - tempest-ipv6-only: irrelevant-files: *dsvm-irrelevant-files diff --git a/gate/live_migration/hooks/ceph.sh b/gate/live_migration/hooks/ceph.sh deleted file mode 100755 index c588f7c9b2..0000000000 --- a/gate/live_migration/hooks/ceph.sh +++ /dev/null @@ -1,208 +0,0 @@ -#!/bin/bash - -function prepare_ceph { - git clone https://opendev.org/openstack/devstack-plugin-ceph /tmp/devstack-plugin-ceph - source /tmp/devstack-plugin-ceph/devstack/settings - source /tmp/devstack-plugin-ceph/devstack/lib/ceph - install_ceph - configure_ceph - #install ceph-common package on compute nodes - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m raw -a "executable=/bin/bash - export CEPH_RELEASE=nautilus - source $BASE/new/devstack/functions - source $BASE/new/devstack/functions-common - git clone https://opendev.org/openstack/devstack-plugin-ceph /tmp/devstack-plugin-ceph - source /tmp/devstack-plugin-ceph/devstack/lib/ceph - install_ceph_remote - " - - #copy ceph admin keyring to compute nodes - sudo cp /etc/ceph/ceph.client.admin.keyring /tmp/ceph.client.admin.keyring - sudo chown ${STACK_USER}:${STACK_USER} /tmp/ceph.client.admin.keyring - sudo chmod 644 /tmp/ceph.client.admin.keyring - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m copy -a "src=/tmp/ceph.client.admin.keyring dest=/etc/ceph/ceph.client.admin.keyring owner=ceph group=ceph" - sudo rm -f /tmp/ceph.client.admin.keyring - #copy ceph.conf to compute nodes - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m copy -a "src=/etc/ceph/ceph.conf dest=/etc/ceph/ceph.conf owner=root group=root" - - start_ceph -} - -function _ceph_configure_glance { - GLANCE_API_CONF=${GLANCE_API_CONF:-/etc/glance/glance-api.conf} - sudo ceph -c ${CEPH_CONF_FILE} osd pool create ${GLANCE_CEPH_POOL} ${GLANCE_CEPH_POOL_PG} ${GLANCE_CEPH_POOL_PGP} - sudo ceph -c ${CEPH_CONF_FILE} auth get-or-create client.${GLANCE_CEPH_USER} \ - mon "allow r" \ - osd "allow class-read object_prefix rbd_children, allow rwx pool=${GLANCE_CEPH_POOL}" | \ - sudo tee ${CEPH_CONF_DIR}/ceph.client.${GLANCE_CEPH_USER}.keyring - sudo chown ${STACK_USER}:$(id -g -n $whoami) ${CEPH_CONF_DIR}/ceph.client.${GLANCE_CEPH_USER}.keyring - - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=DEFAULT option=show_image_direct_url value=True" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=glance_store option=default_store value=rbd" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=glance_store option=stores value='file, http, rbd'" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=glance_store option=rbd_store_ceph_conf value=$CEPH_CONF_FILE" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=glance_store option=rbd_store_user value=$GLANCE_CEPH_USER" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${GLANCE_API_CONF} section=glance_store option=rbd_store_pool value=$GLANCE_CEPH_POOL" - - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${GLANCE_CEPH_POOL} size ${CEPH_REPLICAS} - if [[ $CEPH_REPLICAS -ne 1 ]]; then - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${GLANCE_CEPH_POOL} crush_ruleset ${RULE_ID} - fi - - #copy glance keyring to compute only node - sudo cp /etc/ceph/ceph.client.glance.keyring /tmp/ceph.client.glance.keyring - sudo chown $STACK_USER:$STACK_USER /tmp/ceph.client.glance.keyring - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m copy -a "src=/tmp/ceph.client.glance.keyring dest=/etc/ceph/ceph.client.glance.keyring" - sudo rm -f /tmp/ceph.client.glance.keyring -} - -function configure_and_start_glance { - _ceph_configure_glance - echo 'check processes before glance-api stop' - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "ps aux | grep glance-api" - - # restart glance - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "systemctl restart devstack@g-api" - - echo 'check processes after glance-api stop' - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "ps aux | grep glance-api" -} - -function _ceph_configure_nova { - #setup ceph for nova, we don't reuse configure_ceph_nova - as we need to emulate case where cinder is not configured for ceph - sudo ceph -c ${CEPH_CONF_FILE} osd pool create ${NOVA_CEPH_POOL} ${NOVA_CEPH_POOL_PG} ${NOVA_CEPH_POOL_PGP} - NOVA_CONF=${NOVA_CPU_CONF:-/etc/nova/nova.conf} - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=rbd_user value=${CINDER_CEPH_USER}" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=rbd_secret_uuid value=${CINDER_CEPH_UUID}" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=inject_key value=false" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=inject_partition value=-2" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=disk_cachemodes value='network=writeback'" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=images_type value=rbd" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=images_rbd_pool value=${NOVA_CEPH_POOL}" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=${NOVA_CONF} section=libvirt option=images_rbd_ceph_conf value=${CEPH_CONF_FILE}" - - sudo ceph -c ${CEPH_CONF_FILE} auth get-or-create client.${CINDER_CEPH_USER} \ - mon "allow r" \ - osd "allow class-read object_prefix rbd_children, allow rwx pool=${CINDER_CEPH_POOL}, allow rwx pool=${NOVA_CEPH_POOL},allow rwx pool=${GLANCE_CEPH_POOL}" | \ - sudo tee ${CEPH_CONF_DIR}/ceph.client.${CINDER_CEPH_USER}.keyring > /dev/null - sudo chown ${STACK_USER}:$(id -g -n $whoami) ${CEPH_CONF_DIR}/ceph.client.${CINDER_CEPH_USER}.keyring - - #copy cinder keyring to compute only node - sudo cp /etc/ceph/ceph.client.cinder.keyring /tmp/ceph.client.cinder.keyring - sudo chown stack:stack /tmp/ceph.client.cinder.keyring - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m copy -a "src=/tmp/ceph.client.cinder.keyring dest=/etc/ceph/ceph.client.cinder.keyring" - sudo rm -f /tmp/ceph.client.cinder.keyring - - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${NOVA_CEPH_POOL} size ${CEPH_REPLICAS} - if [[ $CEPH_REPLICAS -ne 1 ]]; then - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${NOVA_CEPH_POOL} crush_ruleset ${RULE_ID} - fi -} - -function _wait_for_nova_compute_service_state { - source $BASE/new/devstack/openrc admin admin - local status=$1 - local attempt=1 - local max_attempts=24 - local attempt_sleep=5 - local computes_count=$(openstack compute service list | grep -c nova-compute) - local computes_ready=$(openstack compute service list | grep nova-compute | grep $status | wc -l) - - echo "Waiting for $computes_count computes to report as $status" - while [ "$computes_ready" -ne "$computes_count" ]; do - if [ "$attempt" -eq "$max_attempts" ]; then - echo "Failed waiting for computes to report as ${status}, ${computes_ready}/${computes_count} ${status} after ${max_attempts} attempts" - exit 4 - fi - echo "Waiting ${attempt_sleep} seconds for ${computes_count} computes to report as ${status}, ${computes_ready}/${computes_count} ${status} after ${attempt}/${max_attempts} attempts" - sleep $attempt_sleep - attempt=$((attempt+1)) - computes_ready=$(openstack compute service list | grep nova-compute | grep $status | wc -l) - done - echo "All computes are now reporting as ${status} after ${attempt} attempts" -} - -function configure_and_start_nova { - - echo "Checking all n-cpu services" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "pgrep -u stack -a nova-compute" - - # stop nova-compute - echo "Stopping all n-cpu services" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "systemctl stop devstack@n-cpu" - - # Wait for the service to be marked as down - _wait_for_nova_compute_service_state "down" - - _ceph_configure_nova - - #import secret to libvirt - _populate_libvirt_secret - - # start nova-compute - echo "Starting all n-cpu services" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "systemctl start devstack@n-cpu" - - echo "Checking all n-cpu services" - # test that they are all running again - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "pgrep -u stack -a nova-compute" - - # Wait for the service to be marked as up - _wait_for_nova_compute_service_state "up" -} - -function _ceph_configure_cinder { - sudo ceph -c ${CEPH_CONF_FILE} osd pool create ${CINDER_CEPH_POOL} ${CINDER_CEPH_POOL_PG} ${CINDER_CEPH_POOL_PGP} - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${CINDER_CEPH_POOL} size ${CEPH_REPLICAS} - if [[ $CEPH_REPLICAS -ne 1 ]]; then - sudo ceph -c ${CEPH_CONF_FILE} osd pool set ${CINDER_CEPH_POOL} crush_ruleset ${RULE_ID} - fi - - CINDER_CONF=${CINDER_CONF:-/etc/cinder/cinder.conf} - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=volume_backend_name value=ceph" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=volume_driver value=cinder.volume.drivers.rbd.RBDDriver" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_ceph_conf value=$CEPH_CONF_FILE" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_pool value=$CINDER_CEPH_POOL" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_user value=$CINDER_CEPH_USER" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_uuid value=$CINDER_CEPH_UUID" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_flatten_volume_from_snapshot value=False" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=ceph option=rbd_max_clone_depth value=5" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=DEFAULT option=default_volume_type value=ceph" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$CINDER_CONF section=DEFAULT option=enabled_backends value=ceph" - -} - -function configure_and_start_cinder { - _ceph_configure_cinder - - # restart cinder - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "systemctl restart devstack@c-vol" - - source $BASE/new/devstack/openrc - - export OS_USERNAME=admin - export OS_PROJECT_NAME=admin - lvm_type=$(cinder type-list | awk -F "|" 'NR==4{ print $2}') - cinder type-delete $lvm_type - openstack volume type create --os-volume-api-version 1 --property volume_backend_name="ceph" ceph -} - -function _populate_libvirt_secret { - cat > /tmp/secret.xml <<EOF -<secret ephemeral='no' private='no'> - <uuid>${CINDER_CEPH_UUID}</uuid> - <usage type='ceph'> - <name>client.${CINDER_CEPH_USER} secret</name> - </usage> -</secret> -EOF - - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m copy -a "src=/tmp/secret.xml dest=/tmp/secret.xml" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "virsh secret-define --file /tmp/secret.xml" - local secret=$(sudo ceph -c ${CEPH_CONF_FILE} auth get-key client.${CINDER_CEPH_USER}) - # TODO(tdurakov): remove this escaping as https://github.com/ansible/ansible/issues/13862 fixed - secret=${secret//=/'\='} - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "virsh secret-set-value --secret ${CINDER_CEPH_UUID} --base64 $secret" - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m file -a "path=/tmp/secret.xml state=absent" - -} diff --git a/gate/live_migration/hooks/nfs.sh b/gate/live_migration/hooks/nfs.sh deleted file mode 100755 index acadb36d6c..0000000000 --- a/gate/live_migration/hooks/nfs.sh +++ /dev/null @@ -1,50 +0,0 @@ -#!/bin/bash - -function nfs_setup { - if uses_debs; then - module=apt - elif is_fedora; then - module=yum - fi - $ANSIBLE all --become -f 5 -i "$WORKSPACE/inventory" -m $module \ - -a "name=nfs-common state=present" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m $module \ - -a "name=nfs-kernel-server state=present" - - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=/etc/idmapd.conf section=Mapping option=Nobody-User value=nova" - - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=/etc/idmapd.conf section=Mapping option=Nobody-Group value=nova" - - for SUBNODE in $SUBNODES ; do - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m lineinfile -a "dest=/etc/exports line='/opt/stack/data/nova/instances $SUBNODE(rw,fsid=0,insecure,no_subtree_check,async,no_root_squash)'" - done - - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "exportfs -a" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m service -a "name=nfs-kernel-server state=restarted" - GetDistro - if [[ ! ${DISTRO} =~ (xenial) ]]; then - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m service -a "name=idmapd state=restarted" - fi - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "iptables -A INPUT -p tcp --dport 111 -j ACCEPT" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "iptables -A INPUT -p udp --dport 111 -j ACCEPT" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "iptables -A INPUT -p tcp --dport 2049 -j ACCEPT" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "iptables -A INPUT -p udp --dport 2049 -j ACCEPT" - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "mount -t nfs4 -o proto\=tcp,port\=2049 $primary_node:/ /opt/stack/data/nova/instances/" -} - -function nfs_configure_tempest { - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$BASE/new/tempest/etc/tempest.conf section=compute-feature-enabled option=block_migration_for_live_migration value=False" -} - -function nfs_verify_setup { - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m file -a "path=/opt/stack/data/nova/instances/test_file state=touch" - if [ ! -e '/opt/stack/data/nova/instances/test_file' ]; then - die $LINENO "NFS configuration failure" - fi -} - -function nfs_teardown { - #teardown nfs shared storage - $ANSIBLE subnodes --become -f 5 -i "$WORKSPACE/inventory" -m shell -a "umount -t nfs4 /opt/stack/data/nova/instances/" - $ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m service -a "name=nfs-kernel-server state=stopped" -}
\ No newline at end of file diff --git a/gate/live_migration/hooks/run_tests.sh b/gate/live_migration/hooks/run_tests.sh deleted file mode 100755 index 7a5027f2c9..0000000000 --- a/gate/live_migration/hooks/run_tests.sh +++ /dev/null @@ -1,75 +0,0 @@ -#!/bin/bash -# Live migration dedicated ci job will be responsible for testing different -# environments based on underlying storage, used for ephemerals. -# This hook allows to inject logic of environment reconfiguration in ci job. -# Base scenario for this would be: -# -# 1. test with all local storage (use default for volumes) -# 2. test with NFS for root + ephemeral disks -# 3. test with Ceph for root + ephemeral disks -# 4. test with Ceph for volumes and root + ephemeral disk - -set -xe -cd $BASE/new/tempest - -source $BASE/new/devstack/functions -source $BASE/new/devstack/functions-common -source $BASE/new/devstack/lib/nova -source $WORKSPACE/devstack-gate/functions.sh -source $BASE/new/nova/gate/live_migration/hooks/utils.sh -source $BASE/new/nova/gate/live_migration/hooks/nfs.sh -source $BASE/new/nova/gate/live_migration/hooks/ceph.sh -primary_node=$(cat /etc/nodepool/primary_node_private) -SUBNODES=$(cat /etc/nodepool/sub_nodes_private) -SERVICE_HOST=$primary_node -STACK_USER=${STACK_USER:-stack} - -echo '1. test with all local storage (use default for volumes)' -echo 'NOTE: test_volume_backed_live_migration is skipped due to https://bugs.launchpad.net/nova/+bug/1524898' -run_tempest "block migration test" "^.*test_live_migration(?!.*(test_volume_backed_live_migration))" - -# TODO(mriedem): Run $BASE/new/nova/gate/test_evacuate.sh for local storage - -#all tests bellow this line use shared storage, need to update tempest.conf -echo 'disabling block_migration in tempest' -$ANSIBLE primary --become -f 5 -i "$WORKSPACE/inventory" -m ini_file -a "dest=$BASE/new/tempest/etc/tempest.conf section=compute-feature-enabled option=block_migration_for_live_migration value=False" - -echo '2. NFS testing is skipped due to setup failures with Ubuntu 16.04' -#echo '2. test with NFS for root + ephemeral disks' - -#nfs_setup -#nfs_configure_tempest -#nfs_verify_setup -#run_tempest "NFS shared storage test" "live_migration" -#nfs_teardown - -# The nova-grenade-multinode job also runs resize and cold migration tests -# so we check for a grenade-only variable. -if [[ -n "$GRENADE_NEW_BRANCH" ]]; then - echo '3. test cold migration and resize' - run_tempest "cold migration and resize test" "test_resize_server|test_cold_migration|test_revert_cold_migration" -else - echo '3. cold migration and resize is skipped for non-grenade jobs' -fi - -echo '4. test with Ceph for root + ephemeral disks' -# Discover and set variables for the OS version so the devstack-plugin-ceph -# scripts can find the correct repository to install the ceph packages. -# NOTE(lyarwood): Pin the CEPH_RELEASE to nautilus here as was the case -# prior to https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/777232 -# landing in the branchless plugin, we also have to pin in ceph.sh when -# configuring ceph on a remote node via ansible. -export CEPH_RELEASE=nautilus -GetOSVersion -prepare_ceph -GLANCE_API_CONF=${GLANCE_API_CONF:-/etc/glance/glance-api.conf} -configure_and_start_glance - -configure_and_start_nova -run_tempest "Ceph nova&glance test" "^.*test_live_migration(?!.*(test_volume_backed_live_migration))" - -set +e -#echo '5. test with Ceph for volumes and root + ephemeral disk' - -#configure_and_start_cinder -#run_tempest "Ceph nova&glance&cinder test" "live_migration" diff --git a/gate/live_migration/hooks/utils.sh b/gate/live_migration/hooks/utils.sh deleted file mode 100755 index e494ae03f8..0000000000 --- a/gate/live_migration/hooks/utils.sh +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/bash - -function run_tempest { - local message=$1 - local tempest_regex=$2 - - # NOTE(gmann): Set upper constraint for Tempest run so that it matches - # with what devstack is using and does not recreate the tempest virtual - # env. - TEMPEST_VENV_UPPER_CONSTRAINTS=$(set +o xtrace && - source $BASE/new/devstack/stackrc && - echo $TEMPEST_VENV_UPPER_CONSTRAINTS) - export UPPER_CONSTRAINTS_FILE=$TEMPEST_VENV_UPPER_CONSTRAINTS - echo "using $UPPER_CONSTRAINTS_FILE for tempest run" - - sudo -H -u tempest UPPER_CONSTRAINTS_FILE=$UPPER_CONSTRAINTS_FILE tox -eall -- $tempest_regex --concurrency=$TEMPEST_CONCURRENCY - exitcode=$? - if [[ $exitcode -ne 0 ]]; then - die $LINENO "$message failure" - fi -} diff --git a/playbooks/legacy/nova-grenade-multinode/post.yaml b/playbooks/legacy/nova-grenade-multinode/post.yaml deleted file mode 100644 index e07f5510ae..0000000000 --- a/playbooks/legacy/nova-grenade-multinode/post.yaml +++ /dev/null @@ -1,15 +0,0 @@ -- hosts: primary - tasks: - - - name: Copy files from {{ ansible_user_dir }}/workspace/ on node - synchronize: - src: '{{ ansible_user_dir }}/workspace/' - dest: '{{ zuul.executor.log_root }}' - mode: pull - copy_links: true - verify_host: true - rsync_opts: - - --include=/logs/** - - --include=*/ - - --exclude=* - - --prune-empty-dirs diff --git a/playbooks/legacy/nova-grenade-multinode/run.yaml b/playbooks/legacy/nova-grenade-multinode/run.yaml deleted file mode 100644 index ffdced6cc8..0000000000 --- a/playbooks/legacy/nova-grenade-multinode/run.yaml +++ /dev/null @@ -1,58 +0,0 @@ -- hosts: primary - name: nova-grenade-multinode - tasks: - - - name: Ensure legacy workspace directory - file: - path: '{{ ansible_user_dir }}/workspace' - state: directory - - - shell: - cmd: | - set -e - set -x - cat > clonemap.yaml << EOF - clonemap: - - name: openstack/devstack-gate - dest: devstack-gate - EOF - /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git \ - https://opendev.org \ - openstack/devstack-gate - executable: /bin/bash - chdir: '{{ ansible_user_dir }}/workspace' - environment: '{{ zuul | zuul_legacy_vars }}' - - - shell: - cmd: | - set -e - set -x - export PROJECTS="openstack/grenade $PROJECTS" - export PYTHONUNBUFFERED=true - export DEVSTACK_GATE_CONFIGDRIVE=0 - export DEVSTACK_GATE_NEUTRON=1 - export DEVSTACK_GATE_TEMPEST_NOTESTS=1 - export DEVSTACK_GATE_GRENADE=pullup - # By default grenade runs only smoke tests so we need to set - # RUN_SMOKE to False in order to run live migration tests using - # grenade - export DEVSTACK_LOCAL_CONFIG="RUN_SMOKE=False" - # LIVE_MIGRATE_BACK_AND_FORTH will tell Tempest to run a live - # migration of the same instance to one compute node and then back - # to the other, which is mostly only interesting for grenade since - # we have mixed level computes. - export DEVSTACK_LOCAL_CONFIG+=$'\n'"LIVE_MIGRATE_BACK_AND_FORTH=True" - export BRANCH_OVERRIDE=default - export DEVSTACK_GATE_TOPOLOGY="multinode" - if [ "$BRANCH_OVERRIDE" != "default" ] ; then - export OVERRIDE_ZUUL_BRANCH=$BRANCH_OVERRIDE - fi - function post_test_hook { - /opt/stack/new/nova/gate/live_migration/hooks/run_tests.sh - } - export -f post_test_hook - cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh - ./safe-devstack-vm-gate-wrap.sh - executable: /bin/bash - chdir: '{{ ansible_user_dir }}/workspace' - environment: '{{ zuul | zuul_legacy_vars }}' diff --git a/playbooks/legacy/nova-live-migration/post.yaml b/playbooks/legacy/nova-live-migration/post.yaml deleted file mode 100644 index e07f5510ae..0000000000 --- a/playbooks/legacy/nova-live-migration/post.yaml +++ /dev/null @@ -1,15 +0,0 @@ -- hosts: primary - tasks: - - - name: Copy files from {{ ansible_user_dir }}/workspace/ on node - synchronize: - src: '{{ ansible_user_dir }}/workspace/' - dest: '{{ zuul.executor.log_root }}' - mode: pull - copy_links: true - verify_host: true - rsync_opts: - - --include=/logs/** - - --include=*/ - - --exclude=* - - --prune-empty-dirs diff --git a/playbooks/legacy/nova-live-migration/run.yaml b/playbooks/legacy/nova-live-migration/run.yaml deleted file mode 100644 index dd60e38a64..0000000000 --- a/playbooks/legacy/nova-live-migration/run.yaml +++ /dev/null @@ -1,59 +0,0 @@ -- hosts: primary - name: nova-live-migration - tasks: - - - name: Ensure legacy workspace directory - file: - path: '{{ ansible_user_dir }}/workspace' - state: directory - - - shell: - cmd: | - set -e - set -x - cat > clonemap.yaml << EOF - clonemap: - - name: openstack/devstack-gate - dest: devstack-gate - EOF - /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git \ - https://opendev.org \ - openstack/devstack-gate - executable: /bin/bash - chdir: '{{ ansible_user_dir }}/workspace' - environment: '{{ zuul | zuul_legacy_vars }}' - - - name: Configure devstack - shell: - # Force config drive. - cmd: | - set -e - set -x - cat << 'EOF' >>"/tmp/dg-local.conf" - [[local|localrc]] - FORCE_CONFIG_DRIVE=True - - EOF - executable: /bin/bash - chdir: '{{ ansible_user_dir }}/workspace' - environment: '{{ zuul | zuul_legacy_vars }}' - - - shell: - cmd: | - set -e - set -x - export PYTHONUNBUFFERED=true - export DEVSTACK_GATE_CONFIGDRIVE=0 - export DEVSTACK_GATE_TEMPEST=1 - export DEVSTACK_GATE_TEMPEST_NOTESTS=1 - export DEVSTACK_GATE_TOPOLOGY="multinode" - function post_test_hook { - /opt/stack/new/nova/gate/live_migration/hooks/run_tests.sh - $BASE/new/nova/gate/test_evacuate.sh - } - export -f post_test_hook - cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh - ./safe-devstack-vm-gate-wrap.sh - executable: /bin/bash - chdir: '{{ ansible_user_dir }}/workspace' - environment: '{{ zuul | zuul_legacy_vars }}' diff --git a/playbooks/nova-evacuate/run.yaml b/playbooks/nova-evacuate/run.yaml new file mode 100644 index 0000000000..35e330a6de --- /dev/null +++ b/playbooks/nova-evacuate/run.yaml @@ -0,0 +1,8 @@ +--- +- hosts: all + roles: + - orchestrate-devstack + +- hosts: controller + roles: + - run-evacuate-hook diff --git a/playbooks/nova-live-migration/post-run.yaml b/playbooks/nova-live-migration/post-run.yaml new file mode 100644 index 0000000000..845a1b15b2 --- /dev/null +++ b/playbooks/nova-live-migration/post-run.yaml @@ -0,0 +1,10 @@ +--- +- hosts: tempest + become: true + roles: + - role: fetch-subunit-output + zuul_work_dir: '{{ devstack_base_dir }}/tempest' + - role: process-stackviz +- hosts: controller + roles: + - run-evacuate-hook diff --git a/roles/run-evacuate-hook/README.rst b/roles/run-evacuate-hook/README.rst new file mode 100644 index 0000000000..e423455aee --- /dev/null +++ b/roles/run-evacuate-hook/README.rst @@ -0,0 +1 @@ +Run Nova evacuation tests against a multinode environment. diff --git a/roles/run-evacuate-hook/files/setup_evacuate_resources.sh b/roles/run-evacuate-hook/files/setup_evacuate_resources.sh new file mode 100755 index 0000000000..c8c385d7ff --- /dev/null +++ b/roles/run-evacuate-hook/files/setup_evacuate_resources.sh @@ -0,0 +1,34 @@ +#!/bin/bash +source /opt/stack/devstack/openrc admin +set -x +set -e + +image_id=$(openstack image list -f value -c ID | awk 'NR==1{print $1}') +flavor_id=$(openstack flavor list -f value -c ID | awk 'NR==1{print $1}') +network_id=$(openstack network list --no-share -f value -c ID | awk 'NR==1{print $1}') + +echo "Creating ephemeral test server on subnode" +openstack --os-compute-api-version 2.74 server create --image ${image_id} --flavor ${flavor_id} \ +--nic net-id=${network_id} --host $SUBNODE_HOSTNAME --wait evacuate-test + +# TODO(lyarwood) Use osc to launch the bfv volume +echo "Creating boot from volume test server on subnode" +nova --os-compute-api-version 2.74 boot --flavor ${flavor_id} --poll \ +--block-device id=${image_id},source=image,dest=volume,size=1,bootindex=0,shutdown=remove \ +--nic net-id=${network_id} --host ${SUBNODE_HOSTNAME} evacuate-bfv-test + +echo "Forcing down the subnode so we can evacuate from it" +openstack --os-compute-api-version 2.11 compute service set --down ${SUBNODE_HOSTNAME} nova-compute + +count=0 +status=$(openstack compute service list --host ${SUBNODE_HOSTNAME} --service nova-compute -f value -c State) +while [ "${status}" != "down" ] +do + sleep 1 + count=$((count+1)) + if [ ${count} -eq 30 ]; then + echo "Timed out waiting for subnode compute service to be marked as down" + exit 5 + fi + status=$(openstack compute service list --host ${SUBNODE_HOSTNAME} --service nova-compute -f value -c State) +done diff --git a/roles/run-evacuate-hook/files/test_evacuate.sh b/roles/run-evacuate-hook/files/test_evacuate.sh new file mode 100755 index 0000000000..bdf8d92441 --- /dev/null +++ b/roles/run-evacuate-hook/files/test_evacuate.sh @@ -0,0 +1,55 @@ +#!/bin/bash +# Source tempest to determine the build timeout configuration. +source /opt/stack/devstack/lib/tempest +source /opt/stack/devstack/openrc admin +set -x +set -e + +# Wait for the controller compute service to be enabled. +count=0 +status=$(openstack compute service list --host ${CONTROLLER_HOSTNAME} --service nova-compute -f value -c Status) +while [ "${status}" != "enabled" ] +do + sleep 1 + count=$((count+1)) + if [ ${count} -eq 30 ]; then + echo "Timed out waiting for controller compute service to be enabled" + exit 5 + fi + status=$(openstack compute service list --host ${CONTROLLER_HOSTNAME} --service nova-compute -f value -c Status) +done + +function evacuate_and_wait_for_active() { + local server="$1" + + nova evacuate ${server} + # Wait for the instance to go into ACTIVE state from the evacuate. + count=0 + status=$(openstack server show ${server} -f value -c status) + while [ "${status}" != "ACTIVE" ] + do + sleep 1 + count=$((count+1)) + if [ ${count} -eq ${BUILD_TIMEOUT} ]; then + echo "Timed out waiting for server ${server} to go to ACTIVE status" + exit 6 + fi + status=$(openstack server show ${server} -f value -c status) + done +} + +evacuate_and_wait_for_active evacuate-test +evacuate_and_wait_for_active evacuate-bfv-test + +# Make sure the servers moved. +for server in evacuate-test evacuate-bfv-test; do + host=$(openstack server show ${server} -f value -c OS-EXT-SRV-ATTR:host) + if [[ ${host} != ${CONTROLLER_HOSTNAME} ]]; then + echo "Unexpected host ${host} for server ${server} after evacuate." + exit 7 + fi +done + +# Cleanup test servers +openstack server delete --wait evacuate-test +openstack server delete --wait evacuate-bfv-test diff --git a/roles/run-evacuate-hook/files/test_negative_evacuate.sh b/roles/run-evacuate-hook/files/test_negative_evacuate.sh new file mode 100755 index 0000000000..b1f5f7a4af --- /dev/null +++ b/roles/run-evacuate-hook/files/test_negative_evacuate.sh @@ -0,0 +1,37 @@ +#!/bin/bash +# Source tempest to determine the build timeout configuration. +source /opt/stack/devstack/lib/tempest +source /opt/stack/devstack/openrc admin +set -x +set -e + +# Now force the evacuation to the controller; we have to force to bypass the +# scheduler since we killed libvirtd which will trigger the libvirt compute +# driver to auto-disable the nova-compute service and then the ComputeFilter +# would filter out this host and we'd get NoValidHost. Normally forcing a host +# during evacuate and bypassing the scheduler is a very bad idea, but we're +# doing a negative test here. + +function evacuate_and_wait_for_error() { + local server="$1" + + echo "Forcing evacuate of ${server} to local host" + # TODO(mriedem): Use OSC when it supports evacuate. + nova --os-compute-api-version "2.67" evacuate --force ${server} ${CONTROLLER_HOSTNAME} + # Wait for the instance to go into ERROR state from the failed evacuate. + count=0 + status=$(openstack server show ${server} -f value -c status) + while [ "${status}" != "ERROR" ] + do + sleep 1 + count=$((count+1)) + if [ ${count} -eq ${BUILD_TIMEOUT} ]; then + echo "Timed out waiting for server ${server} to go to ERROR status" + exit 4 + fi + status=$(openstack server show ${server} -f value -c status) + done +} + +evacuate_and_wait_for_error evacuate-test +evacuate_and_wait_for_error evacuate-bfv-test diff --git a/roles/run-evacuate-hook/tasks/main.yaml b/roles/run-evacuate-hook/tasks/main.yaml new file mode 100644 index 0000000000..f6c80bcb6b --- /dev/null +++ b/roles/run-evacuate-hook/tasks/main.yaml @@ -0,0 +1,64 @@ +- name: Setup resources and mark the subnode as forced down + become: true + become_user: stack + shell: "/opt/stack/nova/roles/run-evacuate-hook/files/setup_evacuate_resources.sh" + environment: + SUBNODE_HOSTNAME: "{{ hostvars['compute1']['ansible_hostname'] }}" + +- name: Fence subnode by stopping q-agt and n-cpu + delegate_to: compute1 + become: true + systemd: + name: "{{ item }}" + state: stopped + with_items: + - devstack@q-agt + - devstack@n-cpu + +- name: Register running domains on subnode + delegate_to: compute1 + become: true + virt: + command: list_vms + state: running + register: subnode_vms + +- name: Destroy running domains on subnode + delegate_to: compute1 + become: true + virt: + name: "{{ item }}" + state: destroyed + with_items: "{{ subnode_vms.list_vms }}" + +- name: Stop libvirtd on "{{ inventory_hostname }}" + become: true + systemd: + name: "{{ item }}" + state: stopped + enabled: no + with_items: + - libvirtd + +- name: Run negative evacuate tests + become: true + become_user: stack + shell: "/opt/stack/nova/roles/run-evacuate-hook/files/test_negative_evacuate.sh" + environment: + CONTROLLER_HOSTNAME: "{{ hostvars['controller']['ansible_hostname'] }}" + +- name: Start libvirtd on "{{ inventory_hostname }}" + become: true + systemd: + name: "{{ item }}" + state: started + enabled: yes + with_items: + - libvirtd + +- name: Run evacuate tests + become: true + become_user: stack + shell: "/opt/stack/nova/roles/run-evacuate-hook/files/test_evacuate.sh" + environment: + CONTROLLER_HOSTNAME: "{{ hostvars['controller']['ansible_hostname'] }}" |