summaryrefslogtreecommitdiff
path: root/cloudinit/sources/DataSourceAzure.py
Commit message (Collapse)AuthorAgeFilesLines
* azure/errors: introduce reportable errors for imds (#3647)Chris Patterson2023-05-121-10/+25
| | | | | | | | | | | | | | | Always report failure to host, but report failure to fabric only outside of _check_if_nic_is_primary() which is expected to fail if nic is not primary. Add two types of reportable errors for IMDS metadata: - add ReportableErrorImdsUrlError() for url errors. - add ReportableErrorImdsMetadataParsingException() for parsing errors. Tweak ReportableError repr to be a bit friendlier. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* azure/errors: add host reporting for dhcp errors (#2167)Chris Patterson2023-05-111-7/+29
| | | | | | | | | | | | | | | | | | | | | | | - Add host_only flag to _report_failure() to allow caller to only report the failure to host. This is for cases where we don't want _report_failure() to attempt DHCP or we expect that we may recover from the reported error (there is no issue reporting multiple times to host, whereas fabric reports will immediately fail the VM provisioning). - Add ReportableErrorDhcpLease() to report lease failures. - Add ReportableErrorDhcpInterfaceNotFound() to report errors where the DHCP interface hasn't been found yet. - Add TestReportFailure class with new test coverage. Will migrate other _report_failure() tests in the future as they currently depend on TestAzureDataSource/CiTestCase. Future work will add the interface name to supporting data, but as that information is not available with iface=None, another PR will explicitly add a call to net.find_fallback_nic() to specify it. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* net: purge blacklist_drivers across net and azure (#2160)Chris Patterson2023-05-101-26/+3
| | | | | | | | | | It was only used by Hyper-V which now has a filtering mechanism that does not require the use of a denylist. This exposed some issues with tests misspelling "hv_netvsc" and using unmatched mac addresses. This fixes those to work with the current filter that does not rely on the driver name. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* Remove mount NTFS error message (#2134)Ksenija Stanojevic2023-05-091-0/+1
| | | | | | Provide an option to suppress error logging from mount_cb as some errors can be expected error and handled appropriately by DataSources. For example: failure to mount NTFS volumes on VMs that do not have NTFS drivers.
* sources/azure: report success to host and introduce kvp module (#2141)Chris Patterson2023-04-281-17/+6
| | | | | | | | | | | | Add success reporting to the host via KVP. - Move _report_failure_to_host() into kvp module. - Tweak error description to use result=error instead of PROVISIONING_ERROR: ... - Use result=success for the successful ("ready") reports. - report_x_via_kvp => report_x_to_host for consistency with fabric. ReportableError.as_description() => as_encoded_report() Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: report failures to host via kvp (#2136)Chris Patterson2023-04-251-1/+15
| | | | | | | | | | | | | | Azure can report provisioning failures via the Wireserver health endpoint. However, in the event of networking failures or Wireserver issues, this report cannot be made and the VM will result in an OS provisioning timeout and a generic error is presented to the user. Report the failure via KVP using the "PROVISIONING_REPORT" key so that the host can relay the provisioning error report to the user when the VM fails to provision. The format used is subject to change and/or removal. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* azure/imds: retry fetching metadata up to 300 seconds (#2121)Chris Patterson2023-04-191-3/+6
| | | | | | | | | | | | | | Instead of a fixed number of retries, allow up to 5 minutes to fetch metadata from IMDS. The current approach allows for up to 11 attempts depending on the path. Given the timeout setting, this can vary from ~11 seconds up to ~32 seconds depending on whether or not read/connection timeouts are encountered. Delaying boot on the rare occasion that IMDS is delayed is better than ignoring the metadata as it ensures the VM is configured as expected. This is a very conservative timeout and may be reduced in the future. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* [1/2] DHCP: Refactor dhcp client code (#2122)Brett Holman2023-04-191-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move isc-dhclient code to dhcp.py In support of the upcoming deprecation of isc-dhcp-client, this code refactors current dhcp code into classes in dhcp.py. The primary user-visible change should be the addition of the following log: dhcp.py[DEBUG]: DHCP client selected: dhclient This code lays groundwork to enable alternate implementations to live side by side in the codebase to be selected with distro-defined priority fallback. Note that maybe_perform_dhcp_discovery() now selects which dhcp client to call, and then runs the corresponding client's dhcp_discovery() method. Currently only class IscDhclient is implemented, however a yet-to-be-implemented class Dhcpcd exists to test fallback behavior and this will be implemented in part two of this series. Part of this refactor includes shifting dhclient service management from hardcoded calls to the distro-defined manage_service() method in the *BSDs. Future work is required in this area to support multiple clients via select_dhcp_client().
* azure/errors: introduce reportable errors (#2129)Chris Patterson2023-04-191-9/+19
| | | | | | | | | | | | | | | When provisioning failures occur an Azure, a generic description is used in the report and ultimately returned to the user. To improve the user experience, report details of the failure in a manner that is parsable, readable and succinct. The current approach is to use csv with a custom delimiter ("|") and quote character ("'"). This format may change in the future. Gracefully handle reportable errors thrown while crawling metadata and treat other exceptions as ReportableErrorUnhandledException. Future work will introduce more reportable errors to handle the expected failure cases. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* azure: introduce identity module (#2116)Chris Patterson2023-04-171-20/+11
| | | | | | | | | | | | | | | | | | | | | - Add query_system_uuid() for getting system uuid from dmi in normalized (lower-cased) form. - Add byte_swap_system_uuid() to convert a system uuid for gen1 instances to the compute.vmId as presented by IMDS. - Add convert_system_uuid_to_vm() to convert system uuid to vm id depending on whether it is gen1 or gen2. - Add is_vm_gen1() to determine if VM is Azure's gen1 by checking for available of EFI (used in gen2). - Add query_vm_id() helper to get VM id without system uuid. - Move ChassisAssetTag from Azure helpers into identity. - Update DataSourceAzure._iid() to use this module. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: move pps handling out of _poll_imds() (#2075)Chris Patterson2023-03-291-85/+83
| | | | | | | | | | | | | | Pull out remaining PPS handling bits from _poll_imds() and add two explicit methods for the overloaded path: - _wait_for_pps_running_reuse() for running PPS logic. - _wait_for_pps_unknown_reuse() for unknown and recovery PPS logic. For consistency: - Rename _wait_for_all_nics_ready() -> _wait_for_pps_savable_reuse(). - Move reporting ready logic into _wait_for_pps_os_disk_shutdown(). Drop several impacted tests as coverage already exists in TestProvisioning, and update the rest to handle the +/- 1 DHCP attempt due to varying assumptions around PPS state and DHCP.
* datasource: Optimize datasource detection, fix bugs (#2060)Brett Holman2023-03-191-16/+15
| | | | | | | | | | | | | | | | | | Commit d1ffbea556a06105 enabled skipping python datasource detection on OpenStack when no other datasources (besides DataSourceNone) can be discovered. This allowed one to override detection, which is a requirement for OpenStack Ironic which does not advertise itself to cloud-init. Since no further datasources can be detected at this stage in the code, this pattern can be generalized to other datasources to facilitate troubleshooting or providing a general workaround to runtime detection bugs. Additionally, this pattern can be extended to kernel commandline datasource definition. Since kernel commandline is highest priority of the configurations, it makes sense to override python code datasource detection as well. Include an integration test on LXD for this behavior that configures kernel commandline and reboots to verify that the specified datasource is forced.
* sources/azure: add networking check for all source PPS (#2061)Chris Patterson2023-03-161-7/+6
| | | | | | | | | | | | | | | | There is a networking check in _poll_imds() which will attempt DHCP again if networking is not up for source PPS. With the previous change to wait at least 20 minutes during provisioning for DHCP, this additional round is not necessary. Report failure if networking is not up for any mode of source PPS. In practice, this is very unlikely as provisioning will typically timeout within the 20 minute window the VM is attempting DHCP and the source PPS VM will be deleted. This fixes an (unobserved) issue where Savable PPS does not have networking prior to _wait_for_all_nics_ready(). Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: fix regressions in IMDS behavior (#2041)Chris Patterson2023-03-011-20/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are effectively two regressions in the recent IMDS refactor: 1. The metadata check len(imds_md["interface"]) in _check_if_nic_is_primary() is no longer correct as the refactor switched URLs and did not update this call to account for the fact that this metadata now lives under "network". 2. Network metadata was fetched with infinite=True and is now limited to ten retries. This callback had the twist of only allowing up to ten connection errors but otherwise would retry indefinetely. For check_if_nic_is_primary(): - Drop the interface count check for _check_if_nic_is_primary(), we don't need it anyways. - Fix/update the unit tests mocks that allowed the tests to pass, adding another test to verify max retries for http and connection errors. - Use 300 retries. We do want to hit a case where we spin forever, but this should be more than enough time for IMDS to respond in the Savable PPS case (~5 minutes). For IMDS: - Consolidate IMDS retry handlers into a new ReadUrlRetryHandler class that supports the options required for each variant of request. - Minor tweaks to log and expand logging checks in unit tests. - Move all unit tests to mocking via mock_requests_session_request and replace mock_readurl fixture with wrapped_readurl to improve consistency between tests. Note that this change drops usage of `retry_on_url_exc` and can probably be removed altogether as it is no longer used AFAICT. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* dhcp: Cleanup unused kwarg (#2037)Brett Holman2023-02-281-1/+0
| | | Usage was dropped in de7851b93c5a2d4658.
* sources/azure: refactor imds handler into own module (#1977)Chris Patterson2023-02-161-298/+28
| | | | | | | | | | Create new azure package for better organization and move IMDS logic for fetching into it. Future work will clean up the test_azure.py tests a little further thanks to these changes, but wanted to minimize churn here to make changes fairly visible. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: retry on connection error when fetching metdata (#1968)Chris Patterson2023-01-171-1/+4
| | | | | | | | | | | | | Early attempts to fetch metadata on Azure may fail with connection errors. While this class of errors is not ideal to retry on, the impact is minimal given that: 1. retries are fairly limited (10) 2. Persistent connection errors would indicate that cloud-init is using a non-primary NIC which is a rare case of failure that will be addressed in the future. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: minor refactor for metadata source detection logic (#1936)Chris Patterson2023-01-121-29/+21
| | | | | | | | | | | | | | | | | | | | | - Initialize md and cfg to the fallback used when no OVF is found and IMDS is required. - Rename metadata_source -> ovf_source and drop usage of "IMDS" as a valid value. - Set `self.seed` to "IMDS" when ovf_source is unset. - Remove late check for metadata source. This is already done by the earlier check where we'll fail with "No OVF or IMDS available". - Move "Found provisioning metadata" diagnostic up to where we read OVF. Suggesting it was "IMDS" prior to querying IMDS is misleading. - Add warning when falling back to IMDS-only provisioning. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: drop description for report_failure_to_fabric() (#1934)Chris Patterson2023-01-051-12/+4
| | | | | | | | | The same default description is used for all error cases. Remove this parameter in favor of assuming the default in all cases. Future work will allow for error reporting with a customizable description using a different interface. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: fix device driver matching for net config (#1914)Chris Patterson2023-01-031-6/+29
| | | | | | | | | | | | | | The ordering of NICs provided by IMDS may not match the order enumerated by kernel. As such, we do not have any guarantee that the nic we're checking the driver for is the nic we think it is. Instead of making any assumptions about how the nics are named, check all interfaces by mac address. If there is an interface using "hv_netvsc", match against that. If there is only one interface driver that is not blacklisted, use that (in case it is not "hv_netvsc"), but log a debug event. If there are multiple hits, don't match against any of the names and report a warning. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: ensure instance id is always correct (#1727)Chris Patterson2022-09-131-0/+5
| | | | | | | | | Currently, get_instance_id() assumes that the instance ID is in the metadata. If not found, it falls back to a hardcoded string "iid-datasource". Override this behavior to query the instance id as needed. Signed-off-by: Chris Patterson cpatterson@microsoft.com
* azure: define new attribute for pre-22.3 pickles (#1725)Brett Holman2022-09-121-0/+3
| | | | | | | | | A new attribute was added to DataSourceAzure[1]. Since the base class uses CloudInitPickleMixin, we need to define this new attribute in _unpickle() Add multiple tests to improve pickle coverage. [1] https://github.com/canonical/cloud-init/pull/1523
* net: Ensure a tmp with exec permissions for dhcp (#1690)Alberto Contreras2022-09-011-1/+3
| | | | | | | | In the case cloudinit.temp_utils points to a fs mounted as noexec and needs_exe=True, fallback to use os.join.path(Distro.usr_lib_exec, "cloud-init/clouddir) that will be mounted with exec perms. LP: #1962343
* sources/azure: handle network unreachable errors for savable PPS (#1642)Chris Patterson2022-08-121-6/+35
| | | | | | | | | | | | | | | | Upon reporting ready for Savable PPS, the VM may be suspended before we see the http request complete. When the VM resumes, http_with_retries will keep retrying even though it sees "Network is unreachable" errors due to the unplugged NIC (and perhaps a new unconfigured one) or "Read timed out" raised. - Do not retry when "Network is unreachable", this will not resolve itself in any case. - Ignore all url errors for Savable PPS. Worst case scenario is we failed to report ready anyways (for whatever reason) and the source PPS VM will soon be discarded. Signed-off-by: Chris Patterson cpatterson@microsoft.com
* sources/azure: add experimental support for preprovisioned os disks (#1622)Chris Patterson2022-08-041-2/+21
| | | | | | | | | | | | Some pre-provisioning scenarios require that the VM be started and shut back down as part of preparing the VM for future use. When the PPS type is PreprovisionedOSDisk, report ready and wait for host to shut down the VM. Provisioning will resume normally on next boot, so do not write a reported ready marker file. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: don't set cfg["password"] for default user pw (#1592)Chris Patterson2022-07-201-3/+1
| | | | | | | | Use `hashed_passwd` instead of `passwd`. The password is still set for the default (admin) user but isn't immediately expired as a result of this change: https://github.com/canonical/cloud-init/pull/1577 Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: refactor chassis asset tag handling (#1574)Chris Patterson2022-07-151-25/+17
| | | | | | | | | | | | | In preparation of adding a new tag to support, move the current DMI chassis asset tag into an enum. No change in behavior should be present other than reporting. - Create ChassisAssetTag enum for containing all Azure DMI chassis asset tags and logic to query system for it. - Add current DMI asset tag to enum as AZURE_CLOUD. - Reporting: drop event frame and report valid asset tag. - Update tests for platform viability to pytest. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: refactor ovf-env.xml parsing (#1550)Chris Patterson2022-07-081-246/+32
| | | | | | | | | | | | | | | | | * Use ElementTree instead of minidom * Use namespaces and case sensitive names * Decouple parsing from usage in config/metadata dictionaries * More clearly distinguish between NonAzureDataSource() and BrokenAzureDataSource() exceptions. Only raise NonAzureDataSource() exception if the ProvisioningSection in the windowsazure namespace is not found. Any other parsing failures will result in BrokenAzureDataSource() being raised. * Streamline log messages * Move logic into Azure helper module There should be no effective change in behavior unless some bad XML is in the wild and being ignored or failing silently. Signed-off-by: Chris Patterson cpatterson@microsoft.com
* cloud-config: honor cloud_dir setting (#1523)Alberto Contreras2022-06-221-7/+16
| | | | | | | | | | | Ensure cloud_dir setting is respected rather than hardcoding "/var/lib/cloud" - Modules affected: cmd.main, apport, devel.logs (collect-logs), cc_snap, sources.DataSourceAzure, sources.DataSourceBigstep, util:fetch_ssl_details. - testing: Extend and port to pytest unit tests, add integration test. LP: #1976564
* sources/azure: remove unused encoding support for customdata (#1526)Chris Patterson2022-06-171-6/+1
| | | | | It is unused and unsupported in Azure's ovf-env.xml. Remove it. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: remove unused metadata captured when parsing ovf (#1524)Chris Patterson2022-06-161-9/+1
| | | | | | | | | - seedfrom is not used by Azure ovf-env.xml, remove it. - azure_data is capturing arbitrary keys and we already have a redacted ovf-env.xml if we need to inspect any (unused) properties. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: remove dscfg parsing from ovf-env.xml (#1522)Chris Patterson2022-06-161-9/+0
| | | | | | | | This property is not found in Azure's ovf-env.xml. Remove relevant code merging it into datasource config and unit tests. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: remove unused userdata property from ovf (#1516)Chris Patterson2022-06-141-3/+1
| | | | | | | Azure does not populate ovf-env.xml with UserData, just CustomData. Update tests accordingly. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: minor refactoring to network config generation (#1497)Chris Patterson2022-06-141-43/+39
| | | | | | | | | | | | | | | - Replace parse_network_config() with _generate_network_config() instance method and consolidate cache checks into network_config. - Update _generate_network_config_from_imds_metadata() to take just network metadata portion of instance metadata and rename to generate_network_config_from_instance_network_metadata(). - Consolidate relevant unit tests and refactor to pytest. - Update net-convert to use generate_network_config_from_instance_network_metadata(). Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* net: Implement link-local ephemeral ipv6Brett Holman2022-06-101-1/+1
| | | | | | | | | | | | | | | | | | | Also refactor network context managers into net.ephemeral Currently EC2 is the only IMDS to make use of this. IPv6 requires a link local address on interfaces. A link local address is sufficient for the EC2 IMDS, so no dhcp6 assignment is required for early boot IMDS queries. The kernel assigns this address using RFC 4291 [1] during link initialization, so all cloud-init needs to do is ensure that link is up. This means that even if dhcp4 fails, an ipv6-enabled instance may still succeed at crawling metadata. [1] https://datatracker.ietf.org/doc/html/rfc4291#section-2.5.6
* Remove xenial references (#1472)Alberto Contreras2022-06-081-6/+3
| | | | | | - Remove references and dead code to Xenial, Eoan, Python < 3.7 - cc_ubuntu_drivers: Use python3-debconf instead of shell script - add integration test for ubuntu_drivers - bump pycloudlib for OCI subnet/jammy fixes
* Oracle ds changes (#1474)Alberto Contreras2022-06-081-1/+1
| | | | | | | | | | | | | | | For primary network config: - Use `iSCSI` config if some `/run/net*` file exists, even if `/run/initramfs/open-iscsi.interface` does not. - If the instance is not an `iSCSI` one, then crawl the network config from `IMDS` instead of falling back to "best guess". - Remove unnecessary conditional use of dhcp.EphemeralDHCPv4 and use it always to crawl `IMDS`. - Migrate tests to pytest. - Extend unit test coverage. - Add some types for mypy. LP: #1967942
* Drop mypy excluded files (#1454)Alberto Contreras2022-05-231-5/+5
| | | | | | | | | - Add types to let mypy pass. - Add mypy flags: - detect unused ignores - redundant casts - Drop support of `ConfigParser` in Python 2 - Harden DataSourceLXD.network_config - Convert old-style commented types to proper types.
* sources/azure: remove reprovisioning marker (#1414)Chris Patterson2022-05-091-58/+36
| | | | | | | | | | | | | | | | | | | | | If we haven't reported ready for source PPS then we can treat the recovery boot like any other. The metadata on the OVF and IMDS will indicate the PPS type correctly as the state hasn't changed. If we have reported ready for source PPS, we continue to fall into _poll_imds() by way of setting pps_type to UNKNOWN if the REPORTED_READY_MARKER is present and will not attempt to report ready again. This fixes a potential issue when recovering on Savable PPS. If a recovery boot occurs after the recovery marker is created, and without reporting ready, the subsequent boot will assume pps type UNKNOWN and attempt to report ready in _poll_imds() using the Running PPS netlink operations. Add unit test coverage for complete recovery scenario. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* Misc module cleanup (#1418)Brett Holman2022-04-291-0/+0
| | | | - move datasource helpers to dedicated directory - drop unnecessary executable bit on shebangless python files
* sources/azure: retry dhcp for failed processes (#1401)Chris Patterson2022-04-251-0/+13
| | | | | | Commands may fail in the process of setting up DHCP, e.g.: udevadm settle, ip link set dev eth0 up, etc. Report these failures and retry until timeout.
* Fix provisioning dhcp timeout to 20 minutes (#1394)Chris Patterson2022-04-211-1/+1
| | | | | | | While this was a previously intended change, the actual logic was backwards. Try for 20 minutes during provisioning, only 5 minutes otherwise. Add test coverage to verify the timeout for provisioning scenarios.
* sources/azure: only wait for primary nic to be attached during restore (#1378)Anh Vo2022-04-181-21/+11
| | | | | | Currently DS Azure waits for all nics to be up and running during the restore phase of save-restore VMs. This change will alter the behavior so that it will only wait for primary nic. This new behavior is consistent with non-preprovisioning and running types.
* Return a namedtuple from subp() (#1376)Brett Holman2022-04-121-1/+1
| | | | | | This provides a minor readability improvement. subp.subp(cmd)[0] -> subp.subp(cmd).stdout subp.subp(cmd)[1] -> subp.subp(cmd).stderr
* sources/azure: remove bind/unbind logic for hot attached nic (#1332)Chris Patterson2022-04-041-61/+15
| | | | | | | | | | Wait up to 10 seconds for link to come up before continuing. This typically takes just a few seconds once the NIC is hotplugged. If it takes longer than 10 seconds for whatever reason, dhclient should eventually succeed on its next attempt after the link does come online. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: move get_ip_from_lease_value out of shim (#1324)Chris Patterson2022-03-241-5/+3
| | | | | | | Just a minor refactoring to cleanup the shim. Update tests to use pytest parametrization. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: remove lease file parsing (#1302)Chris Patterson2022-03-081-9/+9
| | | | | | | | | | | | | | | | | With reporting ready now happening in local phase, we have access to ephemeral DHCP lease options and no longer need to parse DHCP lease files. - Switch from tracking wireserver endpoint in its encoded form to the IP string, parsing it only when read from lease options. - Drop fallback_lease_file and dhcp_options parameters in favor of processed endpoint string. - Add some minor type information for mypy. - Update various tests. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: prevent tight loops for DHCP retries (#1285)Chris Patterson2022-03-021-12/+36
| | | | | | | | | | | With debug logging, tight loops may result in huge log file sizes, e.g.: "Unable to find fallback nic" 1. Raise NoDHCPLeaseMissingDhclientError to caller if no dhclient found instead of retrying DHCP, retrying will not fix a missing dhclient. 2. For other DHCP failures, retry after sleeping one second. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: ensure retries on IMDS request failure (#1271)Chris Patterson2022-02-171-38/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two issues with IMDS retries: 1. IMDS_VER_WANT will never be attempted if retries=0, such as when fetching network metadata with infinite=True. 2. get_imds_data_with_api_fallback() will attempt one request with IMDS_VER_WANT. If the connection fails due to a timeout, connection issue, or error code other than 400, an empty dictionary will be returned without attempting the requested number of retries. This PR: - Updates get_imds_data_with_api_fallback() to invoke get_metadata_from_imds() with the specified retries and infinite parameters. - Updates retry_on_url_exc to take a configurable set of HTTP error codes and exception types to retry on. - Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS: - 404 not found (yet) - 410 gone / unavailable (yet) - 429 rate-limited/throttled - 500 server error - Replace default callback with imds_readurl_exception_callback, which configures retry_on_url_exc() with these error codes and instances. - Add new pytests for IMDS to eventually replace the unittest equivalents and improve existing coverage. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
* sources/azure: removed unused savable PPS paths (#1268)Chris Patterson2022-02-171-43/+4
| | | | | | | | | | If the VM is rebooted during provisioning, the PPS type will be determined to be UNKNOWN and will poll for reprovision data. Given that we will never enter _wait_for_all_nics_ready() in any other condition than a fresh source instance in Savable PPS, we can safely remove the now-unused code paths. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>