summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorDavid Hankins <dhankins@isc.org>2006-05-05 20:12:38 +0000
committerDavid Hankins <dhankins@isc.org>2006-05-05 20:12:38 +0000
commit14baf5cd1adbe429efa804a0d13cee4c07237e84 (patch)
treef2bb0a02734a725981aed593507fdddafc20b465 /doc
parent0b17f049ed230a77ecb32330533013a1b9fb5dc8 (diff)
downloadisc-dhcp-14baf5cd1adbe429efa804a0d13cee4c07237e84.tar.gz
Removed -7 but did not add -12 to repo.
Diffstat (limited to 'doc')
-rw-r--r--doc/draft-ietf-dhc-failover-12.txt7451
1 files changed, 7451 insertions, 0 deletions
diff --git a/doc/draft-ietf-dhc-failover-12.txt b/doc/draft-ietf-dhc-failover-12.txt
new file mode 100644
index 00000000..6d632e08
--- /dev/null
+++ b/doc/draft-ietf-dhc-failover-12.txt
@@ -0,0 +1,7451 @@
+
+
+
+
+
+
+Network Working Group Ralph Droms
+INTERNET DRAFT Kim Kinnear
+ Mark Stapp
+ Cisco Systems
+
+ Bernie Volz
+ Ericsson
+
+ Steve Gonczi
+ Relicore
+
+ Greg Rabil
+ Lucent Technologies
+
+ Michael Dooley
+ Diamond IP Technologies
+
+ Arun Kapur
+ K5 Networks
+
+ March 2003
+ Expires September 2003
+
+
+ DHCP Failover Protocol
+ <draft-ietf-dhc-failover-12.txt>
+
+Status of this Memo
+
+ This document is an Internet-Draft and is in full conformance with
+ all provisions of Section 10 of RFC2026.
+
+ Internet-Drafts are working documents of the Internet Engineering
+ Task Force (IETF), its areas, and its working groups. Note that
+ other groups may also distribute working documents as Internet-
+ Drafts.
+
+ Internet-Drafts are draft documents valid for a maximum of six months
+ and may be updated, replaced, or obsoleted by other documents at any
+ time. It is inappropriate to use Internet- Drafts as reference
+ material or to cite them other than as "work in progress."
+
+ The list of current Internet-Drafts can be accessed at
+ http://www.ietf.org/ietf/1id-abstracts.txt
+
+ The list of Internet-Draft Shadow Directories can be accessed at
+ http://www.ietf.org/shadow.html.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 1]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2003). All Rights Reserved.
+
+Abstract
+
+ DHCP [RFC 2131] allows for multiple servers to be operating on a
+ single network. Some sites are interested in running multiple
+ servers in such a way so as to provide redundancy in case of server
+ failure. In order for this to work reliably, the cooperating primary
+ and secondary servers must maintain a consistent database of the
+ lease information. This implies that servers will need to coordinate
+ any and all lease activity so that this information is synchronized
+ in case of failover.
+
+ This document defines a protocol to provide such synchronization
+ between two servers. One server is designated the "primary" server,
+ the other is the "secondary" server. This document also describes a
+ way to integrate the failover protocol with the DHCP load balancing
+ approach.
+
+
+Table of Contents
+
+
+ 1. Introduction................................................. 4
+ 2. Terminology.................................................. 5
+ 2.1. Requirements terminology................................... 5
+ 2.2. DHCP and failover terminology.............................. 5
+ 3. Background and External Requirements......................... 9
+ 3.1. Key aspects of the DHCP protocol........................... 9
+ 3.2. BOOTP relay agent implementation........................... 11
+ 3.3. What does it mean if a server can't communicate with its partner? 12
+ 3.4. Challenging scenarios for a Failover protocol.............. 13
+ 3.5. Using TCP to detect partner server failure................. 14
+ 4. Design Goals................................................. 15
+ 4.1. Design goals for this protocol............................. 15
+ 4.2. Limitations of this protocol............................... 17
+ 5. Protocol Overview............................................ 17
+ 5.1. Messages and States........................................ 18
+ 5.2. Fundamental guarantees..................................... 20
+ 5.3. Load balancing............................................. 27
+ 5.4. IP address allocations between servers..................... 28
+ 5.5. Operating in NORMAL state.................................. 30
+ 5.6. Operating in COMMUNICATIONS-INTERRUPTED state.............. 31
+ 5.7. Operating in PARTNER-DOWN state............................ 31
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 2]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+ 5.8. Operating in RECOVER state................................. 31
+ 5.9. Operating in STARTUP state................................. 31
+ 5.10. Time synchronization between servers...................... 32
+ 5.11. IP address binding-status................................. 33
+ 5.12. DNS dynamic update considerations......................... 36
+ 5.13. Reservations and failover................................. 41
+ 5.14. Dynamic BOOTP and failover................................ 42
+ 5.15. Guidelines for selecting MCLT............................. 43
+ 5.16. What is sent in response to an UPDREQ or UPDREQALL message? 43
+ 5.17. How do you determine that your partner is "up to date" for 45
+ 6. Common Message Format........................................ 45
+ 6.1. Message header format...................................... 46
+ 6.2. Common option format....................................... 48
+ 6.3. Batching multiple binding update transactions in one BNDUPD mes- 49
+ 7. Protocol Messages............................................ 51
+ 7.1. BNDUPD message [3]......................................... 51
+ 7.2. BNDACK message [4]......................................... 62
+ 7.3. UPDREQ message [9]......................................... 65
+ 7.4. UPDREQALL message [7]...................................... 66
+ 7.5. UPDDONE message [8]........................................ 67
+ 7.6. POOLREQ message [1]........................................ 68
+ 7.7. POOLRESP message [2]....................................... 69
+ 7.8. CONNECT message [5]........................................ 70
+ 7.9. CONNECTACK message [6]..................................... 74
+ 7.10. STATE message [10]........................................ 78
+ 7.11. CONTACT message [11]...................................... 79
+ 7.12. DISCONNECT message [12]................................... 80
+ 8. Connection Management........................................ 81
+ 8.1. Connection granularity..................................... 81
+ 8.2. Creating the TCP connection................................ 81
+ 8.3. Using the TCP connection for determining communications status 83
+ 8.4. Using the TCP connection for binding data.................. 85
+ 8.5. Using the TCP connection for control messages.............. 85
+ 8.6. Losing the TCP connection.................................. 85
+ 9. Failover Endpoint States..................................... 86
+ 9.1. Server Initialization...................................... 86
+ 9.2. Server State Transitions................................... 86
+ 9.3. STARTUP state.............................................. 90
+ 9.4. PARTNER-DOWN state......................................... 93
+ 9.5. RECOVER state.............................................. 95
+ 9.6. RECOVER-WAIT state......................................... 97
+ 9.7. RECOVER-DONE state......................................... 98
+ 9.9. COMMUNICATIONS-INTERRUPTED State........................... 101
+ 9.10. POTENTIAL-CONFLICT state.................................. 105
+ 9.11. RESOLUTION-INTERRUPTED state.............................. 107
+ 9.12. CONFLICT-DONE state....................................... 108
+ 9.13. PAUSED state.............................................. 108
+
+
+
+Droms, et. al. Expires September 2003 [Page 3]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ 9.14. SHUTDOWN state............................................ 109
+ 10. Safe Period................................................. 110
+ 11. Security.................................................... 111
+ 11.1. Simple shared secret...................................... 112
+ 11.2. TLS....................................................... 113
+ 12. Failover Options............................................ 113
+ 12.1. addresses-transferred..................................... 114
+ 12.2. assigned-IP-address....................................... 114
+ 12.3. binding-status............................................ 114
+ 12.4. client-identifier......................................... 115
+ 12.5. client-hardware-address................................... 115
+ 12.6. client-last-transaction-time.............................. 115
+ 12.7. client-reply-options...................................... 116
+ 12.8. client-request-options.................................... 116
+ 12.9. DDNS...................................................... 117
+ 12.10. delayed-service-parameter................................ 118
+ 12.11. hash-bucket-assignment................................... 118
+ 12.12. IP-flags................................................. 119
+ 12.13. lease-expiration-time.................................... 120
+ 12.14. max-unacked-bndupd....................................... 120
+ 12.15. MCLT..................................................... 120
+ 12.16. message.................................................. 121
+ 12.17. message-digest........................................... 121
+ 12.18. potential-expiration-time................................ 122
+ 12.19. receive-timer............................................ 122
+ 12.20. protocol-version......................................... 122
+ 12.21. reject-reason............................................ 123
+ 12.22. relationship-name........................................ 124
+ 12.23. server-flags............................................. 124
+ 12.24. server-state............................................. 125
+ 12.25. start-time-of-state...................................... 125
+ 12.26. TLS-reply................................................ 126
+ 12.27. TLS-request.............................................. 126
+ 12.28. vendor-class-identifier.................................. 126
+ 12.29. vendor-specific-options.................................. 127
+ 13. IANA Considerations......................................... 127
+ 14. Acknowledgments............................................. 127
+ 15. References.................................................. 129
+ 16. Author's information........................................ 131
+ 17. Full Copyright Statement.................................... 132
+
+
+1. Introduction
+
+ DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
+ gle network. Some sites are interested in running multiple servers
+ in such a way so as to provide redundancy in case of server failure
+ since the DHCP subsystem is in many cases a critical part of the
+
+
+
+Droms, et. al. Expires September 2003 [Page 4]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ network infrastructure.
+
+ This document defines a protocol to provide synchronization between
+ two servers in order that each can take over for the other should
+ either one fail or become unreachable.
+
+ One server is designated the "primary" server, the other is the
+ "secondary" server, and most DHCP client requests are sent to each
+ server (see section 3.1.1 for details).
+
+ In order to provide a high availability DHCP service, these
+ cooperating primary and secondary servers must maintain a consistent
+ database of lease information. This implies that servers will need
+ to coordinate all lease activity so that this information is syn-
+ chronized in case failover is required. The protocol messages and
+ processing techniques required to maintain a consistent database are
+ specified in the protocol described here.
+
+ The failover protocol also contains a way to integrate the DHCP load-
+ balancing algorithm described in [RFC 3074] with the failover proto-
+ col.
+
+2. Terminology
+
+ This section discusses both the generic requirements terminology com-
+ mon to many IETF protocol specifications as well as specialized DHCP
+ and failover protocol specific terminology.
+
+2.1. Requirements terminology
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC 2119].
+
+
+2.2. DHCP and failover terminology
+
+ This document uses the following terms:
+
+ o "available IP address"
+
+ An IP address is "available" if it may be allocated by a
+ specific DHCP server. An IP address is considered (for the
+ purposes of this document) to be available to a single server
+ for allocation unless otherwise noted. An IP address available
+ for allocation on a primary server has state FREE, and an IP
+ address available for allocation on a secondary server has
+ state BACKUP.
+
+
+
+Droms, et. al. Expires September 2003 [Page 5]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ o "binding"
+
+ A binding is a collection of configuration parameters, includ-
+ ing at least an IP address, associated with or "bound to" a
+ DHCP client. Bindings are managed by DHCP servers.
+
+ o "binding database"
+
+ The collection of bindings managed by a primary and secondary.
+
+ o "binding update transaction"
+
+ A binding update transaction refers to the set of information
+ (contained in options) necessary to perform a binding update
+ for a single IP address. It will be comprised of the
+ assigned-IP-address option, the binding-status option, along
+ with other options as appropriate.
+
+ o "binding-status"
+
+ The binding-status is the status of an IP address with respect
+ to its association with a client. There are specific binding-
+ status values defined for use by the failover protocol, e.g.,
+ ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to
+ map more or less directly onto the binding-status values used
+ internally in most DHCP server implementations. The term
+ binding-status refers to the concept also sometimes known as
+ "lease state" or "IP address state", but in this document the
+ term "state" is reserved for the failover state of a failover
+ endpoint, and binding-status is always used to refer to the
+ state associated with an IP address or lease.
+
+ o "DHCP client" or "client"
+
+ A DHCP client is an Internet host using DHCP to obtain confi-
+ guration parameters such as a network address. The term
+ "client" used within this document always means a DHCP client,
+ and never one of the two failover servers.
+
+ o "DHCP server" or "server"
+
+ A DHCP server is an Internet host that returns configuration
+ parameters to DHCP clients.
+
+ o "DDNS"
+
+ An abbreviation for "Dynamic DNS", which refers to the capabil-
+ ity to update a DNS server's name (actually resource record)
+
+
+
+Droms, et. al. Expires September 2003 [Page 6]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ database using an on-the-wire protocol defined in [RFC 2136].
+
+ o "DNS"
+
+ An abbreviation for "Domain Name System", a scheme where a cen-
+ tral name repository is used to map names to IP addresses and IP
+ addresses to names.
+
+ o "failover endpoint"
+
+ The failover protocol allows for there to be a unique failover
+ endpoint per partner per role per relationship (where role is
+ primary or secondary and the relationship is defined by the
+ relationship-name option). This failover endpoint can take
+ actions and hold unique states. Typically, there is a one fail-
+ over endpoint per partner, although there may be more.
+
+ o "FQDN"
+
+ An FQDN is a "fully qualified domain name". A fully qualified
+ domain name generally is a host name with at least one zone
+ name, for example "www.dhcp.org" is a fully qualified domain
+ name.
+
+ o "lazy update"
+
+ Lazy update refers to the requirement placed on a server imple-
+ menting a failover protocol to update its failover partner when-
+ ever the binding database changes. A failover protocol which
+ didn't support lazy update would require the failover partner
+ update to be complete before a DHCP server could respond to a
+ DHCP client request with a DHCPACK. A failover protocol which
+ does support lazy update places no such restriction on the
+ update of the failover partner server, and so a server can allo-
+ cate an IP address or extend a lease on an IP address and then
+ update its failover partner as time permits. A failover proto-
+ col which supports lazy update not only removes the requirement
+ to update the failover partner prior to responding to a DHCP
+ client with a DHCPACK, but also allows gathering up batches of
+ updates from one failover server to its partner.
+
+ o "MCLT"
+
+ The MCLT refers to maximum client lead time. This time is con-
+ figured on the primary server and transmitted from the primary
+ to the secondary server in the CONNECT message. It is the max-
+ imum amount of time that one server can extend a lease for a
+ client's binding beyond the time known by the partner server.
+
+
+
+Droms, et. al. Expires September 2003 [Page 7]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ See section 5.2.1 for details.
+
+ o "partner"
+
+ A "partner", for the purposes of this document, refers to a
+ failover server, typically the other failover server. In many
+ (if not most) cases, the failover protocol is symmetric with
+ respect to the primary or secondary nature of the servers, and
+ so it is often appropriate to discuss "updating the partner
+ server", since it could be a primary server updating a secondary
+ server or a secondary server updating a primary server.
+
+ o "Primary server" or "Primary"
+
+ A DHCP server configured to provide primary service to a set of
+ DHCP clients for a particular set of subnet address pools.
+
+ o "RR"
+
+ "RR" is an abbreviation for "resource record". All records in
+ the DNS are resource records. The resource records of most
+ relevance to this document are the "A" resource record, which
+ maps a DNS name to a particular IP address, the "PTR" resource
+ record, which allows a "reverse map", from the IP address back
+ to a DNS name, and the "KEY" resource record, which is used in
+ ways defined in [FQDN] to tag a DNS name with the identity of
+ the DHCP client with which it is associated.
+
+ o "Secondary server" or "Secondary"
+
+ A DHCP server configured to act as backup to a primary server
+ for a particular set of subnet address pools.
+
+ o "stable storage"
+
+ Every DHCP server is assumed to have some form of what is called
+ "stable storage". Stable storage is used to hold information
+ concerning IP address bindings (among other things) so that this
+ information is not lost in the event of a server failure which
+ requires restart of the server.
+
+ o "state"
+
+ In this document, the term "state" refers exclusively to the
+ state of a failover endpoint, for example: NORMAL,
+ COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN. It is not used to
+ refer to any attributes of an IP address or a binding of an IP
+ address. See "binding-status".
+
+
+
+Droms, et. al. Expires September 2003 [Page 8]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ o "subnet address pool"
+
+ A subnet address pool is the set of IP addresses which is asso-
+ ciated with a particular network number and subnet mask. In the
+ simple case, there is a single network number and subnet mask
+ and a set of IP addresses. In the more complex case (sometimes
+ called "secondary subnets", sometimes "superscopes"), several
+ (apparently unrelated) network number and subnet mask combina-
+ tions with their associated IP addresses may all be configured
+ together into one subnet address pool.
+
+
+3. Background and External Requirements
+
+ This section highlights key aspects of the DHCP protocol on which the
+ failover protocol depends. It also discusses the requirements that
+ the failover protocol places on other aspects of the network infras-
+ tructure, and some general issues surrounding server failure detec-
+ tion. Some failure scenarios that provide particular challenges to a
+ failover protocol are discussed. Finally, the challenges inherent in
+ using a TCP connection as a means to detect failure of a partner
+ server are elaborated.
+
+3.1. Key aspects of the DHCP protocol
+
+ The failover protocol is designed to augment the DHCP protocol as
+ described in RFC 2131 [RFC 2131]. There are several key aspects of
+ the DHCP protocol which are required by the failover protocol in
+ order to successfully meet its design goals.
+
+3.1.1. Broadcast behavior
+
+ There are two aspects of the broadcast behavior of the DHCP protocol
+ which are key to making the failover protocol operate successfully.
+ The first is simply that the DHCP protocol requires a DHCP client to
+ broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
+ Because of this requirement, a DHCP client who was communicating with
+ one server will automatically be able to communicate with another
+ server if one is available.
+
+ The second aspect of broadcast behavior is similar to the first, but
+ involves the distinction between a DHCPREQUEST/RENEW and
+ DHCPREQUEST/REBINDING. A DHCPREQUEST/RENEW is the message that a
+ DHCP client uses to extend its lease. It is unicast to the DHCP
+ server from which it acquired the lease. However, the DHCP protocol
+ (in a farsighted move), was explicitly designed so that in the event
+ that a DHCP client cannot contact the server from which it received a
+ lease on an IP address using a DHCPREQUEST/RENEW, the client is
+
+
+
+Droms, et. al. Expires September 2003 [Page 9]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ required to broadcast its renewal using a DHCPREQUEST/REBINDING to
+ any available DHCP server. Since all DHCP clients were required to
+ implement this algorithm, the failover protocol can have a different
+ server from the one that initially granted a lease be the server to
+ renew a lease. Thus, one server can take over for another with no
+ interruption in the service as experienced by the DHCP client or its
+ associated applications software.
+
+3.1.2. Client responsibility
+
+ In the DHCP protocol the DHCP clients are entrusted with a consider-
+ able responsibility. In particular, after they are granted a lease
+ on an IP address, they are enjoined to only use that IP address while
+ their lease is valid. Every DHCP client is expected to stop using an
+ IP address if the expiration time on the lease has passed and if it
+ cannot get an extension on the lease for that IP address from some
+ DHCP server. Thus, the correct behavior of every DHCP client in this
+ regard is required to ensure the integrity of the DHCP service. On
+ the other hand, incorrect behavior by a client in this area will tend
+ to adversely affect at most one other DHCP client.
+
+ Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or
+ DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or
+ broadcast for a REBINDING) MUST still have time to run on the lease
+ for that IP address. The DHCP server sends the DHCPACK back unicast
+ to the IP address from which the RENEW or REBINDING originated.
+
+ Given the existing responsibility placed on the client to only use an
+ IP address when the lease is valid, and to only send in a RENEW or
+ REBINDING if the lease is valid, the failover protocol relies on DHCP
+ clients to perform responsibly and will, in the absence of conflict-
+ ing information, believe a DHCP client that is attempting to RENEW or
+ REBIND a lease on an IP address is the legitimate owner of that IP
+ address.
+
+ If clients do not follow these rules, it is possible for an address
+ to be in use by more than one client. For a single server, this hap-
+ pens because the server has leased the expired address to another
+ client and the original client is also attempting to use the address.
+ The server would NAK the renewal request. This is made slightly worse
+ in the failover protocol if the two servers are unable to communicate
+ with each other and one server leases an available address to a new
+ client while the other server receives a renewal from a different
+ client. In this case, both servers lease the same address to dif-
+ ferent clients for the MCLT time.
+
+ One troublesome issue is that of the DHCP client responsibility when
+ sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP
+
+
+
+Droms, et. al. Expires September 2003 [Page 10]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ RFC was written to require a DHCP client to have time left to run on
+ the lease for an IP address if the client is sending an INIT-REBOOT
+ request, it was sufficiently unclear that some client vendors didn't
+ realize this until recently. Since the INIT-REBOOT request was sent
+ with the IP address in the dhcp-requested-address option and not in
+ the ciaddr (for perfectly good reasons), the similarity to the RENEW
+ and REBINDING case was lost on many people.
+
+ At present, the failover protocol does not assume that a client send-
+ ing in an INIT-REBOOT request necessarily has a valid lease on the IP
+ address appearing in the dhcp-requested-address option in the INIT-
+ REBOOT request.
+
+ The implications of this are as follows: Assume that there is a DHCP
+ client that gets a lease from one server while that server is unable
+ to communicate with its failover partner. Then, assume that after
+ that client reboots it is able only to communicate with the other
+ failover server. If the failover servers have not been able to com-
+ municate with each other during this process, then the DHCP client
+ will get a new IP address instead of being able to continue to use
+ its existing IP address. This will affect no applications on the DHCP
+ client, since it is rebooting. However, it will use up an additional
+ IP address in this marginal case.
+
+3.1.3. Stable storage update before DHCPACK
+
+ The DHCP protocol allocates resources, and in order to operate
+ correctly it requires that a DHCP server update some form of stable
+ storage prior to sending a DHCPACK to a DHCP client in order to grant
+ that client a lease on an IP address.
+
+ One of the goals of the failover protocol is that it not add signifi-
+ cant additional time to this already time consuming requirement to
+ update stable storage prior to a DHCPACK. In particular, adding a
+ requirement to communicate with another server prior to sending a
+ DHCPACK would greatly simplify the failover protocol, but it would
+ unacceptably limit the potential scalability of any DHCP server which
+ employed the failover protocol.
+
+3.2. BOOTP relay agent implementation
+
+ Many DHCP clients are not resident on the same network segment as a
+ DHCP server. In order to support this form of network architecture,
+ most contemporary routers implement something known as a BOOTP Relay
+ Agent. This capability inside of a router listens for all broadcasts
+ at the DHCP port, port 67, and will relay any broadcasts that it
+ receives on to a DHCP server. The IP address of the DHCP server must
+ have been previously configured into the router. As part of the
+
+
+
+Droms, et. al. Expires September 2003 [Page 11]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ relay process, the relay agent will place the address of the inter-
+ face on which it received the broadcast into the giaddr field of the
+ DHCP packet.
+
+ Since the failover protocol requires two DHCP servers to receive any
+ broadcast DHCP messages, in order to work with DHCP clients which are
+ not local to the DHCP server, the BOOTP relay agent on the router
+ closest to the DHCP client must be configured to point at more than
+ one DHCP server.
+
+ Most BOOTP relay agent implementations allow this duplication of
+ packets.
+
+ If this is not possible, an administrator might be able to configure
+ the relay agent with a subnet broadcast address, but in this case the
+ primary and secondary DHCP servers in a failover pair must both
+ reside on the same subnet.
+
+3.3. What does it mean if a server can't communicate with its partner?
+
+ In any protocol designed to allow one server to take over some
+ responsibilities from a partner server in the event of "failure" of
+ that partner server, there is an inherent difficulty in determining
+ when that partner server has failed.
+
+ In fact, it is fundamentally impossible for one server to distinguish
+ a network communications failure from the outright failure of the
+ server to which it is trying to communicate. In the case where each
+ server is handing out resources (in this case IP addresses) to a
+ client community, mistaking an inability to communicate with a
+ partner server for failure of that partner server could easily cause
+ both servers to be handing out the same IP addresses to different
+ clients.
+
+ One way that this is sometimes handled is for there to be more than
+ two servers. In the case of an odd number of servers, the servers
+ that can still communicate with a majority of other servers will con-
+ sider themselves operational, and any server which can't communicate
+ to a majority of other servers must immediately cease operations.
+
+ While this technique works in some domains, having the only server to
+ which a DHCP client can communicate voluntarily shut itself down
+ seems like something worth avoiding.
+
+ The failover protocol will operate correctly while both servers are
+ unable to communicate, whether they are both running or not. At some
+ point there may be resource contention, and if one of the servers is
+ actually down, then the operator can inform the operational server
+
+
+
+Droms, et. al. Expires September 2003 [Page 12]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ and the operational server will be able to use all of the failed
+ server's resources.
+
+ The protocol also allows detection of an orderly shutdown of a parti-
+ cipating server.
+
+3.4. Challenging scenarios for a Failover protocol
+
+ There exist two failure scenarios which provide particular challenges
+ to the correctness guarantees of a failover protocol.
+
+3.4.1. Primary Server crash before "lazy" update:
+
+ In the case where the primary server sends a DHCPACK to a client for
+ a newly allocated IP address and then crashes prior to sending the
+ corresponding update to the secondary server, the secondary server
+ will have no record of the IP address allocation. When the secondary
+ server takes over, it may well try to allocate that IP address to a
+ different client. In the case where the first client to receive the
+ IP address is not on the net at the time (yet while there was still
+ time to run on its lease), an ICMP echo (i.e., ping) will not prevent
+ the secondary server from allocating that IP address to a different
+ client.
+
+ The failover protocol deals with this situation by having the primary
+ and secondary servers allocate addresses for new clients from dis-
+ joint address pools. See section 5.5 for details.
+
+ A more likely (in that DHCPREQUEST/RENEWs are presumably more common
+ than DHCPDISCOVERs) and more subtle version of this problem is where
+ the primary server crashes after extending a client's lease time, and
+ before updating the secondary with a new time using a lazy update.
+ After the secondary takes over, if the client is not connected to the
+ network the secondary will believe the client's lease has expired
+ when, in fact, it has not. In this case as well, the IP address
+ might be reallocated to a different client while the first client is
+ still using it.
+
+ This scenario is handled by the failover protocol through control of
+ the lease time and the use of the maximum client lead time (MCLT).
+ See section 5.2.1 for details.
+
+3.4.2. Network partition where DHCP servers can't communicate but each
+can talk to clients:
+
+ Several conditions are required for this situation to occur. First,
+ due to a network failure, the primary and secondary servers cannot
+ communicate. As well, some of the DHCP clients must be able to
+
+
+
+Droms, et. al. Expires September 2003 [Page 13]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ communicate with the primary server, and some of the clients must now
+ only be able to communicate with the secondary server. When this
+ condition occurs, both primary and secondary servers could attempt to
+ allocate IP addresses for new clients from the same pool of available
+ addresses. At some point, then, two clients will end up being allo-
+ cated the same IP address. This will cause problems when the network
+ failure that created this situation is corrected.
+
+ The failover protocol deals with this situation by having the primary
+ and secondary servers allocate addresses for new clients from dis-
+ joint address pools. See section 5.5 for details.
+
+3.5. Using TCP to detect partner server failure
+
+ There are several characteristics of TCP that are important to the
+ functioning of the failover protocol, which uses one TCP connection
+ for both bulk data transfer as well as to assess communications
+ integrity with the other server. Reliable and ordered message
+ delivery are chief among these important characteristics.
+
+ It would be nice to use the capabilities built in to TCP to allow it
+ to determine if communications integrity exists to the failover
+ partner but this strategy contains some problems which require
+ analysis. There exist three fundamental cases for an open TCP con-
+ nection that must be examined.
+
+ 1. When no data is being sent on a TCP connection, the TCP layer
+ also does not exchange any signaling messages to assure that
+ the peer is still up.
+
+ 2. When data is queued to be sent, and the receiver has not
+ blocked the sending of additional data, then messages are
+ flowing across the TCP connection containing the applications
+ data.
+
+ 3. When data is queued to be sent, and the receiver has blocked
+ the transmission of additional data, then persist messages are
+ flowing from the receiver to the sender to ensure that the
+ sender doesn't miss the receiver opening the window for
+ further transmissions.
+
+ The first case can be turned into the second case by sending
+ application-level keep-alive messages periodically when there is no
+ other data queued to be sent. Note TCP keep-alive messages might be
+ used as well, but they present additional problems.
+
+ Thus, we can ensure that the TCP connection has messages flowing
+ periodically across the connection fairly easily. The question
+
+
+
+Droms, et. al. Expires September 2003 [Page 14]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ remains as to what TCP will do if the other end of the connection
+ fails to respond (either because of network partition or because the
+ receiving server crashes). TCP will attempt to retransmit a message
+ with an exponential backoff, and will eventually timeout that
+ retransmission. However, the length of that timeout cannot, in gen-
+ eral, be set on a per-connection basis, and is frequently as long as
+ nine minutes, though in some cases it may be as short as two minutes.
+ On some systems it can be set system-wide, while on other systems it
+ cannot be changed at all.
+
+ A value for this timeout that would be appropriate for the failover
+ protocol, say less than 1 minute, could have unpleasant side-effects
+ on other applications running on the same server, assuming that it
+ could be changed at all on the host operating system.
+
+ Nine minutes is a long time for the DHCP service to be unavailable to
+ any new clients that were being served by the server which has
+ crashed, when there is another server running that could respond to
+ them as soon as it determines that its partner is not operational.
+
+ The conclusion drawn from this analysis is that TCP provides very
+ useful support for the failover protocol in the areas of reliable and
+ ordered message delivery, but cannot by itself be relied upon to
+ detect partner server failure in a fashion acceptable to the needs of
+ the failover protocol. Additional failover protocol capabilities
+ have been created to support timely detection of partner server
+ failure. See section 8.3 for details on this mechanism.
+
+4. Design Goals
+
+ This section lists the design goals and the limitations of the fail-
+ over protocol.
+
+4.1. Design goals for this protocol
+
+ The following is a list of goals that are met by this protocol. They
+ are listed in priority order.
+
+ 1. Implementations of this protocol must work with existing DHCP
+ client implementations based on the DHCP protocol [RFC 2131].
+
+ 2. Implementations of the protocol must work with existing BOOTP
+ relay agent implementations.
+
+ 3. The protocol must provide failover redundancy between servers
+ that are not located on the same subnet.
+
+ 4. Provide for continued service to DHCP clients through an
+
+
+
+Droms, et. al. Expires September 2003 [Page 15]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ automated mechanism in the event of failure of the primary
+ server.
+
+ 5. Avoid binding an IP address to a client while that binding is
+ currently valid for another client. In other words, do not
+ allocate the same IP address to two clients.
+
+ 6. Minimize any need for manual administrative intervention.
+
+ 7. Introduce no additional delays in server response time as a
+ result of the network communications required to implement the
+ failover protocol, i.e., don't require communications with the
+ partner between the receipt of a DHCPREQUEST and the
+ corresponding DHCPACK.
+
+ 8. Share IP address ranges between primary and secondary servers;
+ i.e., impose no requirement that the pool of available
+ addresses be manually or permanently divided between servers.
+
+ 9. Continue to meet the goals and objectives of this protocol in
+ the event of server failure or network partition.
+
+ 10. Provide graceful reintegration of full protocol service after
+ server failure or network partition.
+
+ 11. Allow for one computer to act as a secondary server for multi-
+ ple primary servers. The protocol must allow failover primary
+ and secondary configuration choices to be made at a granular-
+ ity smaller than "all of the subnets served by a single
+ server", though individual implementations may not choose to
+ allow such flexibility.
+
+ 12. Ensure that an existing client can keep its existing IP
+ address binding if it can communicate with either the primary
+ or secondary DHCP server implementing this protocol - not just
+ whichever server that originally offered it the binding.
+
+ 13. Ensure that a new client can get an IP address from some
+ server. Ensure that in the face of partition, where servers
+ continue to run but cannot communicate with each other, the
+ above goals and requirements may be met. In addition, when
+ the partition condition is removed, allow graceful automatic
+ re-integration without requiring human intervention.
+
+ 14. If either primary or secondary server loses all of the infor-
+ mation that it has stored in stable storage, ensure that it be
+ able to refresh its stable storage from the other server.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 16]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ 15. Support load balancing between the primary and secondary
+ servers, and allow configuration of the percentage of the
+ client population served by each with a moderately fine granu-
+ larity.
+
+
+4.2. Limitations of this protocol
+
+ The following are explicit limitations of this protocol.
+
+ 1. This protocol provides only one level of redundancy through a
+ single secondary server for each primary server.
+
+ 2. A subset of the address pool is reserved for secondary server
+ use. In order to handle the failure case where both servers
+ are able to communicate with DHCP clients, but unable to com-
+ municate with each other, a subset of the IP address pool must
+ be set aside as a private address pool for the secondary
+ server. The secondary can use these to service newly arrived
+ DHCP clients during such a period. The required size of this
+ private pool is based only on the arrival rate of new DHCP
+ clients and the length of expected downtime, and is not influ-
+ enced in any way by the total number of DHCP clients supported
+ by the server pair.
+
+ The failover protocol can be used in a mode where both the
+ primary and secondary servers can share the load between them
+ when both are operating. In this load balancing mode, the
+ addresses allocated by the primary server to the secondary
+ server are not unused, but are used instead to service the
+ portion of the client base to which the secondary server is
+ required to respond. See section 5.3 for more information on
+ load balancing.
+
+ 3. The primary and secondary servers do not respond to client
+ requests at all while recovering from a failure that could
+ have resulted in duplicate IP assignments. (When synchroniz-
+ ing in POTENTIAL-CONFLICT state).
+
+
+5. Protocol Overview
+
+ This section will discuss the failover protocol at a relatively high
+ level of detail. In the event that a description in this section
+ conflicts (or appears to conflict due to the overview nature of this
+ section) with information in later sections of this draft, the infor-
+ mation in the later sections should be considered authoritative.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 17]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+5.1. Messages and States
+
+ This protocol is centered around the message exchange used by one
+ server to update the other server of binding database changes result-
+ ing from DHCP client activity:
+
+ o Communication of binding database changes
+
+ The binding update (BNDUPD) message is used to send the binding
+ database changes to the partner server, and the partner server
+ responds with a binding acknowledgement (BNDACK) message when it
+ has successfully committed those changes to its own stable
+ storage.
+
+ All of the other messages involve ancillary issues:
+
+ o Management of available IP addresses
+
+ The pool request (POOLREQ) message is used by the secondary
+ server to request an allocation of IP addresses from the primary
+ server. The pool response (POOLRESP) message is used by the
+ primary server to inform the secondary server how many IP
+ addresses were allocated to the secondary server as the result
+ of the pool request.
+
+ o Synchronization of the binding databases between the servers
+ after they've been out of communications
+
+ The update request (UPDREQ) message is used by one server to
+ request that its partner send it all binding database informa-
+ tion that it has not already seen. The update request all
+ (UPDREQALL) message is used by one server to request that all
+ binding database information be sent in order to recover from a
+ total loss of its binding database by the requesting server.
+ The update done (UPDDONE) message is used by the responding
+ server to indicate that all requested updates have been sent the
+ responding server and acked by the requesting server.
+
+ o Connection establishment
+
+ The connect (CONNECT) message is used by the primary server to
+ establish a high level connection with the other server, and to
+ transmit several important configuration data items between the
+ servers. The connect acknowledgement message (CONNECTACK) is
+ used by the secondary server to respond to a CONNECT message
+ from the primary server. The disconnect (DISCONNECT) message is
+ used by either server when closing a connection.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 18]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ o Server synchronization
+
+ The state change (STATE) message is used by either server to
+ inform the other server of a change of failover state.
+
+ o Connection integrity management
+
+ The contact (CONTACT) message is used by either server to ensure
+ that the other server continues to see the connection as opera-
+ tional. It MUST be transmitted periodically over every esta-
+ blished connection if other message traffic is not flowing, and
+ it MAY be sent at any time.
+
+5.1.1. Failover endpoints
+
+ The proper operation of the failover protocol requires more than the
+ transmission of messages between one server and the other. Each end-
+ point might seem to be a single DHCP server, but in fact there are
+ many situations where additional flexibility in configuration is use-
+ ful.
+
+ For instance, there might be several servers which are each primary
+ for a distinct set of address pools, and one server which is secon-
+ dary for all of those address pools. The situation with the pri-
+ maries is straightforward, but the secondary will need to maintain a
+ separate failover state, partner state, and communications up/down
+ status for each of the separate primary servers for which it is act-
+ ing as a secondary.
+
+ The failover protocol is SHOULD be configured with one failover rela-
+ tionship between each pair of failover servers. In this case there is
+ one failover endpoint for that relationship on each partner. This
+ failover relationship MUST have a unique name, which is communicated
+ using the relationship-name option in the CONNECT and CONNECTACK mes-
+ sages.
+
+ There is typically little need for addtional relationships between
+ any two servers but there MAY be more than one failover relationship
+ between two servers -- however each MUST have a unique relationship
+ name (stored in the relationship-name option).
+
+ Any failover endpoint can take actions and hold unique states.
+
+ Thus, in the case where there are two primary servers A and B each
+ backed up by a single common secondary server C, there is one fail-
+ over endpoint on each of A and B, and two different failover end-
+ points on C. The two different failover endpoints on C each have
+ unique states, unique relationship names, and independent TCP
+
+
+
+Droms, et. al. Expires September 2003 [Page 19]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ connections.
+
+ This document frequently describes the behavior of the protocol in
+ terms of primary and secondary servers, not primary and secondary
+ failover endpoints. However, it is important to remember that every
+ 'server' described in this document is in reality a failover endpoint
+ that resides in a particular process, and that many failover end-
+ points may reside in the same server process.
+
+ It is not the case that there is a unique failover endpoint for each
+ subnet address pool that participates in a failover relationship. On
+ one server, there is (typically) one failover endpoint per partner,
+ regardless of how many subnet address pools are managed by that com-
+ bination of partner and role. Conversely, on a particular server,
+ any given subnet address pool will be associated with exactly one
+ failover endpoint.
+
+ When a connection is received from the partner, the unique failover
+ endpoint to which the message is directed is determined solely by the
+ IP address of the partner, the relationship-name, and the role of the
+ receiving server. See section 8.2.
+
+5.2. Fundamental guarantees
+
+ There a several fundamental restrictions this protocol places on what
+ one server can do in the absence of knowledge of the other server.
+ Operating within these restrictions allows certain guarantees to be
+ made to the partner server, and these are key to the correct opera-
+ tion of the protocol.
+
+5.2.1. Control of lease time
+
+ The key problem with lazy update is that when a server fails after
+ updating a client with a particular lease time and before updating
+ its partner, the partner will believe that a lease has expired even
+ though the client still retains a valid lease on that IP address.
+
+ In order to handle this problem, a period of time known as the "Max-
+ imum Client Lead Time" (MCLT) is defined and must be known to both
+ the primary and secondary servers. Proper use of this time interval
+ places an upper bound on the difference allowed between the lease
+ time provided to a DHCP client by a server and the lease time known
+ by that server's partner. However, the MCLT is typically much less
+ than the lease time that a server has been configured to offer a
+ client, and so some strategy must exist to allow a server to offer
+ the configured lease time to a client. During a lazy update the
+ updating server typically updates its partner with a potential
+ expiration time which is longer than the lease time previously given
+
+
+
+Droms, et. al. Expires September 2003 [Page 20]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ to the client and which is longer than the lease time that the server
+ has been configured to give a client. This allows that server to
+ give a longer lease time to the client the next time the client
+ renews its lease, since the time that it will give to the client will
+ not exceed the MCLT beyond the potential expiration time acknowledged
+ by its partner.
+
+ The PARTNER-DOWN state exists so that a server can be sure that its
+ partner is, indeed, down. Correct operation while in that state
+ requires (generally) that the server wait the MCLT after anything
+ that happened prior to its transition into PARTNER-DOWN state (or,
+ more accurately, when the other server went down if that is known).
+ Thus, the server MUST wait the MCLT after the partner server went
+ down before allocating any of the partner's addresses which were
+ available for allocation. In the event the partner was not in com-
+ munication prior to going down, it might have allocated one or more
+ of its FREE addresses to a DHCP client and been unable to inform the
+ server entering PARTNER-DOWN prior to going down itself. By waiting
+ the MCLT after the time the partner went down, the server in
+ PARTNER-DOWN state ensures that any clients which have a lease on one
+ of the partner's FREE addresses will either time out or contact the
+ server in PARTNER-DOWN by the time that period ends.
+
+ In addition, once a server has made a transition to PARTNER-DOWN
+ state, it MUST NOT reallocate an IP address from one client to
+ another client until the longer of the following two times:
+
+ o The MCLT after the time the partner server went down (see
+ above).
+
+ o An additional MCLT interval after the lease by the original
+ client expires. (Actually, until the maximum client lead time
+ after what it believes to be the lease expiration time of the
+ client.)
+
+ Some optimizations exist for this restriction, in that it only
+ applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
+ a server has entered PARTNER-DOWN and it leases out an address, it
+ need not wait this time as long as it has never communicated with the
+ partner since the lease was given out.
+
+ The fundamental relationship on which much of the correctness of this
+ protocol depends is that the lease expiration time known to a DHCP
+ client MUST NOT be more than the maximum client lead time greater
+ than the potential expiration time known to a server's partner.
+
+ The remainder of this section makes the above fundamental relation-
+ ship more explicit.
+
+
+
+Droms, et. al. Expires September 2003 [Page 21]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ This protocol requires a DHCP server to deal with several different
+ lease intervals and places specific restrictions on their relation-
+ ships. The purpose of these restrictions is to allow the other server
+ in the pair to be able to make certain assumptions in the absence of
+ an ability to communicate between servers.
+
+ The different lease times are:
+
+ o desired lease interval
+
+ The desired lease interval is the lease interval that a DHCP server
+ would like to give to a DHCP client in the absence of any restric-
+ tions imposed by the Failover protocol. Its determination is out-
+ side of the scope of this protocol. Typically this is the result of
+ external configuration of a DHCP server.
+
+ o actual lease interval
+
+ The actual lease internal is the lease interval that a DHCP server
+ gives out to a DHCP client in the dhcp-lease-time option of a
+ DHCPACK packet. It may be shorter than the desired client lease
+ interval (as explained below).
+
+ o potential lease interval
+
+ The potential lease interval is the lease expiration interval the
+ local server tells to its partner in the potential-expiration-time
+ option of a BNDUPD message.
+
+ o acknowledged potential lease interval
+
+ The acknowledged potential lease interval is the potential lease
+ interval the partner server has most recently acknowledged in the
+ potential-expiration-time option of a BNDACK message.
+
+ The key restriction (and guarantee) that any server makes with
+ respect to lease intervals is that the actual client lease interval
+ never exceeds the acknowledged potential lease interval (if any) by
+ more than a fixed amount. This fixed amount is called the "Maximum
+ Client Lead Time" (MCLT).
+
+ The MCLT MAY be configurable on the primary server, but for correct
+ server operation it MUST be the same and known to both the primary
+ and secondary servers. The secondary server determines the MCLT from
+ the MCLT option sent from the primary server to the secondary server
+ in the CONNECT message.
+
+ A server MUST record in its stable storage both the actual lease
+
+
+
+Droms, et. al. Expires September 2003 [Page 22]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ interval and the most recently acknowledged potential lease interval
+ for each IP address binding. It is assumed that the desired client
+ lease interval can be determined through techniques outside of the
+ scope of this protocol. See section 7.1.5 for more details concern-
+ ing the times that the server MUST record in its stable storage and
+ the way that they interact with the lease time that may be offered to
+ a DHCP client.
+
+ Again, the fundamental relationship among these times which MUST be
+ maintained is:
+
+ actual lease interval <
+ ( acknowledged potential lease interval + MCLT )
+
+
+ Figure 5.2.1-1 illustrates an initial lease to a client using the
+ rules discussed in the example which follows it. Note that this is
+ only one example -- as long as the fundamental relationship is
+ preserved, the actual times used could be quite different.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 23]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+ DHCP Primary Secondary
+ time Client Server Server
+
+ | (time in intervals) | (absolute time) |
+ | | |
+ | >-DHCPDISCOVER-> | |
+ | <---DHCPOFFER-< | |
+ | lease-time=MCLT | |
+ | | |
+ | >-DHCPREQUEST-> | |
+ | (selecting) | |
+ | | |
+ t | <--------DHCPACK-< | |
+ | lease-time=MCLT | |
+ | | >-BNDUPD--> |
+ | | lease-expiration=t+MCLT
+ | | potential-expiration=t+(MCLT/2)+X
+ | | |
+ | | <-BNDACK-< |
+ | | potential-expiration=t+(MCLT/2)+X
+ ... ... ...
+ | | |
+ t+MCLT/2 | >-DHCPREQUEST-> | |
+ | (renew) | |
+ | | |
+ t1 | <--------DHCPACK-< | |
+ | lease-time=X | |
+ | | >-BNDUPD--> |
+ | | lease-expiration=t1+X
+ | | potential-expiration=t1+(X/2)+X
+ | | |
+ | | <-BNDACK-< |
+ | | potential-expiration=t1+(X/2)+X
+ ... ... ...
+
+ Figure 5.2.1-1: Lazy Update Message Traffic
+ X = Desired Lease Interval
+ Assumes renewal interval = lease interval / 2
+
+
+ DISCUSSION:
+
+ This protocol mandates only that the above fundamental relation-
+ ship concerning lease intervals is preserved.
+
+ In the interests of clarity, however, let's examine a specific
+ example. The MCLT in this case is 1 hour. The desired lease
+
+
+
+Droms, et. al. Expires September 2003 [Page 24]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ interval is 3 days, and its renewal time is half the lease inter-
+ val.
+
+ The rules for this example are:
+
+ o What to tell the client:
+
+ Take the remainder of the acknowledged potential lease interval.
+ If this is a new lease, then this value will be zero. If this
+ remainder plus the MCLT is greater than the desired lease inter-
+ val, give the client the desired lease interval else give the
+ client the remainder plus the MCLT.
+
+ o What to tell the failover partner server:
+
+ Take the renewal interval (typically half of the actual client
+ lease interval), add to it the desired lease interval, and add
+ it to the current time to yield the value that goes into the
+ potential-expiration-time option.
+
+ Also tell the failover partner the actual lease interval by
+ adding it to the current time to yield the value that goes into
+ the lease-expiration option.
+
+ In operation this might work as follows:
+
+ When a server makes an offer for a new lease on an IP address to a
+ DHCP client, it determines the desired lease interval (in this
+ case, 3 days). It then examines the acknowledged potential lease
+ interval (which in this case is zero) and determines the remainder
+ of the time left to run, which is also zero. To this it adds the
+ MCLT. Since the actual lease interval cannot be allowed to exceed
+ the remainder of the current acknowledged potential lease interval
+ plus the MCLT, the offer made to the client is for the remainder
+ of the current acknowledged potential lease interval (i.e., zero)
+ plus the MCLT. Thus, the actual lease interval is 1 hour.
+
+ Once the server has performed the DHCPACK to the DHCP client, it
+ will update the secondary server with the lease information. How-
+ ever, the desired potential lease interval will be composed of one
+ half of the current actual lease interval added to the desired
+ lease interval. Thus, the secondary server is updated with a
+ BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
+ potential-expiration-time option.
+
+ When the primary server receives a BNDACK to its update of the
+ secondary server's (partner's) potential lease interval, it
+ records that as the acknowledged potential lease interval. A
+
+
+
+Droms, et. al. Expires September 2003 [Page 25]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ server MUST NOT send a BNDACK in response to a BNDUPD message
+ until it is sure that the information in the BNDUPD message
+ resides in its stable storage. Thus, the primary server in this
+ case can be sure that the secondary server has recorded the poten-
+ tial lease interval in its stable storage when the primary server
+ receives a BNDACK message from the secondary server.
+
+ When the DHCP client attempts to renew at T1 (approximately one
+ half an hour from the start of the lease), the primary server
+ again determines the desired lease interval, which is still 3
+ days. It then compares this with the remaining acknowledged
+ potential lease interval (3 days + 1/2 hour) and adjusts for the
+ time passed since the secondary was last updated (1/2 hour). Thus
+ the time remaining of the acknowledged potential lease interval is
+ 3 days. Adding the MCLT to this yields 3 days plus 1 hour, which
+ is more than the desired lease interval of 3 days. So the client
+ is renewed for the desired lease interval -- 3 days.
+
+ When the primary DHCP server updates the secondary DHCP server
+ after the DHCP client's renewal ACK is complete, it will calculate
+ the desired potential lease interval as the T1 fraction of the
+ actual client lease interval (1/2 of 3 days this time = 1.5 days).
+ To this it will add the desired client lease interval of 3 days,
+ yielding a total desired partner server lease interval of 4.5
+ days. In this way, the primary attempts to have the secondary
+ always "lead" the client in its understanding of the client's
+ lease interval so as to be able to always offer the client the
+ desired client lease interval.
+
+ Once the initial actual client lease interval of the MCLT is past,
+ the protocol operates effectively like the DHCP protocol does
+ today in its behavior concerning lease intervals. However, the
+ guarantee that the actual client lease interval will never exceed
+ the remaining acknowledged partner server lease interval by more
+ than the MCLT allows full recovery from a variety of failures.
+
+5.2.2. Controlled re-allocation of IP addresses
+
+ When in PARTNER-DOWN state there is a waiting period after which an
+ IP address can be re-allocated to another client. For IP addresses
+ which are available when the server enters PARTNER-DOWN state, the
+ period is the MCLT from entry into PARTNER-DOWN state. For IP
+ addresses which are not available when the server enters PARTNER-DOWN
+ state, the period is the MCLT after the IP address becomes available.
+ See section 9.4.2 for more details.
+
+ In any other state, a server cannot reallocate an address from one
+ client to another without first notifying its partner (through a
+
+
+
+Droms, et. al. Expires September 2003 [Page 26]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ BNDUPD message) and receiving acknowledgement (through a BNDACK mes-
+ sage) that its partner is aware that that first client is not using
+ the address.
+
+ This could be modeled in the following way. Though this specific
+ implementation is in no way required, it may serve to better illus-
+ trate the concept.
+
+ An "available" IP address on a server may be allocated to any client.
+ An IP address which was leased to a client and which expired or was
+ released by that client would take on a new state, EXPIRED or
+ RELEASED respectively. The partner server would then be notified
+ that this IP address was EXPIRED or RELEASED through a BNDUPD. When
+ the sending server received the BNDACK for that IP address showing it
+ was FREE, it would move the IP address from EXPIRED or RELEASED to
+ FREE, and it would be available for allocation by the primary server
+ to any clients.
+
+ A server MAY reallocate an IP address in the EXPIRED or RELEASED
+ state to the same client with no restrictions provided it has not
+ sent a BNDUPD message to its partner. This situation would exist if
+ the lease expired or was released after the transition into PARTNER-
+ DOWN state, for instance.
+
+
+5.3. Load balancing
+
+ In order to implement load balancing between a primary and secondary
+ server pair, each server must respond to DHCPDISCOVER requests from
+ some clients and not from other clients. In order to do this suc-
+ cessfully, each server must be able to determine immediately upon
+ receipt of a DHCP client request whether it is to service this
+ request or to ignore it in order to allow the other server to service
+ the request.
+
+ In addition, it should be possible to configure the percentage of
+ clients which will be serviced by either the primary or secondary
+ server. This configuration should be more or less continuous, from
+ all clients serviced by the primary through an even split with half
+ serviced by each, to all clients serviced by the secondary.
+
+ The technique chosen to support these goals is described in [RFC
+ 3074].
+
+ A bitmap-style Hash Bucket Assignment (as described in [RFC 3074]) is
+ used to determine which DHCP clients can be processed. There are two
+ potential HBA's in a failover server -- a server HBA and a failover
+ HBA. The way that a server acquires a server HBA is outside of the
+
+
+
+Droms, et. al. Expires September 2003 [Page 27]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ scope of the failover protocol, but both servers in a failover pair
+ MUST have the same server HBA. The failover HBA (which specifies the
+ clients that the secondary is supposed to process) is sent by the
+ primary server to the secondary server whenever a connection is esta-
+ blished, using the hash-bucket-assignment option defined in section
+ 12.11.
+
+ When using the server HBA (if any) and the failover HBA (if any), to
+ decide whether to process a DHCP request, the server HBA always
+ applies in every failover state, and the failover HBA (which MUST be
+ a subset of the server HBA) is used by the secondary server to decide
+ which packets to process when in NORMAL state.
+
+5.4. IP address allocations between servers
+
+ The failover protocol allows a DHCP server which implements it to
+ operate correctly in spite of the uncertainty over whether its
+ partner has failed or whether the communications link to its partner
+ has failed. This is made possible in part by the existence of
+ separate address pools on each server for allocation to newly arrived
+ DHCP clients.
+
+ Thus, each server has its own pool of available IP addresses. Note
+ that an IP address is not "owned" by a particular server throughout
+ its entire lifetime. Only an IP address which is available is
+ "owned" by a particular server -- once it has been leased to a DHCP
+ client, it is not owned by either failover partner. When it finally
+ becomes available again, it will be owned initially by the primary
+ server, and it may or may not be allocated to the secondary server by
+ the primary server.
+
+ So, the flow of IP address ownership is as follows: initially an IP
+ address is owned by the primary server. It may be allocated to the
+ secondary server if it is available, and then it is owned by the
+ secondary server. Either server can allocate available IP addresses
+ which they own to DHCP clients, in which case they cease to own them.
+ When the DHCP client releases the address or the lease on it expires,
+ it will again become available and will be owned by the primary.
+
+ An IP address will not become owned by the server which allocated it
+ initially when it is released or the lease expires because, in gen-
+ eral, that server will have had to replenish its pool of available
+ addresses well in advance of any likely lease expirations. Thus,
+ having a particular IP address cycle back to the secondary might well
+ put the secondary more out of balance with respect to the primary
+ instead of enhancing the balance of available addresses between them.
+
+ These address pools are used when in COMMUNICATIONS-INTERRUPTED state
+
+
+
+Droms, et. al. Expires September 2003 [Page 28]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ and while waiting for the MCLT expiration in PARTNER-DOWN state. In
+ addition, when using load balancing, these pools are used when in
+ NORMAL state as well.
+
+ This allocation and maintenance of these address pools is an area of
+ some sensitivity, since the goal is to maintain a more or less con-
+ stant ratio of available addresses between the two servers.
+
+ The initial allocation when the servers first integrate is triggered
+ by the POOLREQ message from the secondary to the primary. This is
+ followed by the POOLRESP message where the primary tells the secon-
+ dary how many IP addresses it allocated to the secondary. Then, the
+ primary sends the allocated IP addresses to the secondary via BNDUPD
+ messages. l The POOLREQ/POOLRESP message is a trigger to the primary
+ to perform a scan of its database and to ensure that the secondary
+ has enough IP addresses (based on some configured ratio).
+
+ The actual IP addresses are sent to the secondary using the BNDUPD
+ message with a state of BACKUP, which indicates the IP address is now
+ available for allocation by the secondary. Once the message is sent,
+ the primary MUST NOT use these addresses for allocation to DHCP
+ clients.
+
+ The POOLREQ/POOLRESP message exchange initiated by the secondary is
+ valid at any time, and the primary server SHOULD, whenever it
+ receives the POOLREQ message, scan its database of address pools and
+ determine if the secondary needs more IP addresses from any of the IP
+ address pools.
+
+ However, in order to support a reasonably dynamic balance of the IP
+ addresses between the failover partners, the primary server needs to
+ do additional work to ensure that the secondary server has as many IP
+ addresses as it needs (but that it doesn't have *more* than it needs
+ either).
+
+ The primary server SHOULD examine the balance of available addresses
+ between the primary and secondary for a particular address pool when-
+ ever the number of available addresses for either the primary or
+ secondary changes. The primary server SHOULD adjust the available
+ address balance as required to ensure the configured address balance,
+ excepting that the primary server SHOULD employ some threshold
+ mechanism to such a balance adjustment in order to minimize the over-
+ head of maintaining this balance.
+
+ An example of a threshold approach is: do not attempt to re-balance
+ the available pools on the primary and secondary until the out of
+ balance value exceeds a configured value.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 29]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ The primary server can, at any time, send an available IP address to
+ the secondary using a BNDUPD with the state BACKUP. The primary
+ server can attempt to take an available IP address away from the
+ secondary by sending a BNDUPD with the state FREE. If the secondary
+ accepts the BNDUPD, then it is now available to the PRIMARY and not
+ available to the secondary. Of course, the secondary MUST reject
+ that BNDUPD if it has already used that IP address for a DHCP client.
+
+ Whenever the primary server examines the possible available IP
+ addresses which it could send to the secondary server, the primary
+ server SHOULD take into account whether load balancing is in use, and
+ it SHOULD attempt to send to the secondary any IP addresses whose
+ most recent client would be processed by the secondary under the
+ current load balancing regime in use. Likewise, when removing avail-
+ able IP addresses from the secondary server when load balancing is in
+ use, the primary server SHOULD first remove those IP addresses whose
+ most recent client would be processed by the primary server under the
+ current load balancing regime in use.
+
+5.5. Operating in NORMAL state
+
+ When in NORMAL state, each server services DHCPDISCOVER's and all
+ other DHCP requests other than DHCPREQUEST/RENEWAL or
+ DHCPREQUEST/REBINDING from the client set defined by the load balanc-
+ ing algorithm [RFC 3074]. Each server services DHCPREQUEST/RENEWAL
+ or DHCPDISCOVER/REBINDING requests from any client.
+
+ In general, whenever the binding database is changed in stable
+ storage (other than a change resulting from receiving a BNDUPD from
+ the failover partner), then a BNDUPD message is sent with the con-
+ tents of that change to the partner server. The partner server then
+ writes the information about that binding in its bindings database in
+ stable storage and replies with a BNDACK message.
+
+ The binding database in a DHCP server would normally be changed as a
+ result of DHCP protocol activity with a DHCP client (e.g., granting
+ a lease to a DHCP client through the familiar
+ DISCOVER/OFFER/REQUEST/ACK cycle or extending a lease due to a
+ renewal from a DHCP client) or possibly (on some servers) because a
+ lease has expired or undergone another state change that must be
+ recorded in the DHCP binding database. These are the state changes
+ that would be communicated to the partner server using a BNDUPD mes-
+ sage. Of course, receipt of a BNDUPD message itself will normally
+ cause an update of the binding database for all of the IP addresses
+ contained in the BNDUPD, and a binding database change such as this
+ MUST NOT trigger a corresponding BNDUPD message to the partner.
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 30]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+5.6. Operating in COMMUNICATIONS-INTERRUPTED state
+
+ When operating in COMMUNICATIONS-INTERRUPTED state, each server is
+ operating independently, but does not assume that its partner is not
+ operating. The partner server might be operating and simply unable
+ to communicate with this server, or might not be operating.
+
+ Each server responds to the full range of DHCP client messages that
+ it receives (subject to server load balancing [RFC 3074]), but in
+ such a way that graceful reintegration is always possible when its
+ partner comes back into contact with it.
+
+5.7. Operating in PARTNER-DOWN state
+
+ When operating in PARTNER-DOWN state, a server assumes that its
+ partner is not currently operating, but does make allowances for the
+ possibility that that server was operating in the past, though possi-
+ bly out of communications with this server. It responds to all DHCP
+ client requests in PARTNER-DOWN state (subject to server load balanc-
+ ing [RFC 3074]).
+
+5.8. Operating in RECOVER state
+
+ A server operating in RECOVER state assumes that it is reintegrating
+ with a server that has been operating in PARTNER-DOWN state, and that
+ it needs to update its bindings database before it services DHCP
+ client requests.
+
+ A server may also operate in RECOVER state in order to fully recover
+ its bindings database from its partner server.
+
+5.9. Operating in STARTUP state
+
+ A server operating in STARTUP state assumes that failover is opera-
+ tional, and it spends a short time whenever it comes up attempting to
+ contact the partner. During this short time, the server is unrespon-
+ sive to DHCP client requests. This period exists in order to give a
+ server a chance to determine that its partner has changed state since
+ it was last in communications, and to react to that changed state (if
+ any) prior to responding to DHCP client requests.
+
+ The startup period SHOULD be conditioned on the length of time the
+ server has been down (if that can be determined). If the server has
+ been down less than the MCLT then it can wait only a few (say 5 or
+ 10) seconds. If it has been down a longer time (such that the
+ partner may well have moved to PARTNER-DOWN state), a considerably
+ longer startup period of 30 to 60 seconds may be warranted, since the
+ consequences of running while the partner is in PARTNER-DOWN state
+
+
+
+Droms, et. al. Expires September 2003 [Page 31]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ are unpleasant.
+
+ The period of time a server remains in STARTUP state SHOULD be long
+ enough to ensure that it will connect to the other server if that
+ server is available for connections.
+
+5.10. Time synchronization between servers
+
+ The failover protocol is designed to operate between two servers
+ which have time values which differ by an arbitrarily large amount.
+ A particular implementation MAY choose to only support servers whose
+ time values differ by an arbitrarily small amount.
+
+ Note that if an implementation that requires time synchronization
+ between servers encounters a case where the time is not synchronized
+ to its satisfaction between two servers, then this failure will prob-
+ ably prevent the two servers from reaching communications OK status.
+ In this occurs, and if both servers continue to operate and deal with
+ clients, potentially troublesome things can happen. For instance, if
+ there is a safe period configured on either server, then it will
+ eventually go into PARTNER-DOWN state, but in this case the partner
+ will not be down. This will almost certainly create problems. Thus,
+ some method to prevent this sort of situation SHOULD exist in imple-
+ mentations that can be configured to require time synchronization.
+
+ In any event, whether large or only small differences in time values
+ are supported, every message that is sent MUST include the time and
+ every packet that is received MUST be tagged with a time value as
+ soon as possible after receipt. This time value is used along with
+ the time value that is sent in every message between the failover
+ partners to develop a delta time between the servers. This delta
+ time is used during the connection process to establish a baseline
+ delta time between the servers, and upon receipt of each message, the
+ delta time for that message is used to refine the delta time for the
+ server pair.
+
+ While the algorithm for this refinement of delta time is not speci-
+ fied as part of this protocol, a server SHOULD allow the delta time
+ value for a pair of failover servers to be periodically updated to
+ account for time drift. In addition, the delta time value between
+ servers SHOULD be smoothed in some fashion, so that transient network
+ delays will not cause it to vary wildly.
+
+ A server SHOULD recognize a drastic change in the delta time value as
+ an event to be signaled to a network administrator, as well as reset-
+ ting the time delta between the failover partners.
+
+ The specific definitions of a minor or drastic change in delta time
+
+
+
+Droms, et. al. Expires September 2003 [Page 32]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ as well as the algorithm used to smooth minor changes into the run-
+ ning delta time are implementation issues and are not further
+ addressed in this document.
+
+5.11. IP address binding-status
+
+ In most DHCP servers an IP address can take on several different
+ binding-status values, sometimes also called states. While no two
+ DHCP servers probably have exactly the same possible binding-status
+ values, the DHCP RFC enforces some commonality among the general
+ semantics of the binding-status values used by various DHCP server
+ implementations.
+
+ In order to transmit binding database updates between one server and
+ another using the failover protocol, some common denominator
+ binding-status values must be defined. It is not expected that these
+ binding-status-values correspond with any actual implementation of
+ the DHCP protocol in a DHCP server, but rather that the binding-
+ status values defined in this document should be a common denominator
+ of those in use by many DHCP server implementations. It is a goal of
+ this protocol that any DHCP server can map the various IP address
+ binding-status values that it uses internally into these failover IP
+ address binding-status values on transmission of binding database
+ updates to its partner, and likewise that it can map any failover IP
+ address binding-status values it received in a binding update into
+ its internal IP address binding-status values.
+
+ The IP address binding-status values defined for the failover proto-
+ col are listed below. Unless otherwise noted below, there MAY be
+ client information associated with each of these binding-status
+ values.
+
+ o ACTIVE -- Lease is assigned to a client. Client identification
+ MUST appear.
+
+ o EXPIRED -- indicates that a client's binding on an IP address
+ has expired. When the partner server ACK's the BNDUPD of an
+ EXPIRED IP address, the server sets its internal state to FREE.
+ It is then available for allocation to any client of the primary
+ server. It may be allocated to the same client on the server
+ where the lease expired if a BNDUPD containing the EXPIRED state
+ has not yet been sent to the partner (e.g., in the event that
+ the servers are not in communication). Client identification
+ SHOULD appear.
+
+ o RELEASED -- indicates that a DHCP client sent in a DHCPRELEASE
+ message. When the partner server ACK's the BNDUPD of an
+ RELEASED IP address, the server sets its internal state to FREE,
+
+
+
+Droms, et. al. Expires September 2003 [Page 33]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ and it is available for allocation by the primary server to any
+ DHCP client. It may be allocated to the same client if a BNDUPD
+ has not yet been sent to the partner. Client identification
+ SHOULD appear.
+
+ o FREE -- is used when a DHCP server needs to communicate that an
+ IP address is unused by any DHCP client, but it was not just
+ released, expired, or reset by a network administrator. When
+ the partner server ACK's the BNDUPD of a FREE IP address, the
+ server sets its internal state such that it is available for
+ allocation by the primary DHCP server to any DHCP client. (Note
+ that in PARTNER-DOWN state, after waiting the MCLT, the IP
+ address MAY be allocated to a DHCP client by the secondary
+ server.)
+
+ Note that when an IP address that was allocated by the secondary
+ reverts to the FREE state, it must (like any other IP address)
+ be assigned to the secondary through the POOLREQ/BNDUPD process
+ before the secondary can reallocate it.
+
+ Client identification MAY appear.
+
+ o ABANDONED -- indicates that an IP address is considered unusable
+ by the DHCP subsystem. An IP address for which a valid PING
+ response was received SHOULD be set to ABANDONED. An IP address
+ for which a DHCPDECLINE was received should be set to ABANDONED.
+ Client identification MUST NOT appear.
+
+ o RESET -- indicates that this IP address was made available by
+ operator command. This is a distinct state so that the reason
+ that the IP address became FREE can be determined. Client iden-
+ tification MAY appear.
+
+ o BACKUP -- indicates that this IP address can be allocated by the
+ secondary server to a DHCP client at any time. When the MCLT has
+ passed after its time of entry into PARTNER-DOWN state, the IP
+ address may be allocated by the primary to any DHCP client.
+ Client identification MAY appear.
+
+ These binding-status values are communicated from one failover
+ partner to another using the binding-status option, see section 12.3
+ for details of this option. Unless otherwise noted above there MAY
+ be client information associated with each of these binding-status
+ values.
+
+ An IP address will move between these binding-status values using the
+ following state transition diagram:
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 34]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+ DHCP client DECLINE or
+ server detected problem
+ from any state
+ |
+ V
+ +----------+ +--+------+
+ External >---->| RESET | (3) |ABANDONED|
+ command | +<--------+ |
+ +----------+ +---------+
+ |
+ Comm w/Parter(1)
+ V
+ +---------+ Comm(1) +----------+ Comm(1) +---------+
+ | EXPIRED |--------->| FREE |<----------| RELEASED|
+ | | w/Parter | | w/Partner | |
+ +---------+ +----------+ +---------+
+ ^ ^ | | +-----------+ ^
+ | | | | | |
+ | Exp. grace IP | IP addr alloc. IP addr |
+ | period ends address to sec.(2) reserved |
+ | | leased V | |
+ | | by | +----------+ | |
+ | | primary | BACKUP |<---+ |
+ | wait for | | | |
+ | grace period | +----------+ |
+ | | | | |
+ | | | IP addr leased by |
+ | Expired grace | secondary |
+ | period exists V V |
+ | | +----------+ |
+ | | Lease on | ACTIVE | DHCPRELEASE |
+ +-----+-IP addr---| |------------------+
+ expires +----------+
+
+
+ Figure 5.11-1: Transitions between binding-status values.
+
+ (1) This transition MAY also occur if the server is in
+ PARTNER-DOWN state and the MCLT has passed since the entry
+ in the RELEASED, EXPIRED, or RESET states.
+
+ (2) This transition MAY occur if the server is the secondary
+ and the MCLT has passed since its entry into PARTNER-DOWN state.
+
+ (3) This transition MAY occur due to an implementation specific
+ handling of ABANDONED IP addresses.
+
+
+
+Droms, et. al. Expires September 2003 [Page 35]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+
+ Again, note that a DHCP server implementing the failover protocol
+ does not have to implement either this state machine or use these
+ particular binding-status values in its normal operation of allocat-
+ ing IP addresses to DHCP clients. It only needs to map its internal
+ binding-status-values onto these "standard" binding-status values,
+ and map these "standard" binding-status values back into its internal
+ binding-status values. For example, a server which implements a
+ grace period for a IP address binding SHOULD simply wait to update
+ its partner server until the grace period on that binding has run
+ out.
+
+ The process of setting an IP address to FREE deserves some detailed
+ discussion. When an IP address is moved to the EXPIRED,RELEASED, or
+ RESET binding-status on a server, it will send a BNDUPD with the
+ binding-status of EXPIRED, RELEASED, or RESET to its partner. If its
+ partner agrees that is acceptable (see sections 7.1.2 and 7.1.3 con-
+ cerning why a server might not accept a BNDUPD) it will return a
+ BNDACK with no reject-reason, signifying that it accepted the update.
+ As part of the BNDUPD processing, the server returning the BNDACK
+ will set the binding-status of the IP address to FREE, and upon
+ receipt of the BNDACK the server which sent the BNDUPD will set the
+ binding-status of the IP address to FREE. Thus, the EXPIRED,
+ RELEASED, or RESET binding-status is something of a transitory state.
+ This process is encoded in the transition diagram above by "Comm
+ w/Partner".
+
+5.12. DNS dynamic update considerations
+
+ DHCP servers (and clients) can use DNS Dynamic Updates as described
+ in [RFC 2136] to maintain DNS name-mappings as they maintain DHCP
+ leases. Many different administrative models for DHCP-DNS integra-
+ tion are possible. Descriptions of several of these models, and
+ guidelines that DHCP servers and clients should follow in carrying
+ them out, are laid out in [FQDN]. The nature of the DHCP failover
+ protocol introduces some issues concerning dynamic DNS updates that
+ are not part of non-failover DHCP environments. This section
+ describes these issues, and defines the information which failover
+ partners should exchange and the protocol which they should follow in
+ order to ensure consistent behavior. The presence of this section
+ should not be interpreted as requiring that implementations of the
+ DHCP failover protocol must also support DDNS updates. The purpose
+ of this discussion is to clarify the areas where the DHCP failover
+ and DHCP-DDNS protocols intersect for the benefit of implementations
+ which support both protocols, not to introduce a new requirement into
+ the DHCP failover protocol. Thus, a DHCP server which implements the
+
+
+
+Droms, et. al. Expires September 2003 [Page 36]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ failover protocol MAY also support dynamic DNS updates, but if it
+ does support dynamic DNS updates it SHOULD utilize the techniques
+ described here in order to correctly distribute them between the
+ failover partners. See [FQDN], [DNSRES], and [DHCID] for details of
+ how DHCP servers update DNS.
+
+ From the standpoint of the failover protocol, there is no reason why
+ a server which is utilizing the DDNS protocol to update a DNS server
+ should not be a partner with a server which is not utilizing the DDNS
+ protocol to update a DNS server. However, a server which is not able
+ to support DDNS or is not configured to support DDNS SHOULD output a
+ warning message when it receives BNDUPD messages which indicate that
+ its failover partner is configured to support the DDNS protocol to
+ update a DNS server. An implementation MAY consider this an error
+ and refuse to operate, or it MAY choose to operate anyway, having
+ warned the user of the problem in some way.
+
+5.12.1. Relationship between failover and dynamic DNS update
+
+ The failover protocol describes the conditions under which each fail-
+ over server may renew a lease to its current DHCP client, and
+ describes the conditions under which it may grant a lease to a new
+ DHCP client. An analogous set of conditions determines when a fail-
+ over server should initiate a DDNS update, and when it should attempt
+ to remove records from the DNS. The failover protocol's conditions
+ are based on the desired external behavior: avoiding duplicate
+ address assignments; allowing clients to continue using leases which
+ they obtained from one failover partner even if they can only commun-
+ icate with the other partner; allowing the backup DHCP server to
+ grant new leases even if it is unable to communicate with the primary
+ server. The desired external DDNS behavior for DHCP failover servers
+ is:
+
+ 1. Allow timely DDNS updates from the server which grants a
+ client a lease. Recognize that there is often a DDNS update
+ lifecycle which parallels the DHCP lease lifecycle. This is
+ likely to include the addition of records when the lease is
+ granted, and the removal of DNS records when the lease is sub-
+ sequently made available for allocation to a different client.
+
+ 2. Communicate enough information between the two failover
+ servers to allow one to complete the DDNS update 'lifecycle'
+ even if the other server originally granted the lease.
+
+ 3. Avoid redundant or overlapping DDNS updates, where both fail-
+ over servers are attempting to perform DDNS updates for the
+ same lease-client binding. Avoid situations where one partner
+ is attempting to add RRs related to a lease binding while the
+
+
+
+Droms, et. al. Expires September 2003 [Page 37]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ other partner is attempting to remove RRs related to the same
+ lease binding.
+
+5.12.2. Use of the DDNS option
+
+ In order for either server to be able to complete a DDNS update, or
+ to remove DNS records which were added by its partner, both servers
+ need to know the FQDN associated with the lease-client binding. The
+ FQDN associated with the client's A RR and PTR RR SHOULD be communi-
+ cated from the server which adds records into the DNS to its partner.
+ The initiating server SHOULD use the DDNS option in the BNDUPD mes-
+ sages to inform the partner server of the status of any DDNS updates
+ associated with a lease binding. Failover servers MAY choose not to
+ include the DDNS option in BNDUPD messages if there has been no
+ change in the status of any DDNS update related to the lease binding.
+ The partner server receiving BNDUPD messages containing the DDNS
+ option SHOULD compare the status flags and the FQDN contained in the
+ option data with the current DDNS information it has associated with
+ the lease binding, and update its notion of the DDNS status accord-
+ ingly.
+
+ The initiating server MAY send a BNDUPD to its partner before the
+ DDNS update has been successfully completed. If it does so, it SHOULD
+ leave the 'C' bit in the Flags field clear, to indicate to the
+ partner that the DDNS update may not be complete. When the DDNS
+ update has been successfully acknowledged by the DNS server, the ini-
+ tiating DHCP server SHOULD include the DDNS option in its next BNDUPD
+ message about the binding, so that the partner server will be able to
+ record the final status of the DDNS update. The initiating server
+ SHOULD set the 'C' bit in the DDNS option if the DDNS update was suc-
+ cessfully accepted by the DNS server.
+
+ Some implementations will choose to send a BNDUPD without waiting for
+ the DDNS update to complete, and then will send a second BNDUPD once
+ the DDNS update is complete. Other implementations will delay sending
+ the partner a BNDUPD until the DDNS update has been acknowledged by
+ the DNS server, or until some time-limit has elapsed, in order to
+ avoid sending a second BNDUPD.
+
+ The Domain Name field in the DDNS option contains the FQDN that will
+ be associated with the A RR (if the server is performing an A RR
+ update for the client) and the PTR RR. This FQDN may be composed in
+ any of several ways, depending on server configuration and the infor-
+ mation provided by the client in its DHCP messages. The client may
+ supply a hostname which it would like the server to use in forming
+ the FQDN, or it may supply the entire FQDN. The server may be config-
+ ured to attempt to use the information the client supplies, it may be
+ configured with an FQDN to use for the client, or it may be
+
+
+
+Droms, et. al. Expires September 2003 [Page 38]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ configured to synthesize an FQDN. The responsive server SHOULD
+ include the FQDN that it will be using in DDNS updates it initiates
+ when it sends the DDNS option.
+
+ Since the responsive server may not have completed the DDNS update at
+ the time it sends the first BNDUPD about the lease binding, there may
+ be cases where the FQDN in later BNDUPD messages does not match the
+ FQDN included in earlier messages. For example, the responsive
+ server may be configured to handle situations where two or more DHCP
+ client FQDNs are identical by modifying the most-specific label in
+ the FQDNs of some of the clients in an attempt to generate unique
+ FQDNs for them (a process sometimes called "disambiguation"). Alter-
+ natively, at sites which use some or all of the information which
+ clients supply to form the FQDN, it's possible that a client's confi-
+ guration may be changed so that it begins to supply new data. The
+ responsive server may react by removing the DNS records which it ori-
+ ginally added for the client, and replacing them with records that
+ refer to the client's new FQDN. In such cases, the responsive server
+ SHOULD include the actual FQDN that was used in subsequent DDNS
+ options. The responsive server SHOULD include relevant client-option
+ data in the client-request-options option in its BNDUPD messages.
+ This information may be necessary in order to allow the non-
+ responsive partner to detect client configuration changes that change
+ the hostname or FQDN data which the client includes in its DHCP
+ requests.
+
+5.12.3. Adding RRs to the DNS
+
+ A failover server which is going to perform DDNS updates SHOULD ini-
+ tiate the DDNS update when it grants a new lease to a client. The
+ non-responsive partner SHOULD NOT initiate a DDNS update when it
+ receives the BNDUPD after the lease has been granted. The failover
+ protocol ensures that only one of the partners will grant a lease to
+ any individual client, so it follows that this requirement will
+ prevent both partners from initiating updates simultaneously. The
+ server initiating the update SHOULD follow the protocol in [FQDN].
+ The server may be configured to perform an A RR update on behalf of
+ its clients, or not. Ordinarily, a failover server will not initiate
+ DDNS updates when it renews leases. In two cases, however, a failover
+ server MAY initiate a DDNS update when it renews a lease to its
+ existing client:
+
+ 1. When the lease was granted before the server was configured to
+ perform DDNS updates, the server MAY be configured to perform
+ updates when it next renews existing leases. Since both
+ servers are responsive to renewals in NORMAL state, it is not
+ enough to simply require the non-responsive server to avoid a
+ DNS update in this case. The server which would be responsive
+
+
+
+Droms, et. al. Expires September 2003 [Page 39]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ to a DHCPDISCOVER from this client (even though the current
+ request is a DHCPREQUEST/RENEW) is the server which should
+ initiate the DDNS update.
+
+ 2. If a server is in PARTNER-DOWN state, it can conclude that its
+ partner is no longer attempting to perform an update for the
+ existing client. If the remaining server has not recorded that
+ an update for the binding has been successfully completed, the
+ server MAY initiate a DDNS update. It MAY initiate this
+ update immediately upon entry to PARTNER-DOWN state, it may
+ perform this in the background, or it MAY initiate this update
+ upon next hearing from the DHCP client.
+
+5.12.4. Deleting RRs from the DNS
+
+ The failover server which makes an IP address FREE SHOULD initiate
+ any DDNS deletes, if it has recorded that DNS records were added on
+ behalf of the client.
+
+ A server not in PARTNER-DOWN state "makes an IP address FREE" when it
+ initiates a BNDUPD with a binding-status of FREE, EXPIRED, or
+ RELEASED. Its partner confirms this status by acking that BNDUPD,
+ and upon receipt of the ACK the server has "made the IP address
+ FREE". Conversely, a server in PARTNER-DOWN state "makes an IP
+ address FREE" when it sets the binding-status to FREE, since in
+ PARTNER-DOWN state no communications is required with the partner.
+
+ It is at this point that it should initiate the DDNS operations to
+ delete RRs from the DDNS. Its partner SHOULD NOT initiate DDNS
+ deletes for DNS records related to the lease binding as part of send-
+ ing the BNDACK message. The partner MAY have issued BNDUPD messages
+ with a binding-status of FREE, EXPIRED, or RELEASED previously, but
+ the other server will have NAKed these BNDUPD messages.
+
+ The failover protocol ensures that only one of the two partner
+ servers will be able to make a lease FREE. The server making the
+ lease FREE may be doing so while it is in NORMAL communication with
+ its partner, or it may be in PARTNER-DOWN state. If a server is in
+ PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
+ its partner added originally. This allows a single remaining partner
+ server to assume responsibility for all of the DDNS activity which
+ the two servers were undertaking.
+
+ Another implication of this approach is that no DDNS RR deletes will
+ be performed while either server is in COMMUNICATIONS-INTERRUPTED
+ state, since no IP addresses are moved into the FREE state during
+ that period.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 40]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+5.13. Reservations and failover
+
+ Some DHCP servers support a capability to offer specific pre-
+ configured IP addresses to DHCP clients. These are real DHCP
+ clients, they do the entire DHCP protocol, but these servers always
+ offer the client a specific pre-configured IP address -- and they
+ offer that IP address to no other clients. Such a capability has
+ several names, but it is sometimes called a "reservation", in that
+ the IP address is reserved for a particular DHCP client.
+
+ In a situation where there are two DHCP servers serving the same sub-
+ net without using failover, the two DHCP server's need to have dis-
+ joint IP address pools, but identical reservations for the DHCP
+ clients.
+
+ In a failover context, both servers need to be configured with the
+ proper reservations in an identical manner, but if we stop there
+ problems can occur around the edge conditions where reservations are
+ made for an IP address that has already been leased to a different
+ client. Different servers handle this conflict in different ways,
+ but the goal of the failover protocol is to allow correct operation
+ with any server's approach to the normal processing of the DHCP pro-
+ tocol.
+
+ The general solution with regards to reservations is as follows.
+ Whenever a reserved IP address becomes FREE (i.e., when first config-
+ ured or whenever a client frees it or it expires or is reset), the
+ primary server MUST show that IP address as FREE (and thus available
+ for its own allocation) and it MUST send it to the secondary server
+ with the R bit set in the IP-flags option and the binding-status
+ BACKUP.
+
+ Note that this implies that a reserved IP address goes through the
+ normal state changes from FREE to ACTIVE (and possibly back to FREE).
+ The failover protocol supports this approach to reservations, i.e.,
+ where the IP address undergoes the normal state changes of any IP
+ address, but it can only be offered to the client for which it is
+ reserved. Other approaches to the support of reservations exist in
+ some DHCP server implementations (e.g., where the IP address is
+ apparently leased to a particular client forever, without any expira-
+ tion). The goal is for the failover protocol to support any of the
+ usual approaches to reservations, both those that allow an IP address
+ to go through different states when reserved, and those that don't.
+
+ From the above, it follows that a reservation soley on the secondary
+ will not necessarily allow the secondary to offer that address to
+ client to whom it is reserved. The reservation must also appear on
+ the primary as well for the secondary to be able to offer the IP
+
+
+
+Droms, et. al. Expires September 2003 [Page 41]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ address to the client to which is is reserved.
+
+ When the reservation on an IP address is cancelled, if the IP address
+ is currently FREE and the server is the primary, or BACKUP and the
+ server is the secondary, the server MUST send a BNDUPD to the other
+ server with the binding-status FREE and the R bit clear.
+
+5.14. Dynamic BOOTP and failover
+
+ Some DHCP servers support a capability to offer IP addresses to BOOTP
+ clients without having a particular address previously allocated for
+ those clients. This capability is often called something like
+ "dynamic BOOTP". It is discussed briefly in RFC 1534 [RFC 1534].
+
+ This capability has a negative interaction with the fundamental ele-
+ ments of the failover protocol, in that an address handed out to a
+ BOOTP device has no term (or effectively no term, in that usually
+ they are considered leases for "forever"). There is no opportunity
+ to hand out a lease which is only the MCLT long when first hearing
+ from a BOOTP device, because they may only interact once with the
+ DHCP server and they have no notion of a lease expiration time. Thus
+ the entire concept of the MCLT and waiting the MCLT after entering
+ PARTNER-DOWN state is defeated when dealing with BOOTP devices.
+
+ With some restrictions, however, dynamic BOOTP devices can be sup-
+ ported in a server on a subnet where failover is supported. The only
+ restriction (and it is not small) is that on any portion of the sub-
+ net (in any address pool) where dynamic BOOTP devices can be allo-
+ cated IP addresses, a DHCP server MUST NOT ever use any of the IP
+ addresses which were previously available for allocation by its fail-
+ over partner. Thus, the addresses allocated by the primary to the
+ secondary for allocation that might have been allocated to BOOTP dev-
+ ices MUST NOT ever be used by the primary server even if it is in
+ PARTNER-DOWN state and has waited the MCLT after entering that state.
+ Conversely, addresses available for allocation by the primary MUST
+ NOT be used by the secondary even it is in PARTNER-DOWN state. The
+ reason for this is because one of those IP address could have been
+ allocated by the secondary server to a BOOTP device, and the primary
+ server would have no way of ever knowing that happened.
+
+ Whenever a server sends BNDUPD message to its partner, if the client
+ associated with the IP address is a BOOTP client, then the server
+ MUST set the B bit in the IP-flags option.
+
+ There is a very slight possibility that a BOOTP client could get an
+ IP address on each server of a failover pair. When these two servers
+ eventually attempt to resolve this conflict, they SHOULD agree to
+ disagree, since it is not possible to know which IP address the BOOTP
+
+
+
+Droms, et. al. Expires September 2003 [Page 42]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ client will actually use -- indeed, it could use both. Operator
+ intervention will, in general, be required to rectify this situation.
+ Fortunately, it is extremely unlikely to ever actually occur.
+
+5.15. Guidelines for selecting MCLT
+
+ There is no one correct value for the MCLT. There is an explicit
+ tradeoff between various factors in selecting an MCLT value.
+
+5.15.1. Short MCLT
+
+ A short MCLT value will mean that after entering PARTNER-DOWN state,
+ a server will only have to wait a short time before it can start
+ allocating its partner's IP addresses to DHCP clients. Furthermore,
+ it will only have to wait a short time after the expiration of a
+ lease on an IP address before it can reallocate that IP address to
+ another DHCP client.
+
+ However the downside of a short MCLT value is that the initial lease
+ interval that will be offered to every new DHCP client will be short,
+ which will cause increased traffic as those clients will need to send
+ in their first renew in a half of a short MCLT time. In addition,
+ the lease extensions that a server in COMMUNICATIONS-INTERRUPTED
+ state can give will be only the MCLT after the server has been in
+ COMMUNICATIONS-INTERRUPTED for around the desired client lease
+ period. If a server stays in COMMUNICATIONS-INTERRUPTED for that
+ long, then the leases it hands out will be short and that will
+ increase the load on that server, possibly causing difficulty.
+
+5.15.2. Long MCLT
+
+ A long MCLT value will mean that the initial lease period will be
+ longer and the time that a server in COMMUNICATIONS-INTERRUPTED state
+ will be able to extend leases (after it has been in COMMUNICATIONS-
+ INTERRUPTED state for around the desired client lease period) will be
+ longer.
+
+ However, a server entering PARTNER-DOWN state will have to wait the
+ longer MCLT before being able to allocate its partner's IP addresses
+ to new DHCP clients. This may mean that additional IP addresses are
+ required in order to cover this time period. Further, the server in
+ PARTNER-DOWN will have to wait the longer MCLT from every lease
+ expiration before it can reallocate an IP address to a different DHCP
+ client.
+
+5.16. What is sent in response to an UPDREQ or UPDREQALL message?
+
+ In section 7.3, the UPDREQ message is defined, and it says that the
+
+
+
+Droms, et. al. Expires September 2003 [Page 43]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ receiving server sends to the requesting server "all of the binding
+ database information that it has not already seen". In section
+ 7.4.2, the UPDREQALL message is defined, and it says that the receiv-
+ ing server sends to the requesting server "all binding database
+ information".
+
+ Both of these statements need further elaboration.
+
+ First, for the UPDREQ message, the information to be sent in BNDUPD
+ messages concerns "all of the binding database information it has not
+ already seen". Since every BNDUPD is acked by the receiving server,
+ the sending server need only keep track of which IP addresses have
+ binding database changes not yet seen by the partner, and when they
+ are finally acked by the partner it can record that. Thus, at any
+ time, it knows which IP addresses have unacked binding database
+ information. This is less simple when, across reconfigurations of
+ the servers, an IP address can change the failover partner to which
+ it is associated. In that case, it is important to reset the indica-
+ tion that the partner has seen this binding information. See section
+ 5.17, below, for a more complete discussion of this issue.
+
+ Second, in the event that a failover server's binding database infor-
+ mation is restored from a backup, it will be partially out of date.
+ In this case, its partner's indication of which binding database
+ information the restored server has seen will be also be out of date.
+
+ The solution to this problem is for a server which is connecting with
+ its partner to check the partner's last communicated time, and if it
+ is very much ahead of its own last communicated time, go to into
+ RECOVER state and transmit an UPDREQALL to allow it to refresh its
+ state. See section 9.3.2, step 5. If the partner's last communi-
+ cated time is very much behind its own record of when it last commun-
+ icated with the partner, then it SHOULD invalidate its information on
+ which binding database information the partner server knows, so that
+ it will send all of its relevant binding database information to the
+ partner.
+
+ Third, in the event that a server receives a UPDREQALL message, what
+ constitutes "all binding database information"? At first glance this
+ would seem to be information on every configured IP address in the
+ server. While this would be technically correct, it may impose a
+ serious and unacceptable performance penalty on servers which have
+ millions of configured IP addresses. What can be done to lessen the
+ data that must be sent for an UPDREQALL?
+
+ When sending "all binding database information", if the sending
+ server sends only information concerning IP addresses which have been
+ at some time associated with clients, it will send enough information
+
+
+
+Droms, et. al. Expires September 2003 [Page 44]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ to satisfy the needs of the failover protocol. It need not send
+ information on any IP addresses that have never been used, since
+ presumably they will be initialized as available to the primary
+ server (i.e. FREE) on any server employing failover.
+
+5.17. How do you determine that your partner is "up to date" for
+specific binding?
+
+ Throughout this document, one server is assumed to know for each IP
+ address binding whether or not its partner is "up to date" for that
+ binding. There are some subtle issues involved in recording this "up
+ to date" information about a specific binding.
+
+ In a steady state world, it would suffice to have a single bit in the
+ binding database to represent the information about whether the
+ partner was or was not up to date.
+
+ In a more complex environment a configuration change affecting a par-
+ ticular IP address may change the failover endpoint with which it is
+ associated, and if this should happen, any "up to date" bit which is
+ written into the bindings database will be accurate for only the pre-
+ vious failover endpoint, but not the current failover endpoint. If
+ failover is disabled and then re-enabled (and the "up to date" bits,
+ if used, are not cleared) problems can also occur.
+
+ A server MUST have be able to relate the "up to date" condition to a
+ particular failover endpoint and even a particular instantiation of
+ that failover endpoint. The techniques to do this are implementation
+ dependent.
+
+ In addition, section 7.4 requires that a server be able to remember
+ that an UPDREQALL message has been received and to treat every UPDREQ
+ message as an UPDREQALL message until the first UPDDONE message is
+ sent. One way to do this is to clear all of the "up to date" indica-
+ tions for an entire failover endpoint upon receipt of an UPDREQALL
+ message, thereby ensuring that every active binding will be sent to
+ the partner whether through the completion of this UPDREQALL or
+ through processing of a subsequent UPDREQ message. This is actually
+ better than remembering that an UPDREQALL was received and turning
+ every UPDREQ into an UPDREQALL, since any information sent in an
+ incomplete UPDREQALL (or subsequent UPDREQ messages turned into "all"
+ messages) will be remembered and not re-sent.
+
+6. Common Message Format
+
+ This section discusses the common message format that all failover
+ messages have in common, including the message header format as well
+ as the common option format. See section 12 for the the definitions
+
+
+
+Droms, et. al. Expires September 2003 [Page 45]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ of the specific options used in the failover protocol.
+
+6.1. Message header format
+
+ The options contained in the payload data section of the failover
+ message all use a two byte option number and two byte length format.
+
+ All failover protocol messages are sent over the TCP connection
+ between failover endpoints and encoded using a message format
+ specific to the failover protocol.
+
+ There exists a common message format for all failover messages, which
+ utilizes the options in a way similar to the DHCP protocol. For each
+ message type, some options are required and some are optional. In
+ addition, when a message is received any options that are not under-
+ stood by the receiving server MUST be ignored.
+
+ All of the fields in the fixed portion of the message MUST be filled
+ with correct data in every message sent.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | message length (2) | msg type (1) |payload off (1)|
+ +---------------+---------------+---------------+---------------+
+ | time (4) |
+ +---------------------------------------------------------------+
+ | xid (4) |
+ +---------------------------------------------------------------+
+ | 0 or more additional header bytes (variable) |
+ +---------------------------------------------------------------+
+ | payload data (variable) |
+ | |
+ | formatted as DHCP-style options |
+ | using a two byte option code and two byte length |
+ | See section 6.2 for details. |
+ +---------------------------------------------------------------+
+
+
+
+ message length - 2 bytes, network byte order
+
+ This is the length of the message in bytes. It includes the two byte
+ message length itself. The maximum length is 2048 bytes. The
+ minimum length is 12.
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 46]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ msg type - 1 byte
+
+ The message type field is used to distinguish between messages.
+
+ The following message types are defined:
+
+ Value Message Type
+ ----- ------------
+ 0 reserved not used
+ 1 POOLREQ request allocation of addresses
+ 2 POOLRESP respond with allocation count
+ 3 BNDUPD update partner with binding info
+ 4 BNDACK acknowledge receipt of binding update
+ 5 CONNECT establish connection with the secondary
+ 6 CONNECTACK respond to attempt to establish connection with partner
+ 7 UPDREQALL request full transfer of binding info
+ 8 UPDDONE ack send and ack of req'd binding info
+ 9 UPDREQ request transfer of un-acked binding info
+ 10 STATE inform partner of current state or state change
+ 11 CONTACT probe communications integrity with partner
+ 12 DISCONNECT close a connection
+
+
+ New message types should be defined in one of two ranges, 0-127 or
+ 129-255. The range of 0-127 is used for messages that MUST be sup-
+ ported by every server, and if a server receives a message in the
+ range of 0-127 that it doesn't understand, it MUST close the TCP con-
+ nection. The range of 128-255 is used for messages which MAY be sup-
+ ported but are not required, and if a server receives a message in
+ this range that it does not understand it SHOULD ignore the message.
+
+ payload offset - 1 byte
+
+ The byte offset of the Payload Data, from the beginning of the
+ failover message header. The value for the current protocol version
+ (version 1) is 8.
+
+ time - 4 bytes, network byte order
+
+ The absolute time in GMT when the message was transmitted,
+ represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
+ the ANSI C time_t time value representation). While the ANSI C
+ time_t value is signed, the value used in this specification is
+ unsigned.
+
+ A server SHOULD set this time as close to the actual transmission of
+ the message as possible.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 47]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ xid - 4 bytes, network byte order
+
+ This is the transaction id of the failover message. The sender of a
+ failover protocol message is responsible for setting this number, and
+ the receiver of the message copies the number over into any response
+ message, treating it as opaque data. The sender MUST ensure that
+ every message sent from a particular failover endpoint over the
+ associated TCP connection has a unique transaction id.
+
+ For failover messages that have no corresponding response message,
+ the XID value is meaningless, but MUST be supplied. The XID value is
+ used solely by the receiver of a response message to determine the
+ corresponding request message.
+
+ Request messages where the XID is used in the corresponding response
+ messages are: POOLREQ, BNDUPD, CONNECT, UPDREQALL, and UPDREQ. The
+ corresponding response messages are POOLRESP, BNDACK, CONNECTACK,
+ UPDDONE, and UPDDONE, respectively.
+
+ As requests/responses don't survive connection reestablishment, XIDs
+ only need to be unique during a specific connection.
+
+
+ payload data - variable length
+
+ The options are placed after the header, after skipping payload
+ offset bytes from beginning of the message. The payload data options
+ are not preceded by a "cookie" value.
+
+ The payload data is formatted as DHCP style options using two byte
+ option codes and two byte option lengths. The option codes are in a
+ namespace which is unique to the failover protocol.
+
+ The maximum length of the payload data in octets is 2048 less the
+ size of the header, i.e., the maximum message length is 2048 octets.
+
+6.2. Common option format
+
+ The options contained in the payload data section of the failover
+ message all use a two byte option number and two byte length format.
+
+ The option numbers are drawn from an option number space unique to
+ the failover protocol. All of the message types share a common
+ option number space and common options definitions, though not all
+ options are required or meaningful for every message.
+
+ In contrast to the options which appear in DHCP client and server
+ messages, the options in failover message are ordered. That is, for
+
+
+
+Droms, et. al. Expires September 2003 [Page 48]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ some messages the order in which the options appear in the payload
+ data area is significant. The messages for which option ordering is
+ significant explicitly describe the ordering requirements. If no
+ ordering requirements are mentioned, then the order is not signifi-
+ cant for that message.
+
+ For all options which refer to time, they all use an absolute time in
+ GMT. Time synchronization has already been achieved between the
+ source and the target server using the CONNECT message and is updated
+ and refined using the time in every packet.
+
+ The time value is an unsigned 32 bit integer in network byte order
+ giving the number of seconds since 00:00 UTC, 1st January 1970. This
+ can be converted to an NTP timestamp by adding decimal 2208988800.
+ This time format will not wrap until the year 2106. Until sometime
+ in 2038, it is equal to the ANSI C time_t value (which is a signed 32
+ bit value and will overflow into a negative number in 2038).
+
+ Options should appear once only in each message (except for BNDUPD
+ and BNDACK messages where bulking is used, see section 6.3 for
+ details.) An option that appears twice is not concatenated, but
+ treated as an error.
+
+ Specific option values are described in section 12.
+
+ See section 13 for how to define additional options.
+
+6.3. Batching multiple binding update transactions in one BNDUPD mes-
+sage
+
+ Implementations of this protocol MAY send multiple binding update
+ transactions in one BNDUPD message, where a binding update transac-
+ tion is defined as the set of options which are associated with the
+ update of a single IP address. All implementations of this protocol
+ MUST be prepared to receive BNDUPD messages which contain multiple
+ binding update transactions and respond correctly to them, including
+ replying with a BNDACK message which contains status for the multiple
+ binding update transactions contained in the BNDUPD message.
+
+ In the discussion of sending and receiving BNDUPD messages in section
+ 7.1 and BNDACK messages in section 7.2, each BNDUPD message and
+ BNDACK message is assumed to contain a single binding update transac-
+ tion in order to reduce the complexity of the discussions in section
+ 7.
+
+ Multiple binding update transactions MAY be batched together in one
+ BNDUPD protocol message with the data sets for the individual tran-
+ sactions delimited by the assigned-IP-address option, which MUST
+
+
+
+Droms, et. al. Expires September 2003 [Page 49]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ appear first in the option set for each transaction. Ordering of
+ options between the assigned-IP-address options is not significant.
+ This is illustrated in the following schematic representation:
+
+
+ Non-IP Address/Non-client specific options first
+ assigned-IP-address option for the first IP address
+ Options pertaining to first address, including at least the
+ binding-status option and others as required.
+ assigned-IP-address option for the second IP address
+ Options pertaining to second address, including at least the
+ binding-status option and others as required.
+ ...
+ Trailing options (message digest).
+
+
+ There MUST be a one-to-one correspondence between BNDUPD and BNDACK
+ messages, and every BNDACK message MUST contain status for all of the
+ binding update transactions in the corresponding BNDUPD message.
+
+ The BNDACK message corresponding to a BNDUPD message MUST contain
+ assigned-IP-address options for all of the binding update transac-
+ tions in the BNDUPD message. Thus, every BNDACK message contains
+ exactly the same assigned-IP-address options as does its correspond-
+ ing BNDUPD message. The order of the assigned-IP-address options
+ MAY, however, be different. Here is a schematic representation of a
+ BNDACK:
+
+
+ Non-IP Address/Non-client specific options first
+ assigned-IP-address option for the first IP address
+ If rejected, reject-reason option and message option.
+ assigned-IP-address option for the second IP address
+ If rejected, reject-reason option and message option.
+ ...
+ Trailing options (message digest).
+
+
+ In case the server chooses to reject some or all of the IP address
+ binding information in a BNDUPD message in a BNDACK reply, the BNDACK
+ message MUST contain a reject-reason option following every failed
+ assigned-IP-address option in order to indicate that the binding
+ update transaction for that IP address was not accepted and why. As
+ with a BNDACK message containing a single binding update transaction,
+ an assigned-IP-address option without any associated reject-reason
+ option indicates a successful binding update transaction.
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 50]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+7. Protocol Messages
+
+ This section contains the detailed definition of the protocol mes-
+ sages, including the information to include when sending the message,
+ as well as the actions to take upon receiving the message. The mes-
+ sage type for each message appears as [n] in the heading for the mes-
+ sage (see section 6.1).
+
+7.1. BNDUPD message [3]
+
+ The binding update (BNDUPD) message is used to send the binding data-
+ base changes (known as binding update transactions) to the partner
+ server, and the partner server responds with a binding acknowledge-
+ ment (BNDACK) message when it has successfully committed those
+ changes to its own stable storage.
+
+ The rest of the failover protocol exists to determine whether the
+ partner server is able to communicate or not, and to enable the
+ partners to exchange BNDUPD/BNDACK messages in order to keep their
+ binding databases in stable storage synchronized.
+
+ The rest of this section is written as though every BNDUPD message
+ contains only a single binding update transaction in order to reduce
+ the complexity of the discussion. See section 6.3 for information on
+ how to create and process BNDUPD and BNDACK messages which contain
+ multiple binding update transactions. Note that while a server MAY
+ generate BNDUPD messages with multiple binding update transactions,
+ every server MUST be able to process a BNDUPD message which contains
+ multiple binding update transactions and generate the corresponding
+ BNDACK messages with status for multiple binding update transactions.
+
+ The following table summarizes the various options for the BNDUPD
+ message.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 51]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+ binding-status BACKUP
+ RESET
+ ABANDONED
+ Option ACTIVE EXPIRED RELEASED FREE
+ ------ ------ ------- -------- ----
+ assigned-IP-address (3) MUST MUST MUST MUST
+ IP-flags MUST(4) MUST(4) MUST(4) MUST(4)
+ binding-status MUST MUST MUST MUST
+ client-identifier MAY MAY MAY MAY(2)
+ client-hardware-address MUST MUST MUST MAY(2)
+ lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
+ potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
+ start-time-of-state SHOULD SHOULD SHOULD SHOULD
+ client-last-trans.-time MUST SHOULD MUST MAY
+ DDNS(1) SHOULD SHOULD SHOULD SHOULD
+ client-request-options SHOULD SHOULD NOT SHOULD SHOULD NOT
+ client-reply-options SHOULD SHOULD NOT SHOULD NOT SHOULD NOT
+
+ (1) MUST if server is performing dynamic DNS for this IP address, else
+ MUST NOT.
+ (2) MUST NOT if binding-status is ABANDONED.
+ (3) assigned-IP-address MUST be the first option for an IP address
+ (4) IP-flags option MUST appear if any flags are non-zero, else it
+ MAY appear.
+
+ Table 7.1-1: Options used in a BNDUPD message
+
+
+7.1.1. Sending the BNDUPD message
+
+ A BNDUPD message SHOULD be generated whenever any binding changes. A
+ change might be in the binding-status, the lease-expiration-time, or
+ even just the last-transaction-time. In general, any time a DHCP
+ server writes its stable storage, a BNDUPD message SHOULD be gen-
+ erated. This will often be the result of the processing of a DHCP
+ client request, but it might also be the result of a successful
+ dynamic DNS update operation. Stable storage updates due to BNDUPD
+ or BNDACK messages SHOULD NOT result in additional BNDUPD messages.
+
+ BNDUPD (and BNDACK) messages refer to the binding-status of the IP
+ address, and this protocol defines a series of binding-statuses, dis-
+ cussed in more detail below. Some servers may not support all of
+ these binding-statuses, and so in those cases they will not be sent.
+ Upon receipt of a BNDUPD message which contains an unsupported
+ binding-status, a reasonable interpretation should be made (see sec-
+ tion 5.10).
+
+
+
+Droms, et. al. Expires September 2003 [Page 52]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ All BNDUPD messages MUST contain the IP address of the binding update
+ transaction in the assigned-IP-address option.
+
+ All binding update transactions MUST contain an IP-flags option if
+ the value of any of the flags would be non-zero. The IP-flags option
+ MAY be omitted if all of the flags that it contains are zero. The
+ IP-flags option contains a flag which indicates if the IP address is
+ currently reserved on the server sending the BNDUPD message. It also
+ contains a flag which indicates that the lease is associated with a
+ client that used the BOOTP protocol (as opposed to the DHCP protocol)
+ to interact with the DHCP server.
+
+ All binding update transactions contain a binding-status option, and
+ it will have one of the values found in section 5.11. Client infor-
+ mation consists of client-hardware-address and possibly a client-
+ identifier, and is explained in more detail later in this section.
+ The following table indicates whether client information should or
+ should not appear with each binding-status in a binding update tran-
+ saction:
+
+
+ binding-status includes client information
+ ------------------------------------------------
+ ACTIVE MUST
+ EXPIRED SHOULD
+ RELEASED SHOULD
+ FREE MAY
+ ABANDONED MUST NOT
+ RESET MAY
+ BACKUP MAY
+
+ Table 7.1.1-1: Client information required by various
+ binding-status values.
+
+
+ The ACTIVE binding-status requires some options to indicate the
+ length of the binding:
+
+
+ o lease-expiration-time
+
+ The lease-expiration-time option MUST appear, and be set to the
+ expiration time most recently ACKed to the DHCP client. Note
+ that the time ACKed to a DHCP client is a lease duration in
+ seconds, while the lease-expiration-time option in a BNDUPD mes-
+ sage is an absolute time value.
+
+ o potential-expiration-time
+
+
+
+Droms, et. al. Expires September 2003 [Page 53]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ The potential-expiration-time option MUST appear, and be set to
+ a value beyond that of the lease-expiration time. This is the
+ value that is ACKed by the BNDACK message. A server sending a
+ BNDUPD message MUST be able to recover the potential-
+ expiration-time sent in every BNDUPD, not just those that
+ receive a corresponding BNDACK, in order to be able to protect
+ against possible duplicate allocation of IP addresses after
+ transitioning to PARTNER-DOWN state. See section 5.2.1 for
+ details as to why the potential-expiration-time exists and
+ guidelines for how to decide on the value.
+
+ The following option information applies to all BNDUPD messages,
+ regardless of the value of the binding-status, unless otherwise
+ noted.
+
+ o Identifying the client
+
+ For many of the binding-status values a client MUST appear while
+ for others a client MAY appear, and for some a client MUST NOT
+ appear.
+
+ A client is identified in a BNDUPD message by at least one and pos-
+ sibly two options. The client-hardware-address option MUST appear
+ any time that a client appears in a BNDUPD message, and contains
+ the hardware type and chaddr information from the DHCP request
+ packet. A failover client-identifier option MUST appear any time
+ that a client appears in a BNDUPD message if and only if that
+ client used a DHCP client-identifier option when communicating with
+ the DHCP server. See section 12.5 and 12.4 for details of how to
+ construct these two options from a DHCP request packet.
+
+ o start-time-of-state
+
+ The start-time-of-state SHOULD appear. It is set to the time at
+ which this IP address first took on the state that corresponds to
+ the current value of binding-status.
+
+ o last-transaction-time
+
+ The last-transaction-time value SHOULD appear. This is the time at
+ which this DHCP server last received a packet from the DHCP client
+ referenced by the client-identifier or client-hardware-address that
+ was associated with the IP address referenced by the assigned-IP-
+ address.
+
+ o DDNS
+
+ If the DHCP server is performing dynamic DNS operations on behalf
+
+
+
+Droms, et. al. Expires September 2003 [Page 54]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ of the DHCP client represented by the client-identifier or client-
+ hardware-address, then it should include a DDNS option containing
+ the domain name and status of any dynamic DNS operations enabled.
+
+ o client-request-options
+
+ If the BNDUPD was triggered by a request from a DHCP client (typi-
+ cally those with binding-status of ACTIVE and RELEASED), then the
+ server SHOULD include options of interest to a failover partner
+ from the client's request packet in the client-request-options for
+ transmission to its partner (see section 12.8).
+
+ A server sending a BNDUPD SHOULD remember the "interesting" options
+ or the information that would appear in an "interesting" option for
+ transmission at a time when the BNDUPD is not closely associated
+ with a DHCP client request.
+
+ A server SHOULD send the following "interesting" options. It MAY
+ send any DHCP client options. As new options are defined, the RFC
+ defining these options SHOULD include information that they are
+ "interesting to failover servers" if they should be sent as part of
+ a BNDUPD.
+
+
+ option option
+ number name
+ -----------------------------------------
+
+ 12 host-name
+ 81 client-FQDN [FQDN]
+ 82 relay-agent-information [RFC 3046]
+ 77 user-class [RFC 3004]
+ 60 vendor-class-identifier
+ 118 subnet-selection [RFC 3011]
+
+ Table 7.1.1-2: Options which SHOULD be sent in
+ the client-request-options option in a BNDUPD message.
+
+
+ o client-reply-options
+
+ If the BNDUPD was triggered by a request from a DHCP client (typi-
+ cally those with binding-status of ACTIVE and RELEASED), then the
+ server SHOULD include options of interest to a failover partner
+ from the server's DHCP reply packet in the client-reply-options for
+ transmission to its partner (see section 12.7).
+
+ A server sending a BNDUPD SHOULD remember the "interesting" options
+
+
+
+Droms, et. al. Expires September 2003 [Page 55]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ or the information that would appear in an "interesting" option for
+ transmission at a time when the BNDUPD is not closely associated
+ with a DHCP client request.
+
+ A server SHOULD send the following "interesting" options. It MAY
+ send any DHCP client options. As new options are defined, the RFC
+ defining these options SHOULD include information that they are
+ "interesting to failover servers" if they should be sent as part of
+ a BNDUPD.
+
+
+ option option
+ number name
+ -----------------------------------------
+
+ 58 renewal-time
+ 59 rebinding-time
+
+ Table 7.1.1-3: Options which SHOULD be sent in
+ the client-reply-options option in a BNDUPD message.
+
+
+ The BNDUPD message SHOULD be sent as soon as possible from the time
+ that the DHCP client received a response and the lease bindings data-
+ base is written on stable storage.
+
+7.1.2. Receiving the BNDUPD message
+
+ When a server receives a BNDUPD message, it needs to decide how to
+ process the binding update transaction it contains and whether that
+ transaction represents a conflict of any sort. The conflict resolu-
+ tion process MUST be used on the receipt of every BNDUPD message, not
+ just those that are received while in POTENTIAL-CONFLICT state, in
+ order to increase the robustness of the protocol.
+
+ There are three sorts of conflicts:
+
+ o Two clients, one IP address conflict
+
+ This is the duplicate IP address allocation conflict. There are
+ two different clients each allocated the same address. See sec-
+ tion 7.1.3 for how to resolve this conflict.
+
+ o Two IP addresses, one client conflict
+
+ This conflict exists when a client on one server is associated
+ with a one IP address, and on the other server with a different
+ IP address in the same or a related subnet. This does not refer
+
+
+
+Droms, et. al. Expires September 2003 [Page 56]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ to the case where a single client has addresses in multiple dif-
+ ferent subnets or administrative domains, but rather the case
+ where on the same subnet the client has as lease on one IP
+ address in one server and on a different IP address on the other
+ server.
+
+ This conflict may or may not be a problem for a given DHCP
+ server implementation. In the event that a DHCP server requires
+ that a DHCP client have only one outstanding lease for an IP
+ address on one subnet, this conflict should be resolved by
+ accepting the lease information which has the latest client-
+ last-transaction-time.
+
+ o binding-status conflict
+
+ This is normal conflict, where one server is updating the other
+ with newer information. See section 7.1.3 for details of how to
+ resolve these conflicts.
+
+7.1.3. Deciding whether to accept the binding update transaction in a
+BNDUPD message
+
+ When analyzing a BNDUPD message from a partner server, if there is
+ insufficient information in the BNDUPD to process it, then reject the
+ BNDUPD with reject-reason 3: "Missing binding information".
+
+ If the IP address in the BNDUPD is not an IP address associated with
+ the failover endpoint which received the BNDUPD message, then reject
+ it with reject-reason 1: "Illegal IP address (not part of any address
+ pool)".
+
+ IP addresses undergo binding status changes for several reasons,
+ including receipt and processing of DHCP client requests, administra-
+ tive inputs and receipt of BNDUPD messages. Every DHCP server needs
+ to respond to DHCP client requests and administrative inputs with
+ changes to its internal record of the binding-status of an IP
+ address, and this response is not in the scope of the failover proto-
+ col. However, the receipt of BNDUPD messages implies at least a pos-
+ sible change of the binding-status for an IP address, and must be
+ discussed here. See section 7.1.2 for general actions to take upon
+ receipt of a BNDUPD message.
+
+ Every BNDUPD message SHOULD contain a client-last-transaction-time
+ option, which MUST, if it appears, be the time that the server last
+ interacted with the DHCP client. It MUST NOT be, for instance, the
+ time that the lease on an IP address expired. If there has been no
+ interaction with the DHCP client in question (or there is no DHCP
+ client presently associated with this IP address), then there will be
+
+
+
+Droms, et. al. Expires September 2003 [Page 57]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ no client-last-transaction-time option in the BNDUPD message.
+
+ The list in Figure 7.1.3-1 is indexed by the binding-status that a
+ server receives in a BNDUPD message. In many cases, the binding-
+ status of an IP address within the receiving server's data storage
+ will have an affect upon the checks performed prior to accepting the
+ new binding-status in a BNDUPD message.
+
+ In Figure 7.1.3-1, to "accept" a BNDUPD means to update the server's
+ bindings database with the information contained in the BNDUPD and
+ once that update is complete, send a BNDACK message corresponding to
+ the BNDUPD message. To "reject" a BNDUPD means to respond to the
+ BNDUPD with a BNDACK with a reject-reason option included.
+
+ When interpreting the information in the following table (Figure
+ 7.1.3-1), for those rules that are listed with "time" -- if a BNDUPD
+ doesn't have a client-last-transaction-time value, then it MUST NOT
+ be considered later than the client-last-transaction-time in the
+ receiving server's binding. If the BNDUPD contains a client-last-
+ transaction-time value and the receiving server's binding does not,
+ then the client-last-transaction-time value in the BNDUPD MUST be
+ considered later than the server's.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 58]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+ binding-status in received BNDUPD
+ binding-status
+ in receiving FREE RESET
+ server ACTIVE EXPIRED RELEASED BACKUP ABANDONED
+
+ ACTIVE accept(5) time(2) time(1) time(2) accept
+ EXPIRED time(1) accept accept accept accept
+ RELEASED time(1) time(1) accept accept accept
+ FREE/BACKUP accept accept accept accept accept
+ RESET time(3) accept accept accept accept
+ ABANDONED reject(4) reject(4) reject(4) reject(4) accept
+
+ time(1): If the client-last-transaction-time in the BNDUPD
+ is later than the client-last-transaction-time in the
+ receiving server's binding, accept it, else reject it.
+
+ time(2): If the current time is later than the receiving
+ servers' lease-expiration-time, accept it, else reject it.
+
+ time(3): If the client-last-transaction-time in the BNDUPD
+ is later than the start-time-of-state in the receiving server's
+ binding, accept it, else reject it.
+
+ (1,2,3): If rejecting, use reject reason 15: "Outdated binding
+ information".
+
+ (4): Use reject reason 16: "Less critical binding information".
+
+ (5): If the clients in a BNDUPD message and in a receiving
+ server's binding differ, then if the receiving server is a
+ secondary accept it, else reject it with a reject reason of 2:
+ "Fatal conflict exists: address in use by other client".
+
+ Figure 7.1.3-1: Accepting BNDUPD messages
+
+
+
+ If the IP address in the BNDUPD message has the R flag set in the
+ IP-flags option, indicating it is a reserved IP address, and if the
+ binding-status in the BNDUPD is BACKUP, then if the receiving server
+ does not show the IP address as reserved, the receiving server SHOULD
+ reject the BNDUPD using reject reason 19: "IP not reserved on this
+ server".
+
+7.1.4. Accepting the BNDUPD message
+
+ When accepting a BNDUPD message, the information contained in the
+
+
+
+Droms, et. al. Expires September 2003 [Page 59]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ client-request-options and client-reply-options SHOULD be examined
+ for any information of interest to this server. For instance, a
+ server which wished to detect changes in client specified host names
+ might want to examine and save information from the host-name or
+ client-FQDN options. Servers which expect to utilize information
+ from the relay-agent-information option SHOULD store this informa-
+ tion.
+
+7.1.5. Time values related to the BNDUPD message
+
+ There are four time values that MAY be sent in a BNDUPD message.
+
+ o lease-expiration-time
+
+ The time that the server gave to the client, i.e., the time that
+ the server believes that the client's lease will expire.
+
+ o potential-expiration-time
+
+ The time that the server wants to be sure its partner waits
+ (added to the MCLT) before assuming that this lease has expired.
+ Typically some time beyond the desired client lease time.
+
+ o client-last-transaction-time
+
+ The time that the client last interacted with this server.
+
+ o start-time-of-state
+
+ The time at which the binding first went into the current state.
+
+ As discussed in section 5.2, each server knows what its partner has
+ ACKed with regard to potential-expiration time. In addition, each
+ server needs to remember what it has told its partner as the
+ potential-expiration-time. Moreover, each server must remember what
+ it has acked to the *other* server as the most recent potential-
+ expiration-time from that server.
+
+ Remember that each server sends a potential-expiration-time and
+ receives an ACK for that as well as receiving a potential-
+ expiration-time and needing to remember what it has acked for that.
+
+ While they don't have to be named in any particular way, the times
+ that a server needs to remember for every IP address in order to
+ implement the failover protocol are:
+
+ o lease-expiration-time
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 60]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ The time that a server gave to the DHCP client. A DHCP server
+ needs to remember this time already, just to be a DHCP server.
+ A server SHOULD update this time with the lease-expiration time
+ received from a partner in a BNDUPD if the received lease-
+ expiration time is later than the lease-expiration time recorded
+ for this binding.
+
+ o sent-potential-expiration-time
+
+ The latest time sent to the partner for a potential-expiration-
+ time.
+
+ o acked-potential-expiration-time
+
+ The latest time that the partner has acked for a potential
+ expiration time. Typically the same as sent-potential-
+ expiration-time if there is not a BNDUPD outstanding.
+
+ o received-potential-expiration-time
+
+ The latest time that this server has ever received as a
+ potential-expiration-time from its partner in a BNDUPD that this
+ server ACKed.
+
+ So, a server has to remember two additional times concerning BNDUPD
+ messages that it has initiated, and one additional time concerning
+ BNDUPD message that it has received. How are these times used?
+
+ First, let's look at the time that a DHCP server can offer to a DHCP
+ client. A server can offer to a DHCP client a time that is no longer
+ than the MCLT beyond the max( received-potential-expiration-time,
+ acked-potential-expiration-time). One might think that the server
+ should be able to offer only the MCLT beyond the acked-potential-
+ expiration-time, and while that is certainly simple and easy to
+ understand, it has negative consequences in actual operation.
+
+ To illustrate this, in the simple case where the primary updates the
+ secondary for a while and then fails, if the secondary can then renew
+ the client for only the MCLT beyond the acked-potential-expiration-
+ time, then the secondary will only be able to renew the client for
+ the MCLT, because the secondary has never sent a BNDUPD packet to the
+ primary concerning this IP address and client, and so its acked-
+ potential-expiration-time is zero.
+
+ However, since the secondary is allowed to renew the client with the
+ MCLT beyond the max( received-potential-expiration-time, acked-
+ potential-expiration-time), then the secondary can usually renew the
+ client for the full lease period, at least for the first renew it
+
+
+
+Droms, et. al. Expires September 2003 [Page 61]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ sees from the client, since the received-potential-expiration-time is
+ generally longer than the client's desired lease interval. The
+ difference in renew times could make a big difference in server load
+ on the secondary in this case.
+
+ What are the consequences of allowing a server to offer a DHCP client
+ a lease term of the MCLT beyond the max( received-potential-
+ expiration-time, acked-potential-expiration-time)? The consequences
+ appear whenever a server enters PARTNER-DOWN state, and affect how
+ long that server has to wait before reallocating expired leases.
+ With this approach, when a server goes into PARTNER-DOWN state, it
+ must wait the MCLT beyond the max( lease-expiration-time, sent-
+ potential-expiration-time, acked-potential-expiration-time,
+ received-potential-expiration-time ) for each IP address before it
+ can reallocate that IP address to another DHCP client. One might
+ normally think that it needed to wait only the MCLT beyond the max(
+ lease-expiration-time, received-potential-expiration-time ), i.e.,
+ beyond what it has told the client and what it has explicitly acked
+ to the other server. But with the optimization discussed above --
+ where either server can offer the DHCP client a lease term of the
+ MCLT beyond the max( received-potential-expiration-time, acked-
+ potential-expiration-time), then the additional times sent-
+ potential-expiration-time and acked-potential-expiration-time must be
+ added into the expression, since the partner could have used those
+ times as part of its own lease time calculation.
+
+ Thus this optimization may require a longer waiting time when enter-
+ ing PARTNER-DOWN state, but will generally allow servers to operate
+ considerably more effectively when running in COMMUNICATIONS-
+ INTERRUPTED state.
+
+7.2. BNDACK message [4]
+
+ A server sends a binding acknowledgement (BNDACK) message when it has
+ processed a BNDUPD message and after it has successfully committed to
+ stable storage any binding database changes made as a result of pro-
+ cessing the BNDUPD message. A BNDACK message is used to both accept
+ or reject a BNDUPD message. A BNDACK message which contains a
+ reject-reason option is a rejection of the corresponding BNDUPD mes-
+ sage.
+
+ In order to reduce the complexity of the discussion, the rest of this
+ section is written as though every BNDUPD message contains only a
+ single binding update transaction and thus every corresponding BNDACK
+ message would also contain reply information about only a single
+ binding update transaction. See section 6.3 for information on how
+ to create and process BNDUPD and BNDACK messages which contain multi-
+ ple binding update transactions.
+
+
+
+Droms, et. al. Expires September 2003 [Page 62]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ Note that while a server MAY generate BNDUPD messages with multiple
+ binding update transactions, every server MUST be able to process a
+ BNDUPD message which contains multiple binding update transactions
+ and generate the corresponding BNDACK messages with status for multi-
+ ple binding update transactions. If a server does not ever create
+ BNDUPD messages which contain multiple binding update transactions,
+ then it does not need to be able to process a received BNDACK message
+ with multiple binding update transactions. However, all servers MUST
+ be able to create BNDACK messages which deal with multiple binding
+ update transactions received in a BNDUPD message.
+
+ Every BNDUPD message that is received by a server MUST be responded
+ to with a corresponding BNDACK message. The receiving server SHOULD
+ respond quickly to every BNDUPD message but it MAY choose to respond
+ preferentially to DHCP client requests instead of BNDUPD messages,
+ since there is no absolute time period within which a BNDACK must be
+ sent in response to a BNDUPD message, while DHCP clients frequently
+ have strict time constraints.
+
+ A BNDACK message can only be sent in response to a BNDUPD message
+ using the same TCP connection from which the BNDUPD message was
+ received, since the XID's in BNDUPD messages are guaranteed unique
+ only during the life of a single TCP connection. When a connection
+ to a partner server goes down, a server with unprocessed BNDUPD mes-
+ sages MAY simply drop all of those messages, since it can be sure
+ that the partner will resend them when they are next in communica-
+ tions (albeit with a different XID), or it MAY instead choose to pro-
+ cess those BNDUPD messages, but it MUST NOT send any BNDACK messages
+ in response.
+
+ The following table summarizes the options for the BNDACK message.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 63]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+ Option accept reject
+ ------ ------ ------
+ assigned-IP-address (1) MUST MUST
+ IP-flags SHOULD NOT SHOULD NOT
+ binding-status SHOULD NOT SHOULD NOT
+ client-identifier SHOULD NOT SHOULD NOT
+ client-hardware-address SHOULD NOT SHOULD NOT
+ reject-reason SHOULD NOT MUST
+ message SHOULD NOT SHOULD
+ lease-expiration-time SHOULD NOT SHOULD NOT
+ potential-expiration-time SHOULD NOT SHOULD NOT
+ start-time-of-state SHOULD NOT SHOULD NOT
+ client-last-trans.-time SHOULD NOT SHOULD NOT
+ DDNS(1) SHOULD NOT SHOULD NOT
+
+ (1) assigned-IP-address MUST be the first option for an IP address
+
+ Table 7.2-1: Options used in a BNDACK message
+
+
+7.2.1. Sending the BNDACK message
+
+ The BNDACK message MUST contain the same xid as the corresponding
+ BNDUPD message.
+
+ The assigned-IP-address option from the BNDUPD message MUST be
+ included in the BNDACK message. Any additional options from the
+ BNDUPD message SHOULD NOT appear in the BNDACK message. Note that
+ any information sent in options (e.g, a later lease-expiration time)
+ in the BNDACK message MUST NOT be assumed to necessarily be recorded
+ in the stable storage of the server who receives the BNDACK message
+ because there is no corresponding ACK of the BNDACK message. Any
+ information that SHOULD be recorded in the partner server's stable
+ storage MUST be transmitted in a subsequent BNDUPD.
+
+ If the server is accepting the BNDUPD, the BNDACK message includes
+ only the assigned-IP-address option. If the server is rejecting the
+ BNDUPD, the additional option reject-reason MUST appear in the BNDACK
+ message, and the message option SHOULD appear in this case containing
+ a human-readable error message describing in some detail the reason
+ for the rejection of the BNDUPD message.
+
+ If the server rejects the BNDUPD message with a BNDACK and a reject-
+ reason option, it may be because the server believes that it has
+ binding information that the other server should know. A server
+ which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
+
+
+
+Droms, et. al. Expires September 2003 [Page 64]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ to update its partner with what it believes is better binding infor-
+ mation, but it MUST ensure through some means that it will not end up
+ in a situation where each server is sending BNDUPD messages as fast
+ as possible because they can't agree on which server has better bind-
+ ing data. Placing a considerable delay on the initiation of a BNDUPD
+ message after sending a BNDACK with a reject-reason would be one way
+ to ensure this situation doesn't occur.
+
+7.2.2. Receiving the BNDACK message
+
+ When a server receives a BNDACK message, if it doesn't contain a
+ reject-reason option that means that the BNDUPD message was accepted,
+ and the server which sent the BNDUPD SHOULD update its stable storage
+ with the potential-expiration-time value sent in the BNDUPD message.
+
+ If the BNDACK message contains a reject-reason option, that means
+ that the BNDUPD was rejected. There SHOULD be a message option in
+ the BNDACK giving a text reason for the rejection, and the server
+ SHOULD log the message in some way. The server MUST NOT immediately
+ try to resend the BNDUPD message as there is no reason to believe the
+ partner won't reject it a second time. However a server MAY choose
+ to send another BNDUPD at some future time, for instance when the
+ server next processes an update request from its partner.
+
+7.3. UPDREQ message [9]
+
+ The update request (UPDREQ) message is used by one server to request
+ that its partner send it all of the binding database information that
+ it has not already seen. Since each server is required to keep
+ track at all times of the binding information the other server has
+ ACKed, one server can request transmission of all un-ACKed binding
+ database information held by the other server by using the UPDREQ
+ message.
+
+ The UPDREQ message is used whenever the sending server cannot proceed
+ before it has processed all previously un-ACKed binding update infor-
+ mation, since the UPDREQ message should yield a corresponding UPDDONE
+ message. The UPDDONE message is not sent until the server that sent
+ the UPDREQ message has responded to all of the BNDUPD messages gen-
+ erated by the UPDREQ message with BNDACK messages (they may either be
+ accepted or rejected by the BNDACK messages, but they MUST have been
+ responded to). Thus, the sender of the UPDREQ message can be sure
+ upon receipt of an UPDDONE message that it has received and committed
+ to stable storage all outstanding binding database updates.
+
+ See section 9, Failover Endpoint States, for the details of when the
+ UPDREQ message is sent.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 65]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+7.3.1. Sending the UPDREQ message
+
+ The UPDREQ message has no message specific options.
+
+7.3.2. Receiving the UPDREQ message
+
+ A server receiving an UPDREQ message MUST send all binding database
+ changes that have not yet been ACKed by the sending server. These
+ changes are sent as undistinguished BNDUPD messages.
+
+ However, the server which received and is processing the UPDREQ mes-
+ sage MUST track the BNDACK messages that correspond to the BNDUPD
+ messages triggered by the UPDREQ message and, when they are all
+ received, the server MUST send an UPDDONE message.
+
+ The server processing the UPDREQ message and sending BNDUPD messages
+ to its partner SHOULD only track the BNDUPD and BNDACK message pairs
+ for unACKed binding database changes that were present upon the
+ receipt of the UPDREQ message. A server which has received an UPDREQ
+ message SHOULD send BNDUPD messages for binding database changes that
+ occur after receipt of the UPDREQ message, but it SHOULD NOT include
+ those additional BNDUPD messages and their corresponding BNDACK mes-
+ sages in the accounting necessary to consider the UPDREQ complete and
+ subsequently send the UPDDONE message. If some additional binding
+ database changes end up becoming part of the set of BNDUPD messages
+ considered as part of the UPDREQ (due to whatever algorithm the
+ server uses to scan its bindings database for unacked changes) it
+ will probably not cause any difficulty, but a server MUST NOT attempt
+ to include all such later BNDUPD messages in the accounting for the
+ UPDREQ in order to be able to transmit an UPDDONE message.
+
+ When queuing up the BNDUPD messages for transmission to the sender of
+ the UPDREQ message, the server processing the UPDREQ message MUST
+ honor the value returned in the max-unacked-bndupd option in the CON-
+ NECT or CONNECTACK message that set up the connection with the send-
+ ing server. It MUST NOT send more BNDUPD messages without receiving
+ corresponding BNDACKs than the value returned in max-unacked-bndupd.
+ (See section 8 for more details.)
+
+7.4. UPDREQALL message [7]
+
+ The update request all (UPDREQALL) message is used by one server to
+ request that its partner send it all of the binding database informa-
+ tion. This message is used to allow one server to recover from a
+ failure of stable storage and to restore its binding database in its
+ entirety from the other server.
+
+ A server which sends an UPDREQALL message cannot proceed until all of
+
+
+
+Droms, et. al. Expires September 2003 [Page 66]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ its binding update information is restored, and it knows that all of
+ that information is restored when an UPDDONE message is received.
+
+ See section 9, Protocol state transitions, for the details of when
+ the UPDREQALL message is sent.
+
+ The UPDREQALL message has no message specific options.
+
+7.4.1. Sending the UPDREQALL message
+
+ The UPDREQALL is sent.
+
+7.4.2. Receiving the UPDREQALL message
+
+ A server receiving an UPDREQALL message MUST send all binding data-
+ base information to the sending server. See section 5.16 for details
+ of what might actually comprise "all binding database information".
+
+ A server receiving an UPDREQALL message MUST remember that such a
+ message has been received, ensure that all binding information extant
+ at that point is sent to the partner prior to any UPDDONE message
+ being sent to that partner. One way to do this is to remember the
+ receipt of an UPDREQALL message and to and treat every subsequent
+ UPDREQ message as an UPDREQALL message until it sends the first
+ UPDDONE message after receipt of the UPDREQALL message. This
+ requirement exists because communications may fail and become re-
+ established between the two servers, and the specific conditions
+ which provoked the UPDREQALL message may not longer exist even though
+ the UPDREQALL message may not yet have completed. See section 5.17
+ for information on a more efficient way to meet the above require-
+ ment.
+
+ These changes are sent as undistinguished BNDUPD messages. Otherwise
+ the processing is the same as for the UPDREQ message. See section
+ 7.3.2 for details.
+
+7.5. UPDDONE message [8]
+
+ The update done (UPDDONE) message is used by a server receiving an
+ UPDREQ or UPDREQALL message to signify that it has sent all of the
+ BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
+ it has received a BNDACK for each of those messages.
+
+ While a BNDACK message MUST have been received for each BNDUPD mes-
+ sage prior to the transmission of the UPDDONE message, this doesn't
+ necessarily mean that all of the BNDUPD messages were accepted, only
+ that all of them were responded to with a BNDACK message. Thus, a
+ NAK (comprised of a BNDACK message containing a reject-reason option)
+
+
+
+Droms, et. al. Expires September 2003 [Page 67]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ could be used to reject a BNDUPD, but for the purposes of the UPDDONE
+ message, such NAK would count as a response to the associated BNDUPD
+ message, and would not block the eventual transmission of the UPDDONE
+ message.
+
+ The xid in an UPDDONE message MUST be identical to the xid in the
+ UPDREQ or UPDREQALL message that initiated the update process.
+
+ The UPDDONE message has no message specific options.
+
+7.5.1. Sending the UPDDONE message
+
+ The UPDDONE message SHOULD be sent as soon as the last BNDACK message
+ corresponding to a BNDUPD message requested by the UPDREQ or
+ UPDREQALL is received from the server which sent the UPDREQ or
+ UPDREQALL. The XID of the UPDDONE message MUST be the same as the
+ XID of the corresponding UPDREQ or UPDREQALL message.
+
+7.5.2. Receiving the UPDDONE message
+
+ A server receiving the UPDDONE message knows that all of the informa-
+ tion that it requested by sending an UPDREQ or UPDREQALL message has
+ now been sent and that it has recorded this information in its stable
+ storage. It typically uses the receipt of an UPDDONE message to move
+ to a different failover state. See sections 9.5.2 and 9.8.3 for
+ details.
+
+7.6. POOLREQ message [1]
+
+ The pool request (POOLREQ) message is used by the secondary server to
+ request an allocation of IP addresses from the primary server. It
+ MUST be sent by a secondary server to a primary server to request IP
+ address allocation by the primary. The IP addresses allocated are
+ transmitted using normal BNDUPD messages from the primary to the
+ secondary.
+
+ The POOLREQ message SHOULD be sent from the secondary to the primary
+ whenever the secondary makes a transition into NORMAL state. It
+ SHOULD periodically be resent in order that any change in the number
+ of available IP addresses on the primary be reflected in the pool on
+ the secondary. The period may be influenced by the secondary
+ server's leasing activity.
+
+ The POOLREQ message has no message specific options.
+
+7.6.1. Sending the POOLREQ message
+
+ The POOLREQ message is sent.
+
+
+
+Droms, et. al. Expires September 2003 [Page 68]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+7.6.2. Receiving the POOLREQ message
+
+ When a primary server receives a POOLREQ message it SHOULD examine
+ the binding database and determine how many IP addresses the secon-
+ dary server should have, and set these IP addresses to BACKUP state.
+ It SHOULD then send BNDUPD messages concerning all of these IP
+ addresses to the secondary server.
+
+ Servers frequently have several kinds of IP addresses available on a
+ particular network segment. The failover protocol assumes that both
+ primary and secondary servers are configured in such a way that each
+ knows the type and number of IP addresses on every network segment
+ participating in the failover protocol. The primary server is
+ responsible for allocating the secondary server the correct propor-
+ tion of available IP addresses of each kind, and the secondary server
+ is responsible for being configured in such a way that it can tell
+ the kind of every IP address based solely on the IP address itself.
+
+ A primary server MUST keep track of how many IP addresses were allo-
+ cated as a result of processing the POOLREQ message, and send that
+ number in the POOLRESP message.
+
+ A primary server MAY choose to defer processing a POOLREQ message
+ until a more convenient time to process it, but it should not depend
+ on the secondary server to resend the POOLREQ message in that case.
+
+ If a secondary server receives a POOLREQ message it SHOULD report an
+ error.
+
+7.7. POOLRESP message [2]
+
+ A primary server sends a POOLRESP message to a secondary server after
+ the allocation process for available addresses to the secondary
+ server is complete. Typically this message will precede some of the
+ BNDUPD messages that the primary uses to send the actual allocated IP
+ addresses to the secondary.
+
+ The xid in the POOLRESP message MUST be identical to the xid in the
+ POOLREQ message for which this POOLRESP is a response.
+
+
+7.7.1. Sending the POOLRESP message
+
+ The POOLRESP message MUST contain the same xid as the corresponding
+ POOLREQ message.
+
+ Only one option MUST appear in a POOLREQ message:
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 69]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ o addresses-transferred
+
+ The number of addresses allocated to the secondary server by the
+ primary server as a result of a POOLREQ is contained in the
+ addresses-transferred option in a POOLRESP message. Note this
+ is the number of addresses that are transferred to the secondary
+ in the primary's binding database as a result of the correspond-
+ ing POOLREQ message, and that it may be some time before they
+ can all be transmitted to the secondary server through the use
+ of BNDUPD messages.
+
+7.7.2. Receiving the POOLRESP message
+
+ When a secondary server receives a POOLRESP message, it SHOULD send
+ another POOLREQ message if the value of the addresses-transferred
+ option is non-zero.
+
+ Typically, no other action is taken on the reception of a POOLRESP
+ message.
+
+7.8. CONNECT message [5]
+
+ The connect message is used to establish an applications level con-
+ nection over a newly created TCP connection. It gives the source
+ information for the connection and critical configuration informa-
+ tion. It MUST be sent only by the primary server. Either server can
+ initiate a TCP connection, but the CONNECT message is only sent by
+ the primary server.
+
+ The CONNECT message MUST be the first message sent down a newly esta-
+ blished connection, and it MUST be sent only by the primary server.
+
+ The following table summarizes the options that are associated with
+ the CONNECT message:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 70]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+ Option
+ ------
+ relationship-name MUST
+ max-unacked-bndupd MUST
+ receive-timer MUST
+ vendor-class-identifier MUST
+ protocol-version MUST
+ TLS-request MUST (1)
+ MCLT MUST
+ hash-bucket-assignment MUST
+
+ (1) MUST NOT if CONNECT is being sent over a TLS connection
+
+ Table 7.8-1: Options used in a CONNECT message
+
+
+7.8.1. Sending the CONNECT message
+
+ The CONNECT message MUST be the first message sent by the primary
+ server after the establishment of a new TCP connection with a secon-
+ dary server participating in the failover protocol.
+
+ The xid of the CONNECT message is not related to any previous xid
+ sequence, but initiates the sequence for this connection.
+
+ The name of the failover relationship MUST be placed in the
+ relationship-name option. This information is placed in an option
+ inside of the message in order to allow the identity of the sender to
+ be covered by a shared secret.
+
+ The number of BNDUPD messages the primary server can accept without
+ blocking the TCP connection MUST be placed in the max-unacked-bndupd
+ option. This MUST be a number equal to or greater than 1, SHOULD be
+ a number greater than 10, and SHOULD be a number less than 100.
+
+ The length of the receive timer (tReceive, see section 8.3) MUST be
+ placed in the receive-timer option.
+
+ The MCLT MUST be placed in the MCLT option.
+
+ The hash-bucket-assignment option MUST be included in the CONNECT
+ message. In the event that load balancing is not configured for this
+ server, the hash-bucket-assignment option will indicate that. The
+ value of the hash-bucket-assignment option is determined from the
+ specific buckets that the primary server has determined that the
+ secondary server MUST service as part of the load-balancing
+
+
+
+Droms, et. al. Expires September 2003 [Page 71]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ algorithm. The way in which the primary server determines this
+ information is outside the scope of this protocol definition. The
+ primary server SHOULD be configured with a percentage of clients that
+ the secondary server will be instructed to service, and the primary
+ server SHOULD use the algorithm in [RFC 3074] to generate a Hash
+ Bucket Assignment which it sends to the secondary server.
+
+ The vendor class identifier MUST be placed in the vendor-class-
+ identifier option.
+
+ The protocol-version option MUST be included in every CONNECT mes-
+ sage. The current value of the protocol version is 1.
+
+ The TLS-request option MUST be sent and contains the desired TLS con-
+ nection request as well as information concerning whether TLS is sup-
+ ported. If this CONNECT message is being sent over a already
+ created TLS connection, the TLS-request MUST NOT appear.
+
+7.8.2. Receiving the CONNECT message
+
+ When a server established a TCP connection on a failover port, if it
+ is a PRIMARY server it should send a CONNECT message, and if it is a
+ secondary server it should wait for a CONNECT message before sending
+ any messages. To avoid denial of service attacks, a secondary should
+ only wait for a CONNECT message on a new connection for a limited
+ amount of time and close the connection if none is received during
+ that time.
+
+ When a secondary server receives a CONNECT message it should:
+
+ 1. Record the time at which the message was received.
+
+ 2. Examine the protocol-version option, and decide if this server
+ is capable of interoperating with another server running that
+ protocol version. If not, send the CONNECTACK message with
+ the reject reason 14: "Protocol version mismatch". The server
+ MUST include its protocol-version in the CONNECTACK message.
+
+ 3. Examine the TLS-request option. Figure out the TLS-reply
+ value based on the capabilities and configuration of this
+ server. If the result for the TLS-reply value is a 1 and the
+ connection is accepted, indicating use of TLS, then immedi-
+ ately send the CONNECTACK message and go into TLS negotiation.
+ If the TLS-reply value implies rejection of the connection,
+ then immediately send the CONNECTACK message with the TLS-
+ reply value and the appropriate reject-reason option value.
+ In all other cases, save the TLS-reply option information for
+ the eventual CONNECTACK message.
+
+
+
+Droms, et. al. Expires September 2003 [Page 72]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ The possibilities for TLS-request and TLS-reply are:
+
+ CONNECT CONNECTACK
+ TLS TLS
+ request reply
+ Reject
+ t1 t1 Reason Comments
+ -- -- ------ --------
+ 0 0 no TLS used
+ 0 1 11 primary won't use TLS, secondary requires TLS
+ 1 0 primary desires TLS, secondary doesn't
+ 1 1 primary desires TLS, secondary will use TLS
+ 2 0 9, 10 primary requires TLS and secondary won't
+ 2 1 primary requires TLS and secondary will use TLS
+
+
+
+ 4. Check to see if there is a message-digest option in the CON-
+ NECT message. If there was, and the server does not support
+ message-digests, then reject the connection with reject reason
+ 12: "Message digest not supported" in the CONNECTACK. If the
+ server does support message-digests, then check this message
+ for validity based on the message-digest, and reject it if the
+ digest indicates the message was altered with reject reason
+ 20: "Message digest failed to compare".
+
+ 5. Determine if the sender (from the relationship-name option)
+ and the implicit role of the sender (i.e., primary) represents
+ a server with which the receiver was configured to engage in
+ failover activity. This is performed after any TLS or message
+ digest processing so that it occurs after a secure connection
+ is created, to ensure that there is no tampering with the
+ relationship name of the partner. In the absence of any other
+ security capability (i.e., when TLS or a message digest is not
+ used), the server MAY wish to be configured with the IP
+ address of the partner and check the source-ip of the CONNECT
+ message against that IP address as a weak form of security.
+
+ If not, then the receiving server should reject the CONNECT
+ request by sending a CONNECTACK message with a reject-reason
+ value of: 8, invalid failover partner.
+
+ If it is, then the receiving failover endpoint should be
+ determined.
+
+ 6. Decide if the time delta between the sending of the message,
+ in the time field, and the receipt of the message, recorded in
+ step 1 above, is acceptable. A server MAY require an
+
+
+
+Droms, et. al. Expires September 2003 [Page 73]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ arbitrarily small delta in time values in order to set up a
+ failover connection with another server. See section 5.10 for
+ information on time synchronization.
+
+ If the delta between the time values is too great, the server
+ should reject the CONNECT request by sending a CONNECTACK mes-
+ sage with a reject-reason of 4, time mismatch too great.
+
+ If the time mismatch is not considered too great then the
+ receiving server MUST record the delta between the servers.
+ The receiving server MUST use this delta to correct all of the
+ absolute times received from the other server in all time-
+ valued options. Note that servers can participate in failover
+ with arbitrarily great time mismatches, as long as it is more
+ or less constant.
+
+ 7. Examine the MCLT option in the CONNECT request and use the
+ value of the MCLT as the MCLT for this failover endpoint.
+
+ The secondary server SHOULD be able to operate with any MCLT
+ sent by the primary, but if it cannot, then it should send a
+ CONNECTACK with a reject-reason of 5, MCLT mismatch. In the
+ event that the MCLT from the primary does not match that con-
+ figured on the secondary, and the secondary will run with the
+ primary's value, then the secondary MUST save the MCLT in
+ secondary storage since it will need it even if it cannot con-
+ tact the primary. The secondary MUST NOT use a different MCLT
+ value than it received from the primary even if it cannot con-
+ tact the primary.
+
+ 8. The server MUST store hash-bucket-assignment option for use
+ during processing during NORMAL state. If this hash bucket
+ assignment conflicts with the secondary server's configured
+ hash bucket assignment for use in other than NORMAL state, the
+ secondary server should send a CONNECTACK with a reject reason
+ of 19, Hash bucket assignment conflict.
+
+ 9. The receiving server MAY use the vendor-class-identifier to do
+ vendor specific processing.
+
+7.9. CONNECTACK message [6]
+
+ The CONNECTACK message is sent to accept or reject a CONNECT message.
+ It is sent by the secondary server which received a CONNECT message.
+
+ Attempting immediately to reconnect after either receiving a CONNEC-
+ TACK with a reject-reason or after sending a CONNECTACK with a
+ reject-reason could yield unwanted looping behavior, since the reason
+
+
+
+Droms, et. al. Expires September 2003 [Page 74]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ that the connection was rejected may well not have changed since the
+ last attempt. A simple suggested solution is to wait a minute or two
+ after sending or receiving a CONNECTACK message with a reject-reason
+ before attempting to reestablish communication.
+
+ The following table summarizes the options associated with the CON-
+ NECTACK message:
+
+
+ Option accept reject
+ ------
+ relationship-name MUST MUST
+ max-unacked-bndupd MUST MUST NOT
+ receive-timer MUST MUST NOT
+ vendor-class-identifier MUST MUST NOT
+ protocol-version MUST MUST
+ TLS-reply (1) (2)
+ reject-reason MUST NOT MUST
+ message MUST NOT SHOULD
+ MCLT MUST NOT MUST NOT
+ hash-bucket-assignment MUST NOT MUST NOT
+
+ (1) MUST NOT if sending CONNECTACK after TLS negotiation, MUST
+ if TLS-request in CONNECT, else MUST NOT.
+ (2) MUST if TLS-request in CONNECT message, else MUST NOT.
+
+ Table 7.9-1: Options used in a CONNECTACK message
+
+
+7.9.1. Sending the CONNECTACK message
+
+ The xid of the CONNECTACK message MUST be that of the corresponding
+ CONNECT message.
+
+ The name of the relationship MUST be placed in the relationship-name
+ option. This information is placed in an option inside of the mes-
+ sage in order to allow the identity of the sender to be covered by a
+ shared secret.
+
+ The protocol-version option MUST be included in every CONNECTACK mes-
+ sage. The current value of the protocol version is 1.
+
+ If the connection has been rejected, the reject-reason option MUST be
+ placed in the CONNECTACK message with an appropriate reason, and a
+ message option SHOULD be included with a human-readable error message
+ describing the reason for the rejection in some detail. If the
+ reject-reason option appears, then the remaining options listed below
+ do not appear. The sending server should close the connection after
+
+
+
+Droms, et. al. Expires September 2003 [Page 75]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ sending the CONNECTACK if the connection was rejected.
+
+ The results of the TLS negotiation MUST be placed in the TLS-reply
+ option. If this CONNECTACK message is being sent over an already TLS
+ secured connection, then there MUST NOT be a TLS-reply option.
+
+ If there was a message-digest option in the CONNECT message, then
+ there MUST be a message-digest in the CONNECTACK message and any sub-
+ sequent messages if the CONNECTACK does not contain a reject-reason.
+
+ The number of BNDUPD messages the server can accept without blocking
+ the TCP connection MUST be placed in the max-unacked-bndupd option.
+ This SHOULD be a number greater than 10, and SHOULD be a number less
+ than 100.
+
+ The length of the receive timer (tReceive, see section 8.3) MUST be
+ placed in the receive-timer option.
+
+ The vendor class identifier MUST be placed in the vendor-class-
+ identifier option.
+
+ After a connection is created (either by sending a CONNECTACK message
+ to the first CONNECT message, or sending a CONNECTACK message to a
+ CONNECT message received over a TLS connection), the server MUST send
+ a STATE message.
+
+ After a connection is created, the server MUST start two timers for
+ the connection: tSend and tReceive. The tSend timer SHOULD be
+ approximately 33 percent of the time in the receiver-timer option in
+ the corresponding CONNECT message. The tReceive timer SHOULD be the
+ time sent in the receiver-timer option in the CONNECTACK message.
+
+ The tReceive timer is reset whenever a message is received from this
+ TCP connection. If it ever expires, the TCP connection is dropped
+ and communications with this partner is considered not ok. The
+ reject reason 17: "No traffic within sufficient time" is placed in
+ the DISCONNECT message sent prior to dropping the TCP connection.
+
+ The tSend timer is reset whenever a message is sent over this connec-
+ tion. When it expires, a CONTACT message MUST be sent.
+
+7.9.2. Receiving the CONNECTACK message
+
+ If a CONNECTACK message is received with a different XID from the one
+ in the CONNECT that was sent, it SHOULD be ignored. To avoid denial
+ of service attacks, a primary should only wait for a CONNECTACK mes-
+ sage on a new connection for a limited amount of time and close the
+ connection if none is received during that time.
+
+
+
+Droms, et. al. Expires September 2003 [Page 76]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ When a CONNECTACK message is received, the following actions should
+ be taken:
+
+ 1. Record the time the message was received.
+
+ 2. Check to see if the xid on the CONNECTACK matches an outstand-
+ ing CONNECT message on this TCP connection.
+
+ 3. Check to see if there is a reject-reason option in the CONNEC-
+ TACK message. If not, continue with step 3. If there is a
+ reject-reason option, the server SHOULD report the error code.
+ If a message option appears a server SHOULD display the string
+ from the message option in a user visible way. The server
+ MUST close the connection if a reject-reason option appears.
+
+ 4. Check the value of the TLS-reply option (if any, which there
+ won't be if this CONNECT is taking place utilizing TLS), and
+ if it was 1, then skip processing of the rest of the CONNEC-
+ TACK message, and immediately enter into TLS connection setup.
+
+ This step occurs prior to steps 5 and 6 in order to allow
+ creation of a secure connection (if required) prior to pro-
+ cessing the protocol version and IP address information.
+
+ 5. Examine the value of the protocol-version option. If this
+ server is able to establish connections with another server
+ running this protocol version, then continue, else close the
+ connection.
+
+ 6. Decide if the time delta between the sending of the message,
+ in the time field, and the receipt of the message, recorded in
+ step 1 above, is acceptable. A server MAY require an arbi-
+ trarily small delta in time values in order to set up a fail-
+ over connection with another server.
+
+ If the delta between the time values is too great, the server
+ should drop the TCP connection (see section 7.12).
+
+ If the time mismatch is not considered too great then the
+ receiving server MUST record the delta between the servers.
+ The receiving server MUST use this delta to correct all of the
+ absolute times received from the other server in all time-
+ valued options. Note that the failover protocol is con-
+ structed so that two servers can be failover partners with
+ arbitrarily great time mismatches.
+
+ 7. The receiving server MAY use the vendor-class-identifier to do
+ vendor specific processing.
+
+
+
+Droms, et. al. Expires September 2003 [Page 77]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ 8. After accepting a CONNECTACK message, the server MUST send a
+ STATE message.
+
+ After receiving a CONNECTACK message, the server MUST start
+ two timers for the connection: tSend and tReceive. The tSend
+ timer SHOULD be approximately 20 percent of the time in the
+ receiver-timer option in the corresponding CONNECTACK message.
+ The tReceive timer SHOULD be set to the time sent in the
+ receiver-timer option in the CONNECT message.
+
+ The tReceive timer is reset whenever a message is received
+ from this TCP connection. If it ever expires, the TCP connec-
+ tion is dropped and communications with this partner is con-
+ sidered not ok. The reject reason 17: "No traffic within suf-
+ ficient time" is placed in the DISCONNECT message sent prior
+ to dropping the TCP connection.
+
+ The tSend timer is reset whenever a message is sent over this
+ connection. When it expires, a CONTACT message MUST be sent.
+
+7.10. STATE message [10]
+
+ The state (STATE) message is used to communicate the current failover
+ state to the partner server.
+
+ The STATE message MUST be sent after sending a CONNECTACK message
+ that didn't contain a reject-reason option, and MUST be sent after
+ receiving a CONNECTACK message without a reject-reason option.
+
+ A STATE message MUST be sent whenever the failover endpoint changes
+ its failover state and a connection exists to the partner.
+
+ The STATE message requires no response from the failover partner.
+
+ The following table shows the options that MUST appear in a STATE
+ message:
+
+
+ Option
+ ------
+ sending-state MUST
+ server-flags MUST
+ start-time-of-state MUST
+
+ Table 7.10-1: Options used in a STATE message
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 78]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+7.10.1. Sending the STATE message
+
+ The current failover state is placed in the server-state option and
+ the current state of the STARTUP flag is placed in the server-flags
+ option.
+
+ The message is sent with a unique xid.
+
+ A server SHOULD only send the STATE message either when the connec-
+ tion is created (i.e, after sending or receiving a CONNECTACK message
+ with no reject-reason option), or when there is a change from the
+ values sent in a previous STATE message.
+
+7.10.2. Receiving the STATE message
+
+ Every STATE message SHOULD indicate a change in state or a change in
+ the flags.
+
+ When a STATE message is received, any state transitions specified in
+ section 9 are taken.
+
+ No response to a STATE message is required.
+
+7.11. CONTACT message [11]
+
+ The contact (CONTACT) message is sent to verify communications
+ integrity with a failover partner. The CONTACT message is sent when
+ no messages have been sent to the failover partner for a specified
+ period of time. This is determined by the tSend timer expiring (see
+ section 8.3).
+
+ The CONTACT message has no message specific options.
+
+7.11.1. Sending the CONTACT message
+
+ The CONTACT message is sent.
+
+7.11.2. Receiving the CONTACT message
+
+ When a CONTACT message is received, the tReceive timer is reset (as
+ it is with any message that is received).
+
+ A server SHOULD use the time in the time field and the time the mes-
+ sage was received to refine the delta time calculations between the
+ servers.
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 79]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+7.12. DISCONNECT message [12]
+
+ The DISCONNECT is the last message sent over a connection before
+ dropping an established connection (note that an established connec-
+ tion is one where a CONNECTACK has been sent without a reject rea-
+ son).
+
+ After sending or receiving a DISCONNECT message, a server needs to
+ have some mechanism to prevent an error loop. Simply reconnecting to
+ the partner immediately is not the best option, especially after
+ several consecutive attempts.
+
+ A simple suggested solution is to wait a minute or two after sending
+ or receiving a DISCONNECT before attempting to reestablish communica-
+ tion.
+
+ The DISCONNECT message MUST be the last message sent down a connec-
+ tion before it is closed.
+
+ The following table summarizes the options that are associated with
+ the DISCONNECT message:
+
+
+ Option
+ ------
+ reject-reason MUST
+ message SHOULD
+
+ Table 7.12-1: Options used in a DISCONNECT message
+
+
+
+7.12.1. Sending the DISCONNECT message
+
+ The DISCONNECT message MUST be the last message sent by the a server
+ which is dropping a TCP connection.
+
+ The xid of the DISCONNECT message must be unique.
+
+ The reject-reason option MUST appear giving a reason why the connec-
+ tion was dropped. A message option SHOULD appear giving a human
+ readable error message with possibly more details.
+
+7.12.2. Receiving the DISCONNECT message
+
+ When a server receives a DISCONNECT message it should log the message
+ if there was one and possibly raise an alarm of some sort if the
+ reject reason was one that was sufficiently serious.
+
+
+
+Droms, et. al. Expires September 2003 [Page 80]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+8. Connection Management
+
+ Servers participating in the failover protocol communicate over TCP
+ connections. These TCP connections are used both to transmit bind-
+ ing information from one server to another as well as to allow each
+ server to determine whether communications is possible with the other
+ server.
+
+ Central to the operation of the failover protocol is a notion of
+ "communications okay" or "communications failed". Failover state
+ transitions are taken in many cases when the status of communications
+ with the partner changes, and the existence or non-existence of a TCP
+ connections between failover endpoints is used to determine if com-
+ munications is "okay" or "failed".
+
+ A single TCP connection exists which connects two failover endpoints.
+
+8.1. Connection granularity
+
+ There exists one TCP connection between each set of failover end-
+ points. See section 5.1.1 for an explanation of failover endpoints.
+
+ Typically there is one failover endpoint for each end of a failover
+ relationship between two servers, and only a single relationship
+ between any two servers. Given the integration of loadbalancing into
+ the failover protocol, there is little value in having more than one
+ failover relationship between two servers, though the protocol will
+ support multiple relationships between two servers.
+
+ Each failover relationship MUST have a unique relationship-name, and
+ the relationship-name option is used to communicate this name in the
+ CONNECT and CONNECTACK messages.
+
+8.2. Creating the TCP connection
+
+ All failover TCP connections are initiated over port 647. Every
+ server implementing the failover protocol MUST listen on port 647.
+
+ Every server implementing the failover protocol SHOULD attempt to
+ connect to all of its partners periodically, where the period is
+ implementation dependent and SHOULD be configurable. In the event
+ that a connection has been rejected by a CONNECTACK message with a
+ reject-reason option contained in it or a DISCONNECT message, a
+ server SHOULD reduce the frequency with which it attempts to connect
+ to that server but it SHOULD continue to attempt to connect periodi-
+ cally.
+
+ When a connection attempt succeeds, if the server generating the
+
+
+
+Droms, et. al. Expires September 2003 [Page 81]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ connection attempt is a primary server for that relationship, then it
+ MUST send a CONNECT message down the connection. If it is not a pri-
+ mary server for the relationship, then it MUST just drop the connec-
+ tion and wait for the primary server to connect to it.
+
+ When a connection attempt is received on port 647, the only informa-
+ tion that the receiving server has is the IP address of the partner
+ initiating a connection. It also knows whether it has the primary
+ role for any failover relationships with the connecting server. If
+ it has any relationships for which it is a primary server, it should
+ initiate a connection of its own to port 647 of the partner server,
+ one for each primary relationship it has with that server.
+
+ If it has any relationships with the connecting server for which it
+ is a seconary server, it should just await the CONNECT message to
+ determine which relationship this connection is to serve.
+
+ If it has no secondary relationships with the connecting server, it
+ SHOULD drop the connection.
+
+ To summarize -- a primary server MUST use a connection that it has
+ initiated in order to send a CONNECT message. Every server that is a
+ secondary server in a relationship attempts to create a connection to
+ the server which is primary in the relationship, but that connection
+ is only used to stimulate the primary server into recognizing that
+ the secondary server is ready for operation. The reason behind this
+ is that the secondary server has no way to communicate to the primary
+ server which relationship a connection is designed to serve.
+
+ A server which has multiple secondary relationships with a primary
+ server SHOULD only send one stimulus connection attempt to the pri-
+ mary server.
+
+ Once a connection is established, the primary server MUST send a CON-
+ NECT message across the connection. A secondary server MUST wait for
+ the CONNECT message from a primary server. If the secondary server
+ doesn't receive a CONNECT message from the primary server in an ins-
+ tallation dependent amount of time, it MAY drop the connection and
+ send another stimulus connection attempt to the primary server.
+
+ Every CONNECT message includes a TLS-request option, and if the CON-
+ NECTACK message does not reject the CONNECT message and the TLS-reply
+ option says TLS MUST be used, then the servers will immediately enter
+ into TLS negotiation.
+
+ Once TLS negotiation is complete, the primary server MUST resend the
+ CONNECT message on the newly secured TLS connection and then wait for
+ the CONNECTACK message in response. The TLS-request and TLS-reply
+
+
+
+Droms, et. al. Expires September 2003 [Page 82]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ options MUST NOT appear in either this second CONNECT or its associ-
+ ated CONNECTACK message as they had in the first messages.
+
+ The second message sent over a new connection (either a bare TCP con-
+ nection or a connection utilizing TLS) is a STATE message. Upon the
+ receipt of this message, the receiver can consider communications up.
+
+ It is entirely possible that two servers will attempt to make connec-
+ tions to each other essentially simultaneously, and in this case the
+ secondary server will be waiting for a CONNECT message on each con-
+ nection. The primary server MUST send a CONNECT message over one
+ connection and it MUST close the other connection.
+
+ A secondary server MUST NOT respond to the closing of a TCP connec-
+ tion with a blind attempt to reconnect -- there may be another TCP
+ connection to the same failover partner already in use.
+
+8.3. Using the TCP connection for determining communications status
+
+ The TCP connection is used to determine the communications status of
+ the other server, i.e., communications-ok, or communications-
+ interrupted.
+
+ Three things must happen for a server to consider that communications
+ are ok with respect to another server:
+
+
+ 1. A TCP connection must be established to the other server.
+
+ 2. A CONNECT message must be received and a CONNECTACK message
+ sent in response. The CONNECT message is used to determine
+ the identify of the failover endpoint of the other end of the
+ TCP connection -- without it, the failover endpoint cannot be
+ uniquely determined. Without knowledge of the failover end-
+ point, then the entity with which communications is ok is
+ undetermined.
+
+ 3. A STATE message must be received from the other server over
+ the connection. This STATE message initializes important
+ information necessary to the operation of the state machine
+ the governs the behavior of this failover endpoint.
+
+ There are two ways that a server can determine that communications
+ has failed:
+
+
+ 1. The TCP connection can go down, yielding an error when
+ attempting to send or receive a message. This will happen at
+
+
+
+Droms, et. al. Expires September 2003 [Page 83]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ least as often as the period of the tSend timer.
+
+ 2. The tReceive timer can expire.
+
+ In either of these cases, communications is considered interrupted.
+
+ If the tReceive timer expires, the connection MUST be dropped. The
+ reject reason 17: "No traffic within sufficient time" is placed in
+ the DISCONNECT message sent prior to dropping the TCP connection.
+
+ Several difficulties arise when trying to use one TCP connection for
+ both bulk data transfer as well as to sense the communications status
+ of the other server. One aspect of the problem stems from the dif-
+ ferent requirements of both uses. The bulk data transfer is of
+ course critically important to the protocol, but the speed with which
+ it is processed is not terribly significant. It might well be
+ minutes before a BNDUPD message is processed, and while not optimal,
+ such an occasional delay doesn't compromise the correctness of the
+ protocol. However, the speed with which one server detects the other
+ server is up (or, more importantly, down) is more highly constrained.
+ Generally one server should be able to detect that the other server
+ is not communicating within a minute or less.
+
+ These differing time constraints makes it difficult to use the same
+ TCP connection for data transfer as well as to sense communications
+ integrity. See section 3.5 for additional details on TCP.
+
+ The solution to this problem is to require that some message be
+ received by each end of the connection within a limited time or that
+ the connection will be considered down. If no messages have been
+ sent recently, then a CONTACT message is sent.
+
+ In the case where there is no data queued to be sent, this is not a
+ problem, but in the case where there is data queued to be sent to the
+ partner, then the CONTACT message will not actually be transmitted
+ until the queued data is sent. Section 3.5 explains why waiting for
+ TCP to determine that the connection is down is not acceptable, and
+ leads to a requirement that the receiving server never block the
+ sending server from sending CONTACT messages.
+
+ In order to meet this requirement, each server tells the other server
+ the number of outstanding BNDUPD messages that it will accept. The
+ receiving server is required to always be able to accept that many
+ BNDUPD messages off of the connection's input queue even if it cannot
+ process them immediately, and to accept all other messages immedi-
+ ately.
+
+ Thus, the sending server's TCP is never blocked from sending a
+
+
+
+Droms, et. al. Expires September 2003 [Page 84]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ message except for very short periods, less than a few seconds unless
+ the network connection itself has problems. In this case, if the
+ CONTACT messages don't make it to the partner then the partner will
+ close the connection.
+
+ DISCUSSION:
+
+ When implementing this capability, one needs to be careful when
+ sending any message on the TCP connection as TCP can easily block
+ the server if the local TCP send buffers are full. This can't be
+ prevented because if the receiver is not reachable (via the net-
+ work), the sending TCP can't send and thus it will be unable to
+ empty the local TCP send buffers. So, all send operations either
+ need to assume they may block for some time or non-blocking sends
+ must be used carefully.
+
+8.4. Using the TCP connection for binding data
+
+ Binding data, in the form of BNDUPD messages and BNDACK messages to
+ respond to them, are sent across the TCP connection.
+
+ In order to support timely detection of any failure in the partner
+ server, the TCP connection MUST NOT block for more than a very short
+ time, on the order of a few seconds. Therefore, a server that is
+ sending BNDUPD messages MUST send only a restricted number before
+ receiving BNDACK messages about previous messages sent.
+
+ The number of outstanding BNDUPD messages that each server will
+ accept without causing TCP to block transmission of additional data
+ (i.e, CONTACT messages) is sent by each server in the CONNECT and
+ CONNECTACK messages in the max-unacked-bndupd option.
+
+8.5. Using the TCP connection for control messages
+
+ The TCP connection is used for control messages: POOLREQ, UPDREQ,
+ STATE, CONTACT, UPDREQALL and the corresponding reply messages: POOL-
+ RESP, UPDDONE. A server MUST immediately accept all of these mes-
+ sages from the TCP connection. A server MUST immediately accept any
+ BNDACK which is received as well.
+
+8.6. Losing the TCP connection
+
+ When the TCP connection is lost, then communications is not ok with
+ the other server. A server which has lost communications SHOULD
+ immediately attempt to reconnect to the other server, and should
+ retry these connection attempts periodically.
+
+ An acknowledgement message (BNDACK, POOLRESP, UPDDONE) message can
+
+
+
+Droms, et. al. Expires September 2003 [Page 85]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ only be sent in response to a request message (BNDUPD, POOLREQ,
+ UPDREQ, UPDREQALL) on the same TCP connection from which the request
+ was received, in part since the XID's in the request messages are
+ guaranteed unique only during the life of a single TCP connection.
+
+ When a connection to a partner server goes down, a server with unpro-
+ cessed request messages MAY simply drop all of those messages, since
+ it can be sure that the partner will resend them when they are next
+ in communications. A server with unprocessed BNDUPD messages when a
+ TCP connection goes down MAY instead choose to process those BNDUPD
+ messages, but it MUST NOT send any BNDACK messages in response (again
+ because of the issues surrounding XID uniqueness).
+
+ When the TCP connection is closed explicitly, the DISCONNECT message
+ with a reject-reason option (and, ideally, a message option) MUST be
+ sent over the TCP connection.
+
+9. Failover Endpoint States
+
+ This section discusses the various states that a failover endpoint
+ may take, and the server actions required when entering the state,
+ operating in the state, and leaving the state, as well as the events
+ that cause transitions out of the state into another state.
+
+ The state transition diagram in Figure 9.2-1 is relevant for this
+ section. This is the common state transition diagram for both servers
+ in a failover pair. In the event that the textual description of a
+ state differs from the state transition diagram, the textual descrip-
+ tion is to be considered authoritative.
+
+9.1. Server Initialization
+
+ When a server starts it starts out in STARTUP state. See section 9.3
+ below for details.
+
+9.2. Server State Transitions
+
+ Whenever a server makes a transition into a new state, it MUST record
+ the state and the time at which it entered that state in stable
+ storage. If communications is "ok", it MUST also send a STATE mes-
+ sage to its failover partner.
+
+ Figure 9.2-1 is the diagram of the server state transitions. The
+ remainder of this section contains information important to the
+ understanding of that diagram.
+
+ The server stays in the current state until all of the actions speci-
+ fied on the state transition are complete. If communications fails
+
+
+
+Droms, et. al. Expires September 2003 [Page 86]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ during one of the actions, the server simply stays in the current
+ state and attempts a transition whenever the conditions for a transi-
+ tion are later fulfilled.
+
+ In the state transition diagram below, the "+" or "-" in the upper
+ right corner of each state is a notation about whether communication
+ is ongoing with the other server.
+
+ The legend "responsive", "balanced", or "unresponsive" in each state
+ indicates whether the server is responsive to all DHCP client
+ requests, running in load balanced mode, or totally unresponsive in
+ the respective state. The terms "responsive" and "unresponsive" have
+ the obvious meanings, while "balanced" means that a DHCP server may
+ respond to all DHCPREQUEST messages that are RENEWAL or REBINDING,
+ and to all other messages from clients for which the load balancing
+ algorithm indicates that it MUST respond to. See sections 5.3 and
+ 9.8.2 for details on load balancing.
+
+ Note that in situations where a server does not respond to a DHCP
+ client message, it MUST NOT remember any of the information from that
+ message.
+
+ In the state transition diagram below, when communication is reesta-
+ blished between the two servers, each must record the state of the
+ partner when communication was restored. State transitions on one
+ server in some cases imply state transitions on the partner server,
+ so a record of the current state of the partner server must be kept
+ by each server.
+
+ If the state of the partner changes while communicating a server
+ moves through the communications-failed transition and into whatever
+ state results. It then immediately moves through whatever state
+ transition is appropriate given the current state of the partner
+ server. A server performing this operation SHOULD NOT close the TCP
+ connection to its partner.
+
+ DISCUSSION:
+
+ The point of this technique is simplicity, both in explanation of
+ the protocol and in its implementation. The alternative to this
+ technique of memory of partner state and automatic state transi-
+ tion on change of partner state is to have every state in the fol-
+ lowing diagram have a state transition for every possible state of
+ the partner. With the approach adopted, only the states in which
+ communications are reestablished require a state transition for
+ each possible partner state.
+
+ The current state of a server MUST be recorded in stable storage and
+
+
+
+Droms, et. al. Expires September 2003 [Page 87]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ thus be available to the server after a server restart.
+
+ A transition into SHUTDOWN or PAUSED state is not represented in the
+ following figure, since other than sending that state to its partner,
+ the remaining actions involved look just like the server halting in
+ its otherwise current state, which then becomes the previous state
+ upon server restart.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 88]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+ +---------------+ V +--------------+
+ | RECOVER -|+| | | STARTUP - |
+ |(unresponsive) | +->+(unresponsive)|
+ +------+--------+ +--------------+
+ +-Comm. OK +-----------------+
+ | Other State: | PARTNER DOWN - +<----------------------+
+ | RESOLUTION-INTER. | (responsive) | ^
+ All POTENTIAL- +----+------------+ |
+ Others CONFLICT------------ | --------+ |
+ | CONFLICT-DONE Comm. OK | +--------------+ |
+ UPDREQ or Other State: | +--+ RESOLUTION - | |
+ UPDREQALL | | | | | INTERRUPTED | |
+ Rcv UPDDONE RECOVER All | | | (responsive) | |
+ | +---------------+ | Others | | +------------+-+ |
+ +->+RECOVER-WAIT +-| RECOVER | | | ^ | |
+ |(unresponsive) | WAIT or | | Comm. | Ext. |
+ +-----------+---+ DONE | | OK Comm. Cmd----->+
+ Comm.---+ Wait MCLT | V V V Failed |
+ Changed | V +---+ +---+-----+--+-+ | |
+ | +---+----------++ | | POTENTIAL + +-------+ |
+ | |RECOVER-DONE +-| Wait | CONFLICT +------+ |
+ +->+(unresponsive) | for |(unresponsive)| Primary |
+ +------+--------+ Other +>+----+--------++ resolve Comm. |
+ Comm. OK State: | | ^ conflict Changed |
+ +---Other State:-+ RECOVER | Secondary | V V | |
+ | | | DONE | resolve | ++----------+---++ |
+ | All Others: POTENT. | | conflict | |CONFLICT-DONE-|+| |
+ | Wait for CONFLICT- | ----+ see (9.10) | | (responsive) | |
+ | Other State: V V | +------+---------+ |
+ | NORMAL or RECOVER ++------------+---+ Other State: NORMAL |
+ | | DONE | NORMAL + +<--------------+ |
+ | +--+----------+-->+ (balanced) +-------External Command--->+
+ | ^ ^ +--------+--------+ or Other State: |
+ | | | | | SHUTDOWN |
+ | Wait for Comm. OK Comm. Failed or | |
+ | Other Other Other State: PAUSED | External
+ | State: State: | | Command
+ | RECOVER-DONE NORMAL Start Safe Comm. OK or
+ | | COMM. INT. Period Timer Other State: Safe
+ | Comm. OK. | V All Others Period
+ | Other State: | +---------+--------+ | expiration
+ | RECOVER +--+ COMMUNICATIONS - +----+ |
+ | +-------------+ INTERRUPTED | |
+ RECOVER | (responsive) +-------------------------->+
+ RECOVER-WAIT--------->+------------------+
+ Figure 9.2-1: Server state diagram.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 89]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+9.3. STARTUP state
+
+ The STARTUP state affords an opportunity for a server to probe its
+ partner server, before starting to service DHCP clients.
+
+ DISCUSSION:
+
+ Without the STARTUP state, a server would likely start in a state
+ derived from its previously stored state (held in stable storage),
+ if any. However, this may be inconsistent with the current state
+ of the partner. The STARTUP state affords the opportunity for a
+ server to potentially learn the partner's state and determine if
+ that state is consistent with its derived starting state or
+ whether some significant state change has occurred at the partner
+ that forces the server to start in another state. This is
+ especially critical if significant time has elapsed while the
+ server was down.
+
+
+9.3.1. Operation while in STARTUP state
+
+ Whenever a server is in STARTUP state, it MUST be unresponsive to
+ DHCP client requests, and so the time spent in the STARTUP state is
+ necessarily short, typically on the order of a few seconds to a few
+ tens of seconds. The exact time spent in the STARTUP state is imple-
+ mentation dependent, and the primary and secondary server are not
+ required to spend the same amount of time in the STARTUP state. See
+ section 5.9 for some guidelines on the time to spend in STARTUP
+ state.
+
+ Whenever a STATE message is sent to the partner while in STARTUP
+ state the STARTUP bit MUST be set in the server-flags option and the
+ previously recorded failover state MUST be placed in the server-state
+ option.
+
+
+9.3.2. Transition out of STARTUP state
+
+ Each server starts out in startup state every time it initializes
+ itself, and performs the following algorithm as part of its initiali-
+ zation:
+
+ 1. Is there any record in stable storage of a previous failover
+ state? If yes, set previous-state to the last recorded state
+ in stable storage, and continue with step 2.
+
+ Is there any configuration information that indicates that
+
+
+
+Droms, et. al. Expires September 2003 [Page 90]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ this server was previously running but lost its stable
+ storage? Such information must typically come from some
+ administrative intervention, since it is difficult for a
+ server to distinguish first startup from a startup after it
+ has lost its stable storage. If yes, then set the previous-
+ state to RECOVER, and set the time-of-failure to whatever time
+ was configured, and go on to step 2. This time-of-failure
+ will be used in the transition out of the RECOVER-WAIT state
+ into the RECOVER-DONE state, below.
+
+ If there is no record of any previous failover state in stable
+ storage for this server, then set the previous-state to
+ RECOVER and set the time-of-failure to a time before the
+ maximum-client-lead-time before now. If using standard Posix
+ times, 0 would typically do quite well. This will allow two
+ servers which already have lease information to synchronize
+ themselves prior to operating.
+
+ Note that neither server is responsive to DHCP client requests
+ while in the RECOVER state. If both servers can communicate,
+ however, they will come out of the RECOVER state and progress
+ through RECOVER-WAIT to RECOVER-DONE and thence to NORMAL or
+ COMMUNICATIONS-INTERRUPTED state quickly. If both have state,
+ then they will exchange information. If only one has state,
+ then the one that does not will complete its update of its
+ partner quickly (since it has nothing to send).
+
+ In some cases, an existing server will be commissioned as a
+ failover server and brought back into operation where its
+ partner is not yet available. In this case, the newly commis-
+ sioned failover server will not operate until its partner
+ comes online -- but it has operational responsibilities as a
+ DHCP server nonetheless. To properly handle this situation, a
+ server SHOULD be configurable in such a way as to move
+ directly into PARTNER-DOWN state after the startup period
+ expires if it has been unable to contact its partner during
+ the startup period.
+
+ 2. If the previous state is one where communications was "OK",
+ then set the previous state to the state that is the result of
+ the communications failed state transition in Figure 9.2-1 (if
+ such transition is shown -- some states don't have a communi-
+ cations failed state transition, since they allow both commun-
+ ications OK and failed).
+
+ 3. Start the STARTUP state timer. The time that a server remains
+ in the STARTUP state (absent any communications with its
+ partner) is implementation dependent and SHOULD be
+
+
+
+Droms, et. al. Expires September 2003 [Page 91]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ configurable. It SHOULD be long enough for a TCP connection
+ to be created to a heavily loaded partner across a slow net-
+ work.
+
+ 4. Attempt to create a TCP connection to the failover partner.
+ See section 8.2.
+
+ 5. Wait for "communications okay", i.e., the process discussed in
+ section 8.2 "Creating the TCP Connection", to complete,
+ including the receipt of a STATE message from the partner.
+
+ When and if communications become "okay", clear the STARTUP
+ flag, and set the current state to the previous-state.
+
+ If the partner is in PARTNER-DOWN state, and if the time at
+ which it entered PARTNER-DOWN state (as received in the
+ start-time-of-state option in the STATE message) is later than
+ the last recorded time of operation of this server, then set
+ the current state to RECOVER. If the time at which it entered
+ PARTNER-DOWN state is earlier than the last recorded time of
+ operation of this server, then set the current state to
+ POTENTIAL-CONFLICT.
+
+ Then, transition to the current state and take the "communica-
+ tions okay" state transition based on the current state of
+ this server and the partner.
+
+ 6. If the startup time expires, take an implementation dependent
+ action: The server MAY go to the previous-state, or the
+ server MAY wait.
+
+ Reasons to go to previous-state and begin processing:
+
+ If the current server is the only operational server, then if
+ it waits, there will be no operational DHCP servers. This
+ situation could occur very easily where one server fails and
+ then the other crashes and reboots. If the rebooting server
+ doesn't start processing DHCP client requests without first
+ being in communication with the other server, then the level
+ of DHCP redundancy is not particularly high. This is an
+ appropriate approach if the possibility of partition is low,
+ or if the safe period expiration time is well beyond the time
+ at which an operator would notice and react to a partition
+ situation. It is also quite appropriate if the safe period
+ will never expire.
+
+ Reasons to wait:
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 92]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ If the current server has been down for longer than the
+ maximum-client-lead-time, and it is partitioned from the other
+ server, then when it returns it will attempt to use its own
+ available addresses to allocate to new DHCP clients, and the
+ other server may well be in PARTNER-DOWN state and may have
+ already allocated some of those available addresses to DHCP
+ clients. In cases where the possibility of partition is high,
+ and the safe period expiration time is less than the likely
+ operator reaction time, this is a good approach to use.
+
+9.4. PARTNER-DOWN state
+
+ PARTNER-DOWN state is a state either server can enter. When in this
+ state, the server does not assume that the other server could still
+ be operating and servicing a different set of clients, but instead
+ assumes that it is the only server operating. If one server is in
+ PARTNER-DOWN state, the other server MUST NOT be operating.
+
+
+9.4.1. Upon entry to PARTNER-DOWN state
+
+ No special actions are required when entering PARTNER-DOWN state.
+
+ The server should continue to attempt to connect to the partner
+ periodically.
+
+
+9.4.2. Operation while in PARTNER-DOWN state
+
+ A server in PARTNER-DOWN state MUST respond to DHCP client requests.
+ It will allow renewal of all outstanding leases on IP addresses, and
+ will allocate IP addresses from its own pool, and after a fixed
+ period of time (the MCLT interval) has elapsed from entry into
+ PARTNER-DOWN state, it will allocate IP addresses from the set of all
+ available IP addresses.
+
+ Once a server has entered NORMAL state, the PARTNER-DOWN state is
+ entered only on command of an external agency (typically an adminis-
+ trator of some sort) or after the expiration of an externally config-
+ ured minimum safe-time after the beginning of COMMUNICATIONS-
+ INTERRUPTED state.
+
+ Any IP address tagged as available for allocation by the other server
+ (at entry to PARTNER-DOWN state) MUST NOT be allocated to a new
+ client until the maximum-client-lead-time beyond the entry into
+ PARTNER-DOWN state has elapsed.
+
+ A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
+
+
+
+Droms, et. al. Expires September 2003 [Page 93]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ DHCP client different from that to which it was allocated at the
+ entrance to PARTNER-DOWN state until the maximum-client-lead-time
+ beyond the maximum of the following times: client expiration time,
+ most recently transmitted potential-expiration-time, most recently
+ received ack of potential-expiration-time from the partner, and most
+ recently acked potential-expiration-time to the partner. See section
+ 7.1.5 for details. If this time would be earlier than the current
+ time plus the maximum-client-lead-time, then the time the server
+ entered PARTNER-DOWN state plus the maximum-client-lead-time is used.
+
+ Two options exist for lease times given out while in PARTNER-DOWN
+ state, with different ramifications flowing from each.
+
+ If the server wishes the Failover protocol to protect it from loss of
+ stable storage in PARTNER-DOWN state, then it should ensure that the
+ MCLT based lease time restrictions in section 5.1 are maintained,
+ even in PARTNER-DOWN state.
+
+ If the server wishes to forego the protection of the Failover proto-
+ col in the event of loss of stable storage, then it need recognize no
+ restrictions on actual client lease times while in PARTNER-DOWN
+ state.
+
+ A server in PARTNER-DOWN state MUST continue to attempt to establish
+ communications and synchronization with its partner.
+
+9.4.3. Transitions out of PARTNER-DOWN state
+
+ When a server in PARTNER-DOWN state succeeds in establishing a con-
+ nection to its partner, its actions are conditional on the state and
+ flags received in the STATE message from the other server as part of
+ the process of establishing the connection.
+
+ If the STARTUP bit is set in the server-flags option of a received
+ STATE message, a server in PARTNER-DOWN state MUST NOT take any state
+ transitions based on reestablishing communications. Essentially, if a
+ server is in PARTNER-DOWN state, it ignores all STATE messages from
+ its partner that have the STARTUP bit set in the server-flags option
+ of the STATE message.
+
+ If the STARTUP bit is not set in the server-flags option of a STATE
+ message received from its partner, then a server in PARTNER-DOWN
+ state takes the following actions based on the value of the server-
+ state option in the received STATE message (either immediately after
+ establishing communications or at any time later when a new state is
+ received):
+
+ o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN,
+
+
+
+Droms, et. al. Expires September 2003 [Page 94]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED, or CONFLICT-DONE
+ state
+
+ transition to POTENTIAL-CONFLICT state
+
+ o partner in RECOVER, RECOVER-WAIT, SHUTDOWN, PAUSED state
+
+ stay in PARTNER-DOWN state
+
+ o partner in RECOVER-DONE state
+
+ transition into NORMAL state
+
+9.5. RECOVER state
+
+ This state indicates that the server has no information in its stable
+ storage or that it is re-integrating with a server in PARTNER-DOWN
+ state after it has been down. A server in this state MUST attempt to
+ refresh its stable storage from the other server.
+
+9.5.1. Operation in RECOVER state
+
+ A server in RECOVER MUST NOT respond to DHCP client requests.
+
+ A server in RECOVER state will attempt to reestablish communications
+ with the other server.
+
+9.5.2. Transitions out of RECOVER state
+
+ If the other server is in POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED,
+ or CONFLICT-DONE state when communications are reestablished, then
+ the server in RECOVER state will move to POTENTIAL-CONFLICT state
+ itself.
+
+ If the other server is in any other state, then the server in RECOVER
+ state will request an update of missing binding information by send-
+ ing an UPDREQ message. If the server has been instructed (through
+ configuration or other external agency) that it has lost its stable
+ storage, or if it has deduced that from the fact that it has no
+ record of ever having talked to its partner, while its partner does
+ have a record of communicating with it, it MUST send an UPDREQALL
+ message, otherwise it MUST send an UPDREQ message. See Figure
+ 9.5.2-1.
+
+ It will wait for an UPDDONE message, and upon receipt of that message
+ it will transition to RECOVER-WAIT state.
+
+ If communications fails during the reception of the results of the
+
+
+
+Droms, et. al. Expires September 2003 [Page 95]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ UPDREQ or UPDREQALL message, the server will remain in RECOVER state,
+ and will re-issue the UPDREQ or UPDREQALL when communications are
+ re-established. (See section 5.17).
+
+ If an UPDDONE message isn't received within an implementation depen-
+ dent amount of time, and no BNDUPD messages are being received, the
+ connection SHOULD be dropped.
+
+
+
+
+ A B
+ Server Server
+
+ | |
+ RECOVER PARTNER-DOWN
+ | |
+ | >--UPDREQ--------------------> |
+ | |
+ | <---------------------BNDUPD--< |
+ | >--BNDACK--------------------> |
+ ... ...
+ | |
+ | <---------------------BNDUPD--< |
+ | >--BNDACK--------------------> |
+ | |
+ | <--------------------UPDDONE--< |
+ | |
+ RECOVER-WAIT |
+ | |
+ | >--STATE-(RECOVER-WAIT)------> |
+ | |
+ | |
+ Wait MCLT from last known |
+ time of failover operation |
+ | |
+ RECOVER-DONE |
+ | |
+ | >--STATE-(RECOVER-DONE)------> |
+ | NORMAL
+ | <-------------(NORMAL)-STATE--< |
+ NORMAL |
+ | >---- State-(NORMAL)--------------->
+ | |
+ | |
+
+ Figure 9.5.2-1: Transition out of RECOVER state
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 96]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+If, at any time while a server is in RECOVER state communications fails,
+the server will stay in RECOVER state. When communications are
+restored, it will restart the process of transitioning out of RECOVER
+state.
+
+9.6. RECOVER-WAIT state
+
+ This state indicates that the server has done an UPDREQ or UPDREQALL
+ and has received the UPDDONE message indicating that it has received
+ all outstanding binding update information. In the RECOVER-WAIT
+ state the server will wait for the MCLT in order to ensure that any
+ processing that this server might have done prior to losing its
+ stable storage will not cause future difficulties.
+
+9.6.1. Operation in RECOVER-WAIT state
+
+ A server in RECOVER-WAIT MUST NOT respond to DHCP client requests.
+
+9.6.2. Transitions out of RECOVER-WAIT state
+
+ Upon entry to RECOVER-WAIT state the server MUST start a timer whose
+ expiration is set to a time equal to the time the server went down
+ (if known) or the time the server started (if the down-time is
+ unknown) plus the maximum-client-lead-time. When this timer goes
+ off, the server will transition into RECOVER-DONE state.
+
+ This is to allow any IP addresses that were allocated by this server
+ prior to loss of its client binding information in stable storage to
+ contact the other server or to time out.
+
+ If this is the first time this server has run failover -- as
+ determined by the information received from the partner, not
+ necessarily only as determined by this server's stable storage (as
+ that may have been lost), then the waiting time discussed above may
+ be skipped, and the server may transition immediately to RECOVER-DONE
+ state.
+
+ See Figure 9.5.2-1.
+
+ DISCUSSION:
+
+ The actual requirement on this wait period in RECOVER is that it
+ start not before the recovering server went down, not necessarily
+ when it came back up. If the time when the recovering server
+ failed is known, it could be communicated to the recovering server
+ (perhaps through actions of the network administrator), and the
+ wait period could be reduced to the maximum-client-lead-time less
+
+
+
+Droms, et. al. Expires September 2003 [Page 97]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ the difference between the current time and the time the server
+ failed. In this way, the waiting period could be minimized.
+ Various heuristics could be used to estimate this time, for
+ example if the recovering server periodically updates stable
+ storage with a time stamp, the wait period could be calculated to
+ start at the time of the last update of stable storage plus the
+ time required for the next update (which never occurred). This
+ estimate is later than the server went down, but probably not too
+ much later.
+
+ If the server has never before run failover, then there is no need
+ to wait in this state -- but, again, to determine if this server
+ has run failover it is vital that the information provided by the
+ partner be utilized, since the stable storage of this server may
+ have been lost.
+
+ If communications fails while a server is in RECOVER-WAIT state, it
+ has no effect on the operation of this state. The server SHOULD
+ continue to operate its timer, and the timer goes off during the
+ period where communications with the other server have failed, then
+ the server SHOULD transition to RECOVER-DONE state. This is rare --
+ failover state transitions are not usually made while communications
+ are interrupted, but in this case there is no reason to inhibit the
+ timer. A server MAY state in RECOVER-WAIT state even after expiry of
+ the timer and transition to RECOVER-DONE state upon re-establishing
+ communications with the partner if desired. The key point here is to
+ allow the timer to continue to operate, not whether or not the state
+ transition is made before or after communications are re-established.
+
+
+9.7. RECOVER-DONE state
+
+ This state exists to allow an interlocked transition for one server
+ from RECOVER state and another server from PARTNER-DOWN or
+ COMMUNICATIONS-INTERRUPTED state into NORMAL state.
+
+9.7.1. Operation in RECOVER-DONE state
+
+ A server in RECOVER-DONE state MUST respond only to
+ DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.
+
+9.7.2. Transitions out of RECOVER-DONE state
+
+ When a server in RECOVER-DONE state determines that its partner
+ server has entered NORMAL or RECOVER-DONE state, then it will transi-
+ tion into NORMAL state.
+
+ If communications fails while in RECOVER-DONE state, a server will
+
+
+
+Droms, et. al. Expires September 2003 [Page 98]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ stay in RECOVER-DONE state.
+
+
+ 9.8. NORMAL state
+
+ NORMAL state is the state used by a server when it is communicating
+ with the other server, and any required resynchronization has been
+ performed. While some bindings database synchronization is performed
+ in NORMAL state, potential conflicts are resolved prior to entry into
+ NORMAL state as is binding database data loss.
+
+
+9.8.1. Upon entry to NORMAL state
+
+ When entering NORMAL state, a server will send to the other server
+ all currently unacknowledged binding updates as BNDUPD messages.
+
+ When the above process is complete, if the server entering NORMAL
+ state is a secondary server, then it will request IP addresses for
+ allocation using the POOLREQ message.
+
+
+9.8.2. Processing DHCP client requests and load balancing
+
+ In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
+ DHCPREQUEST/REBINDING request it receives. And, it processes other
+ requests only for those clients as dictated by the load balancing
+ algorithm specified in [RFC 3074].
+
+ As discussed in section 5.3, each server will take the client-
+ identifier from each DHCP client request (or the client-hardware-
+ address, i.e., the chaddr if no client-identifier is present in the
+ request) and use it as the 'Request ID' specified in [RFC 3074].
+ After applying the algorithm specified in [RFC 3074] and comparing
+ the result with the hash bucket assignment (performed during connect
+ processing between failover servers), each failover server will be
+ able to unambiguously determine if it should process the DHCP client
+ request.
+
+9.8.3. Operation in NORMAL state
+
+ When in NORMAL state, for every DHCP client request that it
+ processes, as determined by the algorithm described in section 9.8.2,
+ above, a server will operate in the following manner:
+
+ o Lease time calculations
+
+ As discussed in section 5.2.1, "Control of lease time", the
+
+
+
+Droms, et. al. Expires September 2003 [Page 99]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ lease interval given to a DHCP client can never be more than the
+ MCLT greater than the most recently received potential-
+ expiration-time from the failover partner or the current time,
+ whichever is later.
+
+ As long as a server adheres to this constraint, the specifics of
+ the lease interval that it gives to a DHCP client or the value
+ of the potential-expiration-time sent to its failover partner
+ are implementation dependent. One possible approach is dis-
+ cussed in section 5.2.1, but that particular approach is in no
+ way required by this protocol.
+
+ See section 7.1.5 for details concerning the storage of time
+ associated with IP addresses and how to use these times when
+ calculating lease times for DHCP clients.
+
+ o Lazy update of partner server
+
+ After an DHCPACK of a IP address binding, the server servicing a
+ DHCP client request attempts to update its partner with the new
+ binding information. The lease time used in the update of the
+ secondary MUST be at least that given to the DHCP client in the
+ DHCPACK, and the potential-expiration-time MUST be at least the
+ lease time, and SHOULD be considerably longer.
+
+ o Reallocation of IP addresses between clients
+
+ Whenever a client binding is released or expires, a BNDUPD mes-
+ sage must be sent to the partner, setting the binding state to
+ RELEASED or EXPIRED. However, until a BNDACK is received for
+ this message, the IP address cannot be allocated to another
+ client. It cannot be allocated to the same client again if a
+ BNDUPD was sent, otherwise it can. See section 5.2.2.
+
+ In normal state, each server receives binding updates from its
+ partner server in BNDUPD messages. It records these in its client
+ binding database in stable storage and then sends a corresponding
+ BNDACK message to its partner server. It MUST ensure that the infor-
+ mation is recorded in stable storage prior to sending the BNDACK mes-
+ sage back to its partner.
+
+
+9.8.4. Transitions out of NORMAL state
+
+ If an external command is received by a server in NORMAL state
+ informing it that its partner is down, then transition into PARTNER-
+ DOWN state. Generally, this would be an unusual situation, where
+ some external agency knew the partner server was down. Using the
+
+
+
+Droms, et. al. Expires September 2003 [Page 100]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ command in this case would be appropriate if the polling interval and
+ timeout were long.
+
+ If a server in NORMAL state fails to receive acks to messages sent to
+ its partner for an implementation dependent period of time, it MAY
+ move into COMMUNICATIONS-INTERRUPTED state. This situation might
+ occur if the partner server was capable of maintaining the TCP con-
+ nection between the server and also capable of sending a CONTACT mes-
+ sage every tSend seconds, but was (for some reason) incapable of pro-
+ cessing BNDUPD messages.
+
+ If the communications is determined to not be "ok" (as defined in
+ section 8), then transition into COMMUNICATIONS-INTERRUPTED state.
+
+ If a server in NORMAL state receives any messages from its partner
+ where the partner has changed state from that expected by the server
+ in NORMAL state, then the server should transition into
+ COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran-
+ sition from there. For example, it would be expected for the partner
+ to transition from POTENTIAL-CONFLICT into NORMAL state, but not for
+ the partner to transition from NORMAL into POTENTIAL-CONFLICT state.
+
+ If a server in NORMAL state receives any messages from its partner
+ where the PARTNER has changed into PAUSED state, the server should
+ transition into COMMUNICATIONS-INTERRUPTED state. If a server in
+ NORMAL state receives any messages from its partner where the PARTNER
+ has changed into SHUTDOWN state, the server should transition into
+ PARTNER-DOWN state.
+
+9.9. COMMUNICATIONS-INTERRUPTED State
+
+ A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
+ unable to communicate with the other server. Primary and secondary
+ servers cycle automatically (without administrative intervention)
+ between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
+ connection between them fails and recovers, or as the partner server
+ cycles between operational and non-operational. No duplicate IP
+ address allocation can occur while the servers cycle between these
+ states.
+
+
+9.9.1. Upon entry to COMMUNICATIONS-INTERRUPTED state
+
+ When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
+ configured to support an automatic transition out of COMMUNICATIONS-
+ INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
+ has been configured, see section 10), then a timer MUST be started
+ for the length of the configured safe period.
+
+
+
+Droms, et. al. Expires September 2003 [Page 101]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
+ the NORMAL state SHOULD raise some alarm condition to alert adminis-
+ trative staff to a potential problem in the DHCP subsystem.
+
+
+9.9.2. Operation in COMMUNICATIONS-INTERRUPTED State
+
+ In this state a server MUST respond to all DHCP client requests, and
+ the algorithm for load balancing described in section 5.3 MUST NOT be
+ used. When allocating new IP addresses, each server allocates from
+ its own IP address pool, where the primary MUST allocate only FREE IP
+ addresses, and the secondary MUST allocate only BACKUP IP addresses.
+ When responding to renewal requests, each server will allow continued
+ renewal of a DHCP client's current lease on an IP address irrespec-
+ tive of whether that lease was given out by the receiving server or
+ not, although the renewal period MUST NOT exceed the maximum client
+ lead time (MCLT) beyond the latest of: 1) the potential-expiration-
+ time already acknowledged by the other server, or 2) the lease-
+ expiration-time, or 3) the potential-expiration-time received from
+ the partner server.
+
+ However, since the server cannot communicate with its partner in this
+ state, the acknowledged-potential-expiration time will not be updated
+ in any new bindings. This is likely to eventually cause the actual-
+ client-lease-times to be the current time plus the maximum-client-
+ lead-time (unless this is greater than the desired-client-lease-
+ time).
+
+ The server should continue to try to establish a connection with its
+ partner.
+
+
+9.9.3. Transition out of COMMUNICATIONS-INTERRUPTED State
+
+ If the safe period timer expires while a server is in the
+ COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
+ PARTNER-DOWN state.
+
+ If an external command is received by a server in COMMUNICATIONS-
+ INTERRUPTED state informing it that its partner is down, it will
+ transition immediately into PARTNER-DOWN state.
+
+ If communications is restored with the other server, then the server
+ in COMMUNICATIONS-INTERRUPTED state will transition into another
+ state based on the state of the partner:
+
+ o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 102]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ The partner SHOULD NOT be in NORMAL state here, since upon res-
+ toration of communications it MUST have created a new TCP con-
+ nection which would have forced it into COMMUNICATIONS-
+ INTERRUPTED state. Still, we should account for every state
+ just in case.
+
+ Transition into the NORMAL state.
+
+ o partner in RECOVER
+
+ Stay in COMMUNICATIONS-INTERRUPTED state.
+
+ o partner in RECOVER-DONE
+
+ Transition into NORMAL state.
+
+ o partner in PARTNER-DOWN, POTENTIAL-CONFLICT, CONFLICT-DONE, or
+ RESOLUTION-INTERRUPTED
+
+ Transition into POTENTIAL-CONFLICT state.
+
+ o partner in PAUSED
+
+ Stay in COMMUNICATIONS-INTERRUPTED state.
+
+ o partner in SHUTDOWN
+
+ Transition into PARTNER-DOWN state.
+
+ The following figure illustrates the transition from NORMAL to
+ COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 103]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+ Primary Secondary
+ Server Server
+
+ NORMAL NORMAL
+ | >--CONTACT-------------------> |
+ | <--------------------CONTACT--< |
+ | [TCP connection broken] |
+ COMMUNICATIONS : COMMUNICATIONS
+ INTERRUPTED : INTERRUPTED
+ | [attempt new TCP connection] |
+ | [connection succeeds] |
+ | |
+ | >--CONNECT-------------------> |
+ | <-----------------CONNECTACK--< |
+ | NORMAL
+ | <-------------------STATE-----< |
+ NORMAL |
+ | >--STATE---------------------> |
+ |
+ | >--BNDUPD--------------------> |
+ | <---------------------BNDACK--< |
+ | |
+ | <---------------------BNDUPD--< |
+ | >------BNDACK----------------> |
+ ... ...
+ | |
+ | <--------------------POOLREQ--< |
+ | >--POOLRESP-(2)--------------> |
+ | |
+ | >--BNDUPD-(#1)---------------> |
+ | <---------------------BNDACK--< |
+ | |
+ | <--------------------POOLREQ--< |
+ | >--POOLRESP-(0)--------------> |
+ | |
+ | >--BNDUPD-(#2)---------------> |
+ | <---------------------BNDACK--< |
+ | |
+
+ Figure 9.9.3-1: Transition from NORMAL to COMMUNICATIONS-
+ INTERRUPTED and back (example with 2
+ addresses allocated to secondary)
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 104]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+9.10. POTENTIAL-CONFLICT state
+
+ This state indicates that the two servers are attempting to re-
+ integrate with each other, but at least one of them was running in a
+ state that did not guarantee automatic reintegration would be
+ possible. In POTENTIAL-CONFLICT state the servers may determine that
+ the same IP address has been offered and accepted by two different
+ DHCP clients.
+
+ It is a goal of this protocol to minimize the possibility that
+ POTENTIAL-CONFLICT state is ever entered.
+
+9.10.1. Upon entry to POTENTIAL-CONFLICT state
+
+ When a primary server enters POTENTIAL-CONFLICT state it should
+ request that the secondary send it all updates of which it is
+ currently unaware by sending an UPDREQ message to the secondary
+ server.
+
+ A secondary server entering POTENTIAL-CONFLICT state will wait for
+ the primary to send it an UPDREQ message.
+
+9.10.2. Operation in POTENTIAL-CONFLICT state
+
+ Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
+ DHCP requests.
+
+
+9.10.3. Transitions out of POTENTIAL-CONFLICT state
+
+ If communications fails with the partner while in POTENTIAL-CONFLICT
+ state, then the server will transition to RESOLUTION-INTERRUPTED
+ state.
+
+ Whenever either server receives an UPDDONE message from its partner
+ while in POTENTIAL-CONFLICT state, it MUST transition to a new state.
+ The primary MUST transition to CONFLICT-DONE state, and the secondary
+ MUST transition to NORMAL state. This will cause the primary server
+ to leave POTENTIAL-CONFLICT state prior to the secondary, since the
+ primary sends an UPDREQ message and receives an UPDDONE before the
+ secondary sends an UPDREQ message and receives its UPDDONE message.
+
+ When a secondary server receives an indication that the primary
+ server has made a transition from POTENTIAL-CONFLICT to CONFLICT-DONE
+ state, it SHOULD send an UPDREQ message to the primary server.
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 105]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+
+ Primary Secondary
+ Server Server
+
+ | |
+ POTENTIAL-CONFLICT POTENTIAL-CONFLICT
+ | |
+ | >--UPDREQ--------------------> |
+ | |
+ | <---------------------BNDUPD--< |
+ | >--BNDACK--------------------> |
+ ... ...
+ | |
+ | <---------------------BNDUPD--< |
+ | >--BNDACK--------------------> |
+ | |
+ | <--------------------UPDDONE--< |
+ CONFLICT-DONE |
+ | >--STATE--(CONFLICT-DONE)----> |
+ | <---------------------UPDREQ--< |
+ | |
+ | >--BNDUPD--------------------> |
+ | <---------------------BNDACK--< |
+ ... ...
+ | >--BNDUPD--------------------> |
+ | <---------------------BNDACK--< |
+ | |
+ | >--UPDDONE-------------------> |
+ | NORMAL
+ | <------------STATE--(NORMAL)--< |
+ NORMAL |
+ | >--STATE--(NORMAL)-----------> |
+ | |
+ | <--------------------POOLREQ--< |
+ | >------POOLRESP-(n)----------> |
+ | addresses |
+
+ Figure 9.10.3-1: Transition out of POTENTIAL-CONFLICT
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 106]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+9.11. RESOLUTION-INTERRUPTED state
+
+ This state indicates that the two servers were attempting to re-
+ integrate with each other in POTENTIAL-CONFLICT state, but
+ communications failed prior to completion of re-integration.
+
+ If the servers remained in POTENTIAL-CONFLICT while communications
+ was interrupted, neither server would be responsive to DHCP client
+ requests, and if one server had crashed, then there might be no
+ server able to process DHCP requests.
+
+9.11.1. Upon entry to RESOLUTION-INTERRUPTED state
+
+ When a server enters RESOLUTION-INTERRUPTED state it SHOULD raise an
+ alarm condition to alert administrative staff of a problem in the
+ DHCP subsystem.
+
+9.11.2. Operation in RESOLUTION-INTERRUPTED state
+
+ In this state a server MUST respond to all DHCP client requests, and
+ any load balancing (described in section 5.3) MUST NOT be used. When
+ allocating new IP addresses, each server SHOULD allocate from its own
+ IP address pool (if that can be determined), where the primary SHOULD
+ allocate only FREE IP addresses, and the secondary SHOULD allocate
+ only BACKUP IP addresses. When responding to renewal requests, each
+ server will allow continued renewal of a DHCP client's current lease
+ on an IP address irrespective of whether that lease was given out by
+ the receiving server or not, although the renewal period MUST not
+ exceed the maximum client lead time (MCLT) beyond the latest of: 1)
+ the potential-expiration-time already acknowledged by the other
+ server or 2) the lease-expiration-time or 3) `potential-expiration-
+ time received from the partner server.
+
+ However, since the server cannot communicate with its partner in this
+ state, the acknowledged-potential-expiration time will not be updated
+ in any new bindings.
+
+
+9.11.3. Transitions out of RESOLUTION-INTERRUPTED state
+
+ If an external command is received by a server in RESOLUTION-
+ INTERRUPTED state informing it that its partner is down, it will
+ transition immediately into PARTNER-DOWN state.
+
+ If communications is restored with the other server, then the server
+ in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
+ CONFLICT state.
+
+
+
+Droms, et. al. Expires September 2003 [Page 107]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+9.12. CONFLICT-DONE state
+
+ This state indicates that during the process where the two servers
+ are attempting to re-integrate with each other, the primary server
+ has received all of the updates from the secondary server. It make a
+ transition into CONFLICT-DONE state in order that it may be totally
+ responsive to the client load, as opposed to NORMAL state where it
+ would be in a "balanced" responsive state, running the load balancing
+ algorithm.
+
+9.12.1. Upon entry to CONFLICT-DONE state
+
+ A secondary server should never enter CONFLICT-DONE state.
+
+9.12.2. Operation in CONFLICT-DONE state
+
+ A primary server in CONFLICT-DONE state is fully responsive to all
+ DHCP clients (similar to the situation in COMMUNICATIONS-INTERRUPTED
+ state).
+
+ If communications fails, remain in CONFLICT-DONE state. If communi-
+ cations becomes OK, remain in CONFLICT-DONE state until the condi-
+ tions for transition out become satisfied.
+
+
+9.12.3. Transitions out of CONFLICT-DONE state
+
+ If communications fails with the partner while in CONFLICT-DONE
+ state, then the server will remain in CONFLICT-DONE state.
+
+ When a primary server determines that the secondary server has made a
+ transition into NORMAL state, the primary server will also transition
+ into NORMAL state.
+
+9.13. PAUSED state
+
+ This state exists to allow one server to inform another that it will
+ be out of service for what is predicted to be a relatively short
+ time, and to allow the other server to transition to COMMUNICATIONS-
+ INTERRUPTED state immediately and to begin servicing all DHCP clients
+ with no interruption in service to new DHCP clients.
+
+ A server which is aware that it is shutting down temporarily SHOULD
+ send a STATE message with the server-state option containing PAUSED
+ state and close the TCP connection.
+
+ While a server may or may not transition internally into PAUSED
+
+
+
+Droms, et. al. Expires September 2003 [Page 108]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ state, the 'previous' state determined when it is restarted MUST be
+ the state the server was in prior to receiving the command to shut-
+ down and restart and which precedes its entry into the PAUSED state.
+ See section 9.3.2 concerning the use of the previous state upon
+ server restart.
+
+9.13.1. Upon entry to PAUSED state
+
+ When entering PAUSED state, the server MUST store the previous state
+ in stable storage, and use that state as the previous state when it
+ is restarted.
+
+9.13.2. Transitions out of PAUSED state
+
+ A server makes a transition out of PAUSED state by being restarted.
+ At that time, the previous state MUST be the state the server was in
+ prior to entering the PAUSED state.
+
+
+9.14. SHUTDOWN state
+
+ This state exists to allow one server to inform another that it will
+ be out of service for what is predicted to be a relatively long time,
+ and to allow the other server to transition immediately to PARTNER-
+ DOWN state, and take over completely for the server going down.
+
+9.14.1. Upon entry to SHUTDOWN state
+
+ When entering SHUTDOWN state, the server MUST record the previous
+ state in stable storage for use when the server is restarted. It
+ also MUST record the current time as the last time operational.
+
+ A server which is aware that it is shutting down SHOULD send a STATE
+ message with the server-state field containing SHUTDOWN.
+
+9.14.2. Operation in SHUTDOWN state
+
+ A server in SHUTDOWN state MUST NOT respond to any DHCP client input.
+
+ If a server receives any message indicating that the partner has
+ moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
+ MUST record RECOVER state as the previous state to be used when it is
+ restarted.
+
+ A server SHOULD wait for a few seconds after informing the partner of
+ entry into SHUTDOWN state (if communications are okay) to determine
+ if the partner entered PARTNER-DOWN state.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 109]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+9.14.3. Transitions out of SHUTDOWN state
+
+ A server makes a transition out of SHUTDOWN state by being restarted.
+
+10. Safe Period
+
+ Due to the restrictions imposed on each server while in
+ COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
+ is not feasible for either server. One reason that these states
+ exist at all, is to allow the servers to easily survive transient
+ network communications failures of a few minutes to a few days
+ (although the actual time periods will depend a great deal on the
+ DHCP activity of the network in terms of arrival and departure of
+ DHCP clients on the network).
+
+ Eventually, when the servers are unable to communicate, they will
+ have to move into a state where they no longer can re-integrate
+ without some possibility of a duplicate IP address allocation. There
+ are two ways that they can move into this state (known as PARTNER-
+ DOWN).
+
+ They can either be informed by external command that, indeed, the
+ partner server is down. In this case, there is no difficulty in mov-
+ ing into the PARTNER-DOWN state since it is an accurate reflection of
+ reality and the protocol has been designed to operate correctly (even
+ during reintegration) as long as, when in PARTNER-DOWN state the
+ partner is, indeed, down.
+
+ The more difficult scenario is when the servers are running unat-
+ tended for extended periods, and in this case an option is provided
+ to configure something called a "safe-period" into each server. This
+ OPTIONAL safe-period is the period after which either the primary or
+ secondary server will automatically transition to PARTNER-DOWN from
+ COMMUNICATIONS-INTERRUPTED state. If this transition is completed
+ and the partner is not down, then the possibility of duplicate IP
+ address allocations will exist.
+
+ The goal of the "safe-period" is to allow network operations staff
+ some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
+ state. During the safe-period the only requirement is that the net-
+ work operations staff determine if both servers are still running --
+ and if they are, to either fix the network communications failure
+ between them, or to take one of the servers down before the expira-
+ tion of the safe-period.
+
+ The length of the safe-period is installation dependent, and depends
+ in large part on the number of unallocated IP addresses within the
+ subnet address pool and the expected frequency of arrival of
+
+
+
+Droms, et. al. Expires September 2003 [Page 110]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ previously unknown DHCP clients requiring IP addresses. Many
+ environments should be able to support safe-periods of several days.
+
+ During this safe period, either server will allow renewals from any
+ existing client. The only limitation concerns the need for IP
+ addresses for the DHCP server to hand out to new DHCP clients and the
+ need to re-allocate IP addresses to different DHCP clients.
+
+ The number of "extra" IP addresses required is equal to the expected
+ total number of new DHCP clients encountered during the safe period.
+ This is dependent only on the arrival rate of new DHCP clients, not
+ the total number of outstanding leases on IP addresses.
+
+ In the unlikely event that a relatively short safe period of an hour
+ is all that can be used (given a dearth of IP addresses or a very
+ high arrival rate of new DHCP clients), even that can provide sub-
+ stantial benefits in allowing the DHCP subsystem to ride through
+ minor problems that could occur and be fixed within that hour. In
+ these cases, no possibility of duplicate IP address allocation
+ exists, and re-integration after the failure is solved will be
+ automatic and require no operator intervention.
+
+11. Security
+
+ The Failover protocol communicates DHCP lease activity and this data
+ is generally easily discovered via other means, such as by pinging
+ addresses and doing DNS lookups. Therefore, the need to encrypt the
+ data over the wire is likely not great (though some sites may feel
+ differently).
+
+ However, it is very desirable to assure the integrity of failover
+ partners and to thus ensure proper operation of the servers. For
+ example, denial of service attacks are possible by the communication
+ of invalid state information to one or both servers.
+
+ Therefore, the Failover protocol MUST be capable of being secured by
+ using a simple shared secret message digest which covers each mes-
+ sage. This provides authentication of the servers, but does not pro-
+ vide encryption of the data exchange.
+
+ The Failover protocol MAY also be secured by using TLS [RFC 2246]
+ (Transport Layer Security) if encryption of the data exchange is
+ desired. The use of the shared secret or TLS will not protect
+ against TCP or IP layer attacks (such as someone sending fake TCP RST
+ segments). IPsec [RFC 2401] SHOULD be used to protect against most
+ (if not all) of these kinds of attacks.
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 111]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+11.1. Simple shared secret
+
+ Messages between the failover partners can be authenticated through
+ the use of a shared secret, which is never sent over the network and
+ must be known by each server. How each server is told about this
+ shared secret and secures its storage of the shared secret is outside
+ the scope of this document. If a server is configured with a shared
+ secret for a partner, it MUST send the message-digest option in ALL
+ messages to that partner and it MUST treat any messages received from
+ that partner without a message-digest option as failing authentica-
+ tion and reject them with reject reason 21: "Missing message digest".
+ Note that the message digest option MUST be the first option in the
+ message.
+
+ If a server is not configured with a shared secret for a partner, it
+ MUST NOT send the message-digest option in any message to that
+ partner and it MUST treat any messages received from that partner
+ with a message-digest option as failing authentication with reject
+ reason 13: "Message digest not configured".
+
+ The shared secret is used to calculate a 16 octet message-digest
+ which is sent in every failover message in the message-digest option.
+ See section 12.16. The message-digest contains a one-way 16 octet
+ HMAC-MD5 [RFC 2104] hash calculated over a stream of octets consist-
+ ing of the entire message concatenated with the shared secret.
+
+ For calculation, the message includes the message-digest option with
+ the message-digest data zeroed (16-octets of zero). Once the calcula-
+ tion is complete, these 16 octets of zero are replaced by the 16-
+ octet HMAC-MD5 hash and the message is sent.
+
+ For verification, the 16-octet message-digest is saved and replaced
+ with 16-octets of zero and calculated per above. The resulting HMAC-
+ MD5 hash is compared to the received hash and if they match, the mes-
+ sage is assumed authenticated.
+
+ A failover partner that fails to authenticate a received message or
+ receives a message without a message-digest option when configured
+ with a shared secret MUST close the connection immediately and take
+ steps to notify operators.
+
+ Every time a CONNECT message is received, the time at which that mes-
+ sage was sent by the partner (i.e., the time that actually appears in
+ the message itself) MUST be saved. If a CONNECT message is ever
+ received containing that time or containing a time before that time,
+ it MUST be rejected.
+
+ The XID (see section 6.1) of every message received at a failover
+
+
+
+Droms, et. al. Expires September 2003 [Page 112]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ endpoint MUST be greater than that of the previous message received
+ on that failover endpoint or the message just received MUST be
+ rejected.
+
+ A server MAY operate with arbitrary time skew between servers (see
+ section 5.10), but when using a shared secret administrators MAY wish
+ to configure a maximum allowable time skew between a failover server
+ and its partner(s). Servers SHOULD allow an administrator to config-
+ ure a maximum allowable time skew between two failover partners.
+
+11.2. TLS
+
+ TLS, Transport Layer Security, as specified in [RFC 2246] MAY be
+ used. The use of TLS would be similar to the way it is used with
+ SMTP [RFC 2487] and IMAP/POP3/ACAP [RFC 2595].
+
+ To request the use of TLS, the primary MUST send the TLS-request
+ option as part of the CONNECT message. The secondary receiving the
+ TLS-request option MUST respond with a TLS-reply option indicating
+ its acceptance or rejection of the TLS-request in the CONNECT mes-
+ sage."
+
+ If the CONNECTACK message contained a TLS-reply of 1 , then both
+ servers immediately begin TLS negotiation.
+
+ Upon completion of this negotiation, the primary server sends another
+ CONNECT message without any TLS-request option, and must wait for a
+ corresponding CONNECTACK.
+
+ Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [RFC 2246]
+ cipher suite is REQUIRED in Failover servers supporting TLS. This is
+ important as it assures that any two compliant implementations can be
+ configured to interoperate.
+
+12. Failover Options
+
+ This section lists all of the options that are currently defined to
+ be used with the failover protocol. See section 6.2 for details con-
+ cerning time values.
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 113]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.1. addresses-transferred
+
+ A 32 bit unsigned long in network byte order. Reports the number of
+ addresses transferred by the primary to the secondary server
+ (addresses to be used for the secondary server's private address
+ pool).
+
+ Code Len Number of Addresses
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 1 | 0 | 4 | n1 | n2 | n3 | n4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.2. assigned-IP-address
+
+ The DHCP managed IP address to which this message refers.
+
+ Code Len Address
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.3. binding-status
+
+ This option is used to convey the current state of a binding.
+
+ Code Len Type
+ +-----+-----+-----+-----+-----+
+ | 0 | 3 | 0 | 1 | 1-7 |
+ +-----+-----+-----+-----+-----+
+
+ Legal values for this option are:
+
+ Value Binding Status
+ ----- ------------------------------------------------
+ 1 FREE Lease is currently available to the primary
+ 2 ACTIVE Lease is assigned to a client
+ 3 EXPIRED Lease has expired
+ 4 RELEASED Lease has been released by client
+ 5 ABANDONED A server, or client flagged address as unusable
+ 6 RESET Lease was freed by some external agent
+ 7 BACKUP Lease belongs to secondary's private address pool
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 114]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.4. client-identifier
+
+ This is the client-identifier for the client associated with a
+ binding. The client-identifier data is subject to the same
+ conventions as DHCP option 81 [RFC 2132].
+
+ Code Len Client Identifier
+ +-----+-----+-----+-----+----+-----+---
+ | 0 | 4 | 0 | n | i1 | i2 | ...
+ +-----+-----+-----+-----+----+-----+--
+
+
+12.5. client-hardware-address
+
+ This is the hardware address for the client associated with a
+ binding. Byte t1 (type) MUST be set to the proper ARP hardware
+ address code, as defined in the ARP section of RFC 1700 (it MUST NOT
+ be zero!)
+
+ Code Len htype chaddr
+ +-----+-----+-----+-----+----+-----+-----+---
+ | 0 | 5 | 0 | n | t1 | c1 | c2 | ...
+ +-----+-----+-----+-----+----+-----+-----+---
+
+
+12.6. client-last-transaction-time
+
+ The time at which this server last received a DHCP request from a
+ particular client expressed as an absolute time (see section 6.2).
+
+
+ Code Len client last transaction time
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 6 | 0 | 4 | t1 | t2 | t3 | t4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 115]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.7. client-reply-options
+
+ This option contains options from a DHCP server's reply to a DHCP
+ client request. It is sent in a BNDUPD message. The first 4 bytes
+ of the option contain the "magic number" of the option area from
+ which the DHCP reply options were taken and serves to define the
+ format of the rest of the sub-options contained in this option.
+ After the magic number, the options included are in the normal
+ options format appropriate for that magic number.
+
+ A server SHOULD NOT include all of the options in a DHCP server's
+ reply to a client's request in this option, but rather a server
+ SHOULD include only those options which are of likely interest to its
+ partner server. See section 7.1 for details.
+
+ Code Len Magic Number Embedded options
+ +-----+-----+-----+-----+----+----+----+----+----+----+--
+ | 0 | 7 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
+ +-----+-----+-----+-----+----+----+----+----+----+----+--
+
+
+12.8. client-request-options
+
+ This option contains options from a DHCP client's request. It is
+ sent in a BNDUPD message. The first 4 bytes of the option contain
+ the "magic number" of the option area from which the DHCP client's
+ request options were taken and serves to define the format of the
+ rest of the sub-options contained in this option. After the magic
+ number, the options included are in the normal options format
+ appropriate for that magic number.
+
+ A server SHOULD NOT include all of the options in a DHCP client
+ request in this option, but rather a server SHOULD include only those
+ options which are of likely interest to its partner server. See
+ section 7.1 for details.
+
+ Code Len Magic Number Embedded options
+ +-----+-----+-----+-----+----+----+----+----+----+----+--
+ | 0 | 8 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
+ +-----+-----+-----+-----+----+----+----+----+----+----+--
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 116]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.9. DDNS
+
+ If an implementation supports Dynamic DNS updates, this option is
+ used to communicate the status of the DDNS update associated with a
+ particular lease binding. The Flags field conveys the types of DNS
+ RRs that are to be updated by the DHCP server, and the status of the
+ DDNS update. The Domain Name field conveys the DNS FQDN that the
+ DHCP server is using to refer to the client, in DNS encoding as
+ specified in [RFC 1035].
+
+ Code Len Flags Domain Name
+ +-----+-----+-----+-----+-----+------+------+-----+------
+ | 0 | 9 | 0 | n | flags | d1 | d2 | ...
+ +-----+-----+-----+-----+-----+------+------+-----+------
+
+ The Flags field is a 16-bit field; several bit positions are
+ specified here.
+
+ 1 1 1 1 1 1
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |C|A|D|P| MBZ |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ The bits (numbered from the least-significant bit in network
+ byte-order) are used as follows:
+
+ 0 (C): name to address (such as A RR) update successfully completed
+ 1 (A): Server is controlling A RR on behalf of the client
+ 2 (D): address to name (such as PTR RR) update successfully completed (Done)
+ 3 (P): Server is controlling PTR RR on behalf of the client
+ 4-15 : Must be zero
+
+ All of the unspecified bit positions SHOULD be set to 0 by servers
+ sending the Failover-DDNS option, and they MUST be ignored by servers
+ receiving the option.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 117]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.10. delayed-service-parameter
+
+ The delayed-service-parameter is an optional load balancing tuning
+ parameter, defined in [RFC 3074]. If it is used, it MUST be sent in
+ the same message as the hash-bucket-assignment option (see section
+ 12.11).
+
+ Format :
+
+
+ Code Len Seconds
+ +-----+-----+-----+-----+----+
+ | 0 | 10 | 0 | 1 | S |
+ +-----+-----+-----+-----+----+
+
+ S is a one byte value, 1..255.
+
+
+12.11. hash-bucket-assignment
+
+ A set of load balancing hash values for the secondary server. A one
+ bit in the hash buckets indicates that the secondary is to service
+ that set of clients. See section 5.3 for more information on how
+ this option is used. This option is only sent from the primary to
+ the secondary.
+
+ The format and usage of the data in this option is defined in [RFC
+ 3074].
+
+ Code Len Hash Buckets
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+ | 0 | 11 | 0 | 32 | b1 | b2 | ... | b32 |
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 118]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.12. IP-flags
+
+ This option is used to convey the current flags of the assigned-IP-
+ address option preceding it.
+
+ Code Len IP Flags
+ +-----+-----+-----+-----+-----+-----+
+ | 0 | 12 | 0 | 1 | f1 | f2 |
+ +-----+-----+-----+-----+-----+-----+
+
+ The IP-flags field is a 16-bit field; two bit positions are
+ specified here.
+
+ 1 1 1 1 1 1
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |R|B| MBZ |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ The bits (numbered from the least-significant bit in network
+ byte-order) are used as follows:
+
+ 0 (R): RESERVED (this bit allocated and in use and named "RESERVED")
+ Bit 0 MUST be set to 1 whenever the IP address in the preceding
+ assigned-IP-address option is reserved on the server sending the
+ packet.
+ 1 (B): BOOTP
+ Bit 1 MUST be set to 1 whenever the IP address in the preceding
+ assigned-IP-address option is a an IP address which has been
+ allocated due to an interaction with a BOOTP client (as opposed
+ to a DHCP client).
+ 2-15 : Must be zero
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 119]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.13. lease-expiration-time
+
+ The lease expiration time is the lease interval that a DHCP server
+ has ACKed to a DHCP client added to the time at which that ACK was
+ transmitted -- expressed as an absolute time (see section 6.2).
+
+
+ Code Len Time
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.14. max-unacked-bndupd
+
+ The maximum number of BNDUPD message that this server is prepared to
+ accept over the TCP connection without causing the TCP connection to
+ block. A 32 bit unsigned integer value, in network byte order.
+
+
+ Code Len Maximum Unacked BNDUPD
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 14 | 0 | 4 | n1 | n2 | n3 | n4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.15. MCLT
+
+ Maximum Client Lead Time, an interval, in seconds. A 32 bit unsigned
+ integer value, in network byte order.
+
+ Code Len Time
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 120]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.16. message
+
+ This option is used to supply a human readable message text. It may
+ be used in association with the Reject Reason Code to provide a human
+ readable error message for the reject.
+
+
+ Code Len Text
+ +-----+-----+-----+-----+------+-----+--
+ | 0 | 16 | 0 | n | c1 | c2 | ...
+ +-----+-----+-----+-----+------+-----+--
+
+
+12.17. message-digest
+
+ The message digest for this message.
+
+ This option consists of a variable number of bytes which contain the
+ message digest of the message prior to the inclusion of this option.
+
+ When this option appears in a message, it MUST appear as the first
+ option in the message. It MUST appear in every message if message
+ digests are required. The Type MUST be configurable (once additional
+ types are defined). When additional types are defined, they MUST be
+ specified as either optional (MAY be supported) or required (MUST be
+ supported). See the section on IANA considerations for more details.
+
+ Code Len Type Message Digest
+ +-----+-----+-----+-----+-----+-----+-----+--
+ | 0 | 17 | 0 | n | t | d1 | d2 | ...
+ +-----+-----+-----+-----+-----+-----+-----+--
+
+
+ Type: 0 Not Allowed
+ 1 HMAC-MD5
+ 2-255 Not Allowed
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 121]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.18. potential-expiration-time
+
+ The potential expiration time is the time that one server tells
+ another server that it may wish to grant in a lease to a DHCP client.
+ It is an absolute time. See section 6.2.
+
+
+ Code Len Time
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 18 | 0 | 4 | t1 | t2 | t3 | t4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.19. receive-timer
+
+ The number of seconds (an interval) within which the server must
+ receive a message from its partner, or it will assume that
+ communications with the partner is not ok. An unsigned 32 bit
+ integer in network byte order.
+
+ Code Len Receive Timer
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 19 | 0 | 4 | s1 | s2 | s3 | s4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+12.20. protocol-version
+
+ The protocol version being used by the server. It is only sent in the
+ CONNECT and CONNECTACK messages. The current value for the version
+ is 1.
+
+ Code Len Version
+ +-----+-----+-----+-----+-----+
+ | 0 | 20 | 0 | 1 | 1 |
+ +-----+-----+-----+-----+-----+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 122]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.21. reject-reason
+
+ This option is used to selectively reject binding updates. It MAY be
+ used in a BNDACK message or a CONNECTACK message, always associated
+ with an assigned-IP-address option, which contains the IP address of
+ the update being rejected.
+
+ Code Len Reason Code
+ +-----+-----+-----+-----+-----+
+ | 0 | 21 | 0 | 1 | R1 |
+ +-----+-----+-----+-----+-----+
+
+ Reason codes (section where referenced in parentheses):
+
+ 0 Reserved
+ 1 Illegal IP address (not part of any address pool). (7.1.3)
+ 2 Fatal conflict exists: address in use by other client. (7.1.3)
+ 3 Missing binding information. (7.1.3)
+ 4 Connection rejected, time mismatch too great. (7.8.2)
+ 5 Connection rejected, invalid MCLT. (7.8.2)
+ 6 Connection rejected, unknown reason. (not specifically referenced)
+ 7 Connection rejected, duplicate connection. (unused)
+ 8 Connection rejected, invalid failover partner. (7.8.2)
+ 9 TLS not supported. (7.8.2)
+ 10 TLS supported but not configured. (7.8.2)
+ 11 TLS required but not supported by partner. (7.8.2)
+ 12 Message digest not supported. (11.1)
+ 13 Message digest not configured. (11.1)
+ 14 Protocol version mismatch. (7.8.2)
+ 15 Outdated binding information. (7.1.3)
+ 16 Less critical binding information. (7.1.3)
+ 17 No traffic within sufficient time. (8.6)
+ 18 Hash bucket assignment conflict. (7.8.2)
+ 19 IP not reserved on this server. (7.1.3)
+ 20 Message digest failed to compare. (7.8.2)
+ 21 Missing message digest. (7.1.3)
+ 22-253, reserved.
+ 254 Unknown: Error occurred but does not match any reason code.
+ 255 Reserved for code expansion.
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 123]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.22. relationship-name
+
+ A string which is a unique identifier for the failover relationship.
+
+ Code Len Relationship Name
+ +-----+-----+-----+-----+----+-----+---
+ | 0 | 22 | 0 | n | c1 | c2 | ...
+ +-----+-----+-----+-----+----+-----+---
+
+
+12.23. server-flags
+
+ This option is used to convey the current flags of the failover
+ endpoint in the sending server.
+
+ Code Len Server Flags
+ +-----+-----+-----+-----+-------+
+ | 0 | 23 | 0 | 1 | flags |
+ +-----+-----+-----+-----+-------+
+
+ The flags field is an 8-bit field; one bit position is
+ specified here.
+
+
+ 0 1 2 3 4 5 6 7
+ +-+-+-+-+-+-+-+-+
+ |S| MBZ |
+ +-+-+-+-+-+-+-+-+
+
+ The bits (numbered from the least-significant bit in network
+ byte-order) are used as follows:
+
+ 0 (S): STARTUP,
+ Bit 0 MUST be set to 1 whenever the server is in STARTUP state,
+ and set to 0 otherwise. (Note that when in STARTUP state, the
+ state transmitted in the server-state option is usually the last
+ recorded state from stable storage, but see section 9.3 for
+ details.)
+ 1-7 : Must be zero
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 124]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.24. server-state
+
+ This option is used to convey the current state of the failover
+ endpoint in the sending server.
+
+ Code Len Server State
+ +-----+-----+-----+-----+-----+
+ | 0 | 24 | 0 | 1 | 1-9 |
+ +-----+-----+-----+-----+-----+
+
+ Legal values for this option are:
+
+ Value Server State
+ ----- -------------------------------------------------------------
+ 0 reserved
+ 1 STARTUP Startup state (1)
+ 2 NORMAL Normal state
+ 3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe)
+ 4 PARTNER-DOWN Partner down (unsafe mode)
+ 5 POTENTIAL-CONFLICT Synchronizing
+ 6 RECOVER Recovering bindings from partner
+ 7 PAUSED Shutting down for a short period.
+ 8 SHUTDOWN Shutting down for an extended
+ period.
+ 9 RECOVER-DONE Interlock state prior to NORMAL
+ 10 RESOLUTION-INTERRUPTED Comm. failed during resolution
+ 11 CONFLICT-DONE Primary has resolved its conflicts
+
+ (1) The STARTUP state is never sent to the partner server, it is
+ indicated by the STARTUP bit in the server-flags options (see section
+ 12.22).
+
+
+12.25. start-time-of-state
+
+ This option is used for different states in different messages. In a
+ BNDUPD message it represents the start time of the state of the lease
+ in the BNDUPD message. In a STATE message, it represents the start
+ time of the partner server's failover state. In all cases it is an
+ absolute time.
+
+
+ Code Len Start Time of State
+ +-----+-----+-----+-----+----+-----+-----+-----+
+ | 0 | 25 | 0 | 4 | t1 | t2 | t3 | t4 |
+ +-----+-----+-----+-----+----+-----+-----+-----+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 125]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.26. TLS-reply
+
+ This option contains information relating to TLS security
+ negotiation. It is sent in a CONNECTACK message
+
+ A t1 value of 0 indicates no TLS operation, a value of 1 indicates
+ that TLS operation is required.
+
+ Code Len TLS
+ +-----+-----+-----+-----+-----+
+ | 0 | 26 | 0 | 1 | t1 |
+ +-----+-----+-----+-----+-----+
+
+
+12.27. TLS-request
+
+ This option contains information relating to TLS security
+ negotiation. It is sent in a CONNECT message.
+
+ The t1 byte is the TLS request from the primary server. A value of 0
+ indicates no TLS operation (to communicate the secondary server MUST
+ NOT require TLS), a value of 1 indicates that TLS operation is
+ desired but not required (to communicate, the secondary server MAY
+ utilize TLS), and a value of 2 indicates that TLS operation is
+ required (to communicate the secondary server MUST utilize TLS) to
+ establish communications with the primary server.
+
+ Code Len TLS
+ +-----+-----+-----+-----+-----+
+ | 0 | 27 | 0 | 1 | t1 |
+ +-----+-----+-----+-----+-----+
+
+
+12.28. vendor-class-identifier
+
+ A string which identifies the vendor of the failover protocol
+ implementation.
+
+ Code Len vendor class string
+ +-----+-----+-----+-----+----+-----+---
+ | 0 | 28 | 0 | n | c1 | c2 | ...
+ +-----+-----+-----+-----+----+-----+---
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 126]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+
+12.29. vendor-specific-options
+
+ This option is used to convey options specific to a particular
+ vendor's implementation. The vendor class identifier is used to
+ specify which option space the embedded options are drawn from.
+ Every message that uses vendor specific options MUST have a vendor-
+ class-identifier option in it.
+
+ It functions similarly to the vendor class identifier and vendor
+ specific options in the DHCP protocol.
+
+ This option contains other options in the same two byte code, two
+ byte length format. If this option appears in a message without a
+ corresponding vendor class identifier, it MUST be ignored.
+
+ Code Len Embedded options
+ +-----+-----+-----+-----+----+-----+---
+ | 0 | 29 | 0 | n | c1 | c2 | ...
+ +-----+-----+-----+-----+----+-----+---
+
+
+
+
+13. IANA Considerations
+
+ This document defines several number spaces (failover options, fail-
+ over message types, message digest types, and failover reject reason
+ codes). For all of these number spaces, certain values are defined in
+ this specification. New values may only be defined by IETF Con-
+ sensus, as described in [RFC 2434]. Basically, this means that they
+ are defined by RFCs approved by the IESG.
+
+
+14. Acknowledgments
+
+ Ralph Droms started it all, by sketching out an initial interserver
+ draft that embodied ideas from several past IETF meetings. In that
+ draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
+ Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.
+
+ Kim Kinnear and Bob Cole each extended that draft, separately and
+ then together, until they created an interserver draft that supported
+ any number of servers. The complexity of that approach was just too
+ great, and that draft wasn't greeted with enthusiasm by many, includ-
+ ing its authors.
+
+ It did however lead to a much simpler approach embodied in the first
+
+
+
+Droms, et. al. Expires September 2003 [Page 127]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
+ Droms. This draft posited only two servers -- a primary and a secon-
+ dary.
+
+ Kim Kinnear then wrote the Safe Failover draft to layer on top of the
+ Failover Draft and increase its robustness in the face of certain
+ rare network failures.
+
+ At the spring 1998 IETF meeting in LA, the DHC working group said
+ that they wanted a merged Failover and Safe Failover draft. Steve
+ Gonczi and Bernie Volz stepped up and produced the raw material for
+ such a merged draft, along with a new message format designed around
+ DHCP options and other extensions and clarifications. Kim Kinnear
+ edited their work into draft format and made other changes in time
+ for the Summer Chicago IETF meeting.
+
+ Many people have reviewed the various earlier drafts that went into
+ this result. At American Internet, ideas were contributed by Brad
+ Parker. At Cisco Systems Paul Fox and Ellen Garvey contributed to
+ the design of the protocol.
+
+ During the summer and fall of 1998, two groups worked on separate
+ implementations of the UDP failover draft. Bernie Volz and Steve
+ Gonczi constituted one group, and Kim Kinnear, Mark Stapp and Paul
+ Fox made up the other. These two groups worked together to produce
+ considerable changes and simplifications of the protocol during that
+ period, and Steve Gonczi and Kim Kinnear edited those changes into
+ -03 draft in time for submission to the December 1998 Orlando IETF
+ meeting.
+
+ In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting of
+ people interested in the failover draft. During that meeting a gen-
+ eral agreement was reached to recast the failover protocol to use TCP
+ instead of UDP. In addition, the group together brainstormed a work-
+ able load-balancing technique. Kim Kinnear rewrote the entire draft
+ to include the changes made at that meeting as well as to restructure
+ the draft along guidelines suggested by Thomas Narten. The result
+ was the -04 draft, submitted prior to the Oslo IETF meeting.
+
+ The initial idea for a hash-based load balancing approach was offered
+ by Ted Lemon, and the determination of an algorithm and its integra-
+ tion into the draft was done by Steve Gonczi. The security section
+ was spearheaded by Bernie Volz. Both contributed considerably to the
+ ideas and text in the rest of the draft with several reviews.
+
+ In early October of 1999, three conference calls were held to discuss
+ the -04 draft. The -05 includes changes as a result of those calls,
+ perhaps the largest of which was to remove the load balancing
+
+
+
+Droms, et. al. Expires September 2003 [Page 128]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ approach into a separate draft. Thanks to all of the many people
+ who participated in the conference calls. Changes were made because
+ of contributions by: Ted Lemon, David Erdmann, Richard Jones, Rob
+ Stevens, Thomas Narten, Diana Lane, and Andre Kostur.
+
+ Another conference call was held in mid-January of 2000, and the -06
+ draft was produced to tighten up the the -05 draft both technically
+ as well as editorially.
+
+ The -07 draft was edited by Kim Kinnear and was based in part on
+ reviews by Richard Jones, Bernie Volz, and Steve Gonczi. It embodies
+ several technical updates as well as numerous editorial revisions
+ that enhanced both correctness as well as clarity.
+
+ The -08 draft was edited by Kim Kinnear and was based on the results
+ of two conference calls held in October and November of 2000. It
+ includes the correct second port number, a new state to synchronize
+ conflict resolution with load balancing, a generally accepted
+ approach to secondary pool allocation, and many other updates based
+ on both operational as well as implementation experience.
+
+ The -09 draft was edited by Kim Kinnear based on discussions held at
+ the Minneapolis IETF in December of 2000, as well as issues raised by
+ Ted Lemon based on implementation and deployment. The specific
+ changes were mailed to the dhcp-v4 list.
+
+ The -10 draft differed from the -09 draft in that figure 9.8.3-1 was
+ correctly relabeled figure 9.10.3-1, and it was updated to include
+ the CONFLICT-DONE message. One of the authors affiliations was also
+ updated.
+
+ This, the -11 draft differs only slightly from the -10 draft in
+ correcting another author affiliation.
+
+ These most recent changes have not been widely circulated among the
+ other authors prior to submission to the IETF.
+
+ Glenn Waters of Nortel Networks contributed ideas and enthusiasm to
+ make a Failover protocol that was both "safe" and "lazy".
+
+
+15. References
+
+
+ [DHCID] Stapp, M., Lemon, T., Gustafsson, A., "draft-ietf-dnsext-
+ dhcid-rr-02.txt", March, 2001.
+
+ [DNSRES] Stapp, M., "draft-ietf-dhc-dns-resolution-01.txt", March,
+
+
+
+Droms, et. al. Expires September 2003 [Page 129]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ 2001.
+
+ [FQDN] Rekhter, Y., Stapp, M., "draft-ietf-dhc-fqdn-option-01.txt",
+ March, 2001.
+
+ [RFC 1035] Mockapetris, P., "Domain Names - Implementation and
+ Specification", November, 1987.
+
+ [RFC 1534] Droms, R., "Interoperation between DHCP and BOOTP", RFC
+ 1534, October 1993.
+
+ [RFC 2104] Krawczyk, H., Bellare, M., and Canetti, R., "HMAC: Keyed
+ Hashing for Message Authentication", RFC 2104, IBM T.J. Watson
+ Research Center, University of California at San Diego, February
+ 1997.
+
+ [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
+ Requirement Levels", RFC 2119.
+
+ [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
+ 2131, March 1997.
+
+ [RFC 2132] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor
+ Extensions", Internet RFC 2132, March 1997.
+
+ [RFC 2136] P. Vixie, S. Thomson, Y. Rekhter, J. Bound, "Dynamic
+ Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April
+ 1997
+
+ [RFC 2139] Rigney, C., "Radius Accounting", RFC 2139, Livingston
+ Enterprises, April 1997.
+
+ [RFC 2246] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246,
+ January 1999.
+
+ [RFC 2401] Kent, S., Atkinson, R., "Security Architecture for the
+ Internet Protocol", RFC 2401, November 1998.
+
+ [RFC 2434] Alvestrand, H. and T. Narten, "Guidelines for Writing an
+ IANA Considerations Section in RFCs", BCP 26, RFC 2434, October
+ 1998.
+
+ [RFC 2487] Hoffman, P., "SMTP Service Extension for Secure SMTP over
+ TLS", RFC 2487, January 1999.
+
+ [RFC 2595] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC
+ 2595, June 1999.
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 130]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ [RFC 3004] Stump, G., Droms, R., Gu, Y., Vyaghrapuri, R., Demirtjis,
+ A., Privat, J. "The User Class Option for DHCP", November 2000.
+
+ [RFC 3011] Waters, G., "The IPv4 Subnet Selection Option for DHCP",
+ November 2000.
+
+ [RFC 3046] Patrick, M., "DHCP Relay Agent Information Option", RFC
+ 3046, January 2001.
+
+ [RFC 3074] Volz, B., Gonczi, S., Lemon, T., Stevens, R., "DHC Load-
+ balancing Algorithm", February, 2001.
+
+16. Author's information
+
+ Ralph Droms
+ Kim Kinnear
+ Mark Stapp
+ Cisco Systems
+ 250 Apollo Drive
+ Chelmsford, MA 01824
+
+ Phone: (978) 497-0000
+
+ EMail: rdroms@cisco.com
+ kkinnear@cisco.com
+ mjs@cisco.com
+
+
+
+ Bernie Volz
+ Ericsson
+ 959 Concord St.
+ Framingham, MA 01701
+
+ Phone: (508) 875-3162
+
+ EMail: bernie.volz@ericsson.com
+
+
+ Steve Gonczi
+ Relicore, Inc.
+ One Wall Street
+ Burlington, MA 01803
+
+ Phone: (781) 229-1122
+
+ Email: steve@relicore.com
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 131]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+ Greg Rabil
+ Lucent Technologies
+ 400 Lapp Road
+ Malvern, PA 19355
+
+ Phone: (800) 208-2747
+
+ EMail: grabil@lucent.com
+
+
+
+
+ Michael Dooley
+ Diamond IP Technologies
+ One E Uwchlan Ave, Suite 112
+ Exton, PA 19341
+
+ EMail: mdooley@diamondip.com
+
+
+
+
+ Arun Kapur
+ K5 Networks
+ 2 Toll House Lane
+ Colts Neck, NJ 07722
+
+ Phone: (732) 817-9475
+
+17. Full Copyright Statement
+
+Copyright (C) The Internet Society (2003). All Rights Reserved.
+
+This document and translations of it may be copied and furnished to oth-
+ers, and derivative works that comment on or otherwise explain it or
+assist in its implementation may be prepared, copied, published and dis-
+tributed, in whole or in part, without restriction of any kind, provided
+that the above copyright notice and this paragraph are included on all
+such copies and derivative works. However, this document itself may not
+be modified in any way, such as by removing the copyright notice or
+references to the Internet Society or other Internet organizations,
+except as needed for the purpose of developing Internet standards in
+which case the procedures for copyrights defined in the Internet Stan-
+dards process must be followed, or as required to translate it into
+languages other than English.
+
+The limited permissions granted above are perpetual and will not be
+revoked by the Internet Society or its successors or assigns.
+
+
+
+Droms, et. al. Expires September 2003 [Page 132]
+
+Internet Draft DHCP Failover Protocol March 2003
+
+
+This document and the information contained herein is provided on an "AS
+IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
+FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
+LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
+INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT-
+NESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Droms, et. al. Expires September 2003 [Page 133]
+ \ No newline at end of file