Virtual Router Redundancy Protocol

From Citizendium
Jump to navigation Jump to search
This article may be deleted soon.
To oppose or discuss a nomination, please go to CZ:Proposed for deletion and follow the instructions.

For the monthly nomination lists, see
Category:Articles for deletion.


If an Internet Protocol (IP) subnet has only one local router, the router becomes a single point of failure that the Virtual Router Redundancy Protocol (VRRP) can prevent. At its most basic, VRRP allows failover from one local router to a standby local router. Its function is similar to that of some earlier proprietary protocols, including Cisco's Hot Standby Router Protocol. It is widely implemented for Internet Protocol version 4, but the variant for Internet Protocol version 6 is still a draft.

The names of these protocols are subtle but important. Note that they do not use the word "routing", but "router". A given set of VRRP routers will be seen on a local subnet. The purpose of VRRP is not to use a router to send packet to destinations outside the local subnet, as required by the Internet Protocol#Local versus remote principle|local versus remote principle, but to find a running router that can send those packets.

Again, the rerminology is subtle. The preceding paragraph did not say "discover a router". VRRP assumes that hosts on the subnet already know the IP address of a router. The IP address of such a router is assumed either to have been manually configured into the host, or that the host learned it from a Dynamic Host Configuration Protocol (DHCP) server. VRRP assumes there are at least two routers that can respond to a single router address known to local hosts. It is not an anycast method in which multiple routers can provide the same service, but it is a way of making sure that one and only one router is available in basic situations.

VRRP's basic principle

To understand how the protocol works, assume there are two physical routers on the subnet. A VRRP router must be capable of recognizing two unicast addresses on its local interface: a unique IP address that can be used to diagnose and maintain that router, and an additional unicast address that is shared among the routers in a group of VRRP routers, which is assumed here to consist of two routers.

One router of the group is designated as primary. Whenever a periodic timer expires, the primary router will multicast a message to the other routers of its group. This message essentially says "I am working. Go back to sleep; you are not yet needed." The secondary router also has a timer, and, if it does not receive the message from the primary router, will become active as the local router, until it again hears the "I am alive" message from the primary. There should never be more than one instance of the same router address active at any given time.

The Layer 2 challenge

In IPv4, a host also needs the layer 2 address of a router, before it send to it. In the Internet Protocol version 4 version of VRRP, assuming that the local subnet is broadcast multiaccess using IEEE 802 addresses, VRRP assumes that the local hosts have discovered the layer 2 address of the local router using the Address Resolution Protocol (ARP). Just as there is a shared IP address, there will be a shared MAC address.

Having the shared MAC address, as well as the shared IP address, in the VRRP group makes it irrelevant, to the end hosts, which physical router is doing the routing. They simply send to the MAC address they learned, comfortable that router at that address will respond.

IPv6 uses a mechanism different than ARP, and there is much less implementation experience for VRRPv6.

Refinements

Multiple VRRP groups for load sharing

Assume there are two routers on a subnet, both connected to the same hierarchically higher addresses. Rather than have one router remaining idle in standby, it is often possible to create multiple VRRP groups on the subnet. Half the hosts might configured with the default router address 192.198.0.1, while the other half is given 192.198.0.2 as its default router address. The first router is configured as primary router for the first group: it normally responds as 192.0.1, but is programmed to be the secondary router for the virtual address 192.198.0.2. The other router reverses the primary and secondary roles. When both routers are working, half the hosts send their traffic through the first router, and the other half sends their traffic through the second router. Either router can fail, and all the hosts will still know of a router to accept their traffic.

Failover for other reasons than timer expiration

Using a local router and a local backup host for a normally remote service