Spanning Tree Protocol
The Layer 2 protocol used by LAN switches to locate and eliminate redundant paths is called the Spanning Tree Protocol (STP).
Spanning Tree Protocol Features
Both the original Spanning Tree Protocol and the more recent Rapid STP (RSTP) use the concept of a minimum-cost spanning tree to eliminate redundancy in a transparently bridged internetwork. They have the general characteristics that are listed below.
- They reconfigure a transparently bridged internetwork with redundant paths into a logical topology such that only one route exists between any two points in the network, thus eliminating loops.
- Only the switches participate in STP or RSTP. The operation of the both is transparent to the LAN adapters (i.e., end stations). In effect, the LAN adapters are unaware of the presence of switches in the internetwork. From the LAN adapter's perspective, it sees a single cohesive Ethernet network.
- Neither STP nor RSTP interfere with the switch's ability to carry out its primary functions of filtering and forwarding. They simply constrain the forwarding to specific ports, thereby eliminating the possibility of frame duplication.
- Both STP and RSTP offer a degree of fault tolerance by automatic reconfiguration of the topology in the wake of a network failure (e.g., switch or port).
What Is a Spanning Tree?
Understanding STP operation requires a basic understanding of spanning trees. The concept of a spanning tree actually derives from a branch of mathematics called graph theory. In this discipline, a graph is an abstract concept consisting of nodes and links. Think of it as a massive game of connect the dots. A spanning tree is defined as a subset of the graph that includes all of the nodes and only just enough links to interconnect them. The subset is essentially a tree-like structure that spans the entire graph, hence the term spanning tree.
Such a subset has no loops. It is literally a tree structure, and most trees (Banyan trees not withstanding) do not have loops; the branches fork outward into ever finer branches, but at no point do they come back together again.
The application to networking is clear. In a transparently bridged internetwork (i.e., one implemented using switches), the switches and collision domains are the nodes, and the ports on the switches are the links. The goal of the Spanning Tree Protocol is to eliminate enough of the switch ports to reduce the network to a loop-free configuration. The ports are not actually eliminated; they are simply closed to traffic.
A Spanning Tree
The name of the protocol comes from its effect on the network—STP takes a fully or partially meshed, switched network and logically reduces it to a tree-like configuration. To maintain connectivity, the tree must span the entire network, hence the expression “spanning tree.”
The visual depicts a network from the spanning tree’s point of view, after the protocol has done its job. This is a network with physical redundancies. The physically redundant ports have been logically blocked by STP and appear as dotted lines. Note that the remaining network is configured in a tree-like structure. Switch 42 is positioned as the “root” of the spanning tree. In fact, STP calls this the root bridge. Switch 93, Switch 57, and Switch 87 have all detected redundant ports and placed them in the blocked state. In this network, Switch 87 and Switch 57 are essentially the end of the line for their branch of the tree. They will forward frames out to any attached computers (not shown here), but they will never pass a frame onto their blocked port, nor accept a frame from that port. Switch 93 is still forwarding frames between Ports 2 and 3, but Port 1 is blocked.
In this network, a broadcast frame originating on any LAN is forwarded by each switch on its active ports (i.e., those ports deemed to be in the forwarding state). Note that a broadcast storm due to loops in the network is no longer possible. A broadcast frame appears once on each LAN.
Any internetwork with at least one redundant path has more than one possible spanning tree. In the network depicted on the visual, a perfectly valid spanning tree would result if Switch 93 blocked Port 3 instead of Port 1. In fact, this small internetwork has over 100 possible spanning trees!
Selecting a good spanning tree for deployment can be important, especially in larger internetworks. Some trees produce unnecessarily long paths across the internetwork. For example, the switches on the visual could have deployed a valid spanning tree if, instead of blocking the ports as depicted, they had elected to block Port 3 on Switch 63, Port 1 on Switch 93, and Port 3 on Switch 42. However, that would mean that frames traveling from Station A to Station B would be forced to take the longest possible route through the network!
Optimizing the Spanning Tree
The more richly a switched network is interconnected, the more spanning trees we can find in it, and all trees are not created equal. A network that disables links in order to arrange all of the active links in a straight line with no branches is technically a spanning tree (e.g., it has no loops), but it is arguably not a good one. Clearly there must be a way to evaluate different trees and pick a good one.
This is done by choosing a good root for the tree, and then selecting links to be included in the tree based on some defined value that represents the “goodness” of the link, what might be called a “link cost.” We would then search to find a tree that chooses the links that create the minimum possible total cost from any node back to the root of the tree.
The key to getting a tree lies in choosing a good root and establishing meaningful values to the links. A root that sits at the topological edge of the network is not a good choice. The tree will grow out from one edge of the network. Two stations, located on the opposite side of the network, that are connected to adjoining switches may find themselves on different branches of the tree because the link between those switches is blocked. Their traffic will travel all the way across the network to the root, and then cross back again on the other branch to get to their destination.
A good root is usually located near the topological center of the network, making it as close as possible to all edges of the network, and has the richest set of ports.
Setting link costs is also critical for links between switches. If all of the links are assigned a cost of one, the tree will deploy based on hop count (e.g., the number of links from the node to the root). If all of the links are the same speed, that is actually acceptable. However, if the links have different transmission rates, those differences will be ignored. A cost based on the transmit delay of the links would be better in that circumstance. It would include high-speed links and exclude low-speed ones.
Finding a Minimum Cost Spanning Tree
The Spanning Tree Protocol uses two primary criteria to deploy a tree. The first criterion is a unique bridge identifier; this is used to select the root bridge. The switch with the lowest valued bridge identifier serves as the root bridge. The second criterion is the port path cost assigned to each port on the switch. The term port path cost is actually a bit misleading, although it is the one used in the standard. It is actually the cost of that port or link, not the entire cost back to the root, which is referred to as the root path cost. The root path cost is calculated by the switches for each port and used to determine which port has the lowest cost path back to the root.
The configuration of these criteria is under the control of the network administrator. As the switches are deployed, the network administrator assigns a bridge identifier to each switch, and a port path cost to each port. Once powered up, the switches begin to transmit bridge protocol data units (BPDU) to one another. BPDUs are short messages (i.e., less than 64 octets) containing information used by the switches to discover and deploy the spanning tree. Among other things, they contain the identifier of the switch believed to be the root, and an indication of the total cost of the path from the transmitting switch to the switch determined to be serving as the root bridge.
When a switch is first powered up, it has no way of knowing which switch is serving as the root bridge. So it makes a simple assumption—it assumes that it is the root bridge! The BPDUs that it transmits list itself as the root bridge and announce a root path cost of zero (i.e., it is zero cost from itself). If this new switch is indeed the root, the other switches abandon the former root bridge and begin to calculate their cost to reach this new root. If it is not the root, one of the local switches sends a BPDU indicating the ID of the real root and the cost to reach it.
The cost to reach the root bridge is easily determined. When a switch advertises its cost to reach the root, a switch receiving that BPDU adds the cost of the port on which the BPDU is received to the root path cost listed in the BPDU. The switch now has a root path cost from itself to the root, via that port.
Based on this information, a switch can easily determine:
- Which of its ports has the shortest path back to the root; this becomes the root port.
- Which of its remaining ports attach to collision domains that have no other switch with a lower cost path back to the root; these ports become designated ports.
All remaining ports are blocked. The result is the elimination of all redundant paths through the switched network.
A Minimum Cost Spanning Tree
On the visual, Switch 42 announces that it is the root with a cost of zero to itself. Because no other switch has a lower bridge identifier, this claim goes unchallenged. Switch 93 receives a BPDU from Switch 42 on Port 3, and another on Port 1. Because Port 3 has been configured with a cost of three, and Port 1 has a cost of four, Switch 93 determines that its root port (i.e., its best path to the root) is via Port 3. It also decides to block Port 1 because there is another switch with a better path to the root connected to that collision domain (i.e., the root itself!). Finally, Switch 93 transmits a BPDU on Port 2 identifying Switch 42 as the root and a path with a cost of three to the root.
Switch 87 receives one BPDU from Switch 93 announcing a path with a cost of three to the root, and another BPDU from Switch 63 with a root path cost of four. Switch 87 determines that its best path to the root is via Port 1, which becomes the root port for Switch 87. This gives the total path from Switch 87 to the root a cost of 5 (i.e., the path reported by Switch 93 plus the cost of Port 1 on Switch 87). Because Switch 63 has a path with a cost of 4, Switch 87 elects to block Port 2.
In summary, each switch transmits BPDUs identifying the root and the known cost to that switch. This information is used by the switches to determine the lowest cost path to the root. For each switch, the port that has the lowest cost to the root is the root port. For each collision domain, the switch that has the lowest cost to the root is the designated bridge, and the port connecting the switch to collision domain is the designated port. All other ports in the network are blocked. The result is a minimum cost spanning tree. From any collision domain or switch, the path along the spanning tree to the root has the minimum possible cost. The network administrator controls the tree, ultimately deployed by the switches, by altering the bridge identifiers and/or port path costs assigned to each switch.
Reconfiguring the Tree
The Spanning Tree Protocol is also designed to be fault tolerant. Once the spanning tree is deployed, it is the responsibility of the root to generate a new BPDU every few seconds (default is at 2 second intervals). The BPDU contains information about the identity of, and cost to, the root. When a downstream switch receives a BPDU on its root port, it responds by generating BPDUs on its designated ports (no BPDUs are ever generated on a blocked port). As a result of this process, a “wave” of BPDUs traverses the spanning tree every few seconds.
These BPDUs are important. Although a switch can never learn or forward frames on a blocked port, it does actually receive frames on the port. Because the port is blocked, frames containing user data must be filtered by the switch. However, BPDUs that are received on that port are closely examined. Using the information contained in the BPDU, the switch can determine if there has been a change in the network. For example, a switch might be reconfigured with a lower cost for a given port. When that information is propagated, downstream switches might decide that there is a new, lower cost path to the root and change the status of their various ports. A port that was blocked might become the root port and a root port might be blocked.
More importantly, if a switch fails to see BPDUs arriving on its root port, it knows that something has failed. In the network depicted on the visual, Port 2 on Switch 42 has failed. This will also fail the Port 3 on Switch 93 because it is a point-to-point circuit. Switch 93 knows it has lost its root port. Based on the BPDUs it is still receiving on Port 1, this is the next best path to the root, with a cost of 4. This becomes the new root port and the port is reactivated.
When Port 2 on Switch 42 is restored to service, it will again begin to generate BPDUs. Switch 93 will detect that its port is online again, and that its original, better cost path to the root is available. Port 3 becomes the root port again, and Port 1 is blocked because there is another switch in that collision domain with a better path to the root (i.e., the root itself!).
Spanning Tree Terminology
Understanding the details of STP and the meaning of the BPDUs requires a grasp of some basic STP terminology. Below are a few of the key terms and their definitions.
- Root bridge: The switch that serves as the root of the spanning tree. In STP, the root is responsible for periodically (default is every two seconds) issuing a new BPDU to refresh the tree.
- Port path cost: The cost assigned to a particular port on a particular switch. The cost is used in calculating the total cost of the path back to the root.
- Root path cost: The calculated total cost from a port on a switch back to the root.
- Root port: On any switch (other than the one serving as the root), the port that has the lowest calculated root path cost.
- Designated port: In any collision domain, the port that connects that collision domain to the switch with the lowest cost path back to the root.
- Blocked port: A port that is neither root nor designated, and across which frames cannot be forwarded and learning cannot occur.
- Disabled port: A port that is administratively online and not a functional part of the network. It is, in effect, shut down.
Spanning Tree Rules
The operation of the Spanning Tree Protocol is fully distributed. Switches exchange information via Bridge Protocol Data Units (BPDU). Based on these messages, each switch makes local decisions about which port is the root port, which ports are designated, and which ports are blocked. The rules a bridge uses to make these decisions are as follows:
- If a switch believes it is the root, it transmits Configuration BPDUs on all of its ports on a regular basis (default is every two seconds). The frequency of these transmissions is governed by the value of the Hello Time field in the BPDU that is configured when the switch is installed. The switch assumes that all of its ports are designated ports. In the transmitted BPDUs, the switch identifies itself as the root, and indicates a root path cost of zero.
- When a switch receives Configuration BPDUs, it uses that information to build a profile for each port. The profile includes (among other things) the total root path cost via that port to the root, and the bridge identifier of the root. The port with the lowest recorded root bridge identifier and the lowest root path cost is established as the root port and is part of the spanning tree.
- For each of its remaining ports, if the switch determines it has the lowest root path cost of all other switches (if any) on the attached collision domains, it declares that port a Designated port, which is part of the spanning tree.
- Ports that are neither root nor designated are blocked and the switch is prohibited from forwarding frames across that port or learning on that port. These ports have been determined to be redundant and are not part of the deployed spanning tree.
- If a switch receives superior (or equal) root bridge information on its root port, it calculates and transmits Configuration BPDUs via all designated ports. The information transmitted includes the root bridge identifier, and the root path cost from this switch to the known root.
- If a switch receives inferior information on a designated port, it responds by transmitting its own Configuration BPDU to “correct” the errant transmitter.
Using these simple rules, a switch can configure its ports and block redundant ports.
STP Port States
While a switch is executing the Spanning Tree Protocol (STP), a switch port can be in any of five states: Blocking, Listening, Learning, Forwarding, or Disabled. These are explained below.
- Blocking: A switch port in this state does not participate in the relay of media access control (MAC) frames. In effect, a blocked port is a port that has been eliminated from the spanning tree. Ports that are neither designated ports nor root ports are blocked.
- Listening: When the spanning tree is reconfigured (i.e., due to network failure or the addition of new equipment or ports), a port that was blocked may be added to the spanning tree to bypass a network fault or to incorporate the new equipment into the spanning tree. A port moving from the blocking state first enters the listening state. The listening state is required because it may take some finite amount of time for the spanning tree to complete its deployment. As the spanning tree is deployed, it may shift a few times before assuming its final form. The listening state is simply a waiting period, allowing the spanning tree to fully deploy. This state is necessary to prevent a switch from learning addresses incorrectly, or from forwarding frames into a transient loop. From this state, the port will enter the learning state.
- Learning: A switch port in the learning state is building the filtering database by learning MAC addresses and their associated ports. A port enters the learning state from the listening state. While in the learning state, the port is not participating in MAC frame forwarding. This state is controlled by a protocol timer. When the timer expires, the port enters the forwarding state. This state is necessary to reduce the number of MAC frames forwarded unnecessarily or incorrectly.
- Forwarding: A switch port in this state is actively receiving and forwarding MAC frames. While in this state, the switch may actively update the filtering database with MAC address information learned on this port. By definition, root ports and designated ports will be in the forwarding state once the spanning tree is completely configured and MAC addresses have been learned. All other ports will be blocked or disabled.
- Disabled: A port in this state is not participating in relaying of MAC frames or in the STP. The port is, in essence, “turned off.” This state is initiated and terminated by a switch management function or by explicit action of the network administrator.
The listening and learning states prevent transient loops and the inevitable broadcast storm that would produce. However, they introduce the variable of time. It may take a minute or more for the switches to recover from a failure and place the necessary ports in forwarding state. In some applications, that may be an eternity.
|<mp3>http://podcast.hill-vt.com/podsnacks/2007q3/stp.mp3%7Cdownload</mp3> | Spanning Tree Protocol (STP)|