In the beginning there were routers. To have a router move your data, you need to cooperate with it… Then along came the Ethernet. Wonderful technology, but I’m annoyed at the inventors for calling it a network. They should have called it a multi-access link.Radia Perlman
The new edition of Radia Perlman’s bestselling book, Interconnections, encompasses the many changes in the networking field since the first edition was published in 1992. In the following interview, Perlman reveals her ability to frame this complex and fascinating technology in reader-friendly terms. She talks about her new book and shares the surprising story behind the invention of the spanning tree algorithm as well as its poetic counterpart, the algorhyme.
I used to be the routing architect at Digital Equipment Corp., and it was my work that formed the basis of today’s routing and bridging protocols. This was a natural topic for me to write about. Also, the field was getting more and more confusing. Committees reinvented the same stuff, with new terminology. Decisions were based on politics rather than understanding the implications of protocols.
Some big mistakes were made, and then patched. There also was a disturbing trend to treat the field as religion, where you weren’t allowed to question or look at certain things critically. It was a situation where anything coming out of committee X had to be bad and anything coming out of committee Y had to be good. I can’t imagine trying to understand the technology under these conditions, essentially from the specifications alone.
So I wrote the book as a friendly way to learn about the field. The idea is to help people understand it deeply; not limited to the simplistic good or bad, but taking apart a protocol into the various problems it’s trying to solve and evaluating various ways of resolving each problem.
It’s a complete rewrite, and now also significantly longer, since there’s a lot of extra material like IPv6, ATM, DCHP, VLANs, Fast Ethernet and switched Ethernet. The point of the book is still to understand the concepts rather than just give the details of a bunch of protocols. So instead of looking at a big protocol like IP all at once, I look at a specific sub-problem, like how to find the maximum packet size you can use on a path to the destination. Then I compare the solutions in all the deployed protocols and standards, as well as some variants I might suggest.
I also added a bunch of older protocols, like IPX, Appletalk, and DECnet. They are discussed partially because they still exist and it’s getting increasingly difficult to find documentation for them, but also because there are some interesting ideas, both good and bad, and this knowledge is useful for designing future protocols.
In the beginning there were routers. To have a router move your data, you need to cooperate with it. You need a layer 3 header (a protocol like IP, DECnet, IPX, etc.). There are all sorts of nice fields that help the router move the data safely, like a hop count to notice when a packet might be in a loop.
Then along came the Ethernet. Wonderful technology, but I’m annoyed at the inventors for calling it a network. They should have called it a multi-access link. When the Ethernet came along, I realized that routing protocols needed to be redesigned somewhat to accommodate potentially hundreds or thousands of neighbors. So I invented designated routers and other methods of making routing protocols work efficiently on shared media.
But the rest of the world got all excited about using Ethernet as the network rather than the complicated layer 3 stuff. For example, there was something known as LAT (local area terminal) being developed at Digital. They were proud of how many bytes they could save out of the header by eliminating layer 3. I unsuccessfully argued that they should work on top of layer 3, not just on top of Ethernet. This way, it would be possible to talk from one LAN to another.
Press people, anxious for a juicy quote, would call me and say,
Do you think Ethernet will replace DECnet? But Ethernet was a link in a network, not a network! We eventually needed to build a box that would interconnect LANs without the cooperation of the end stations. That’s what a bridge is, or rather what the transparent bridge is. The bridge was a kludge designed after the fact to work with stations that left out layer 3.
Oh yes, the short answer is a switch is just a new word for a bridge. You see, Ethernet was originally a bus topology. Then people decided it would be more convenient to wire buildings as a star topology, with a hub that was a multiport repeater. Then they noticed that by making the hub store packets &ndash at least until it knows which port(s) to forward them on, and then only forward onto those ports – you could get higher aggregate bandwidth. This is because two pairs of stations could talk to each other at full bandwidth.
Then they decided that it would be convenient to plug one of these boxes into another, which means you might wind up with loops, which means you need the spanning tree and then...voila...you’ve reinvented the bridge. People try to claim that a switch is hardware where a bridge is software or that a switch can have lots of ports whereas a bridge has only two ports. But none of this is true.
The concept of a bridge is a device that acts like a station on more than one link, learns the location of stations based on the layer 2 header, computes the spanning tree, and forwards packets on that topology. Then people said,
Gee, we can make our switches smarter by looking at the layer 3 header. And they called that a layer 3 switch, which happens to be a router. So the words may change, but the concepts haven’t.
No. For subtle reasons, the success of IP depends on bridges. With a protocol like CLNP or DECnet, where there’s a bottom level of hierarchy in which you can move around and keep your layer 3 address, bridges could just go away. But with IP, you need a different address on every link. So bridging forms the equivalent bottom level of hierarchy for IP.
It allows there to be a reasonable sized region in which stations can move and keep their IP address. The routes at that level of hierarchy would be better with a layer 3 protocol designed with a bottom layer to explicitly route to individual nodes, rather than bridging, since with bridging you have to prune the topology to a spanning tree. But since IP has no such concept at layer 3, bridging serves that purpose.
That’s a fun story. Not only will I tell you about the algorithm but about the poem too! Remember, the world got all confused and thought they could eliminate layer 3, and I failed to convince them otherwise. Well, years later, Tony Lauck (my manager) wandered into my office and said,
Radia, we need to design a box that will let packets move from one LAN to another. I said,
Err...we have such a box. It’s called....a router.
But I knew what he was talking about, which is that we had to design a box that worked without cooperation from the endnodes. The bridge implementers at Digital had the general idea: a box that listened promiscuously on each of its ports, learned station locations based on the source address in the layer 2 header, and forwarded based on destination address.
But this would only work if the topology was loop-free, so they needed a spanning tree algorithm. Coming up with distributed algorithms was where I was called into the project. Tony assumed it would be difficult to come up with such an algorithm. Then, I don’t know what made him do it, but he said:
While you’re at it, to make it more challenging, make the algorithm infinitely scalable. No matter how many bridges and how many LANs, the algorithm should take only a constant amount of memory and CPU in each bridge. It should only take a constant amount of bandwidth on each LAN.
No routing algorithms are like that! What was he thinking? They’re all n**2 or if you’re creative enough (as in source route bridging) exponential. Not constant! Then he went away for a week. That night I woke up realizing not only was a spanning tree algorithm trivial, but it had the properties my manager requested!
It was constant in memory, CPU, and bandwidth (with a little waving of hands...for an engineer it’s constant; for a mathematician, it’s actually log n, since I’m assuming a fixed size 48-bit ID for a bridge). I spent the next day writing it up very carefully. It took six pages, and was complete enough that the bridge implementers had no problem implementing it from the description.
Now for the poem. There was still almost a week left before Tony would return and I could show off the algorithm to him. I had all this nervous energy, anticipating blowing him away with its simplicity and having met his impossible criteria. I couldn’t concentrate on a different problem at this point. So I decided to try to write a poem based on Joyce Kilmer’s Trees poem.
So I did the poem, titled Algorhyme (on the suggestion of Mike Speciner, who coined the word and said,
Every algorithm ought to have an associated algorhyme).
And here it is:
Oh yes. I’m co-author of Network Security: Private Communication in a Public World, published by Prentice Hall. If you like Interconnections, you’ll like this book. It has the same level of technical depth while still being easy to read. And it’s funnier.