«Reliable Internet Routing Martin Suchara A Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of ...»
Reliable Internet Routing
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
By the Department of
Adviser: Professor Jennifer Rexford
c Copyright by Martin Suchara, 2011. All rights reserved.
Network routing algorithms responsible for selecting paths to destinations have a profound
impact on network reliability experienced by the network users. Unfortunately, performance of state-of-the-art routing algorithms often falls short of users’ expectations.
(i) The ﬂexibility with which operators of independently administered networks can choose their routing policies allows them to make selections that are “conﬂicting” and may lead to route oscillations. Oscillating routes have a negative impact on performance experienced by the user, and also cause overloading of the routers with control messages. (ii) Interdomain routing in the Internet is based on trust. As a result, false route announcements can be made by a malicious network operator. Such false announcements can be made even without knowledge of the network operator, e.g., due to accidentally misconﬁgurations or router hijacking. False route announcements may lead to denial of service, or worse yet, traﬃc can be intercepted without detection of both the sender and recipient. (iii) Even if network routes are stable and secure, unexpected equipment failures may cause performance degradation. It is diﬃcult to pre-conﬁgure current routing protocols with all possible failures in mind, and not enough ﬂexibility is oﬀered to balance load in the network evenly.
This thesis addresses these three challenging problems. (i) We provide a new theoretical model of interdomain routing and derive the necessary and suﬃcient conditions that determine which policy combinations lead to route oscillations. Moreover, we also provide a practical polynomial-time algorithm that allows network operators to verify the existence of such conﬂicts. (ii) To secure routing against malicious attacks, we oﬀer a new secure routing protocol that, unlike earlier attempts, is incrementally deployable. Our solution can protect both participants and non-participants if as few as 5–10 independently administered domains deploy our solution. (iii) To handle traﬃc engineering in the presence of failures, we propose a new architecture that optimizes load balancing for a wide range of failure scenarios. Our architecture supports ﬂexible splitting of traﬃc over multiple precomputed paths, with eﬃcient path-level failure detection and automatic load balancing over the remaining paths.
Collectively, the contributions of the dissertation provide tools that improve routing reliability and as a result network performance perceived by the user.
also the downs of my graduate career at Princeton. Words cannot express Jennifer’s passion for teaching and mentoring. She worked with me selﬂessly and was happy to discuss new ideas, give feedback about my writing style, presentations, and career advice. I admire her sharp intellect, professionalism with which she deals with all tasks, and ability to prioritize. I would also like to thank Jennifer for her support and encouragement to present my results at conferences and as a speaker at several universities.
I would like to thank Mung Chiang for being my co-advisor and mentor in my early years at Princeton. I highly value Mung’s expertise in optimization theory. His mentoring helped me to understand and tackle the complex optimization problems I faced when I worked on this dissertation. I would also like to thank all my thesis committee members, Mung Chiang, Gordon Wilfong, Ed Felten, and Mike Freedman for their feedback on my thesis.
I am grateful for the help I received from all the co-authors of the papers that I published. I am especially indebted to Alex Fabrikant, Ioannis Avramopoulos, and Jiayue He. Alex taught me how to think like a theorist, provided help during my job search, and was a great friend. I thank Ioannis and Jiayue for their creative ideas and co-authoring several major publications with me.
I am very lucky to have worked with a number of excellent researchers at AT&T Labs Research where I spent two summers. In particular, I am grateful to my mentors Robert Doverspike, and Dahai Xu for their mentoring and encouragement to work on practical problems with real-world impact. I also appreciate the numerous interactions I had with David Johnson. In addition, I owe thanks to Quynh Nguyen, Kostas Oikonomou, Rakesh Sinha, Kobus van der Merwe, and Jennifer Yates for help and advice with understanding the operation of the AT&T’s backbone, measurement, data collection, and data processing that I needed to ﬁnish this dissertation.
I thank Barbara Terhal and Sergey Bravyi at IBM research where I spent one summer working on quantum error correcting codes. I especially thank you for your time explaining me the concepts of a new research area I wasn’t familiar with.
I thank my fellow Cabernet group members: Minlan Yu, Yaping Zhu, Eric Keller, Wenjie (Joe) Jiang, Rob Harrison, Michael Schapira, Sharon Goldberg, Haakon Ringberg, Elliott Karpilovsky, Changhoon Kim, Yi Wang, Steven Ko, Nate Foster, Rui Zhang-Shen, Matthew Caesar, and others.
Melissa Lawson, the administrator in the Computer Science Department.
I was fortunate to work with and co-supervise two bright undergraduate students, Will Fisher and Umar Javed. Thank you for your hard work, you were great students.
Writing this dissertation would not be possible without the support of research grants from AFOSR, NSF and AT&T. In addition, my graduate study at Princeton was partly supported by the Gordon Wu Fellowship, and my two summer internships at AT&T by the DARPA CORONET Program.
I thank my parents for their love, support and encouragement. Without them this accomplishment would have been impossible.
2.5 Internal architecture of routers or ASes causes spurious updates and oscillations.. 28
2.6 Router architecture where loss of the most preferred route causes spurious announce
2.12 Sequence of activations that results in permanent oscillations............. 42
2.13 Reduction showing NP completeness of determining safety when router conﬁgura
3.3 Security performance with partial soBGP deployment with 25 large ISPs participating. 71
3.4 Security performance with ideal malicious route ﬁlters and with 25 large ISPs par
3.5 Security performance when participants use overlay routing to select valid routes.. 77
3.6 Security performance with overlay routing provided by SBone and up to 5 tier-1
3.11 Security performance when the adversary mounts a sub-preﬁx hijacking attack... 88
4.1 The network architecture consisting of a management system that precalculates
4.5 Existing tools and protocols that can be used to deploy our architecture....... 115 Chapter 1 Introduction Since the ﬁrst message was sent on the ARPANET network, the predecessor of the modern Internet, the network has experienced a dramatic growth. Once reserved for academic use, the Internet presently connects billions of users around the world who rely on the infrastructure in their daily lives.
The Internet has become much more than just a network used to access information. In the past two decades, new important applications have emerged, such as electronic commerce, voice over IP, social networking, and many more. In addition, many applications such as online banking or online trading are business critical and time sensitive. As a result, users and businesses who rely on the Internet infrastructure require a high degree of reliability from the operators of the network. Reliability encompasses the ability to oﬀer the network users high-bandwidth and lowlatency service in the presence of accidental hardware failures or planned maintenance, and ability to deliver data securely even in the presence of malicious attacks on the Internet infrastructure.
This thesis addresses these challenging problems.
Network routing, the selection of paths to destinations, is perhaps one of the most important features of the Internet that determines the performance, security, and reliability of the network.
For example, selecting the “right” paths can reduce congestion and decrease queuing and propagation delay, improving the performance for the users. Furthermore, dynamic rerouting is a critical operation that ensures that connectivity is re-established after a failure of a link or router.
Rerouting is also important when traﬃc demands of the users change. Finally, security of network routing protocols has implications on the reliability of the entire network – a malicious user may for example manipulate the routing protocol so that network traﬃc is forwarded to him, or to cause widespread connectivity disruptions.
In Section 1.1 we describe the basic operation of two main families of routing protocols – intradomain and interdomain routing protocols. Then, in Section 1.2 we describe the shortcomings of the current design and the impact on the safety, security and reliability of the network. We also outline the proposed solutions to address these important issues.
1.1 Routing in the Internet Interdomain routing concerns the problem of calculating the paths across domains that the traﬃc needs to traverse to reach the destination. Intradomain routing determines the path inside a single administrative domain that the traﬃc needs to take to reach the destination. The two problems are very diﬀerent. Intradomain routing is done in a single network and the owner of the network has a full control and information about the network topology, load, conﬁguration, etc. Interdomain routing concerns exchanging traﬃc between separate networks whose owners, who are business competitors, do not have full information about the other networks. For this reason, interdomain and intradomain routing rely on diﬀerent routing protocols and face diﬀerent challenges.
1.1.1 Interdomain Routing The Internet consists of tens of thousands of autonomous systems (ASes) that are independently owned and operated. To achieve global connectivity, ASes exchange information about reachability.
This information exchange is facilitated by the Border Gateway Protocol (BGP) . BGP is a path vector protocol, that is, when an AS uses BGP to announce a route to its neighbor, the announcement contains a list of all other ASes that the path traverses before reaching the destination. The adjacent ASes exchange the BGP messages between their edge routers, which are also sometimes referred to as BGP speakers.
ASes are typically Internet Service Providers who have business relationships with their neighboring ASes. These business relationships determine any transit fees. While business relationships are conﬁdential, a model  that is believed to correspond to reality classiﬁes business relationships into two categories: customer-provider and peer-peer. In customer-provider relationship, the customer has to pay the provider for all traﬃc that traverses the link between the ASes, no matter what the direction of the traﬃc. In peer-peer relationships, the peers forward traﬃc for each other free of charge.
The nature of business relationships determines which routes are preferred by ASes. For example, given the choice between a customer, peer and provider route, the AS will prefer the customer route which is the most proﬁtable. Business relationships also play a role even after a BGP speaker selects the single route to the destination that it prefers – a BGP speaker will not announce a provider route to another provider as it would have to pay to both providers for the transit traﬃc. For this reason ASes need the ﬂexibility to choose among multiple paths, and the option to announce the selected path to an arbitrary subset of their neighbors. BGP allows such ﬂexibility – if an AS learns about multiple routes from its neighbors, it can apply an arbitrary policy to choose the preferred path, and decide which neighbors to announce the path to.
BGP is a protocol based on trust. When a route announcement is received, autonomous systems cannot verify whether a path announced by a neighboring BGP speaker corresponds to an existing physical path, and whether that path is available to the neighbor. For this reason, BGP is extremely vulnerable to malicious attacks where an attacker compromises a router to make false routing announcements, and to misconﬁgurations where a speaker mistakenly announces an incorrect route.
1.1.2 Intradomain Routing Network operators need intradomain routing protocols that ensure network connectivity even as the network topology changes due to link additions, hardware failures, or during planned equipment maintenance. In addition, network operators desire to balance the load in their networks to avoid congestion. One protocol satisfying these goals is Open Shortest Path First (OSPF) . OSPF is a link state routing protocol, i.e., a protocol that collects information from routers about their connectivity (the state of their links). Then, the routers construct a graph representing the network, and traﬃc is sent on the shortest path according to link weights that were pre-assigned to each link. If a router ﬁnds multiple shortest paths, traﬃc is split evenly on the outgoing links.
The link state information is maintained by each router and if it changes, it is ﬂooded in the network. The beneﬁts of using OSPF include the ability to react to link failures – when a link fails the information is immediately ﬂooded in the network and all of the routers can compute new shortest paths that avoid the failed link. Furthermore, proper link weight assignment allows load balancing. However, OSPF only allows to split the traﬃc on paths of the same minimal cost.