Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 715 (1999)
(USC DC Other)
USC Computer Science Technical Reports, no. 715 (1999)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Enabling Large-scale Network Simulations: A Selective Abstraction Approach By Polly Huang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Computer Science) December 1999 Copyright 1999 Polly Huang 2 Table of Contents LIST OF FIGURES....................................................................................................................... 4 LIST OF TABLES......................................................................................................................... 6 DEDICATION ............................................................................................................................... 7 ACKNOWLEDGEMENTS .......................................................................................................... 8 ABSTRACT ................................................................................................................................... 9 INTRODUCTION ....................................................................................................................... 10 1.1. THESIS STATEMENT ............................................................................................................ 12 1.2. CONTRIBUTIONS.................................................................................................................. 13 1.2.1. Scaling Techniques.......................................................................................................13 1.2.2. Selection Guidelines.....................................................................................................16 1.2.3. Case Studies ................................................................................................................. 17 1.3. ORGANIZATION ................................................................................................................... 18 BACKGROUND AND RELATED WORK.............................................................................. 19 2.1. INTERNET PROTOCOLS ........................................................................................................ 19 2.1.1. Unicast Protocols......................................................................................................... 20 2.1.2. Multicast Protocols ......................................................................................................23 2.1.3. Scaling Dimensions of Network Protocols................................................................... 25 2.2. NETWORK SIMULATIONS .................................................................................................... 28 2.2.1. Network Simulation Development................................................................................ 28 2.2.2. Parallel and Distributed Simulation ............................................................................ 33 2.2.3. Simulation Abstraction................................................................................................. 35 2.2.4. Hybrid Simulation ........................................................................................................36 2.2.5. Related Work Summary................................................................................................ 37 SCALING TECHNIQUES ......................................................................................................... 39 3.1. OVERVIEW........................................................................................................................... 39 3.2. ABSTRACTION TECHNIQUES................................................................................................ 43 3.2.1. Centralized Computation ............................................................................................. 44 3.2.2. End-to-End Packet Delivery......................................................................................... 46 3.2.3. Algorithmic Routing ..................................................................................................... 51 3.2.4. Finite State Automata Modeling................................................................................... 54 3.3. HYBRID SIMULATION.......................................................................................................... 62 3.4. OPTIMIZATIONS................................................................................................................... 64 3.4.1. Packet Reference Count ............................................................................................... 64 3.4.2. Virtual Classifier .......................................................................................................... 65 3.5. SELECTION GUIDELINES...................................................................................................... 67 SYSTEMATIC SIMULATION COMPARISON..................................................................... 69 3 4.1. CENTRALIZED MULTICAST & SESSION MULTICAST........................................................... 69 4.2. ALGORITHMIC ROUTING ..................................................................................................... 78 4.3. FSA TCP............................................................................................................................. 83 4.4. MIXED MODE ...................................................................................................................... 87 4.5. MULTICAST PACKET REFERENCE COUNT........................................................................... 91 4.6. VIRTUAL CLASSIFIER .......................................................................................................... 94 IMPACTS ON SIMULATION STUDIES................................................................................. 96 5.1. CASE STUDY: SRM ............................................................................................................. 96 5.1.1. SRM Mechanism........................................................................................................... 97 5.1.2. Applying Session Multicast .......................................................................................... 98 5.1.3. SRM Simulations ..........................................................................................................99 5.2. CASE STUDY: RAP............................................................................................................ 101 5.2.1. RAP Mechanism ......................................................................................................... 101 5.2.2. Applying Mixed Mode ................................................................................................ 102 5.2.3. RAP Simulation .......................................................................................................... 104 5.3. CASE STUDY: SELF-SIMILAR TRAFFIC .............................................................................. 105 5.3.1. Self-similar Traffic Causality Study ........................................................................... 106 5.3.2. Applying FSA TCP ..................................................................................................... 109 5.3.3. Self-similar Traffic Simulation ................................................................................... 111 CONTRIBUTIONS AND FUTURE WORK .......................................................................... 113 6.1. THESIS STATEMENT .......................................................................................................... 113 6.2. CONTRIBUTIONS................................................................................................................ 114 6.3. SHORT-TERM FUTURE WORK............................................................................................ 116 6.3.1. Algorithmic routing .................................................................................................... 117 6.3.2. FSA TCP..................................................................................................................... 118 6.4. LONG-TERM FUTURE WORK ............................................................................................. 119 6.4.1. Router characteristics ................................................................................................ 119 6.4.2. Domain Abstraction ................................................................................................... 119 6.4.3. Simulation Validation................................................................................................. 120 6.5. CONCLUSION ..................................................................................................................... 120 REFERENCES .......................................................................................................................... 122 4 List of Figures FIGURE 1. A COMPARISON OF DETAILED DENSE MODE MULTICAST AND CENTRALIZED MULTICAST............................................................................................................................. 45 FIGURE 2. NS-2 IMPLEMENTATION OF A MULTICAST NODE .......................................................... 47 FIGURE 3. A COMPARISON OF DETAILED PACKET DISTRIBUTION AND SESSION MULTICAST ..... 48 FIGURE 4. AN ILLUSTRATION OF CAUSALITY PROBLEM............................................................... 48 FIGURE 5. AN EXAMPLE OF ERROR DEPENDENCY IN SESSION MULTICAST ................................. 50 FIGURE 6. ALGORITHMIC LOOKUP ................................................................................................ 52 FIGURE 7. BREADTH FIRST SEARCH TREE MAPPING..................................................................... 52 FIGURE 8. TCP SENDING A BATCH OF PACKETS PER ROUND TRIP TIME ..................................... 56 FIGURE 9. FSA TCP: RENO AND REGULAR ACKNOWLEDGEMENT............................................... 58 FIGURE 10. FSA TCP: TAHOE AND REGULAR ACKNOWLEDGEMENT .......................................... 59 FIGURE 11. FSA TCP: RENO AND DELAYED ACKNOWLEDGEMENT............................................. 60 FIGURE 12. FSA TCP: TAHOE AND DELAYED ACKNOWLEDGEMENT .......................................... 61 FIGURE 13. ILLUSTRATION OF A MIXED MODE MULTICAST SESSION .......................................... 62 FIGURE 14. ILLUSTRATION OF PACKET REFERENCE COUNT......................................................... 64 FIGURE 15. COMPARISON OF ORIGINAL CLASSIFIER (LEFT) AND VIRTUAL UNICAST CLASSIFIER (RIGHT)................................................................................................................................... 66 FIGURE 16. SELECTION GUIDELINES ASSISTING THE ENTIRE EXPERIMENT VALIDATION PROCESS ................................................................................................................................................ 68 FIGURE 17. 100 NODE RANDOM TRANSIT-STUB TOPOLOGY .......................................................... 70 FIGURE 18. COMPARISON OF MEMORY USAGE............................................................................. 72 FIGURE 19. COMPARISON OF TIME USAGE.................................................................................... 74 FIGURE 20. DIFFERENCE BETWEEN DENSE MODE AND CENTRALIZED MULTICAST .................... 75 FIGURE 21. DISTORTION IN END-END DELAY BY SESSION MULTICAST ....................................... 77 FIGURE 22. MEMORY CONSUMPTION FOR FLAT, HIERARCHICAL, AND ALGORITHMIC ROUTING 79 FIGURE 23. TIME CONSUMPTION FOR FLAT, HIERARCHICAL, AND ALGORITHMIC ROUTING ...... 80 FIGURE 24. DISTORTION BY ALGORITHMIC ROUTING: DIFFERENCE IN ROUTE LENGTH ............. 81 FIGURE 25. EXAMPLE OF ROUTE DISTORTION USING ALGORITHMIC ROUTING........................... 82 FIGURE 26. ISP-LIKE ENVIRONMENT: ........................................................................................... 84 FIGURE 27. MEMORY CONSUMPTION FOR FSA TCP .................................................................... 84 5 FIGURE 28. TIME CONSUMPTION FOR FSA TCP ........................................................................... 85 FIGURE 29. DISTORTION BY FSA TCP: % DIFFERENCE IN THROUGHPUT .................................... 86 FIGURE 30. DISTORTION BY FSA TCP: DIFFERENCE IN DELAY ................................................... 87 FIGURE 31. MEMORY CONSUMPTION FOR MIXED MODE.............................................................. 89 FIGURE 32. TIME CONSUMPTION FOR MIXED MODE .................................................................... 90 FIGURE 33. DISTORTION BY MIXED MODE: % DIFFERENCE IN THROUGHPUT ............................. 91 FIGURE 34. MEMORY CONSUMPTION FOR PACKET REFERENCE COUNT ...................................... 92 FIGURE 35. TIME CONSUMPTION FOR PACKET REFERENCE COUNT ............................................. 93 FIGURE 36. MEMORY CONSUMPTION FOR VIRTUAL CLASSIFIER.................................................. 95 FIGURE 37. TIME CONSUMPTION FOR VIRTUAL CLASSIFIER ........................................................ 95 FIGURE 38. PERFORMANCE COMPARISON FOR SRM AND SESSION SRM................................... 100 FIGURE 39. ACCURACY COMPARISON FOR SRM AND SESSION SRM......................................... 100 FIGURE 40. RAP FAIRNESS: FOUR COMBINATIONS OF DETAILED AND MIXED MODE SIMULATIONS ....................................................................................................................... 105 FIGURE 41. GLOBAL SCALING PLOTS: ARTIFICIAL POISSON AND EXACT SELF-SIMILAR PROCESSES ........................................................................................................................... 107 FIGURE 42. GLOBAL SCALING PLOTS FOR MEASUREMENT DATA .............................................. 108 FIGURE 43. GLOBAL SCALING PLOT: HEAVY-TAILED OBJECT SIZES VS. EXPONENTIAL ........... 109 FIGURE 44. ALMOST IDENTICAL GLOBAL SCALING PLOTS: DETAILED TCP AND FSA TCP ..... 111 6 List of Tables TABLE 1. CASE STUDIES................................................................................................................ 18 TABLE 2. SCALING DIMENSIONS AND THEIR EFFECT ON RESOURCE CONSUMPTION................... 26 TABLE 3. A COMPARISON OF NETWORK SIMULATORS................................................................. 27 TABLE 4. IMPLEMENTATION TUNING AND ABSTRACTION ARE COMPLEMENTARY....................... 43 TABLE 5. LARGE MAXIMUM AND SMALL MEDIAN FOR ROUTE LENGTH % DIFFERENCE............ 81 TABLE 6. SCALING TECHNIQUES AND DISTORTIONS SUMMARY ................................................ 115 TABLE 7. ABSTRACTION SELECTION METHODOLOGY SUMMARY .............................................. 116 TABLE 8. CASE STUDIES SUMMARY............................................................................................ 116 7 Dedication To my parents, For their love and support. 8 Acknowledgements I am deeply grateful for my advisors Deborah Estrin and John Heidemann. I thank Deborah for her invaluable encouragement and support throughout my time at USC, as well as my dream of becoming an astronaut. I thank John for his impeccable patient and consistent guidance throughout my time at ISI, especially his constant reminders of not to under-sell this work. I am thankful for John Silvester, Michael Arbib, and Dave Wile for providing feedback on this work and for serving as members of my Qualifying Exam and Dissertation Committees. I appreciate inspiring interactions among my fellow students and researchers in dgroup, USC Network and Distributed System Lab, ISI VINT project, and AT&T Labs-Research Florham Park. Thank you all. I am grateful for the financial support from various funding agencies, the National Science Foundation Infrastructure grant (award number CDA-9216321), the NSF PIM project grant, and DARPA VINT project grant (contract number ABT63-96-C-0054), Sun microsystem Inc., and Cisco system Inc. I am especially thankful for my family, mom, dad, August, Amy, Tiffany, Coco, and Meowmeow Jr, for their love and support. 9 Abstract The Internet research community widely uses network simulators for protocol evaluations. As the Internet’s size and complexity double every year, simulation scalability becomes increasingly important. Just like designing a scalable Internet protocol, making a general-purpose network simulator scalable is challenging. Parallel and distributed simulation is one approach to improve simulation scale, but it can require expensive hardware, have high overhead, and is limited by number of CPUs (10-100). In this thesis, we investigate a complimentary solution -- abstraction. Just as a custom simulator includes only details necessary for the task at hand, we show how a general simulator can support configurable levels of detail for different simulations, achieving 10- 1000 times improvements in scale. We identify several general abstraction techniques. In particular, centralized computation and end-to-end packet delivery are used to abstract network and hop by hop transmission details, whereas algorithmic routing and finite state automata modeling are used to abstract network topology and traffic. Based upon these general concepts, we derive abstractions for various unicast and multicast simulations. Our experiments show that each of these abstraction techniques helps to gain one order of magnitude improvement. However, these abstraction techniques can introduce errors by distorting certain aspects of the network characteristics. For studies that do not concern the affected network characteristics, we are able to perform the same simulations much more efficiently while the simulation distortion results in negligible changes in the conclusions drawn. We show that for case studies in several important research areas, reliable multicast, multimedia congestion control, and self-similar data traffic, distortion is not a problem. Based on our experience in applying abstractions and managing distortions, we provide a set of general guidelines that will help users systematically and progressively select the appropriate abstractions for their simulations. Furthermore, if no abstraction can be applied uniformly throughout a simulation, our hybrid simulation mode enables users to tune and select parts of their simulations to a finer granularity and thus achieves a desirable balance of efficiency and accuracy. 10 Chapter 1 Introduction One of the most important steps in protocol design is evaluation. Implementation, analytical modeling, and simulation are traditionally the three methods for performance evaluation. Evaluations using actual implementations can capture equipment-related details (e.g., processing time), but can be extremely costly if large testbeds are required to study protocol scaling properties. Even if such a large testbed is available, repeated experiments over the testbed can be cumbersome when implementation or design flaws are discovered. By contrast, analytical modeling is much conservative in terms of computing resource usage. However, in order to keep the models tractable for the size and complexity of today’s Internet, a significant amount of detail has to be left out. For example, studies such as coarse-grain buffer size estimation, pure analytical models may work, but for others such as comparisons of congestion control mechanisms, analytical models are often too abstract to provide any meaningful results. While analysis and implementation will remain crucial methods for all designs, researchers have turned increasingly to simulations to complete these methods. Our focus in this thesis is to improve scalability of general-purpose simulators and thus to support large-scale protocol evaluations. General-purpose network simulators (such as ns-2 [49]) make constructing simulations easier by capturing characteristics of network components and providing a modular programming environment, often composed by links, nodes and existing protocol suites. For instance, a link 11 may contain transmission and propagation delay modules, and a node may contain routing tables, forwarding machinery, local agents, queuing objects, TTL objects, interface objects, and loss modules. These composable modules provide a flexible environment to simulate network behaviors, but depending on the level of details desired, these details may or may not be required. Unfortunately, this modular structure can result in significant resource consumption, especially when the simulation scenarios grow. Five minutes of activity on a network the size of today’s Internet would require gigabytes of real memory and months of computation on today’s 100 MIPS uniprocessors. --- Ahn and Danzig [33] One solution to the scalability problem is parallel and distributed simulation in which simulation jobs are divided into parts and coordinated over numbers of CPUs. Parallelism can improve simulation scale in proportion to the number of CPUs added, but this linear growth is not sufficient to provide the several orders of magnitude scaling needed since the Internet is growing at a much higher rate. Parallel simulation can also require hardware that is expensive or not widely available. A complimentary solution is to slim down simulation by abstracting out details. The basic idea is to analyze a particular set of simulations, identify the bottleneck, and eliminate it by abstracting unnecessary details (i.e., making the simulator slim). Abstraction saves time and memory, but simulation results may be distorted. Users must be careful that conclusions drawn from abstract simulations are not affected. We address this problem by providing identical simulation interfaces for detailed and abstract simulations, allowing users to validate through side-by-side comparisons of their simulations at small scales. When the abstraction is validated, they can then use it for very large-scale simulations. 12 Simulation scaling involves a process of selective abstraction that is often application specific, thus raising two interesting questions, “what must be abstracted to make simulations scaleable?” and “what is the effect on simulation results?” We examine these questions, using various unicast and multicast simulations as examples. This thesis shows that abstracting details can speed up and reduce memory consumption significantly while the results of simulations are not crucially affected. Applications of our abstraction techniques may allow the research community to perform large-scale simulations that were impossible before, and to perform previously achievable large-scale simulations on much lower cost hardware. As a result, this work enables evaluation of protocol performance and design in large-scale scenarios. 1.1. Thesis Statement Using abstraction, one can simulate large-scale networks with a substantial amount of traffic and yet retain significant accuracy in many experiments. To verify the above statement, we focus on the following tasks. • Develop abstraction, hybrid, and optimization techniques • Pinpoint the distortions caused by abstraction techniques • Provide general guidelines to assist users with progressive and systematic abstraction selection • Demonstrate the use of selection guidelines through sample simulation studies 13 1.2. Contributions There are two important components in improving simulation performance through selective abstractions. They are the abstraction techniques and the process of selecting appropriate techniques. In order to identify useful abstraction techniques, we classify network research problems and search for the commonality within each class. As a result, these abstraction techniques effectively improve simulation performance, but they are applicable only to network problems. In contrast, the proposed systematic guidelines for progressive abstraction selection may apply to many forms of abstraction-oriented simulations. The guidelines, when used properly, can lead users through the selection of abstractions vs. distortions and maintain the integrity of users’ simulation studies. 1.2.1. Scaling Techniques We propose four abstraction techniques. For each technique, we analyze the simulation performance, pinpoint the distortion aspects, discuss potential impacts on the problems studied, and suggest suitable applications. The first abstraction, called centralized computation, is designed to leave out control message exchange details that are commonly found in network layer protocols. This technique can be applied to unicast routing, multicast routing, or even reservation protocols. Our experiments focus on centralized multicast, an application of centralized computation to multicast routing. We find significant memory usage and run time improvement but also identify behavioral differences in the measured control message overhead and route convergence time. These two distortion effects can impact studies that are concerned with low- level control bandwidth estimation or protocol dynamics during transients. Suitable applications include transport layer protocol studies that evaluate end-to-end performance metrics. 14 The second abstraction, called end-to-end packet delivery, leaves out details in queuing delay calculated in hop-by-hop packet transmission. This technique can be applied to both unicast and multicast packet dissemination. Our experiments focus on session multicast, an application of end-to-end packet delivery to multicast packet dissemination. We find improvements in simulation performance but also observe an increasing difference in end-to-end packet delay when the degree of congestion grows. This distortion effect can impact studies that require congestion in the simulation scenarios. Appropriate applications of this technique include studies of reliable transport protocol such as reliable multicast. The third abstraction, called algorithmic routing, is designed to leave out the effects of certain topology details so we can reduce the size of routing table from O(N 2 ) to O(N). Algorithmic route generation is particularly useful for unicast routing for large numbers of nodes. Our simulation experiments show dramatic decrease in memory consumption and the results are consistent with our analytical findings. If the original topology is not a tree, the distortion can occur in routes that cross ‘cycles’ in the original topology. While the average difference in route length is small, few routes can be several factors longer, depending on the size of cycles in the topology. This distortion may impact studies that are concerned with absolute delay measurements over the same topologies. In addition, with the same topologies, algorithmic routing can change the degree of concentration at the bottlenecks, so congestion control mechanisms should also avoid algorithmic routing. Suitable applications are simulations that use tree topologies or have single sources, as well as those simulations concerned with the traits in a protocol’s scaling properties. The fourth abstraction, called finite state automata modeling, is designed to leave out details in modules that generate traffic. Simulations that require a lot of these traffic generation modules could be very memory consuming. This technique can be applied to TCP, UDP, and other traffic regulating protocols based on timers. For instance, TCP waits a round-trip-time to transmit next 15 batch of packets (i.e., the time duration for a data packet to reach the TCP sink and an acknowledgement to come back). We generate TCP finite state automata, FSA TCP, using a detailed TCP implementation and find that the memory requirement when using FSA TCP is reduced significantly. The more TCP connections to be created, the more memory we save. The distortion is found in the small timing details. This distortion may impact the detailed behavioral study of TCP and its variants. However, it is very useful for studies that introduce new traffic types into the Internet, for instance, control mechanisms for video, audio, and multicast traffic. These protocols require simulations with background TCP traffic to study their impacts to existing traffic, but are not heavily dependent on TCP details. The two limitations are that our FSA TCP currently handles transfers up to 31 packets long and assumes that there’s only one loss per connection. TCP connections that do not comply with these two constraints need to be simulated in detail. Through our understanding of Internet protocols and their simulations, we are able to identify effective abstractions for various bottlenecks in network simulations, pinpoint distortions, reason about potential impacts, and suggest suitable applications. These abstraction techniques can be used together and thus enable flexible ‘levels of abstractions’ in network simulations. Nevertheless, we find that in some cases, one abstraction technique cannot be applied generally throughout a simulation. In these cases, often a large portion of the simulation is abstractable while a small inapplicable portion prohibits the use of the abstraction throughout. Thus, we propose a hybrid simulation technique that facilitates interactions between detailed and abstract modules in an instance of simulation. Our hybrid simulation technique, called mixed mode, is a general concept that will allow mixtures of abstraction techniques and their detailed forms to run in the same simulation instance. As a result, we get tunable fine-grain levels of abstractions. Hybrid simulation can be applied with any of the four abstraction techniques. Our experiments here focus on mixing session 16 multicast and detailed hop-by-hop packet delivery, called mixed mode multicast. In doing so, we allow the congested areas to be simulated in detail, taking into account necessary queuing delay information, while the non-congested areas are simulated in session mode, reducing the simulation memory and run time consumption. As expected, we observe some performance improvement but not as much as if abstractions are applied throughout, while the error in end-to- end per packet delay is small (0.3%) enough to be statistically negligible. With minor API addition, we extend this technique to mix session and detailed unicast packet delivery. This mixed mode packet delivery appears suitable for congestion control studies. In addition to the abstraction and hybrid simulation techniques, we propose two optimization techniques. By optimization, we mean better, more efficient representations of data structures that produce exactly the same simulation results. Packet reference counts use a counter to track identical copies of the same packet while using little additional memory. Packet reference counts are proven effective for multicast simulations and could be potentially very useful for wireless simulations where media is inherently multi-access. The virtual classifier is used to eliminate the forwarding entries that are duplicated from high-level routing tables in simulated nodes for the purpose of reflecting the real world node structure. In simulations, accessing high-level routing tables or low-level forwarding caches give identical routes. Our virtual classifier technique enables route requests directly to high-level routing tables and thus avoids allocating memory for the forwarding caches. In the meantime, it gives virtually the same simulation results. 1.2.2. Selection Guidelines The abstraction techniques improve simulation scale but also introduce errors. These errors may or may not be critical to many simulation studies. Users of abstraction techniques have to be careful to avoid drawing invalid conclusions. Our general guidelines can lead users through the 17 process of selecting appropriate abstractions. We add memory and CPU time monitoring functions that help users to identify performance bottlenecks. Consequently, users can select effective abstraction techniques to eliminate the bottleneck. In additional, we use API’s that allow nearly identical detailed and abstract simulation configurations. Users can conveniently compare results side-by-side to validate the use of abstraction techniques in small scale. To proceed with large-scale simulations using abstractions, users will have to determine whether certain distortions matter to measuring metrics or scaling factors enlarge the degrees of distortion. This step remains highly dependent on the researcher’s individual expertise. In the long term, we hope to gather networking expertise, categorize Internet research problems, define scaling factors and interesting measurement metrics, and finally correlate these scaling factors and measurement metrics with various network characteristics that abstraction techniques might distort. 1.2.3. Case Studies We demonstrate the use of our abstraction techniques and selection guidelines in three case studies, all active research topics. SRM (Scalable Reliable Multicast) [50] is a reliable multicast protocol that uses a timer-based request/response suppression mechanism. We apply session multicast and packet reference count and show that we can improve the simulation performance while maintaining substantial accuracy. RAP (Rate-based Adaptation Protocol) [111] is a TCP friendly congestion control for real-time media. We are not able to select any abstraction techniques that can be applied throughout, so we apply the mixed mode packet delivery (session unicast + detailed hop-by-hop packet delivery). The results are almost identical for full detail or mixed mode simulations. Self-similar traffic analysis [99] is a study to identify data traffic’s time scale phenomena, usually from the scale of 10 msec and up. Thus, we apply FSA TCP. The simulation results confirm that the scaling phenomena from 10-msec scale and up are almost identical in either the detailed or FSA TCP simulations. 18 Techniques Application Experiments Case Studies Centralized Computation (Section 3.2.1) Centralized Multicast Section 4.1 -- End-to-end Packet Delivery (Section 3.2.2) Session Multicast and Unicast Section 4.1 SRM (Section 5.1) Algorithmic Routing (Section 3.2.3) Algorithmic Unicast Routing Section 4.2 -- Finite State Automata Modeling (Section 3.2.4) FSA TCP Section 4.3 Self-Similar Traffic 1 (Section 5.3) Hybrid Simulation (Section 3.3) Mixed Mode Section 4.4 RAP 2 (Section 5.2) Packet Reference Count (Section 3.4.1) Packet Reference Count for Multicast Packets Section 4.5 SRM (Section 5.1) Virtual Classifier (Section 3.4.2) Virtual Unicast Classifier Section 4.6 -- Table 1. Case Studies 1.3. Organization After a brief overview of Internet protocols and related work, we present the abstraction techniques applied to unicast and multicast simulations, and a set of selection guidelines for applying abstract ions to simulations. We then discuss results comparing the performance and accuracy of abstract simulations to their detailed equivalents. Several active research topics, SRM (Scalable Reliable Multicast), RAP (Rate Adaptation Protocol), and self-similar data traffic, are re-examined to demonstrate the usefulness of the general guidelines and abstraction techniques. Finally, we conclude with contributions and future directions. 1 Each simulation in self-similar traffic study uses both detailed and FSA TCP connections. This is a form of hybrid simulation. 2 Each simulation in RAP study uses both detailed and session delivery modes. This is the mixed mode described and evaluated in Section 3.3 and 4.4. 19 Chapter 2 Background and Related Work This thesis focuses on improving simulation efficiency for Internet protocol evaluation. To design simulation techniques that can remove simulation performance bottlenecks, a thorough understanding of the Internet is necessary. We begin this chapter with an overview of the Internet, in particular, its physical structure and layered protocol architecture. Characteristics of various routing, transport, and application level protocols motivate our simulation technique designs and provide insights into possible distortions that might be introduced by abstractions. We continue the chapter with a brief description of relevant unicast, multicast protocols, and scaling factors associated with Internet’s size and complexity (Section 2.1). To compare and contrast our abstraction approach with other simulation techniques, we conclude with a survey of simulation methodologies that forms the core of our related work (Section 2.2). 2.1. Internet Protocols The physical Internet is a collection of routers interconnected by links. Users usually access the network through the end systems (or hosts) that are connected to routers through multi-access media, for example Ethernet. A contiguous collection of routers under one administrative authority is called an administrative domain (or simply a domain). From routers to domains to Internet, there exists roughly a two-level hierarchy. 20 End users communicate through layers of protocols. From physical, link, network, transport, to application, each layer performs certain designated functions and hides details from the higher layer. This property is also known as ‘layer transparency’. By that, higher layer protocols are not required to know the underlying details and will work seamlessly with any other lower layer protocol designs. This property is why end systems can still communicate no matter what operating system we use, what unicast routing we use, or what bandwidth links the data are transmitted through. This layered architecture allows the Internet’s heterogeneity at different layers today. We will discuss unicast and multicast protocols separately. 2.1.1. Unicast Protocols Unicast protocols include one-to-one routing, transport and application. On top of routing, transport protocols deal with important transmission properties such as reliability and flow control. Depending on the nature of the data to be transmitted, degrees of reliability and flow control may vary. Above the transport and network layers, numerous applications interact with end users. During the past few years, applications have evolved in many dimensions, namely speed, volume, media type, and composition of traffic among media types. Unicast Routing Distance Vector (DV) [72]-[75] and Link State (LS) [78][79] are the two most popular intra- domain routing techniques whereas path-vector Border Gateway Protocol (BGP) [76][77] is currently used as the primary inter-domain routing protocol in the Internet. Each DV router sends a vector of distances from itself to every other router in the domain to neighboring routers. After receiving these distance-vector reports, each router re-computes shortest routes to every other router and sends updated distance-vector reports again. These reports are sent periodically and eventually flood the entire routing domain. As a result, DV doesn’t scale to large networks and is 21 only suitable for intra-domain routing. Link State routing floods local topology information. Each router will eventually obtain information about the entire topology and compute shortest routes using the topology. As with DV, the flooding element in LS makes it only suitable for intra-domain routing. To route packets inter-domain, we currently rely on the Border Gateway Protocol. Border routers running BGP exchange routing information per domain 3 , so aggregation can be used to reduce memory space consumption. Reliable Transmission and Congestion Control TCP (Transmission Control Protocol) [83] is currently the most widely used unicast transport protocol. TCP guarantees reliable data transmission and reacts to network congestion. The two ends of a TCP connection keep states such as packet sequence number to detect packet losses and retransmit until the transmission is completed, and sending window size to control the speed with which a source injects data packets into the network. Acknowledgements from the receivers and packet losses signal network capacity availability and congestion respectively. These signals in turn affect the sending window size and eventually affect the sending speed. There have been several flavors of TCP developed over the past few years. They are more or less confined to this general principle of linear increase and multiplicative decrease. One exception is TCP Vegas [82]. Instead of a window-based flow control, TCP Vegas uses a rate-based mechanism, which essentially estimates the available bandwidth per connection and sends packets with the corresponding inter-packet interval. Most Internet applications nowadays use TCP for reliable transmission with delayed acknowledgement [109], where acknowledgements are sent every other packet. 3 Domain identifications are usually represented by the high bits of IP addresses. 22 Real-time Multimedia Congestion Control Several congestion control designs have been proposed for real-time data transmission, as compared to TCP for bulk data transfer. Many kinds of real-time media by nature do not require 100% reliability. However, it is very important that real-time data to be played back ‘smoothly’ after being sent through the network. Window-based flow control usually creates bursty data transfer and thus more and more researchers turn to rate-based control for multimedia congestion control. In the early 90s, several rate-based congestion control mechanisms [84]-[87] were proposed, but most of them require extra mechanisms inside the network. More recently, the need for co-existing of congestion control for bulk file and multimedia has become important. Thus the concept of TCP-friendliness [88] has become one of the major design principles for adding new services into the Internet. SCP (based on TCP Vegas) [89], VDP [90], and those congestion controls implemented for existing commercial media streaming applications [116] have either been shown to not be TCP-friendly, or have not been properly shown to be TCP- friendly. Several TCP-friendly congestion control schemes have surfaced in the network research community. In particular, it has been shown through simulations and scientific experimentation that RAP [111] and TCP each allocate an equal share of network bandwidth. Unicast Application Recent measurement studies [108] show that HTTP (Hyper Text Transmission Protocol) traffic has displaced FTP as the most significant contributor to the Internet traffic. The original idea of HTTP is simple. It allows clients to request a web object and the corresponding servers to respond as to whether they possess such a web object. If the servers do have a copy of the web object, they will subsequently send the contents of the requested object using TCP. Since the World Wide Web (WWW) took off [96], several performance problems in original HTTP design have been raised. For example, when a user clicks on a web page, several HTTP connections 23 may be established to transmit multiple web objects contained in the page. Some of these HTTP connections may experience unnecessary TCP slow start because window sizes in closing TCP connections may still be a good indication of network capacity availability. HTTP with persistent TCP connection [107] proposes to transfer consecutive web object transfers through a persistent TCP connection and thus to avoid the unnecessary delay caused by TCP slow start. WWW will need caching to further improve the web performance in terms of delay and bandwidth consumption. Several web caching architectures have been proposed [113]-[115]. 2.1.2. Multicast Protocols Multicast means the routing support for transfer of information from one or more sources to multiple receivers. On top of multicast routing, applications may need support for reliably multicasting data and handling congestion. In this subsection, we briefly examine multicast routing, reliable multicast, multicast congestion control, and multicast application design. Multicast Routing Multicast supports transfer of information from one or more sources to multiple receivers (multicast group members). Current Internet multicast protocols transfer information along distribution trees rooted at the group sources and spanning the group members. Various multicast routing protocols establish these multicast trees in different ways: broadcast and prune (e.g., DVMRP [37] and PIM-DM [38]), membership advertisement (e.g., MOSPF [46][47]), and rendezvous-based (e.g., CBT [39], PIM-SM [40]-[45], and BGMP [67]). Broadcast and prune multicast routing protocols flood data packets all over the network and then prune back branches that do not have members. They are suitable for groups with dense member distribution, and, thus, are also called dense mode (DM) multicast routing protocols. Membership advertisement multicast routing protocols advertise membership and topology information everywhere in the 24 network and calculate the multicast tree routes according to this information. Due to the domain- wide advertisement of membership and topology information, these multicast routing protocols do not scale to large networks and are most suitable for intra-domain multicast. Rendezvous- based multicast routing protocols build multicast trees by sending explicit membership join messages toward a well-known rendezvous point and sources. These protocols have good scaling properties and are, therefore, suitable for inter-domain multicast routing and sparse member distribution. Much of the current IP multicast infrastructure uses dense mode multicast protocols such as DVMRP and PIM-DM, so we use a DVMRP-like dense mode as the standard detailed multicast for the simulations. Reliable Multicast Multicast routing does not guarantee that all the group members successfully receive packets. Therefore, multicast transfer with some form of error recovery is needed, much as TCP is for unicast routing. Several reliable multicast transfer mechanisms have been proposed to suite various multicast applications. In particular, SRM [50] is designed for interactive multi-party applications, e.g., wb (whiteboard, an interactive, conferencing tool) [57]. In SRM, the goal is to shorten the recovery delay as much as possible without generating too many recovery packets. RMTP [59] is designed to help distributed simulations that rely on multicast to transfer synchronization packets. In this kind of applications, the recovery delay is even more essential. Because the distributed simulation environment is usually controllable, the recovery overhead can be compensated by over-provisioning the communication bandwidth among machines running simulations. On the other hand, MFTP [61] is for multicast file transfer, an application that has much looser delay constraint and therefore it is designed to minimize the recovery overhead. 25 Multicast Congestion Control As in unicast, congestion control is important to keep the network stable in the face of unknown resource demands. In multicast, the problem is harder due to the heterogeneity of a multicast group. Simply backing off when detecting a congestion signal is likely to reduce the source rate to zero. A few multicast congestion control protocols have been proposed to solve the zero- source-rate problem. One of them is RLM [51]. It suggests that application media can be encoded into several layers. Each layer provides another degree of resolution. Together with multiple, consecutive multicast groups, receivers can selectively join or leave groups. Other proposals [52][54] suggest dynamically and adaptively subdividing into several sub-groups according to correlation. Each sub-group then uses TCP like connection between the sub-group representative and the source. Multicast Applications Multimedia conferencing applications have become increasingly popular during the past years. Tools like vic [56], vat [55] and wb have been ported into several platforms, including Microsoft Windows. As corporations and Internet Service Providers deploy mature multicast, the user population of conferencing and broadcast is expected to grow dramatically in the near future. Therefore, it is important to investigate the scalability of these applications. 2.1.3. Scaling Dimensions of Network Protocols Scaling dimensions are the variables that affect resource consumption such as memory and CPU consumption. In the scope of Internet protocols, scaling dimensions can be categorized into topology and traffic. The number of nodes and links, link capacity (delay-bandwidth product), and buffer size are topology-related scaling factors; number of sources, groups, members, and the source data rate are traffic-related scaling factors. Each of them affect resource consumption by 26 requiring nodes and links themselves, using state in nodes, and making packets on links and queues. Table 2 lists all the factors and the resources affected. Class Factor Affected Resource Growth Topology Node (N) Nodes, control messages, and routing states O(N 2 ) Link (L) Links, control messages, multicast routing states and packets 4 O(L) Link Capacity (C) Packets on links O(C) Buffer Size (B) Packets in queues O(B) Traffic Source (S) Packets on links, connection states, and multicast states O(S) Group (G) Packets on links, control messages, and multicast states O(G) Member (M) Packets on links, control messages, and multicast states 5 O(M) Data Rate (D) Packets on links O(D) Table 2. Scaling Dimensions and Their Effect on Resource Consumption Total Resource = O(N 2 ) + O(L) + O(C) + O(B) + O(S * G * M * D) ---Equation 1 From Table 2, we obtain the above formula that estimates the growth rate of resource consumption for a network simulation. The formula also reveals the degree of potential effectiveness for each factor, which is a useful reference to systematically develop abstraction techniques and their priority. Moreover, constants in this resource consumption analysis may not be negligible. For instance, in ns-2, the memory consumption for an individual link is much greater than the memory consumption for a routing entry. Sometimes, total memory consumption for links, though in the order of O(L), can be larger than the cost for the routing table, in the order of O(N 2 ). Especially for topologies with high connectivity degree (L/N) 6 , total resource to create links, l * L, can be greater than the resource to maintain the routing table, r * N 2 , where l and r are the memory consumption for a link and routing entry respectively. 4 Potentially, multicast groups with the same amount of members span larger trees in larger topologies. 5 Overhead depends on the membership distribution. 27 Class Simulator Scale Available Modules Sequential MIT NetSim -- -- General NEST Multi-thread 100 nodes -- REAL 100 nodes TCP, telnet, ftp, routing, scheduling VINT ns-2 Abstraction 50,000 nodes on one Pentium PC Rich library of TCP, unicast routing, multicast transport, multicast, scheduling mechanism and traffic source Commercial OPNET -- Rich 7 COMNET III -- BONeS -- Emulation NIST Net -- Various network conditions (e.g., delay and loss) VINT ns-2 Emulation TCP and UDP interfaces between simulation and real world Special- Purpose MaRS -- Routing, telnet, ftp, and Poisson sources, simplified TCP (Routing) PIMSIM -- PIM-SM and PIM-DM MCRSIM -- Multicast routing, voice and video sources (TCP) TCP with PTOLEMY -- TCP Netsim -- FIFO queue, simple traffic source (IP/ATM) INSANE -- Basic IP transport and scheduling mechanisms, IP and ATM link layer, ATM queuing and signaling ATM Simulator -- ATM switches, hosts, applications, link layer and virtual circuit routing Abstraction VINT ns-2 Abstraction 50,000 nodes on one Pentium PC Rich library of TCP, unicast routing, multicast transport, multicast, scheduling mechanism and traffic source Parallel & DistREAL 2000 nodes 8 TCP, telnet, ftp, routing, scheduling Distributed Parsec (Maisie) -- Wireless 9 , TCP S3 100,000 wireless nodes 10 ATM, reliable multicast, wireless, radio, mobile network, TCP VINT ns-2 and Georgia Tech In Progress Rich library of unicast and multicast protocol and network components Hybrid OO Analytical Model 1024 nodes Routing, service time model queue (FIFO only) Table 3. A Comparison of Network Simulators 6 L >> N 7 The commercial network simulators have strong library and service support. Most modules are available or can be easily created with provided tools. 8 The authors of DistREAL expect to run 2000 node simulations on 20 Sun workstations. 9 Wireless network simulation is one of the major applications of Parsec. At this time, it does not support wired simulations. 28 2.2. Network Simulations There has been a great deal of work on network simulations. Here we briefly summarize prior work in simulator development, parallel and distributed simulation, abstraction, and hybrid simulation. Table 3 compares them using to the following criteria: scale (e.g., sequential discrete- event simulation, distributed and parallel simulation, and abstraction techniques) and available modules (e.g., network topologies, source patterns, and protocols). 2.2.1. Network Simulation Development During the past decade, many simulators were built to fit individual needs. In this section, we briefly review the major simulators and categorize them into packet simulator, emulation and special-purpose simulators. Packet Simulator Previous sequential packet network simulators were not capable of large simulations containing more than hundreds of nodes. In this subsection, we briefly examine the architecture of some packet network simulators and the associated scaling issues. NEST [21] is a general-purpose communication network simulator. It addresses a duplication of effort issue. By that, we mean instead of developing both simulation code and real implementation, NEST allows actual code to be plugged in with only minor adjustments. It also contains a graphical environment for simulation and rapid prototyping of distributed networked systems and protocols. Its core, the simulation server, is designed so that multiple threads of execution are supported in a single process and are therefore efficient and lightweight. NEST scales up to hundreds of nodes in a workstation environment. 10 The authors of S3 claim that S3 is capable of 100,000 wireless node simulations on 1000s of machines. 29 REAL [20], based on NEST, concentrates on evaluating transport protocols and packet scheduling mechanisms. TCP and many telnet/ftp sources are implemented for node entities; several routing and scheduling algorithms are implemented for gateway entities. Graphical interface and monitoring tool make it easy to draw graph, set parameters, monitor node variables, and generate printable reports. REAL is capable of 100-node simulations on a Sun workstation. Commercial simulators such as OPNET [12][68] of MIL3, COMNET III [13], and BONeS [11] are user-oriented. They provide convenient, graphical tools (to name a few, scenario creation, model construction, and trace analysis) and rich protocol module libraries to assist users through the phases of their simulation study. Nevertheless, there is no specific focus on simulation scalability. Most packet simulators in this category concentrate on providing a rich library of TCP/IP network component and protocol implementations. Especially in commercial simulators, network components and protocols are implemented in great detail. As a result of that, these simulators are limited to small or medium size simulations, up to a couple hundreds of nodes. The ns-2 simulator is similar to these general packet simulators in providing a rich network component and protocol library. However, the ns-2 simulator is different from other general simulators in providing levels of abstractions (or details). Users can select the appropriate level of details to meet the requirements of their simulation study. By that, the ns-2 simulator is potentially capable of up to 10,000s of node simulations. Network Emulator Network Emulation is an alternative approach to evaluate network protocols. This approach supports live-code testing and eliminates the duplicate work in developing both simulation model and real code. More importantly, it is more realistic – less likely to inadversely neglect things. 30 However, live-code testing also implies a higher cost in hardware, and will not be practical to test a network with more than hundreds of nodes. NIST Net [32] is a more recent attempt to improve and integrate previous efforts on the network emulation and live-code testing (e.g., packet drop emulation by Thomas Skibo, UIUC [32], packet delay emulation by Elliot Limin Yan, USC [116], and X-kernel simulator by Lawrence Brakmo, et al., University of Arizona [81]). Network emulation has the advantages of controlled and reproducible environment and high degree of "real-world behavior" from both sides of simulation and live-code testing. However, this approach may consume a larger amount of human resource during the development phase due to the intensive kernel programming. The NIST Net developers choose the widely available Linux 2.0.30 for the implementation and promise to provide a wide variety of network conditions, delay, reordering, loss, duplication, and bandwidth limitation for live code testing. The emulator architecture is built upon an intercept agent which intercepts incoming packets and decide whether to reschedule, drop or forward it to user supplied handlers. The NIST Net development group does not address scaling issues. The ns-2 simulator also provides emulation interfaces to live TCP and UDP implementations for FreeBSD. These emulation interfaces allow interactions between live codes and simulation models, as well as interactions between live codes. Sizes of emulation experiments are usually small because these experiments require high degrees of details. However, emulation is important and complementary to simulations. It provides good opportunities for validation. Special purpose Simulator (Routing, TCP, Ethernet, and ATM) Many other simulators are customized to suit individual needs. In this section, we introduce several simulators that are specifically developed to evaluate routing protocols, TCP, Ethernet, and ATM. However, like most simulators mentioned in previous subsections, the focus is to develop realistic simulation models rather than pursuing large-scale simulations. 31 The next few simulators, MaRS [24], PIMSIM [31], and MCRSIM [29], are for the study of routing algorithms. The basic structures are very similar. They are all discrete event simulators; each node represents a router or switch; each link represents a connection in between nodes. However, they provide different set of routing algorithms, source patterns, and node models to suit the need of individual study. MaRS, for datagram network (IP) routing, uses file transfer (FTP), remote login access (TELNET), and simple (Poisson) traffic patterns to drive the simulations, whereas MCRSIM, for circuit-switching network (ATM) routing, uses voice and video traffic patterns. Each of MaRS and MCRSIM provided a variety of unicast and multicast routing algorithms, whereas PIMSIM, extended from MaRS, emphasized on the study of PIM- DM and PIM-SM. MaRS and PIMSIM's event handling and user interface routines come from a network simulator developed by MIT (NetSim [22]), and an optional graphical interface can be used to monitor changes in simulation states and many average or instaneous performance measures. MCRSIM, written in C++, uses Motif and X library for graphical user interface, which allows users to create or change their topologies graphically. There is no apparent effort in improving simulation scalability. A TCP Simulator built on PTOLEMY [30] [23] implements 4.3BSD Tahoe based TCP and a variant with fast retransmission. PTOLEMY is written in C++ with an object-oriented architecture. It is a general simulation engine that models and simulates the communication network, signal process, hardware and software design, parallel computing, and various other applications. The core of PTOLEMY development is its generic simulation construction framework that allows a broad range of system design study. Network simulations use the ‘discrete-event’ domain in PTOLEMY where a network is a combination of stars, galaxies and universes, representing the nodes, links, and clusters in communication network. A form of wormhole routing is used to route packets or messages inside the topology. Despite the wide range of applicability, PTOLEMY scaling property has not been studied thoroughly. 32 Netsim [25] concentrates on modeling Ethernet. It assumes the queues are strictly FIFO, and therefore it can use a less memory consuming post-scheduling to simulate the network queues. As a result, the simulator has better scaling properties when network size and amount of traffic grow. However, the size of the network is limited to 1000 stations to match the IEEE 802.3 Ethernet standard. ATM Network Simulator and INSANE are the two simulators designed for ATM related simulations. While ATM Network Simulator concentrates on ATM switch and application designs, INSANE concentrates on IP-over-ATM designs. Hence, ATM Network Simulator, based on MIT’s NetSim, provides more detailed implementation of ATM specific components, physical links, ATM switches, broadband terminal equipment, ATM applications, and variable or constant bit rate traffic source. In addition to common ATM components, INSANE provides a library of typical Internet applications (e.g., ftp, telnet, www browser, audio, video, and empirical traffic models derived from tcplib [70]) and the Internet protocol stack (e.g., TCP, UDP, and IP). Both simulators focus on the development of network components, as opposed to the investigation of simulation scalability. It is common to see special-purpose simulators leave out unnecessary details. For instance, many TCP and ATM simulators compute unicast routing tables using simple shortest path algorithms in a centralized fashion, as opposed to implementing the distributed distance vector or link state style unicast routing mechanisms. Other customized simulators may simplify the calculation of queuing delays because there will be no congestion in these simulations. These observations have motivated some of our abstraction techniques. Our work further generalizes and extends the use abstractions for network simulations in a larger scope. 33 2.2.2. Parallel and Distributed Simulation An alternative to sequential simulation is parallel and distributed simulation. It exploits the cost benefits of microprocessors and high-bandwidth interconnections by partitioning the simulation problem and distributing executions in parallel. The distributed simulations require techniques such as conservative and optimistic [6]-[8] synchronization mechanisms to maintain the correct event ordering. Consequently, the simulation efficiency may be degraded due to the overhead associated with these techniques. In addition, the typical simulation algorithm does not easily partition for parallel execution. Ohi and Preiss [4][5] investigated several block selection policies and found limited speedup and possibly degraded performance when there were a large number of unique event types. Although parallel and distributed simulation is useful when large computers are available, alternative techniques such as abstraction are needed to make very large simulations. Several parallel and distributed simulators are examined in the rest of the section. In DistREAL, simulation time is divided into many passes (the length of a pass is adjustable depending on the desired degree of parallelism), and a variant of barrier synchronization is applied to coordinate among REAL instances. This synchronization mechanism is easy to implement and there is no need to rollback, but the gain in execution time may be limited. It is shown that REAL is capable of 100-node simulations on a Sun workstation and is expected to run 2000-node simulations on 20 Sun workstations. Maisie (also known as Parsec) [9] is a language-based simulator (enhanced C with a few primitives). This type of simulator separates the simulation program from the underlying parallel simulation algorithms (e.g., sequential or parallel). Therefore, it is easy for properly written sequential simulations to be ported to parallel simulations. A major concern of Maisie is the efficiency of the simulations, so an execution-monitoring tool is conveniently provided for the purpose. Users can start with a fast design/implementation of a prototype simulation which models the existing system, and then use the provided transparent monitoring tool to identify 34 most frequently executed events, thus refine the prototype implementation. Ultimately, a parallel and distributed version of the prototype simulation can be easily created with a few adjustments. When switching to parallel mode, users can choose either optimistic or conservative algorithm. The parallel version scales linearly better. COMPOSE [28] is an object-oriented variant of Maisie. Like Maisie, it supports both conservative and optimistic algorithms for parallel and distributed execution. In addition, objects may dynamically change its mode of synchronization in COMPOSE. COMPOSE is implemented in C++ and the simulation facilities are provided as library routines instead of language extension. The composibility, reusability, and flexibility are improved but the conversion to equivalent parallel versions becomes more difficult. However, the linear improvement in performance is expected to stay the same. TED (Telecommunication Description Language) [14]-[18] is a language designed for modeling telecommunication networks. By combining a specification language (METATED) and an external language (e.g., C++), TED is able to separate protocol structure from behavior, emphasize on modularity, and yet retain the extendibility to parallel simulations. ATM PNNI routing simulations using TED has been observed a speedup factor of 5.05 with 8 processors (two hundred nodes, three hundred edges, and ten thousand call requests). Preliminary results of a multicast protocol, wireless network, and TCP on TED also suggest significant speedup. However, the authors also observe the range of improvement degrades gradually as the number of processors increases. S3 (Scalable Self-Organizing Simulation) [10] adapts the TED framework and emphasizes the continuing development of the extensible common description language and the efficient, dynamically adjustable parallel simulation system. The developers have set their goal on large- scale telecommunication systems, including an extended defense communication architecture targeting networks with 1,000,000 nodes. There have been simulations containing 100,000 35 mobile units distributed over thousands of cells, but further improvement will require hierarchical modeling techniques [10]. Previous works have shown that parallel and distributed simulators can indeed improve simulation scales, but the improvement is linear at best. The Internet is growing at a much higher rate. To conduct realistically large network simulations, we need solutions that can further improve performance already achieved by parallel and distributed techniques. Our abstraction approach is one such solution and complementary to parallel and distributed simulations. 2.2.3. Simulation Abstraction As a complementary alternative to parallel simulations, we propose simulation abstraction. Most customized simulator developers have been more or less abstracting details they deem not crucial to their study, but very few of them conduct formal studies on the efficiency and impact of abstraction techniques. Recently several simulation research groups expressed interests in the area of abstraction techniques and are conducting researches as this thesis is prepared. The following paragraphs briefly introduce previous effort, as well as work in progress. Flowsim [33] is one of the pioneer works in simulation abstraction. In 1996, Ahn and Danzig proposed to abstract packet streams (Flowsim) for packet network simulations and proved that Flowsim could be adjusted to the desired simulation granularity and help to study flow and congestion control algorithms more efficiently. However, Flowsim only abstracts one aspect of network simulation. This thesis presents four more abstraction techniques, thus expanding the scope of applying abstraction. Fast Network Simulation [34] considers sessions (i.e., flows) and ignores individual packet transmission. Although one extreme of Flowsim (the coarsest granularity) can be viewed as simulating only sessions, the Fast Network Simulation pursues the abstraction method differently. It is proposed to observe the effect of various simulation parameters on a flow, thereby replacing 36 individual packet transmission with light-weight session-level reactions to simulation condition change. The preliminary document of Fast Network Simulation also states the importance of the abstraction techniques proposed in this thesis as tools for the session cause-effect study. The development of the session-level abstraction is underway. Optimizing Simulation for Large Networks [36] is another related work in progress. It stresses an important concept -- dynamic changes in simulation ‘resolution’ (levels of details), which has been deployed in the Physics, Biology, and Chemistry community. Scientists with different purposes may look at ‘things’ at different depths, scopes and angles [3]. For instance, physicists look at the scale of light years (the universe), biologists look at the scale of millimeters (human cells), and chemists look at the scale of 10 -20 meter (molecules). Similarly, network studies with different purposes may look at ‘networks’ at different depths, scopes and angles. Consequently, simulations can be made more efficient by focusing on the hot spots with desired level of details and letting the rest run in less detail. The same argument is made to justify the abstraction techniques in this thesis, called levels of abstraction. There appear to be some common characteristics between the two efforts 2.2.4. Hybrid Simulation In hybrid simulation models [35], both discrete-event simulation and analytic techniques are combined to produce efficient yet accurate system models. There are examples of using hybrid simulations on hypothetical computer systems. For instance, discrete-event simulation is used to model the arrival and activation of jobs when a central-server, analytical queuing network is used to model the use of system processors. The accuracy and efficiency of the hybrid techniques are demonstrated by comparing the result and computational costs of the hybrid model of the example with those of an equivalent simulation-only model. Our abstract simulation can be thought of as an application of hybrid simulation to networking, where one layer is hybridized. 37 Mixed mode is a hybrid of detailed and abstract techniques, where one part of a network topology is hybridized. The object-oriented simulator (OO) [69] focuses on the study of heuristic routing strategies on large-scale communication networks. It is desired to have a large number and variety of entities, and dynamic behavior and interaction among them. The object-oriented approach can facilitate the study of heterogeneous communication network. In order to scale better, it is suggested that simulations should focus on those deemed essential for the analysis, and others should be abstracted and represented implicitly or left out completely, when increasing the network size and interaction complexity. The core of the simulator is an event-driven schedule list which seems more appropriate than a time-driven mechanism 11 , because the computer system and communication network components tend to schedule their actions at various time instants. The authors adapt a post-scheduling queuing mechanism which buffers only one event in the event list per queue, and a priori service time mechanism which generates the packet service time according to a default exponential distribution (could be a simple deterministic distribution to give constant service delay). It is reported that a routing experiment of a 32x32 grid topology (1024 nodes; each is a queue; packets generated in constant mean rate; sent to random destination) takes 4400 sec on a HP 712/80 workstation for 300 seconds of simulation time. 2.2.5. Related Work Summary In Table 3, we classify network simulators discussed above into several categories. OPNET, COMNETIII, and BONeS are the more popular commercial network simulators. Their vendors emphasize on providing user-friendly interfaces, and detailed network component and protocol modules. To discuss publicly available simulators, we further separate these simulators into general-purpose and special-purpose simulators. General-purpose simulators usually provide a 38 rich library of network- and transport-layer protocol modules as well as various network node or link component models. Based on the scaling methods these general-purpose simulators adapt, we further subdivide them into sequential, parallel and distributed, and hybrid simulators. NetSim, NEST, REAL, and ns-2 fall into the sequential simulator class. Using a selective abstraction approach described in this thesis, ns-2 is currently capable of 10,000s-node and 10,000s-packet-flow simulations. Parsec, S3, and ns-2 Georgia Tech extension fall into the parallel and distributed simulator class. Parsec and S3 claimed to be capable of 10,000s- to 100,000s-node wireless simulations on multiple machines. OO is one of the few hybrid simulators that use analytical models within packet level simulations. Other than these more general-purpose simulators, there exists a great deal of special-purpose simulators. Some are designed to study routing mechanisms, some are designed to study TCP performance, and others are designed to study IP over ATM issues. 11 Time-driven scheduling mechanism queries all components to decide whether a event needs to be executed which is often to be null events in environments such as computer systems. 39 Chapter 3 Scaling Techniques The focus of this chapter is the details of scaling techniques, including four abstractions, one kind of hybrid simulation, and two optimizations, as well as the details of applying these scaling techniques to various multicast and unicast simulations. Because ns-2 has been widely used for network research and richly supplied with various network protocols, we choose to implement our scaling techniques on ns-2 [49]. For each scaling technique and its application, we start by describing the original implementation in ns-2 and then the abstract version. Following the description of each technique, we qualitatively compare and contrast the details that are being left out and the distortions that might have been introduced. Detailed quantitative evaluations are presented in Chapter 4, where we systematically examine the efficiency of each abstraction in terms of memory and run time, as well as the associated distortions. In the last section, we present the general guidelines to systematically and progressively select scaling techniques for individual studies. Unlike the proposed scaling techniques that may be applicable only to data network simulations, our selection guidelines may be more generally applicable to other forms of ‘abstract’ simulations. 3.1. Overview Large-scale simulations are restricted because of resource constraints, typically in CPU consumption and memory usage. In the context of one-processor, sequential simulations, 40 memory usage is usually the bottleneck [102]. Anecdotal evidence suggests that memory is often also the bottleneck in parallel simulations. A simulation that requires more memory than a simulation machine can provide will never complete. Simulations that have working sets much larger than physical memory will take orders of magnitude longer to complete. The thesis shows that through abstractions we are able to reduce resource consumption, mainly memory, and thereby enable larger-scale simulations. Abstractions Abstraction is possible because most protocol architectures are layered such that one protocol makes use of another. In design or evaluation of a level-n protocol, we need information provided by level n-1 and below, but not necessarily (depending on the research questions) all the details of the lower level protocols. For instance, a multicast transport protocol may need multicast routing tables in order to forward multicast packets. The detailed exchange of messages to generate those routing tables is often not important. If we abstract away unnecessary details, the memory and time consumption that would have been used by these details can be conserved and used to simulate larger scale scenarios. Based on this reasoning, we proposed centralized computation and end-to-end packet delivery. These two techniques are aimed to abstract away, respectively, network layer and hop-by-hop transmission details that higher layer protocol studies may or may not require. Protocol complexity is only one dimension of the scaling problems. To be able to simulate very large networks and substantial amounts of traffic, we need to enhance those elements that construct network connectivity or generate traffic in the simulations. Currently, in most network simulators, the cost to construct network connectivity is proportional to the cost of creating nodes O(N), links O(L) and routing table O(N 2 ). For large topologies (N > 1000), routing table cost is the most significant. Hierarchical routing has the 41 potential of reducing the cost of routing information to O(NlogN) when the network hierarchy is fairly balanced. Raman, Shenker, and McCanne suggested [112] a routing mechanism for simulation that we call algorithmic routing, and we generalized it to apply to arbitrary topologies. This form of routing shifts some of the burden in maintaining routing information to computation and results in only O(N) memory consumption. Note that algorithmic routing is designed for the purpose of efficient network simulations; it is not suitable as a routing protocol for real networks. The cost of generating traffic comes from creating packet flows and scheduling packets. While Ahn and Danzig have focused [33] on abstracting details of simulating individual packets, we emphasize reducing the cost of creating individual traffic flows. According to recent measurement studies [108], Internet traffic is composed mainly of TCP flows and very few UDP flows. Furthermore, distribution of these TCP flows is heavy-tailed. In other words, most of these TCP flows are short and the transmissions are more likely to complete before TCP reaches the steady state, i.e., connections would complete in the slow start phase rather than the congestion avoidance phase. Based on this observation and our understanding of TCP’s roughly ‘per round trip time transmitting a batch of packets’ behavior [118], we propose to model short TCP connections using a finite state automata, also interpreted as a multi-state Markov chain. Using this finite state automata, the cost of a TCP flow is almost just a pointer to a state in the finite state automata, as opposed to all the variables and mechanism details that the regular TCP implementation has to maintain. Distortions The laws of physics often suggest that it is likely that we lose something whenever we gain something. The same principle applies to the study of abstract simulations. By leaving out certain details, we gain efficiency but also create distortions. However, it is not a bad thing to lose some details as long as these details will not affect the final conclusion drawn from the 42 simulation study. Moreover, this is not a new tradeoff or observation. People have been conducting simulation studies through their own customized simulators. These customized simulators often implicitly adopt some forms of abstractions. Some may be more abstract than others, depending on the nature and assumptions of studies. It has been and will always be primarily people’s responsibility to make sure their simulations are congruent to the assumptions of their studies. In other words, the process of validating a simulation study is application specific. To ease the process of validation, ns-2 provides nearly identical APIs to configure detailed and abstract simulations. This feature allows convenient comparison between detailed and abstract simulations. To systemize the process of selecting appropriate abstractions, we propose a set of selection guidelines. To fully utilize the selection guidelines, we also pinpoint distortions introduced by each of the abstraction techniques. This is possible because we know the exact details that are left out. Consequently, anyone can go over his or her measurement metrics and decide whether the distortion will affect the final conclusion. The proposed systematic, progressive abstraction selection guidelines apply to any forms of abstract simulations. The guidelines, when used properly, can assert users in sorting through the tradeoffs between abstractions and distortions and to maintain the integrity of simulation studies and conclusions. Hybrid simulations For some studies, we may not find any abstraction that is applicable throughout the entire simulation. If the abstraction is applicable to a large portion of the simulation, this simulation can take advantage of our hybrid simulation technique, which basically allows detailed and abstract modes run at the same time. If being able to apply various abstraction techniques creates a simulation environment with ‘levels of abstraction’, this hybrid simulation technique, in a sense, enables users to tune their simulations with an even finer granularity. 43 Optimizations There is a subtle difference between abstraction and implementation tuning. Implementation tuning means faster and less memory consuming implementation that retains exactly the same results as the original simulations, whereas abstraction means leaving out details, which may result in distorted results. We believe the two approaches are orthogonal and complementary approaches to achieve simulation scalability. Table 4 gives an analogy to explain the subtle difference between abstraction and implementation tuning. In that, an abstract simulation can be re-implemented in a more efficient manner. Similarly, an optimized implementation can apply abstraction techniques for another degree of performance improvement. Orthogonal Solutions Without Implementation Tuning Implementation Tuning Without Abstraction 4.1 + 4.1 + 4.1 4.1 * 3 Abstraction 4 + 4 + 4 4 * 3 Table 4. Implementation Tuning and Abstraction are complementary 3.2. Abstraction techniques Abstraction differs from other scaling techniques in a way that each technique ignores certain aspects of the network characteristics. Our first abstraction technique, Centralized Computation ignores control messages that are typically sent throughout the network in order to establish protocol specific states. Secondly, End-to-end Packet Delivery ignores queuing aspects of end- to-end delay. Thirdly, Algorithmic Routing can be interpreted as ignoring links that form cycles in the original topologies. Lastly, FSA Modeling is used to ignore connection level details for any transmission protocols that exhibit periodic behaviors. The following subsections describe these abstraction techniques and their example applications in detail. In order to compare and contrast the gain by applying abstractions, we also discuss the existing models implemented in the Network Simulation version 2 (ns-2). 44 3.2.1. Centralized Computation In a typical network layer protocol (e.g., routing), each network node propagates local information globally and computes the ‘states’ needed in order to perform the task of the protocol. For example in distance vector unicast routing, each router propagates its routing table globally and computes the best paths to all other nodes according to the routing tables it heard from other routers. To higher layer protocols, ‘ends’ of the lower layer protocols are important, but ‘means’ may not be so. Our observation on simulations using detailed network layer routing protocols suggests that the global message exchange and timer events are the major performance bottlenecks. In particular, we are able to simulate only 10s of nodes with the original, detailed dense mode multicast routing with 128 MB virtual memory. Our first abstraction technique, centralized computation, is to provide an option to ignore the global message exchange and to conserve resources for larger-scale simulations. Instead of setting timers, sending messages and calculating states in a distributed fashion, centralized computation technique use algorithms that will produce equivalent results, and update ‘states’ (e.g., routing table) instantly. Described in the paragraphs below, we implement this technique on multicast routing as an example. The same approach can be used for similar protocols (for example, unicast routing and the reservation protocol, RSVP [71]). The centralized computation technique conserves a significant amount of memory and time at the cost of a slight difference in route convergence latency when group membership or topology changes. Original Dense Mode Multicast Routing The ns-2 implementation of dense mode multicast closely follows real-world implementations in terms of message exchanges. Each dense mode multicast agent maintains a parent-child relationship to all other agents by a triggered check to neighbors’ routing tables. When a cache- miss occurs in a node (e.g., a source starts to send packets), this node upcalls its local dense mode 45 agent to install a multicast entry according to the source and group addresses in the packet header. The dense mode multicast agent will insert only the child links indicated in its parent-child relationship to be the outgoing interfaces. After the multicast forwarding entry is installed, packets are forwarded to the outgoing interfaces. When a multicast packet reaches a leaf node that does not have any local member, a prune message is sent upstream to start a prune timer in the upstream dense mode agent for the particular outgoing interface. Within this prune timeout period, multicast packets will not be forwarded on this outgoing interface. When a member joins a group and there exists a multicast forwarding entry for the group but no local or downstream members, a graft message is sent upstream to cancel a prune timer (if there’s any) so multicast packets can be received at this member. Similarly, when a member leaves a group and there are no local or downstream members, a prune message is sent upstream to prune off this branch. In this original dense mode multicast routing, we take into account the memory and CPU time required to maintain timers, build control message handler, and transmit control message. n0 n1 n3 n2 n4 n5 : Member : Data : Prune Dense Mode Multicast n0 n1 n3 n2 n4 n5 Centralized Route Computation Centralized Multicast Source Figure 1. A Comparison of Detailed Dense Mode Multicast and Centralized Multicast Centralized Multicast The multicast routing details described above were helpful in developing these protocols and validating the simulator, but they often are unnecessarily specific. Our centralized multicast abstraction eliminates much of this message exchange. 46 The centralized multicast computation agent (Figure 1, right) keeps track of source and member lists for a group. Therefore, when a member joins a group, a multicast tree branch is installed toward all sources for the group until it merges with the original tree. Similarly, when a member leaves a group, a multicast tree branch is pruned until it reaches the original tree. When a source starts to send packets to a group, the entire multicast tree is installed according to the member list of the group. There are no prune timers associated with outgoing interfaces, and whenever there is a topology change, all the multicast trees are re-installed. The periodic broadcast and prune, parent-child relationship maintenance, and message exchange used in the dense mode multicast are omitted in centralized multicast. We look at the effect of this omission in Section 4.1 and its effect in simulations of SRM in Section 5.1. 3.2.2. End-to-End Packet Delivery Centralized route computation allows us to scale to 100s of nodes, but at 1000s of nodes the overhead of general-purpose per-node and link data structures and hop by hop packet transmission become very large. To reduce the node and link structure, we developed end-to-end packet delivery, our second abstraction technique. End-to-end packet distribution can be used to abstract hop-by-hop transmission (i.e., traffic flows and multicast sessions) for transport-layer protocols. Instead of sending packets through a series of queues and links, end-to-end packet distribution computes delay and loss characteristics and directly schedules receive events at the group members according to these characteristics. Consequently, nodes and links become lightweight data structures. Applying this technique to multicast sessions, end-to-end multicast distribution sets up a direct connection between a source and its receivers, computes loss, delay, and TTL characteristics for each source and receiver pair, and schedules packet delivery events accordingly. Details of the abstract multicast distribution are presented in the following section. The same end-to-end packet distribution can be applied to 47 unicast hop-by-hop transmission for transport protocol study such as TCP. The end-to-end packet distribution technique significantly improved the simulation performance with increased topology size, but with observable difference in packet end-to-end delay when the network is congested as described in the next section. Link Link Link Node entry MULTICAST NODE Agent Agent Agent dmux_ classifier_ agents_ entry_ Multicast Classifier multiclassifier_ Replicators <S1,G1> <S2,G2> Figure 2. ns-2 Implementation of a Multicast Node Original Multicast Packet Delivery Multicast packets are forwarded through a series of objects (Figure 2) that are linked to mimic detailed packet forwarding machinery in the real network. When packets enter a node, an entry classifier decides which next classifier to go to depending on the destination address in the packet header. If the address is a unicast/multicast address (mask and mask length), packets are forwarded to a unicast/multicast classifier which maintains a hash table of (source, destination, outgoing target, incoming interface if multicast) tuples. These tuples are installed by the unicast/multicast routing protocol. The classifiers then hash on the source, destination, or incoming interface if multicast, to decide which outgoing target to forward the packets. In unicast classifiers, outgoing targets are typically the network interfaces that the packets are supposed to be forwarded onto. However, in multicast classifiers, outgoing targets are 48 replicators, and the replicators copy the packets and forward on to several outgoing network interfaces. Packet forwarding within a node is depicted in Figure 2. Each network interface is connected to a queue object, which holds packets inside the queue if the link is occupied. When the link is clear, packets are forwarded onto a transmission and propagation delay module which delays the packets with the propagation delay and transmission delay given in the simulation configuration. Finally, the packets reach a TTL checker which decrements the TTL field in the packet header and passes to the receiving network interface. The receiving network interface labels the packet header and forwards to the entry classifier of the receiving node. We see this machinery on the left side of Figure 3. In detailed packet delivery, we take into account the memory and CPU time required to construct these classification and forwarding machinery in detailed nodes and links, as well as the CPU time consumed by hop-by-hop processing. n0 n4 n2 Replication Delay TTL Session Helper Abstract Multicast Distribution n0 n1 n2 n4 I Q D T I I Q D T I I Q D T I Detailed Packet Distribution I: Interface Q: Queue D: Delay T: TTL : Member Source Figure 3. A Comparison of Detailed Packet Distribution and Session Multicast Propagation Delay Propagation Delay P1 P2 P1 & P2 Sent P2 & P1 Received Time Figure 4. An Illustration of Causality Problem 49 Session Multicast Another area with possibly unnecessary detail is hop by hop transmission. Our second abstraction, end-to-end packet distribution, eliminates this overhead. The end-to-end multicast packet distribution avoids set-up and maintenance of multicast forwarding entries altogether. Instead, source and members are directly connected with appropriate delay and loss characteristics. In the simulated version, accumulated propagation delay and bandwidth are calculated between a source and each of its members. Packets from the source are automatically duplicated for all members. At each member, the packets are scheduled to be received after the accumulated propagation delay and transmission delay (i.e., packet size divided by accumulative bandwidth). Accumulated Propagation Delay (source, member) = Σ i : links on path source-member Propagation Delay (i) Accumulated Bandwidth (source, member) = (Σ i : links on path source-member Bandwidth (i) –1 ) –1 Two details must be solved to prevent a causality problem and to insert appropriate loss dependency for session multicast. 1. Without passing the packets through queuing objects, this delay calculation incurs a causality problem. A small packet may arrive at the receivers before a previous, large packet. In Figure 4, packet P1 is sent before P2, but P2 is so small that the transmission finishes before P1. The solution is to remember the arrival time of the previous packet and take the maximum of the calculated arrival time and previous packet arrival time plus the current transmission time. Arrival time of P i = max {Now + Propagation + P i Transmission, P i-1 Arrival + P i Transmission} 50 2. Users may introduce loss into a simulation (through the use of loss modules in ns-2) to study protocol behavior. Packet loss in the detailed simulation must be reflected in the abstraction. In particular, losses on a link will correlate to losses to any recipients downstream of that link. We preserve this dependency in our abstract simulation. For example, the left diagram in Figure 5 presents a multicast tree with source n0 and members n1-n6, and an error modules is inserted in each of link n0-n1, n1-n3, and n2-n6. The right diagram in Figure 5 shows the equivalent error dependency retained for the multicast tree where n6 is dependent on e2, n3 is dependent on e1, and n1, n4 and e1 are dependent on e0. n0 n1 n3 n2 n4 n6 n5 e0 e1 e2 e0 e1 e2 n3 n1 n6 n4 error_dependency_ Example Multicast Tree with Error Modules The Equivalent Error Dependency Tree Member Node Error Module Figure 5. An Example of Error Dependency in Session Multicast In end-to-end multicast packet distribution, all the replication, delay, loss and forwarding machinery are combined into one compact ‘session helper’ agent (Figure 3, right). As a result, the original link structure (a sequence of interface, loss, queuing, delay, and TTL object) is reduced to delay and bandwidth values, and the original node structure, a combination of classifiers, is reduce to node id. Also, we add error dependency trees. Our simple form of end- to-end multicast packet distribution ignores queuing delay and thus will not be suitable for studies concerning queuing delay. The effect of abstraction packet distribution on multicast packet transmission is examined in Section 4.1. 51 3.2.3. Algorithmic Routing A bottleneck of simulating very large topologies is routing tables. The routing table cost is usually O(N 2 ) for flat routing (i.e., 1-level routing). The implication of O(N 2 ) is that any form of scaling technique that improves performance linearly will not work well to cope with the growth of topology sizes. For example, simulations with 4 bytes per routing table entry, 4000 nodes require 64 MB memory. Hierarchical routing is better if the hierarchy tree is balanced. The cost can be reduced to the order of O(NlogN). If the routing table cost can be reduced one full order from O(N 2 ) to O(N), we might be able to simulate an actual Internet topology someday; thus could not be possible if the routing information cost remains O(N 2 ). Algorithmic Routing is designed exactly for this purpose. It is aimed to reduce memory consumption for routing information, from O(N 2 ) to O(N). This routing method for network simulations is composed of two parts, lookup and mapping. The insight that makes algorithmic routing possible is that by carefully naming nodes we can easily compute next hops. If we look at a binary tree topology with node identification numbered in a regular fashion, say node k is the parent node of 2k and 2k+1, the next hop node can be acquired through a simple algorithm. By doing this kind if algorithmic lookup, we do not need to maintain a O(N 2 ) routing table. This approach generalizes to k-ary trees. This is the first part of the algorithmic routing, lookup. However, not all topologies are k-ary trees. We need to convert an arbitrary topology into a tree topology and re-assign the node addresses so they are labeled in a regular fashion and can be identified with very simple formula. This is the second part of the algorithmic routing, mapping. There exist several tree search algorithms, for instance Breath First Search or Depth First Search, that can map an arbitrary topology into a tree. Different tree search algorithms may result in different tree topologies and thus the routes used in actual simulations. Within the scope of this thesis, we experimentally choose to use Breath First Search because it creates more balanced 52 trees than Depth First Search. We leave the discussion of what tree search algorithms to use and what impact they might have on the simulations in the future work sections. Figure 6. Algorithmic Lookup Figure 7. Breadth First Search Tree Mapping Original Unicast In ns-2, the simplest form of unicast routing centrally computes the shortest routes using the Dijkstra algorithm [80]. Eventually, Dijkstra routing generates a routing table that each node has 2a+ 2 a 2a+ 1 (a- 1)/2 …. 0 1 4 3 44 21 43 10 2 5 6 22 46 45 Algorithm next_hop(src, dst): loop (walk up from dst to root) { if reaching src if through 2src+1 return 2src+1 else if through 2src+2 return 2src+2 } return (src-1)/2 Examples: next_hop(10, 44) = 21 next_hop(1, 45) = 4 next_hop(5, 43) = 2 BFS Re-assign O(N) 1 4 5 6 7 0 2 3 1 4 3 0 6 7 5 2 1 6 5 0 3 10 2 4 53 the next hop and cost information to every other node. The space complexity of Dijkstra unicast is a O(N 2 ). In the dynamic routing, DV, each router exchanges distance vectors (updated route reports) and eventually each router obtains the next hop and route cost information. The total space complexity is the memory requirement for message exchange and routing table: O(kN) + O(N 2 ), where k is the connectivity of the topology and k ≤ N In the formula, O(N 2 ) is the dominant factor. Our observation also confirms this analytical finding. The memory consumption exponentially increases to the number of nodes in the topology. O(N 2 ) factor will not scale and is the first obstacle to simulate networks the size of Internet. Algorithmic Unicast We implement the concept of algorithmic routing in ns-2. The details can be described in three parts. The setup part is Breath First Search (BFS) mapping. The current implementation picks the lowest address node to start as the root of the BFS tree. BFS algorithm will traverse all the immediately connected nodes, and then recursively starting from each of these immediately connected nodes traverse its immediately connected nodes until all the nodes are traversed. While the original topology is converted into such a tree, its rank (maximal number of leaves per node) can be recorded and used as the value of k for the k-ary tree. Subsequently, the node addresses are re-assigned to be a k-ary tree with possibly some empty leaves. See Figure 7 for a simple example. The run-time part is the algorithmic lookup. Figure 6 shows a binary tree and the algorithm to look up next hop from any node A to B in the binary tree. Following the algorithm (Figure 6 right), we are able to acquire the next hop information in a binary tree. Basically, we continuously walk up the tree from B, by calculating the value of (B-1)/2. In the meantime, we keep track of last node we visited. If we come across A while walking up the tree, we return the 54 last node visited as the next hop from A to B. If we never come across A while walking up all the way to the root, we return (A-1)/2 as the next hop from A to B. This simple algorithm does not require maintaining a routing table and the average complexity is O(logN). Finding the next hop from A to B in a k-ary tree is as easy as that for binary tree, except that we walk up the tree by calculating (B-1)/k, and in case of not coming across A while walking up, we return (A-1)/k. Finally, next hop lookup from node A orig to B orig will be elaborated into a three-step process. First, we find the mapped id of node A orig and B orig to the k-ary tree, say A tree and B tree . Second, fine the next hop node from A tree to B tree , say C tree . Third, find the mapped id of C tree from k-ary tree to original topology, C orig . Through the entire process, we only spend O(N) memory space to maintain the mapping between the original topology to its mapped k-ary tree. However, if the simulation topology is cyclic, i.e., there exist multiple paths to go from any node A to any node B, the mapped k-ary tree will not capture simulation relevance to some of the links and result in sub-optimal routes. Sometimes, algorithmic routing may distort the simulation traffic so fundamentally that results will not be valid to the research question that one would like to answer. In particular, mapping any topology into a tree also means that every node now has only one path to every other node. Thus, there’ll be a higher degree of traffic concentration and sometimes unintended network congestion. So for example, for the study of congestion control protocols, simulations using algorithmic routing could result in designs of overly-sensitive back- off mechanisms. However, for special cases with only one sender, we can start BFS tree search with the sender as the root. No sub-optimal routes will be introduced. Section 4.2 demonstrates efficiency and accuracy of algorithmic routing. 3.2.4. Finite State Automata Modeling Many transport layer protocols contain flow control mechanisms. These flow control mechanisms usually decide how fast packets/data are injected into networks. Dynamics of these 55 control mechanisms can be characterized in many ways depending on how much detail needs to be preserved. In the context of current TCP/IP network, the closed-loop flow controls usually adjust the transmission rate per round trip time. Some do operate in a finer granularity (RAP [111]). In the context of network simulation, the round trip time information can be estimated through simple calculations. Thus, instead of keeping all the round trip time estimation details as necessary in the implementation, we can replace them with pre-calculated round trip time. It is also important to update round trip time information as simulated networks becoming congested. Furthermore, the dynamics of sending rates per round trip time (or rate adaptation time interval) can be characterized [118]. One can fairly simply form finite state machines to model these flow control mechanisms in a slightly coarser granularity, while preserving the one most important property that co-exists in almost every congestion control mechanism – adjusting rates according congestion signals (packet drops or explicit congestion indication) from the networks. The importance of preserving rate adaptation dynamics has been raised in recent studies on network measurement and traffic characterization [99]. The network as a whole is a dynamic system, congestion control sources inject data into the network which may cause the network to congest. When the network cannot digest all the data, it returns congestion signals to sources. The congestion control sources react to these congestion signals. In short, how a source releases data into the network will eventually affect its behavior in the future. The effect depends heavily on the specifications of the network. Thus, any model that is generated from curve fitting certain traffic measurements will not be valid when applied to a different network, which is often the case in simulations because of limited knowledge about the measured networks. Original TCP Detailed TCP implementation in ns-2 maintains a great number of states. Some of them are required to handle the slow start exponential increase in sending rate. Some of them are required 56 to handle the congestion avoidance phase’s linear increase in sending rate. Some of them are required to estimate the round trip time and the corresponding timeout interval. Some of these states are required for features such as Explicit Congestion Notification (ECN). When these states all add up, it becomes a bottleneck to create a large number of TCP connections. Figure 8. TCP Sending a Batch of Packets Per Round Trip Time FSA TCP Our first application of FSA modeling is TCP. First, it does posses the property that every round trip time TCP flow control adjusts its sending rate. This observation captures much of the dynamics of TCP behavior [118]. Figure 8 shows a typical TCP connection at slow start. Secondly, according to many network traffic characteristics studies, the distribution of TCP file sizes is heavy tailed with a small mean in the scale of 8-18 kilobytes. Assuming that each TCP packet is 1 kilobyte, most TCP connections (90%+) will be 8-18 packets long. An elaborated finite state machine, that models TCP connections up to a few tens of packets, is sufficient to cover a high percentage of TCP connections. Long TCP connections, which are relatively rare, can be simulated in the original detailed mode (another application of hybrid simulation). The numbers of packets being sent are determined through a set of systematic stress tests. At this 1st Round Trip 2nd Round Trip 3rd Round Trip data ack data ack data ack 57 point, we manually extract relevant information from the results of stress tests. We plan to automate the process of generating FSA models, including stress test generation and relevant information extraction, in the future. We have created four flavors for FSA TCP’s, corresponding to TCP Reno, Tahoe, Reno with delayed acknowledgement, and Tahoe with delayed acknowledgement (Figure 9-12). These FSA TCP’s are only partial. They model TCP connections with at most one packet loss for the duration of connections. Even so, a significant number of TCP connections could very well be applicable. Simulation efficiency of this abstraction depends on the number of TCP connections that are FSA TCP applicable. In Figure 9-12, the number within a circle denotes the number of packets to be transmitted in this period. The bold lines represent timeout and the thin lines represent round trip time. Each line with a packet sequence number attached to it denotes that a packet with that sequence number is dropped for the transition (or batch). Each line with no number attached denotes no packet loss for the transition. Each two-element tuple shows the congestion window size (wnd) the slow start threshold (ssh) at those transition states. Each FSA TCP connection starts from the lower left state with number 1. If no packet is lost in the batch, it transits to the state above with number 2 after one round trip time. If no packets are lost for the entire connection, the FSA TCP connection will follow the solid lines all the way up to the point where a sufficient number of packets are sent as specified in the connection size. However, if a packet is dropped in a batch, the FSA TCP connection will follow the line numbered X if the lost packet is the Xth in sequence number. From the two-element tuples, we see that Reno and Tahoe TCP adjust their slow start threshold in the same fashion whereas the two flavors of TCP decrease the congestion window size in slightly different manners. When a packet retransmission occurs due to duplicate acknowledgements, Reno TCP reduces its congestion window size to a half of the current window size. On the other hand, Tahoe TCP always reduces its congestion window size to one. 58 Figure 9. FSA TCP: Reno and Regular Acknowledgement 1 2 4 8 16 1 2 2 3 4 5 6 7 7 3 3 4 5 6 4 4 5 6 1 5 6 7 1 6 7 6 7 1 2 4 6 4 6 8 1 1 1 16-28 29-31 (wnd, ssh) = (1,2) (3,3) (4,4) (5,5) (6,6) (7,7) 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X Y Batch of X packets Round Trip Time Timeout Packet loss of sequence number Y 59 Figure 10. FSA TCP: Tahoe and Regular Acknowledgement 1 2 4 8 16 1 2 2 3 4 5 6 7 7 1 2 3 4 5 5 6 1 2 4 4 5 6 1 2 4 5 6 1 2 4 6 1 2 4 1 2 2 4 6 2 4 6 8 1 1 1 16-28 29-31 (wnd, ssh) = (1,2) (1,3) (1,4) (1,5) (1,6) (1,7) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X Y Batch of X packets Round Trip Time Timeout Packet loss of sequence number Y 60 Figure 11. FSA TCP: Reno and Delayed Acknowledgement 1 2 3 8 12 1 2 2 2 3 3 4 4 5 2 2 3 4 4 4 5 3 4 4 5 5 4 4 5 6 1 2 2 20-28 (wnd, ssh) = (1,2) (2,2) (3,3) (4,4) (6,6) 5 6 1 4 5 5 (5,5) 5 9 8 1 2 3 4 3 2 5 2 6 3 7 9 8 10 3 29-31 1 2 3 4 5 6 7 12 13 14 15 16 17 18 19 X Y Batch of X packets Round Trip Time Timeout Packet loss of sequence number Y 61 Figure 12. FSA TCP: Tahoe and Delayed Acknowledgement 1 2 3 8 12 1 2 2 2 3 3 4 4 5 1 2 3 3 4 4 5 1 2 3 4 4 1 2 3 5 1 2 1 2 2 3 20-28 29-31 (wnd, ssh) = (1,2) (1,3) (1,4) (1,5) (1,6) 5 6 1 9 8 6 5 3 2 6 5 3 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y Batch of X packets Round Trip Time Timeout Packet loss of sequence number Y 62 R’ S’ Session Area Session Area Detailed Area S R R R R R Proxy Receiver Proxy Source Figure 13. Illustration of a Mixed Mode Multicast Session 3.3. Hybrid Simulation Sometimes, one abstraction model is not applicable to the entire simulation (Section 3.5 Selection Guidelines), but is partially applicable. The memory and runtime improvements by partially applying the abstract models could help extending the explorable problem space. Hybrid simulation is a general concept to access the mixture of detailed and abstract models within the same simulation. It can be applied in many forms to various aspects of network simulations. We previously discussed earlier work in this area (Section 2.2.4 Hybrid Simulations). To illustrate the usefulness of hybrid methods in network simulations, we apply this concept on the detailed packet forwarding and its abstract form, end-to-end packet delivery (Section 3.2.2). We called this particular form of hybrid simulation mixed mode. Mixed mode allows users to specify detailed and abstract regions that forward packets in the corresponding modes. Unicast and multicast packets are transmitted through interleaved detailed and abstract regions. Thus, while we gain performance improvement in passing packets through session regions, we are able to preserve accuracy by including effects of congestion in the detailed regions. As described in Section 3.2.2, session multicast ignores queuing delay. This will significantly affect 63 protocol studies that require testing scenarios with network congestion. Mixed mode can avoid this limitation, so we next describe an application of mixed mode technique on multicast. Mixed Mode Multicast Unlike the four abstraction techniques described in Section 3.2, mixed mode is used to minimize the distortion that end-to-end packet delivery creates, as well as to provide flexible adjustment depending on the available computation resource and desired accuracy. End-to-end packet delivery computes the delay between a source-receiver pair according to the link and packet attributes such as bandwidth, propagation delay, and packet size. It entirely ignores the queuing delay that might occur. With the mixed mode technique, users can specify certain links and nodes to operate in a detailed mode (i.e., detailed packet forwarding in nodes and detailed queuing and transmitting in links). Hence, packet delays incurred by queuing and detailed processing can then be taken into account when simulating a congested network. In particular, we have applied the mixed mode technique to the detailed and session multicast transmission. The details are described below with an illustration, Figure 13. Initially, users need to define regions in detailed and abstract mode by specifying which nodes and links to operate in detailed mode with selected multicast protocol. The remainder will be in session (abstract) mode by default. A group join command operates differently depending on where the join occurs. If the join occurs in session mode, the session-join operation is performed until the join reaches the group source or the entry point of a detailed mode region. See the proxy source in Figure 13. If the join occurs in detailed mode, the detailed join operation is performed (i.e., hop by hop join message propagation) until the join reaches the group source or the entry point of a session mode region. See the proxy receiver in Figure 13. 64 3.4. Optimizations Optimization techniques provide faster and less memory consuming implementations that retain exactly the same results as the original simulations, as opposed to abstractions that may result in distortions. We emphasize devising optimization techniques that avoid allocating memory for duplicate information. The first optimization technique, packet reference count, is used to avoid physically duplicating packets for multicast or broadcast communication. The second technique, virtual classifier, is used to ignore installing forwarding caches in ns-2’s detailed nodes. There are no distortions caused by these two optimization techniques. However, packet reference count is only applicable to duplicate packets that will remain identical until the transmissions are completed. Duplicate Packet ref_count = 5 Logical Packets Physical Packet Figure 14. Illustration of Packet Reference Count 3.4.1. Packet Reference Count One simulation bottleneck is the number of packets in transit. Especially in multicast sessions, at each branching point the exactly same packets have to be duplicated and forwarded down the multicast tree. Rather than physically allocating memory for duplicate packets, we can logically duplicate packets if we can keep track of how many logical packets a physical packet represents. In this way, we are able to avoid allocating the memory used by the physical duplicate packets. To keep track of the number of logical duplicate packets, we attach a reference count to each physical packet. When a packet is replicated (i.e., a logical duplicate is created), the reference 65 count in the associated physical packet increases, and vice versa. When the reference count reaches 0, the physical packet is freed. We implement the packet reference count technique for multicast transmission; details are described in the next section. The same approach can be used in cases where packets are often duplicated (e.g., LAN simulations). The packet reference count technique significantly improved the simulation performance with increased topology size, source rate, and numbers of groups, but the reference count mechanism may not retain the information that might be changed during the transmission (e.g., TTL or checksum). However, often we can add modules that compute or convey these changes to the end points. For example, we have implemented an end-to-end TTL checker. Reference Counting Multicast Packets Packet reference count can reduce the overhead of duplicate multicast packets (Figure 14). Each multicast packet is assigned a reference count. At the branching point of a multicast tree, the reference count increases, instead of allocating memory for a duplicate. When the packet is dropped or received by the group members, its reference count decreases. If the reference count decreases to 0, the memory allocated to the packet will be freed for later use. We quantify time and memory conservation from the reference count technique in Section 4.5. 3.4.2. Virtual Classifier The virtual classifier is more an optimization technique than an abstraction technique. In detailed simulations, each node keeps a classification table that takes care of packet forwarding. Just like in the real world, each router has a classification table in its kernel for fast packet forwarding in addition to the routing table stored in the user domain. In a sense, there are two copies of routing information that essentially are identical. Users that have limited memory resource would sometimes be willing to trade off runtime for memory in larger simulation scenarios. The virtual 66 classifier is therefore used to replace the low-level fast forwarding table with a pointer to the high-level routing table maintained by the routing protocol modules. However, simulation runtime will increase because the checks of the fast forwarding table are now replaced by upcalls that look in the routing table. This concept can be applied to both unicast and multicast routing implementations. Figure 15. Comparison of Original Classifier (Left) and Virtual Unicast Classifier (Right) Virtual Unicast Classifier To illustrate how runtime and memory trade against each other when using a virtual classifier, we applied this concept to the ns-2 unicast routing implementation. In ns-2, the unicast routing modules perform necessary actions (protocol-specific) to obtain the user domain routing table and update forwarding caches (classifier) in the kernel level. See Figure 15 left plot. Network dynamics will be detected by the routing modules and the routing table will be updated accordingly. When packets arrive at nodes, they will be passed through the forwarding caches (classifiers) and continued on to the next hops, without going through the routing tables in the high-level routing modules. This architecture results in memory consumption for two identical copies of routing information. The virtual unicast classifier replaces the low-level forwarding caches (classifiers) with a pointer to the user-level routing table. See Figure 15 right plot. When O(N 2 ) Routing Table Node Address Classifier ... O(N 2 ) Routing Table Node O(1 ) Virtual Classifier O(N 2 ) 67 packets arrive at nodes, the virtual classifiers query the next hop information from the user-level routing table. By that, this optimization technique eliminates memory consumption for the low- level forwarding caches, O(N 2 ). 3.5. Selection Guidelines If it is fast and ugly, they will use it and curse you; if it is slow, they will not use it. -- David Cheriton [2], pp. 519 Our experience in preparing and designing the abstraction techniques can be generalized into the following guidelines. They will help users conduct their own abstract simulations: 1. Define scaling factors (varying input) and measurement metrics (wanted output) 2. Start from small-scale detailed simulations 3. Profile simulations to find bottleneck 4. Adapt techniques to avoid simulation bottleneck 5. Verify simulations in small-scale in detailed mode 6. If confident with one of the following conditions, proceed with larger-scale simulations using the improved version • The defined measurement metrics are not affected by the distortion at all • Distortion of the selected abstraction will not be enlarged when increasing the defined scaling factors If the improvement is not sufficient, repeat step 3 to step 6. In addition to the above guidelines, we provide memory and CPU time monitoring functions in ns-2 for identifying performance bottlenecks (step 3). Subsequently, users can select effective abstraction techniques (described in Section 3.2 to 3.4) to eliminate the bottleneck (step 4). Moreover, we use API’s that allow nearly identical detailed and abstract simulation configurations. Users can conveniently compare results side-by-side to validate the use of abstraction techniques in small scale (step 5). To proceed with large-scale simulations using abstractions, users will have to determine whether certain distortions matter to measuring metrics or scaling factors enlarge the degrees of distortion (step 6). This step remains highly dependent on the researcher’s individual expertise. In the future, we hope to gather networking expertise, 68 categorize Internet research problems, define scaling factors and interesting measurement metrics, and finally correlate these scaling factors and measurement metrics with various network characteristics that abstraction techniques might distort. This attempt to formalize/systemize the expertise-dependent step will increase the confidence of using abstractions in a wider base. Figure 16. Selection Guidelines Assisting the Entire Experiment Validation Process Figure 16 depicts a general experiment validation process. Small-scale testbed experiments and simulations are relatively easier to conduct. We can compare and contrast the experiment, detailed simulation, and abstract simulation results side by side. However, large-scale experiments and detailed simulations are difficult due to resource constraints. Using our selection guideline, we can be confident to conduct large-scale abstract simulations and to project the results of large-scale detailed simulations and perhaps large-scale experiments. Unlike the scaling techniques described in Section 3.2-3.4, our selection guidelines are more general and might be applicable to non-network simulations. Small Testbed Experiments Small Detailed Simulations Small Abstract Simulations Large Testbed Experiments Large Detailed Simulations Large Abstract Simulations Step 5 Step 6 Should Project Results 69 Chapter 4 Systematic Simulation Comparison We evaluate our implementations of scaling techniques described in Chapter 3. We apply each scaling technique to various multicast or unicast simulation components in ns-2. These scaling technique applications are evaluated in simulation efficiency and accuracy. For simulation efficiency, we measure both memory and runtime performances. For accuracy, we quantify the distortion aspects pointed out in Chapter 3. In each section, we detail our experiment design and the statistics involved in generating the figures. The evaluations of centralized and session multicast simulations are combined into one section because they are compared using the same sets of simulation configurations. Following this combined section, we examine other scaling technique applications in this order: algorithmic routing, FSA TCP, mixed mode, packet reference count, and virtual classifier. 4.1. Centralized Multicast & Session Multicast We compare original dense mode multicast, centralized multicast, and session multicast, finding that the abstraction techniques substantially improve the scaling properties of the simulations. Dense mode (Section 3.2.1) with detailed packet distribution (Section 3.2.2) scales the worst, centralized multicast (Section 3.2.1) with detailed packet distribution scales better, and session multicast (Section 3.2.2) scales the best. Abstract simulations have small differences (distortions) 70 Figure 17. 100 node random transit-stub topology 67 63 59 60 61 62 64 65 66 68 69 70 71 72 73 74 75 76 5 4 7 6 17 20 23 25 24 26 16 18 22 21 19 30 33 32 31 34 28 27 35 29 80 77 78 79 81 82 83 84 85 86 88 87 89 90 91 95 94 93 92 96 97 98 99 3 1 2 0 47 42 43 45 48 46 44 53 49 51 50 52 54 55 56 57 58 8 12 9 11 10 13 14 15 38 39 36 41 37 40 71 from detailed simulations. We illustrate some of the simulation relevant differences between detailed dense mode and centralized multicast simulations, and between detailed packet delivery and session multicast simulations. Experiment Design The numbers of nodes, members, and groups are the three major factors that determine the scale of multicast simulations. Thus, we design the following three sets of simulations to explore each of the three scaling dimensions from a standard case, 300 nodes (average degree 1.77), 30 groups, and 30 members. Memory and runtime are measured to compare the performance of dense mode, centralized and session multicast. Nodes: 30 group, 30 members, and 100 - 500 nodes (average degree 1.77, transit-stub) Members: 300 nodes, 30 group, and 10 to 50 members Groups: 300 nodes, 30 members, and 10 to 50 groups In the set of simulations with increasing numbers of nodes, we use fixed numbers of groups and members. 30 group sources are randomly selected and for each group 30 members are randomly selected as well. Several member agents may co-exist in the same node. In addition, the number of links increases proportionally to the number of nodes to maintain a constant degree of connectivity 1.77, and, using GT-ITM (Georgia Tech - Internetwork Topology Model) [48], we are able to create random transit-stub topologies that are closer to the real world network structure, shown in Figure 17. Typically, stub domains communicate with other stub domains through a few transit domain nodes. In the second set of simulations, the number of nodes and groups are fixed to 300 and 30, and, for each of these 30 groups, 10 to 50 members are selected randomly. Similarly, the number of nodes and members are fixed in the third set of simulations, and 10 to 50 group sources are randomly selected to send packets to 30 randomly selected members in a 300 node topology. All simulations are conducted on a 128 MB RAM (2 GB virtual memory), 200 MHz Pentium PC, running FreeBSD 2.2. 72 For each configuration, (N = number of nodes, G = number of groups, R = number of receivers per group), we run 10 instances of the simulation with different random seeds. Figure 18 and Figure 19 show the mean and 95% confidence range of memory and time consumption. Memory Usage Figure 18. Comparison of Memory Usage In the first set of simulations (Figure 18, left plot), the periodic flood and prune in dense mode multicast has a dramatic effect on the memory consumption (30 topology-wide floods and prunes per 0.5 second period). Especially when the number of nodes increases, the area for flood and prune increases as well, contributing to the high memory usage for dense mode multicast. Centralized multicast replaces flood-and-prune messages, eliminating this source of overhead (compare memory usage between dense mode and centralized in all three plots in Figure 18). However, centralized multicast still experiences a significant growth in memory consumption with the growth of topology size because of ns’s detailed representation of links and nodes (Figure 18, left plot). By abstracting this representation, we get both better absolute memory usage (compare the difference between session and the other simulations in all three plots in Memory Cons umption - Node 0 20 40 60 80 100 120 140 160 180 100 200 294 406 512 Number of Nodes Memory in MB Memory Cons umption - Group 0 20 40 60 80 100 120 10 20 30 40 50 Number of Groups Memory in MB Memory Cons umption - Member 0 20 40 60 80 100 10 20 30 40 50 Number of Members Memory in MB Dense Mode Centralized Session Dense Mode Centralized Session Dense Mode Centralized Session 73 Figure 18) and incrementally less cost as scale increases (compare the slopes of session to the others as number of nodes or groups change in left two plots in Figure 18). An exception to reduced incremental costs is shown as we increase the number of members (Figure 18, right plot). All three dense mode, centralized, and session multicast simulations grow at similar rates here (although the abstractions have far less absolute memory usage). This is due to the random selection of the members. If the member distribution is diverse (fairly random), then increased number of members does not necessarily imply the multicast tree spans a wider range. In another words, the area of flood and prune in dense mode does not necessarily grow or shrink. On average, the flood and prune area should stay constant. However, we expect centralized multicast to consume slightly more memory than dense mode when all or most nodes are members, because the flood and prune overhead in dense mode is reduced and centralized multicast carries extra global states required for route computation. The slight increases in all three versions of multicast simulations are the results of additional states for new members. In summary, we have shown that abstraction can save substantial amounts of memory, allowing larger simulations to be conducted. The improvement from detailed dense mode multicast to session multicast ranges from a factor of 3 to 16. The benefits of these approaches vary depending on what dimensions of scale is being pursued. We see large absolute and incremental improvements when numbers of nodes and multicast groups increase, but only absolute improvements when the number of members increases. Time Usage Dense mode scales poorly in time usage (Figure 19) due to the flood and prune over the entire topology which causes an enormous amount of packet event scheduling (proportional to the 74 Figure 19. Comparison of Time Usage number of nodes and groups). Centralized and session multicast scale so much better because they do not have the overhead of topology-wide flood and prune. Session multicast scales better than centralized multicast because session multicast further eliminates the overhead of hop-by- hop packet processing. Distortions The performance improvement described above is possible because abstractions remove some protocol or network related details. Users of centralized and session multicast must examine carefully if these details affect the validness of users’ simulation studies. In this section we characterize the distortions centralized and session multicast simulations introduce. In Section 5.1 we examine the effects these distortions may have on the study of reliable multicast (SRM). Dense Mode Multicast vs. Centralized Multicast The centralized multicast abstraction replaces routing message exchange with a central computation. Although the end results of the algorithms are identical (since both implement the Time Consumption - Node 0 3000 6000 9000 12000 15000 18000 21000 24000 27000 100 200 294 406 512 Number of Nodes User Time in Second Time Consumption - Group 0 3000 6000 9000 12000 10 20 30 40 50 Number of Groups User Time in Second Time Consumption - Member 0 3000 6000 9000 10 20 30 40 50 Number of Members User Time in Second Dense Mode Centralized Session Dense Mode Centralized Session Dense Mode Centralized 75 same reverse-shortest-path-tree algorithm), transient behavior can be different. In a detailed dense mode simulation messages propagate topology change information through the tree at link speeds, while with centralized abstraction topology changes become global known the instant they occur. To demonstrate the difference between the protocols, we simulate various group events (join, leave, link down, and link up) on a simple topology just to show the difference in the two multicast implementations’ behavior (Figure 20, right). During this simulation, we periodically count the numbers of data packets on the entire topology and show the collected data on the following figure, we referred to as the behavior chart. This behavior chart allows us to observe the protocol dynamics, traffic overhead as well as the event convergence latency, all in one 2-D graph. Total # of data packets on the fly in the topology -2 0 2 4 6 8 10 12 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time (in sec) # of packets Dense Mode Centralized 0 1 2 3 4 5 time events 0 source 0 starts 0.1 DM’s first periodic flood & prune 0.2 member 1 joins 0.4 member 4 joins 0.6 member 5 joins 0.8 link 0-1 goes down 1 linke 0-1 comes back on 1.2 member 1 leaves 1.4 member 4 leaves 1.6 member 5 leaves Figure 20. Difference Between Dense Mode and Centralized Multicast The dark dashed line represents dense mode multicast behavior. The light solid line represents the same simulation with the centralized multicast abstraction. Dark spikes (for example, at time 0.1, 0.4, and 0.8) correspond to periodic flood and prunes. In both lines, we can see that the steady state of the lines rise by a few packets when new members join the group (for example, at 76 time 0.2, 0.4, and 0.6), and the lines drop by a few packets when members leave the group (for example, at time 1.2, 1.4, and 1.6). Furthermore, when a link on the tree fails (for example, at 0.8), we can see that the lines drop a few packets lower as well, and, when the link comes back on, the lines jump back to the original level. Comparing the two lines, other than the extra flood and prune overhead in dense mode, we also observe that the route convergence latency for join/leave and network events are quite different depending on the update sensitivity of the two multicast implementations (for example, the delay at time 1.4). From the above, we conclude that the centralized multicast is not appropriate for experiments which examine detailed multicast behavior during transients (e.g., amount of loss during transients and amount of bandwidth consumption by flood and prune). However, centralized multicast should give no difference to simulation results when the transient delay or behavior is not an issue, for example transport or application layer protocol studies with static membership and topology. Detailed Packet Delivery vs. Session Multicast Session multicast replaces hop-by-hop message forwarding with direct end-to-end channels with estimated end-to-end delays. These end-to-end delays do not include network queuing delays. We therefore expect the session multicast simulations to be the same as detailed packet delivery simulations when there is no congestion. In other words, session multicast simulations would fail to model queuing delays when congestion occurs. To demonstrate this effect we measure end-to- end delay in session multicast and detailed simulations in a 100-node transit-stub topology (Figure 17). We increase the amount of multicast flows from 1, 10, 20 to 80 to add more traffic into the simulated network. For each number of packet flows, we monitor end-to-end delays for all packets and calculate the differences between session multicast and detailed multicast packet delivery. Figure 21 depicts the mean ratio of the difference to end-to-end delay. 77 Figure 21. Distortion in End-End Delay by Session Multicast When there is only one multicast flow, link capacity is high enough to carry the incoming traffic smoothly, so there is no queuing contributing to the end-to-end delay, which is closely modeled by the end-to-end delay estimation in session multicast. Therefore, we see almost no difference in end-to-end delay. However, when we increase the number of multicast flows, the network starts to experience congestion, causing differences in end-to-end delay. Multicast group sources and receivers are randomly selected. Thus, the slight decrease in average end-to-end delay difference from 40 to 50 multicast session simulations may be a result of more evenly distributed traffic, as opposed to higher degrees of concentration at a few points in the simulated network. This example suggests that session multicast distribution must be used carefully when simulations involve very high source rates or cross traffic (i.e., congested network). For example, session multicast should not be used with congestion control protocols because there must be congested network components in order to exercise the congestion control mechanism. However, session multicast will be useful for studies that do not require congested networks, such as reliable multicast. Difference in End-to-end Delay 0 5 10 15 20 10 20 30 40 50 60 70 80 Number of Data Sources % of End-to-end Delay mean-stddev mean mean+stddev 78 4.2. Algorithmic Routing In this section, we present simulation results comparing flat routing (i.e., the Dijkstra shortest path algorithm), hierarchical routing and algorithmic routing. Through the set of experiments using GT-ITM random transit-stub topologies, we confirm the analytical results discussed in Section 3.2.3 that the flat, hierarchical, and algorithmic routing implemented in ns-2 scale in the order of O(N 2 ), O(NlogN), and O(N) respectively. The time requirement for hierarchical routing is surprisingly higher than the other two because in hierarchical routing each domain itself has to perform a smaller-scale Dijkstra shortest route computation. Both hierarchical and algorithmic routing can cause distortion. To quantify the distortion, we measure path lengths for all source and receiver pairs. Assuming all routes are symmetric, there are totally (N 2 -N)/2 pairs of distinct routes. The results shows that there are occasional large difference in path length but on average the difference in percentage, absolute difference divided by flat route length, is small, approximately 0.1%. We advise that users should avoid using algorithmic routing if the simulated networks are not trees and they have multiple senders. Otherwise, using algorithmic routing may over-estimate end-to-end delays and degrees of congestion at bottleneck points. Experiment Design The experiments are designed as follows. We vary sizes of the network topologies from 100 nodes to 500 nodes to show how each routing mechanism scales to the size of the topologies. These random topologies are GT-ITM transit-stub with connectivity approximately 1.77. We test one random topology per 100 node. The simulations end at the point where routing information is established (We don’t send data). All simulations are conducted in session mode where the memory and time requirements for nodes and links are much lower than those for detailed mode, 79 so we can highlight the memory and runtime consumption for routing. We use the same hardware as in Section 4.1. Memory Usage Figure 22. Memory Consumption for Flat, Hierarchical, and Algorithmic Routing Figure 22 shows the memory usage for one simulation because after repeating the experiment we find that the results are deterministic. As we increase number of nodes included in the topologies, the memory requirement for flat, hierarchical and algorithmic routing mechanisms increase as well. Although the expected growth of flat routing should be N 2 , it exhibits a somewhat faster than N 2 jump and then flattens out. This is due to the memory allocation policy in FreeBSD’s C library which interacts with ns-2’s flat routing. The number of entries in the routing table is always 2 k , where k is an integer. Thus, when the number of nodes increases to 2 k +1, ns-2 will allocate 2 k+1 entries. In this case, 2 k+1 -2 k -1 entries are not used. This power of 2 artifact contributes to the “(2N) 2 jump then flattens out” behavior. We expect to see another jump between node 500 and 600 (jumping from 512 entries to 1024 entries). Memory requirement for hierarchical routing increases in a significantly slower rate. Memory requirement for algorithmic routing is even lower in the scale of O(N). Memory 3.00 4.00 5.00 6.00 7.00 0 100 200 300 400 500 600 Number of Nodes MB Flat Hier Algo 80 Time Usage Figure 23. Time Consumption for Flat, Hierarchical, and Algorithmic Routing Figure 23 shows surprising results that flat routing, supposedly the least scalable routing mechanism, runs faster than hierarchical routing and only slightly slower than algorithmic routing. By investigating this unexpected phenomenon, we discover that simulation speed is closely related to two important factors. One is analytical complexity and the other is programming language efficiency. Sometimes, language efficiency can be as crucial as analytical complexity in determining simulation speed. In ns-2, the flat routing adapts the simplest form of Dijkstra algorithm, whose computational complexity is O(N 3 ). Computational complexity for hierarchical is O(Nlog 2 N). In that, each node (N) searches for a best route to each of the intra-domain nodes and neighboring domains (logN) through existing routes known to these intra-domain nodes and neighboring domains (logN). Computational complexity for algorithmic routing in session mode is only O(N). This is due to the tree search algorithm and address re-assignment. The computation cost of O(logN) per path lookup is shifted to the process of creating sessions, not included in the route establishment phase (the period of simulation we monitor the time usage for Figure 23). Time 0 50 100 150 200 250 300 350 0 100 200 300 400 500 600 Number of Nodes Second Flat Hier Algo 81 From the analytical results, flat routing should scale the worst in the order of O(N 3 ), hierarchical routing second in O(Nlog 2 N), and algorithmic routing the best in O(N). As a matter of fact, simulation results (see Figure 23) for hierarchical and algorithmic routing agree reasonably well with the analytical results. However, flat routing scales surprisingly better than expected, much better than hierarchical routing and only slight worse than algorithmic routing. This is because that flat routing’s Dijkstra algorithm is implemented in C++, a much more efficient language than otcl [110], in which hierarchical and algorithmic routing mechanisms are implemented. We plan to change this. Distortions Figure 24. Distortion by Algorithmic Routing: Difference in Route Length Number of Nodes Mean % Median % Max % 100 7.9541 0 400 200 9.236 0 600 300 10.314 0 700 400 13.368 12.5 600 500 13.539 12.5 700 Table 5. Large Maximum and Small Median for Route Length % Difference Difference in Route Length 0 0.2 0.4 0.6 0.8 1 0 200 400 600 Number of Nodes Number of Hop Counts or % diff diff% 82 By mapping arbitrary topologies into trees, we ignore certain links that otherwise are part of the cycles in the original topologies. This indirectly results in different route paths using algorithmic routing as opposed to the Dijkstra shortest path routing. Figure 24 depicts the average percentage difference in route length. For each transit-stub topology, we run the same simulation using flat and algorithmic routing mechanisms. For each source and destination pair, totally N*(N-1)/2 of them, we compute the difference (number of hops) in path length and the ratio of difference to the original flat routing path length. Each point in Figure 24 is the average of these differences in hop count and ratio. Figure 25. Example of Route Distortion Using Algorithmic Routing From Figure 24, we see that the average difference ratio remains very low, roughly 0.1 %. This nice property has a lot to do with the size of the cycles in the transit-stub topologies used in the evaluation. Typically, when the sizes of the cycles are small, the differences are small. Consider a ring topology with 5 nodes. See Figure 25. The shortest path from node 1 to 5 is to go directly through link 1-5 (Figure 25, left). After this ring topology is mapped into a string, the shortest route from node 1 to 5 becomes the path through link 1-2, 2-3, 3-4, and 4-5 (Figure 25, right). The route length difference is 3 hops and the ratio is 300%. Similarly, we hypothesize that usually the longer the cycles are in the topology, the larger the path difference. The transit-stub topologies used in this evaluation have short and few cycles concentrating in the transit areas and 1 5 4 3 2 1 5 4 3 2 Root to Start BSF Search Shortest Path Routing Algorithmic Routing 83 thus result in small route length difference. Table 5 shows that most routes have the same length (low median) and that few of the routes have very large differences (large maximum, up to 700%). Addition of stub-stub links will change this result. Consequently, there will be longer cycles and/or more cycles in the actual Internet topologies and thus route length difference is expected to be higher. Further analysis has to be done to verify that our hypothesis of longer or more cycles result in larger route length difference, and experiments have to be conducted to understand more on the effect of algorithmic routing to actual Internet topology. Simulations that have single sender can start tree mapping from the sender node to avoid this distortion. 4.3. FSA TCP In this section, we compare detailed TCP and FSA TCP implementations in ns-2. Simulation results show that FSA TCP is more efficient than detailed TCP in terms of memory consumption. In time consumption, FSA TCP is only slightly faster than detailed TCP. We measure TCP throughput and packet delay leaving the bottleneck queue. Our findings are that the average percentage difference in throughput is approximately 3% and the difference in per packet delay is in between 10 to 20 msec. We advise to use FSA TCP for the purpose of generating background traffic and avoid FSA TCP when fine-grain TCP details are required to draw valid conclusions. Experiment Design In these experiments, we use an ISP-like topology (Figure 26). To demonstrate the scaling property of FSA TCP, we vary number of TCP connections from 10 web sessions to 100 web sessions and each session contains about 200 TCP connections. These TCP connections arrive in Poisson random distribution and the connection sizes are Pareto (heavy-tailed) with average 10KB and alpha 1.2. For the set of FSA TCP simulations, we replace TCP connections that are shorter or equal to 31KB with FSA TCP and let the other longer connections run using original 84 Figure 26. ISP-like Environment: 420 Clients and 40 Servers Connected Through a Modem Pool (FDDI Ring) TCP implementation. All simulations run in detailed delivery mode and end after 4200 seconds of simulation time (slightly higher than an hour). We use a Pentium II 450MHz machine with 1GB physical memory, running FreeBSD 3.0. Memory Usage Figure 27. Memory Consumption for FSA TCP Memory 0 100 200 300 400 500 600 700 0 5000 10000 15000 20000 25000 Number of Connections MB All TCP FSA TCP * 420 Clients ... FDDI Server Network Bottleneck (always in detailed mode) 4 Intermediate Routers 40 Servers 85 Each point in Figure 27 is the result of one simulation. We repeat the same configuration (with the same the random seed) and find that simulation results (except time usage) are deterministic. Figure 27 shows FSA TCP improves memory usage significantly. The more TCP connections have to be created, the more memory FSA TCP saves. This evaluation also reveals that a significant amount of memory consumption is due to the details of ns-2’s original TCP implementation. FSA TCP abstracts away a great deal of detail and requires very little state, essentially a pointer to the current state in the finite state automata and a floating point number for round trip time. It turns out that reducing the amount of variables in ns-2’s TCP implementations can result in a large saving in memory usage 12 , especially for simulations that need to create a large number of connections. Time Usage Figure 28. Time Consumption for FSA TCP From Figure 28, we do not see significant improvement in runtime by using FSA TCP. This is because that simulation time is proportional to the size of the event scheduler list, which is 12 One probable reason of TCP implementation being memory consuming may be the artifact in binding OTcl and C++ variables. Time 0 500 1000 1500 2000 0 5000 10000 15000 20000 25000 Number of Connections Second All TCP FSA TCP 86 determined by the number of events or packets scheduled at times. Currently, our FSA TCP implementation in ns-2 generates the exact amount of individual packets as indicated in the finite state automata diagrams. Thus, the amounts of events or packets scheduled for detailed TCP and FSA TCP are the same, and so we do not see much improvement in simulation runtime. However, one potential improvement of FSA TCP implementation is to represent each batch of packets with a representative packet event, as suggested by Ahn and Danzig [33]. This avoids scheduling individual packets, subsequently reduces the size of the event queue, and eventually speeds up simulations. Distortions Figure 29. Distortion by FSA TCP: % Difference in Throughput FSA TCP ignores the round trip time estimation mechanism in detailed TCP, as well as the possible spacing between packets within a batch. These will result in differences in delay and delay-related metrics. To visualize the differences, we measure the connection throughput and the delay for each packet to leave the bottleneck queue. For each X-axis point (number of connections), we run two identical simulations. One uses FSA TCP and the other use detailed TCP. For each connection, we compute the difference ratio of throughput and the absolute % Difference in Throughput 0 2 4 6 8 10 0 5000 10000 15000 20000 Number of Connections % 87 Figure 30. Distortion by FSA TCP: Difference in Delay difference in delay at bottleneck queue. Figure 29 shows that FSA TCP differs from detailed TCP in throughput for about 3%. Figure 30 shows that FSA TCP differs from detailed TCP in per packet delay at the bottleneck queue for about 10-20 msec. FSA TCP packets usually arrive and leave the bottleneck queue 10-20 msec earlier than detailed TCP packets. These results suggest that FSA TCP might be useful in generating background traffic where coarse-grain accuracy (up to 100s msec) is required. However, we should avoid using FSA TCP when comparing TCP behavior in fine-grain (below 10msec) time scale (See Section 5.3 for details). 4.4. Mixed Mode Mixed mode is a form of hybrid simulation. To allow detailed and end-to-end (i.e., session) packet deliveries (Section 3.1.2) running in the same simulation, we install proxy interfaces on the borders of detailed or session regions (R’ and S’ in Figure 13). These proxies will intercept packets coming through detailed hop-by-hop delivery or session channels, and then forward them onto session channels or detailed hop-by-hop forwarding machinery. Intuitively, the larger the detailed regions are the less the performance improvement will be. From a simple set of simulations, we show that not only the size of the detailed region degrades the performance, but Difference in Delay 0 5 10 15 20 25 0 5000 10000 15000 20000 Number of Connections msec 88 also the amount of connections crossing the detailed and session regions. The reason is that session delivery replaces detailed forwarding machinery with per session/connection states. Detailed mode may require a large amount of memory to create all the components in the detailed packet forwarding machinery, but the cost is one-time. No matter how many connections/sessions need to be created, only negligible amount of extra memory is required. However, in the session mode, every connection requires a substantial amount of memory, so the aggregated memory usage will grow linearly as the number of connections/sessions increases, eventually exceeds memory requirement for the same simulation in detailed packet delivery mode. Runtime for mixed mode simulations is usually worse than the detailed mode due to the extra proxy mechanism required to handle packets traversing between detailed and session regions. Distortion in mixed mode is quantified as the percentage difference in connection throughput. As long as the congested areas are configured to run in detailed mode, the error will be very small. Experiment Design The experiments are designed similarly to those in Section 4.3. We again use an ISP-like topology. The source connections arrive in Poisson and sizes are distributed in Pareto (a heavy- tailed distribution). To exercise mixed mode, we configure the bottleneck link to run in detailed mode and the rest in session mode. All simulations end at 4200 second simulation time. Because it is more obvious that the larger the detailed regions are, the less memory usage improvement will be, we vary the number of connections/sessions in the simulation to show another dimension of scaling limitation in mixed mode. All mixed mode simulations are conducted on the same hardware as in Section 4.3. 89 Memory Usage Figure 31. Memory Consumption for Mixed Mode To show the effect of number of connections to mixed mode’s scaling property, we plot Figure 31. In the figure, each point is one run of simulation because after repeating the simulation (with the same the random seed) we find that simulation results (except time usage) are deterministic. We increase the number of connections to see how mixed mode and original detailed mode scale to this scaling dimension. Figure 31 shows that mixed mode’s memory usage increases in a larger slope and the two lines tend to converge when the number of connections increases to a large value, around 2000. The reason is that when simulating a small number of connections, most memory used is to create the topology. In the set of mixed mode simulations, only the bottleneck link is configured in detailed mode the rest of the simulation topology is in session mode. Node and link structures are reduced to minimal in session mode, so the mixed mode simulation with this ISP-like simulation setup uses less memory to maintain topology information. However, session mode requires extra states to establish virtual channels (Section 3.2) between sources and receivers for all connections, as opposed to detailed simulations where packet forwarding is provided by the routing information maintained within detailed nodes and links. Therefore, when the number of connections increases, mixed mode simulations tend to use Memory 0 10 20 30 40 50 60 70 0 200 400 600 800 1000 Number of Connections MB Detailed Mixed Mode 90 Time 0 500 1000 1500 2000 0 200 400 600 800 1000 Number of Connections Second Detailed Mixed Mode more memory for per connection states in the session area. Although this evaluation focus on mixed mode’s scaling property to number of connections, it would not be surprising to see that when the area of detailed mode enlarges, mixed mode’s scaling property degrades, in order to include more memory consuming nodes and link in detailed mode. Time Usage Figure 32. Time Consumption for Mixed Mode The more packet events in a simulation, the slower mixed mode will perform. This is because of the large amount of special handling actions at the proxy interfaces in between detailed and session regions. Figure 32 shows that when the number of connections exceeds 200, mixed mode becomes a less efficient simulation mode in terms of simulation speed. It is expected that mixed mode simulations to be slow when a significant amount of packet events have to cross interface from detailed to session regions. When packets are forwarded from the detailed to session regions, mixed mode module needs to upcall OTcl functions to look for relevant proxy senders. All packets in this evaluation pass across one such interface. Figure 32 demonstrates a problem in the current mixed mode implementation , not in the idea. 91 Distortions Figure 33. Distortion by Mixed Mode: % Difference in Throughput For all simulations in this evaluation, congestion happens only at the bottleneck link. Thus, configuring the bottleneck link to run in detailed mode should give sufficiently accurate results. Figure 33 confirms this hypothesis and shows that the percentage difference in connection throughput is only 0.3%. We are confident to recommend mixed mode when the locations of congestion in a simulation can be projected a priori, so the congested regions can be specified to run in detailed mode during mixed mode’s simulation setup. 4.5. Multicast Packet Reference Count Instead of physically copying packets at multicast branching points, multicast packet reference count simply keeps a counter that tracks number of packets duplicated. This optimization technique eliminates memory requirement to maintain multiple copies of the same packet. Our simulations verify this hypothesis and show promising results. With packet reference count, the memory usage remains roughly the same when number of groups or number of members per group increases. Without physically allocating memory for duplicated packets and the related Difference in Throughput 0 0.1 0.2 0.3 0.4 0.5 0 200 400 600 800 1000 Number of Connections % 92 processing, simulation runtime is also improved. This optimization can not be used if packets need to be modified in flight, for example, the time to live (TTL) header field. Experiment Design The experiments and machine used for reference count evaluation are similar to those in Section 5.1 for the study of SRM. In this set of simulations, we use a 100-node random transit-stub topology (Figure 17). This topology is generated by GT-ITM (Georgia Tech - Internetwork Topology Model) [48], a random Internet topology generator. We increase the number of receivers and number of groups at the same time, so each X-axis value represents both numbers of receivers and groups in the simulation. Group sources and receivers are randomly selected. All simulations are conducted on the same hardware as in Section 4.1. Each point in Figure 34 and Figure 35 is the result of one simulation run. Simulation results are deterministic with the same random seed. Memory Usage Figure 34. Memory Consumption for Packet Reference Count Memory 0 100 200 300 400 500 0 20 40 60 80 100 Number of Groups and Receivers per Group MB Session Multicast With Reference Count 93 Figure 34 shows that when number of receivers and number of groups increase at the same time, session multicast without packet reference count consumes more memory, O(R*G) where R is the number of receivers per group and G is the number of groups in a simulation. However, when packet reference count is turned on, memory requirement does not increase as drastically. Instead, memory requirement increases in a very slow rate, linear in the scale of O(G). This also implies that a significant amount of memory is used to duplicate packets. Packet reference count can improve simulation memory usage tremendously for simulations that involve multicast or broadcast communication. Time Usage Figure 35. Time Consumption for Packet Reference Count Improvement in terms of runtime is not as drastic as that in memory usage. The reason is, again, that simulation time is determined greatly by the amount of events in the simulation event scheduler. In packet reference count, although content of duplicate packets will not be physically duplicated, the amount of scheduling for these duplicate packets remains the same. This results in a less significant effect on improving simulation runtime. The shape of the time usage is most likely to stay the same. Figure 35 confirms the rationale and shows that the scaling property in Time 0 100 200 300 400 500 0 20 40 60 80 100 Number of Groups and Receivers per Group Second Session Multicast With Reference Count 94 runtime roughly remains as O(R*G) where R is the number of receivers per group and G is the number of groups per simulation. Although packet reference count does not flat out the scaling curve, we still see a constant shift of speedup due to reduction of physically allocating memory in packet reference count. 4.6. Virtual Classifier Virtual classifier can be used to reduce memory requirement for routing information. Through a similar set of experiments used in Section 4.2, we show that this optimization technique does improve the memory usage significantly, so does time usage. There is no distortion introduced by this technique. Experiment Design The experiments to evaluate virtual classifier are essentially the same as Section 4.2 algorithmic routing, because virtual classifier is to improve memory requirement for routing related states as well. We vary the size of topology from 100 node to 500 node. All simulations use algorithmic routing and run in detailed delivery mode. Nodes and links are in the more complex construction where details of packet forwarding, queuing, transmission, and propagation delays are preserved. The two sets of simulations, one applies the virtual classifier technique and the other does not. These simulations are conducted on the same hardware as in Section 4.1. Each point in Figure 36 or Figure 37 is the result of one simulation run. Simulation results are deterministic with the same topology size. Virtual classifier eliminates N 2 forwarding entries in the detailed node structure. Figure 36 shows that memory usage’s scaling curve is brought down substantially when the size of topologies increases. The amount of memory, required for a 300-node algorithmic routing, detailed delivery mode simulation, is sufficient for a 600-node or larger simulation when using 95 Memory Usage Figure 36. Memory Consumption for Virtual Classifier virtual classifier. The more nodes there are in the simulation topology, the more improvement there is with virtual classifier technique. Time Usage Figure 37. Time Consumption for Virtual Classifier The simulation speedup shown in Figure 37 is due to the elimination of installing forwarding entries into detailed nodes. These installation actions do not consume much CPU time. Thus, time usage improvement is only marginal. Memory 0 10 20 30 40 50 60 0 200 400 600 Number of Nodes MB Original Classifier Virtual Classifier Time 0 50 100 150 200 250 300 0 200 400 600 Number of Nodes Second Original Classifier Virtual Classifier 96 Chapter 5 Impacts on Simulation Studies Abstractions can introduce distortions. One has to be careful when and what to use to avoid drawing incorrect conclusions. The progressive, systematic selection methodology (Section 3.5) is designed exactly for the purpose of helping potential users to choose scaling techniques wisely. In this section, we exercise this methodology and apply the scaling techniques. By that, we study the usefulness and the impacts that our proposed abstraction approach might have on Internet protocol studies. In previous discussion of suitability for each abstraction technique, we, in particular, identify that session multicast is suitable for reliable multicast studies, mix mode is suitable for congestion control studies, and FSA TCP is suitable for simulating aggregated large- scale TCP traffic. Thus, in this section, we choose one simulation study per research category. For each problem, we carefully follow the selection methodology, apply appropriate technique(s), and minimize distortions that might affect the final conclusion. As a result, we demonstrate that our abstraction approach has great potential in improving simulations for Internet protocol studies. 5.1. Case Study: SRM The goal of reliable multicast is to insure that all source data to a multicast group will be received by all members eventually. One can extend the reliability mechanism in TCP to enable reliable 97 multicast. For each data packet, source will receive M copies of the acknowledgements, where M is the number of members in the multicast group. This simple multicast extension of TCP will not scale when M increases. Thus, a key problem in reliable multicast mechanism is how to minimize the overhead of control messages. Members who detect losses will send minimum requests for retransmission and members who have received data successfully will send minimum recovery. SRM (Scalable Reliable Multicast) [50] uses combinations of timers. When these timers expire, requesters or repairers send out requests and repairs. Other requesters or repairers suppress their timers when receiving requests. In order to effective reduce the amount of duplicate requests and repairs, SRM sets these timers according to distances between members. SRM’s scalability can be examined through simulations that inject artificial packet losses instead of congestion losses. We start this section with a brief overview of SRM. Subsequently, we show that abstraction can give much better performance for large simulations, but that, in some cases, produces slightly different results. To determine if the differences caused by abstraction would affect the end-results of a real protocol study, we examine SRM timer behavior (previously explored in [50][63]-[66]) with both detailed and abstract simulations. 5.1.1. SRM Mechanism SRM (Scalable Reliable Multicast) is a reliable multicast transport protocol. Each member detects losses individually and issues multicast requests for retransmission per loss. Any member who successfully receives the packets may multicast a retransmission to repair losses for the requesting member. If members simply fire requests upon detecting losses, there will potentially be many duplicate requests traversing the multicast tree. To reduce the number of duplicate requests, members wait for some time before sending requests. This time has both deterministic and probabilistic parts. The deterministic part is decided according to the distance between the source 98 and requester, so that the requester closest to the source tends to fire requests first. The requests arriving at other losing members suppress duplicate requests. The Probabilistic part is to suppress a request under the circumstances that several requesters are equally distant from the source. Recoveries are handled analogously. The distance information used to decide the deterministic part of timers is obtained by periodic exchange of session messages, similar to the NTP (Network Time Protocol) algorithm. Member A records T sendB when sending a session message to member B. Member B records T rcvA when receiving the session message from A. After a while, member B records T sendA when sending another session message to member A, and, similarly, member A records T rcvB upon receiving the session message. The one-way distance is estimated by: (T rcvB - T sendA + T rcvA - T sendB ) /2 Details are in [50]. Refinements are in [63]-[66]. 5.1.2. Applying Session Multicast The three common metrics used to estimate the effectiveness of SRM are recovery delay, number of duplicate requests, and number of duplicate repairs. Recovery delay is the time duration from loss detection to recovery. Number of duplicate requests is the number of requests fired before request suppression (receiving requests from other members during request timer); number of duplicate repairs is the number of repairs fired before repair suppression (receiving repairs from other members during repair timer). The scaling dimension is session size, i.e. number of participants in a multicast group. Because network dynamics (link failures) and queuing delay are not required to evaluate the above three metrics when changing session sizes, we conduct the simulations with session multicast. We also verify the results at the scale of 100 nodes. 99 5.1.3. SRM Simulations We simulate the SRM mechanism on top of detailed packet distribution with centralized multicast (referred to as SRM/detailed) and abstract multicast distribution (referred to as SRM/abstract). Session participants are randomly selected on a 100-node transit-stub topology (Figure 17). To study scaling behaviors of the detailed and abstract SRM simulations, we increase the session size from 20 to 90 and record the time and memory consumption. To study possible behavior differences, we measure the common SRM metrics - recovery delay, duplicate repair, and duplicate request. To make sure our results are reproducible, we examine 10 runs at each session size, randomly distributing the session participants in the same 100-node network Figure 38 compares memory and time for SRM/detailed (SRM in Figure 38) and SRM/abstract (Session SRM Figure 38). We see that abstraction improves memory and CPU usage for SRM simulations. SRM/abstract simulations experience a slower slope increase in memory consumption and lower time consumption when session size grows. To evaluate if abstraction changed the simulation results, we look at the three common metrics of reliable multicast. In Figure 39, we plot three common metrics of SRM performance. For each metric, we plot the mean observed value for detailed and abstract simulations and indicate the 95% confidence interval around this mean. As can be seen, these means are very close to each other and variation is much smaller than the randomness inherent in the experiments (as shown by the confidence intervals). This suggests session multicast yields about the same results. SRM is an error recovery scheme and so most simulations do not concern operationally congested networks. This suggests that session multicast works well for SRM simulations, and indeed our simulations yield useful results. However, there are other mechanisms (e.g., congestion control) that need to be studied in the presence of congestion. Can these mechanisms be studied using end-to-end packet delivery? The answer may be yes, depending on the levels of 100 details the mechanism requires and the availability of the models for congested links and nodes (e.g., input rate, output rate, and buffer size). As mentioned earlier, scaling simulations is application specific. Our hybrid simulation technique takes into account queuing behaviors for congested areas in the simulated network whereas the non-congested areas are running session multicast for better performance. The next case study is aimed at using this hybrid simulation technique for an active congestion control study and demonstrating the usefulness of proposed simulation scaling techniques. SRM Simulation Memory Consumption - Scaling with Session Size 0 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 Session Size Memory in MB SR M Session SRM SRM Simulation Time Consumption - Scaling with Session Size 0 50 100 150 200 250 300 350 0 20 40 60 80 100 Session Size User Time in Sec SR M Session SRM Figure 38. Performance Comparison for SRM and Session SRM Recovery Delay 0 5 10 15 20 25 30 20 50 80 Session Size Delay in RTT SRM Mean SRM Conf Low SRM Conf High Session Mean Session Conf Low Session Conf High Duplicate Repairs -1 1 3 5 7 9 11 13 15 20 50 80 Session Size Number of Repairs SRM Mean SRM Conf Low SRM Conf High Session Mean Session Conf Low Session Conf High Duplicate Requests 0 0.5 1 1.5 2 2.5 3 3.5 4 20 50 80 Session Size Number of Requests SRM Mean SRM Conf Low SRM Conf High Session Mean Session Conf Low Session Conf High Figure 39. Accuracy Comparison for SRM and Session SRM 101 5.2. Case Study: RAP Real-time media congestion control is the study of traffic control for media such as audio and video. Transmission of these media does not necessarily need 100% reliability. Usually, humans can still comprehend audio and video with some degree of noise inside. Therefore, the key issue for transmitting real-time media is rather the congestion control, given that audio and video sources are considerably high rate sources. To prevent these high rate multimedia sources from overtaking existing TCP traffic (i.e., the World Wide Web and many Email applications), it is desirable that real-time media control exhibits some form of TCP-friendliness [88]. In other words, if the same amount of data is sent through real-time congestion control mechanism or TCP, the two transmission controls should consume roughly the same share of available bandwidth, averaged of the duration of the transmissions (i.e., roughly equivalent connection throughputs). Rate Adaptation Protocol (RAP) [111] is designed with TCP friendliness in mind. It was previously evaluated through simulations that RAP flows and TCP flows share equal amounts of bandwidth assuming infinite flow sizes. We start this section by briefly introducing the RAP protocol, the process of selecting appropriate scaling techniques, and finally a new set of experiment design that remove the original assumption of infinite flow sizes. 5.2.1. RAP Mechanism The basic design principles behind RAP are to de-couple the reliability and flow control that currently co-exist in the TCP congestion control and to adopt a rate-based control, as opposed to TCP’s window-based control. In spite of these differences, RAP is designed to imitate TCP’s additive increase and multiplicative decrease rate control and per RTT adjustments. By that, the designers hope the multimedia traffic will not overtake existing TCP traffic and consume a fair or slightly lower share of bandwidth. 102 For real-time streaming application, error recovery is a decision that must be made by higher layer quality adaptation control. RAP only keeps track of losses, using a slightly complicated acknowledgement scheme, and reports losses upward to the quality adaptation layer. The higher layer ‘quality adaptation’ control will, based on application specific requirements, decide what to be recovered. Once decided, packets to be retransmitted will be stacked at RAP is input buffer and waiting to be sent. A more crucial part of RAP is its rate-based congestion control that attempts to emulate TCP’s coarse-grain additive increase and multiplicative decrease (AIMD) and a finer-grain RTT sensitive flow control. To achieve that, all three components are necessary. The first component is a congestion/loss detection algorithm. A packet is deemed lost if not acknowledged after a certain threshold, which is calculated according to the sending time and expected RTT. The second component is a rate adaptation algorithm. Inter-packet gap (IPG) is the one most important attribute to control the sending rate. All the speedup and slowdown decisions are to be reflected on the values of IPG. Thus data flows, sent using IPG, will exhibit appropriate additive increase and multiplicative decrease behavior. The final component is decision frequency. To be able to behave like TCP, the decision frequency is chosen to be the exact decision frequency in TCP – every RTT. 5.2.2. Applying Mixed Mode The motivation of RAP is to allow multimedia traffic to co-exist with TCP traffic and thus TCP friendliness is the most important evaluation metric for this type of protocol design. TCP friendliness is quantified as the ratio of throughputs if the same data flows are transmitted using RAP or TCP. The ideal TCP friendliness is 1:1 (i.e. 100%). A previous simulation study [111] has shown that, assuming infinitely long data sources and uniform random source arrival, RAP exhibits very close to 100% TCP friendliness when number of such infinite long data sources 103 increases, i.e. under a reasonably high degree of congestion on the bottleneck link. In this case study, we use a different set of experiments that do not assume infinitely long file sources. Based on observations from the Internet, we make our selection of file sizes and inter-arrival of connections on a heavy tailed and exponential distribution. In the total number of file transfers created, one half uses TCP, the other half uses RAP. We compare the average throughput of all connections and that of another set of simulations in which all files are transferred using TCP. The results show that although RAP does not achieve 100% TCP friendliness as claimed in the previous simulation study, it does share a fixed portion of bandwidth, roughly 70%; and RAP’s fairness remains at that ratio when the degree of congestion increases. One probable explanation is that the file sizes are heavy tailed and so most connections are short. Short TCP connections often end during the slow start stage where data packets are sent more aggressively than in the congestion avoidance stage. Thus TCP connections in this experiments have higher throughput than expected (thus lower RAP/TCP ratio). It is not clear what distribution the multimedia sources will be in the future. However, there are hints that if multimedia source sizes distribute similarly to those of existing Internet objects, we will see RAP behaves in roughly 70% TCP friendliness. We refer to above simulations of RAP and TCP running in detail as detailed RAP/detailed TCP. We re-ran the same sets of simulations in mixed mode where congested area transmits packets in detailed hop-by-hop delivery and the rest of the network transmits packets in session mode. We compared the result of average throughput in these three combinations over the base simulation results (detailed RAP/detailed TCP), mixed mode RAP/mixed mode TCP, mixed mode RAP/detailed TCP, and detailed RAP/mixed mode TCP. The results are depicted in Figure 40. All four sets of experiments lead to the same conclusion that RAP is about 70% TCP- friendly. It is shown that mix mode can be used effectively to study relevant network problems. 104 5.2.3. RAP Simulation The simulation setup in this case study is rather different from the SRM study where it is not clear yet what multicast topology and source models are appropriate. To simulate SRM, we use randomly generated topology and group membership. To simulate RAP, we adopt an Internet Service Provider (ISP)-like setup presented in [104]. The center of the ISP network is a modem pool. This modem pool, on one side, connects to 420 dialup modem users, and on the other side, connects to 40 web servers through a two-level tree hierarchy (see Figure 26). These 420 dialup clients start in Poisson arrival and send requests randomly to the 40 web servers. The web servers send data, in second form Pareto distribution, back to the clients. The link in between the modem pool and the web servers is the bottleneck. To obtain the TCP friendliness ratio, we need at least two sets of simulations, all TCP vs. half TCP and half RAP. When using detailed hop by hop packet delivery throughout the entire simulated network, we refer to the all TCP simulations as detailed TCP and the half TCP half RAP simulations as detailed RAP. To improve the simulation scale, we apply the mixed mode technique. The bottleneck link, between the modem pool and the web servers, is configured to run in detailed mode, so queuing delays are taken into account. The rest of the network is configured to run in session mode where queuing delays are ignored. The mixed mode equivalents of the original two sets of the simulations are referred as mixed mode TCP and mixed mode RAP. We increase the number of web session requests, i.e., number of TCP and RAP flows, to create various degrees of congestion on the bottleneck link. Figure 40 shows that the ratios of detailed RAP to detailed TCP, mixed mode RAP to detailed TCP, detailed RAP to mixed mode TCP and mixed mode RAP to mixed mode TCP are almost identical. Average RAP/TCP ratio among all four combinations of experiments is 70.927%. This demonstrates that the scaling techniques, when used appropriately, can improve simulation performance while maintaining significant accuracy. RAP is a congestion control 105 mechanism. When the congested area in the simulation can be pre-determined, we can apply the mixed mode technique so the significance of queuing delays in the congested area will not be ignored by session end-to-end packet delivery. Figure 40. RAP Fairness: Four Combinations of Detailed and Mixed Mode Simulations 5.3. Case Study: Self-similar Traffic Data network traffic is a result of Poisson human requests and heavy-tailed computer responses, in which the traditional memory-less analysis for telephone network will no longer apply. In fact, packet-level data network traffic exhibits self-similar or fractal behavior. The data network community, longing to know the nature of Internet traffic, is excited about the new discovery. Some have started analyze using fractional Gaussian noise, a well-known fractal process. However, planning or analysis based on the empirical discovery may be misleading, which could ultimately lead to a disastrous breakdown in the future. Thus, this study [99] tried to explain the self-similar phenomena observed in data traffic. Through a set of detailed simulations, the study has provided strong evidences that 1) self-similarity over large time scales is due to the users 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 # of TCP connections RAP/TCP Ratio Detailed RAP / Detailed TCP Detailed RAP / Mixed Mode TCP Mixed Mode RAP / Detailed TCP Mixed Mode RAP / Mixed Mode TCP 106 behavior and object/connection sizes, and 2) a pronounced scaling behavior change over round- trip time scale is determined by the network topology and traffic load. In this section, we begin with a precinct description of the wavelet analysis used to determine self-similarity in traffic traces and ways to interpret the scaling plots generated by the wavelet analysis. Then, we select to re-run the same set of detailed TCP simulations with FSA TCP and show that we are able to produce almost identical global scaling plots that are essential to this self-similar traffic study. 5.3.1. Self-similar Traffic Causality Study This study has taken the approach of replicating measured traffic in a simulation environment so that possible causes to interesting phenomena can be isolated from others and conveniently studied. For the purpose of simplicity, we narrow the problem scope down to verifying the conjecture of “self-similar scaling behavior over large time scales is mainly caused by user/session characteristics and has little to do with network-specific aspects”. To visualize the scaling characteristics in network traffic, we apply a wavelet-based analysis technique whose ability to localize a set of network measurements in time and scale enables one to uncover relevant information about the time-scale dynamics of network traffic. Throughout this study, we rely on a set of high-quality measurements from an ISP environment and on various traces collected from a simulation environment that uses the ns-2 simulator [49]. The ISP traces serve as validation checks, while the ns-2-generated traces allow us to identify the effects that aspects of user/session characteristics or network configurations have on the dynamics of network traffic. Wavelet Analysis and Measurement Data We use the wavelet transform of a time series to study its global scaling properties. In particular, we examine the average energy contained in each scale of the trace and examine how that 107 quantity changes as we move from coarser to finer scales. The average energy at scale j is the average of the sum of the squared wavelet coefficients d j,k 2 ; i.e., E j = (∑ k d j,k 2 ) / N j , where N j is the number of coefficients at scale j. To determine the global scaling property of the data, we plot logE j as a function of scale j, from coarsest to finest scales, and determine qualitatively over what range of scales there exists a linear relationship between logE j and scale j; that is, over what range of time scales there exists self- similar scaling (see [91][105] for more details). Figure 41 shows global scaling properties of two artificially generated traces. The first one is a Poisson process, and the second one is an exactly self-similar trace. In all of the figures in this paper, the scale j is on the bottom axis and the corresponding time (in seconds) is plotted on the top axis for reference. Figure 41. Global Scaling Plots: Artificial Poisson and Exact Self-Similar Processes We validate our observations against two data sets, DIAL1 and DIAL2. They were collected on July 21, 1997 and January 22, 1999 between 22:00 and 23:00 at the same location, an FDDI ring connecting 420 modem clients to the rest of the Internet. Both traces have high time stamp accuracy of about 10-100 msec and negligible dropped packets. To summarize the results of the 108 scaling analysis of our two IP traffic traces, Figure 42 shows the global scaling behaviors for DIAL1 and DIAL2. Both data sets exhibit very similar global scaling behaviors, i.e., self-similar scaling over time scales larger than a few hundreds of msec (look for linear regime on the left half of the plot) and a rather abrupt transition in scaling behavior over the “typical” round-trip time scales (look for dips on the right half of the plot). Figure 42. Global Scaling Plots for Measurement Data Simulation Results We explore the role that variability plays in determining the scaling properties of network traffic, in particular user/session-related variability (e.g., sizes of Web sessions or sizes of HTTP data transfers, number of requests per session). An ISP-like configuration (Figure 26) is used to illustrate the difference between how low user variability and high user variability contribute to the dynamics of the measured traffic. By high user variability, we mean that at least one of the “workload-specific” distributions (i.e., number of objects per page or object size) must be chosen from the class of heavy-tailed distributions with infinite variance (e.g., Pareto-type tail behavior), while low user variability reflects the fact that all these distributions are either exponential or trivial (i.e., constant). 109 The results are shown in Figure 43. While the low user variability simulation yields a trivial global scaling plot (i.e., horizontal line, consistent with the absence of long-range dependence), the high user variability setting gives rise to a pronounced global scaling behavior over large time scales. This also illustrates what is meant by saying that self-similar scaling over large time scales is primarily caused by user/session characteristics and has little to do with network-specific aspects [99]. There is also a notable knee in time scales at the order of round trip time indicating a abrupt transition from self-similarity to a possibly multifractal scaling behavior. Figure 43. Global Scaling Plot: Heavy-tailed Object Sizes vs. Exponential 5.3.2. Applying FSA TCP Our choices of network topologies and types of clients attempt to replicate a reasonably realistic ISP environment. Since roughly 60-80% of all packets and bytes measured are web traffic, our primary user is a consumer accessing the network through an ISP via a modem bank to browse the web. To accurately simulate HTTP transfers, we extend the existing ns-2 HTTP modules to accommodate for the variability that is inherent in the Web. 110 In a typical HTTP 1.0 transaction, a web client sends a request to the Web server for a web object after establishing a TCP connection. The server responds with a reply header (sometimes attaching data) and then continues to send data. However, the original ns-2 TCP connection module failed to send the connection set-up and tear-down packets. In fact, the TCP connection modules allow the transfer of data in only one direction 13 . We modified ns-2 to emulate the exchange of HTTP header information with two ns-2 TCP connections that have the same "port" numbers which facilitates object identification for the purpose of data analysis. During a Web session a user usually requests several Web pages and each page may contain several web objects (e.g. images or audio files). To capture this hierarchical structure and its inherent variability, we allow for different probability distributions for the following user/session attributes: inter-session time, pages per session, inter-page time, objects per page, inter-object time, and object size (in KB). For each of these attributes, we can choose from the many built-in distributions (such as constant, uniform, exponential, Pareto, etc.) or we may define our own. We base our choice of distributions on the work surrounding SURGE [92], a web workload generator designed to generate realistic Web traffic patterns, and upon [93]-[95] The protocol stack, network topology (including delays and bandwidths), and the sequence of Web requests define a simulation. Since TCP Reno and HTTP 1.0 [96] are the predominant protocols in the measured traces, we emulated them in our simulations. To find out how various attributes of network topology and web request sequence affect the traffic characteristics, we experimented with a set of network topologies. We concentrated on simulation environments that consist of a set of clients connected to an access network, which in turn provides connectivity to a set of servers, in effect creating a "dumbbell." The exact topology is shown in Figure 26. The performance bottleneck of these simulations is really the amount of TCP connections that has to be created. Given that the global scaling analysis is performed on the per 10 msec 13 Currently, ns-2 has Full-TCP that includes two-way traffic. 111 time series, FSA TCP seems to be a strong candidate abstraction. In fact, in the next subsection, we verify that to certain extend FSA TCP improves simulation performance and produces almost identical global scaling analysis plots as those of the detailed TCP simulations. 5.3.3. Self-similar Traffic Simulation We replace connections with FSA TCP whenever it is applicable, i.e., when a connection is short and has no more than one loss per connection. The simulation that uses detailed Reno TCP implementation is referred to as Traffic/TCP and the other simulation that replaces qualified connections with FSA TCP is Traffic/FSA. The global scaling plots are depicted in Figure 44. The two lines, one for Traffic/TCP and the other for Traffic/FSA, overlap through all time scales, which means our FSA TCP works well in preserving self-similarity in the aggregated traffic. Because this self-similarity is determined by the distribution of web object sizes, not the details of TCP, the minor delay distortion introduced by FSA TCP should not have crucial effects self- similar traffic simulations. We can be confident in simulating larger scale self-similar traffic using FSA TCP, for example generating background traffic. Figure 44. Almost Identical Global Scaling Plots: Detailed TCP and FSA TCP 112 With FSA TCP, we have achieved creating realistic network traffic in a medium-scale scenario, i.e., ISP environment. Along with abstraction techniques to create large network topologies (e.g., algorithmic routing), we could potentially simulate a regional-area network. To be able simulate network and traffic to the scale of today’s Internet, more abstraction techniques that model various aspects of the Internet in even coarser granularity need to be developed. We will continue the discussion of more and better scaling techniques in the next chapter. 113 Chapter 6 Contributions and Future Work To conclude, we revisit the tasks needed to support the thesis statement. Below is a summary of what is done and to be done. We have provided the selective abstraction guideline and four abstraction, one hybrid, and two optimization techniques. Each abstraction technique certain network aspects to conserve simulation resource. We also pinpoint the distortions created by the abstraction techniques and suggest suitable applications. To study the potential impact of abstractions, we conduct three case studies, SRM, RAP, and Self-similar Traffic. The results demonstrate the applicability of the general selection guideline and abstraction techniques. Currently, if algorithmic routing and FSA TCP are applicable, ns-2 is capable of simulation scenarios with 10,000s of nodes and 10,000s of flows on a FreeBSD Pentium PC with 1GB physical memory, as opposed to maximal 512 nodes and 40,000 TCP flows in full detailed mode. More abstraction techniques need to be developed to enable simulation scenarios larger than 10,000s of nodes and 10,000s of flows. We briefly examine short- and long-term future work issues in separate subsections. 6.1. Thesis Statement Using abstraction, one can simulate large-scale networks with a substantial amount of traffic and yet retain significant accuracy. 114 To verify the above statement, we focus on the following tasks. • Develop abstraction, hybrid, and optimization techniques • Pinpoint the distortions caused by abstraction techniques • Provide general guidelines to assist users with progressive and systematic abstraction selection • Demonstrate through sample simulation studies the use selection guidelines 6.2. Contributions Table 4 summarizes all the scaling techniques described earlier. Among the seven techniques, centralized computation, end-to-end packet delivery, algorithmic routing, and finite state automata modeling are abstraction techniques. Each of them leaves out significant details that may introduce distortions. Two optimization techniques, packet reference count and virtual classifier attempt to conserve memory and runtime used for duplicating identical simulation information. The mixed mode hybrid simulation technique enables partial abstract simulations. In a sense, it allows users to adjust the level of abstractions for their simulations in a finer granularity. For the purpose of evaluation, we apply these slightly more general scaling techniques to specific unicast and multicast simulation components. Through our systematic experiments, we identify distortions associated with the abstraction techniques and suggest applicable network problem studies. In general, the applicability of the above scaling techniques is problem dependent. Users have to be careful to avoid distortions that will lead to invalid results and conclusions. To help users effectively select the above scaling techniques, we propose the following selection methodology (Table 5). This methodology, unlike scaling techniques that are specific to network problems, can be potentially applicable to other forms of abstract simulations. Table 6 115 summarizes the three case studies conducted to demonstrate the usefulness of this selection methodology and the proposed abstraction techniques. Scaling Techniques Conserved Aspects Application Distortion Applicable Problem Abstraction Centralized Computation (Section 3.2.1) Control messages and network state Centralized Multicast Control Message Overhead and Convergence Delay Transport Layer Multicast End-to-end Packet Delivery (Section 3.2.2) Routing states and node/link structures Session Multicast Queuing Delay Reliable Multicast Algorithmic Routing (Section 3.2.3) Routing states for very large topologies Algorithmic Unicast Route Length Tree Topology and single source scenarios Finite State Automata Modeling (Section 3.2.4) End-to-end protocol states FSA TCP Fine-grain Connection Delay and Throughput Background Traffic Optimization Packet Reference Count (Section 3.4.1) Packets in flight Multicast Packet Reference Count None Multicast or Broadcast Communicati on Virtual Classifier (Section 3.4.2) Routing states within detailed nodes Virtual Unicast Classifier None Any Unicast Simulations Hybrid Simulation Mixed Mode (Section 3.3) Expand application range for the above Mixed Session Packet Delivery Negligible Congestion Control Table 6. Scaling Techniques and Distortions Summary 116 1. Define scaling factors (varying input) and measurement metrics (wanted output) 2. Start from small-scale detailed simulations 3. Profile simulations to find bottleneck 4. Adapt techniques to avoid simulation bottleneck 5. Verify simulations in small-scale in detailed mode 6. If confident with one of the following conditions, proceed with larger-scale simulations using the improved version • The defined measurement metrics are not affected by the distortion at all • Distortion of the selected abstraction will not be enlarged when increasing the defined scaling factors Table 7. Abstraction Selection Methodology Summary Case Study Applicable Abstractions Scalable Reliable Multicast Session Multicast & Packet Reference Count Rate Adaptation Protocol Mixed Mode & Session Multicast Self-similar Traffic Analysis FSA TCP & a Form of Hybrid Simulation Table 8. Case Studies Summary 6.3. Short-term Future Work Although current algorithmic routing and FSA TCP show promising results in previous discussion, there are potentially ways to optimize these two abstraction techniques. With these optimizations, algorithmic routing might be able to eliminate its limitation of simulating only tree topologies, and be able to simulate arbitrary 10,000s node topologies or the actual Internet 117 topology. By automating the generation of TCP’s finite state machine diagram, there will be more FSA TCP applicable traffic flows. 6.3.1. Algorithmic routing Algorithmic routing is composed of two parts, topology mapping and next hop lookup. Its optimizations are aimed to improve either one of these two sub-mechanisms. The next hop lookup algorithm currently requires O(logN) time on average. However, with careful node address assignment and arithmetic calculation, this O(logN) time consumption can be reduced to constant time, O(1) [106]. The topology mapping is essentially a tree search algorithm. This tree mapping process originates algorithmic routing’s distortion, sub-optimal routes or differences in route length. To reduce the degree of this distortion, we can apply a reversed minimum spanning tree (maximum spanning tree) with equivalent link weight. This tree search algorithm guarantees to leave out a minimal number of links and thus tends to create less sub-optimal routes. Selecting a root to start the tree search algorithm also affects the degree of distortion and deserves further investigations. Intuitively, the root of the tree search should correspond to the largest sender. The above tree search algorithm optimizations only reduce the degree of distortion. To prevent sub-optimal routes from happening entirely, we will need mechanisms such as creating multiple trees and/or shortest route probing. Based on our observation on algorithmic routing’s impact on a five-node ring topology, we hypothesis (Section 4.2) that the longer the cycles in a topology, the larger the route difference. Cycles appear to be one of the major sources of the sub- optimal route effect. When there are few cycles and the cycles are small, we can create multiple mappings that root at every node on the cycles. Although the memory requirement becomes O(kN) where k is the number of nodes within cycles, we completely eliminate the sub-optimal route distortion. However, this optimization is only helpful when there are very few cycles and the cycles are small. Otherwise, the k value in O(kN) could be too big and we lose the scalability 118 of algorithmic routing. An alternative solution is shortest route probing. This mechanism proposes to mark nodes on the two ends of the ignored links (after tree search algorithm). Whenever a route is requested to or from these marked nodes to other nodes, these marked nodes probe their ‘previously connected’ neighboring nodes for shortest routes. Although our evaluation study uses a set of transit-stub random topologies, it is also important that we understand the impact of algorithmic routing on other types of topologies. Especially for topologies that contain stub-stub links, there are potentially longer cycles in the topologies and therefore can result in more and worse sub-optimal routes. The actual Internet topology tends to have stub-stub links for network robustness, so algorithmic routing could potentially cause a higher degree of distortion than observed in Section 4.2. 6.3.2. FSA TCP The existing TCP finite state automata diagrams are generated manually via a simple set of selective error scenarios. Thus, they are applicable to TCP flows that are short or have only one packet loss throughout the transmissions. When the degree of congestion is high, TCP flows tend to drop more packets and thus less TCP flows can apply the FSA TCP abstraction. For these highly congested network simulation scenarios, we need fully extended TCP FSA models. These completed TCP FSA models, however, are cumbersome to generate manually. Therefore, automation of FSA model generation is crucial. It can potentially provide completed TCP FSA for longer TCP flows. Another dimension to improve FSA TCP is, instead of sending individual packets according to the number indicated in the current state, to send a representative for the entire batch. This form of extension is expected to improve both simulation memory and time usage by conserving number of physical packets in flight and by reducing the number of events in the scheduler queue. 119 6.4. Long-term Future Work In this section, we discuss related issues that will require a larger-scale of collaboration and a longer period of study time. We discuss these long-term problems in three subsections. The first two hint on advanced abstraction techniques that will enable larger-scale traffic and topology simulations, e.g., 1,000,000s of flows and 1,000,000s of nodes. The last subsection looks at related issues in validating abstract simulations. 6.4.1. Router characteristics When simulating a large network with a large amount of traffic where almost 100% of the network capacity is consumed by packets in flight, the performance bottleneck becomes the amount of packet events. To be able to conduct simulations in the scale of 1,000,000s of flows and 1,000,000s of nodes, there must exist abstraction techniques or models that can ignore generating a large amount of packet events. One approach is to characterize the queue length dynamics, given the specifications of cross traffic. 6.4.2. Domain Abstraction Another dimension of problems in simulating larger-scale topologies is the memory requirement for routing information. Judging from the scale of algorithmic routing, 10,000s of nodes, and current Internet topology, 100,000s of routers, there must exist techniques to improve simulation scale for another order or two. One potential solution is to abstract away some intra-domain details, called domain abstraction. Each domain is represented by a simulation node and communicates with other domains through inter-domain links, represented by a simulation link. The intra-domain details can be approximated by previous domain-sized simulations. 120 6.4.3. Simulation Validation Simulation validation has drawn an increasing amount of attention in the simulation research community. A majority of the community agrees that there are levels of validations. Simulation kernel, model, and experiment are the three levels of validations commonly identified from recent conversations. Among them, model validation is closely related to the abstraction issues presented in this thesis. Our proposed selection methodology helps in selecting valid models in certain degree. However, the selection process is highly dependent on individual expertise. To simplify the process, we need to gather networking expertise, categorize Internet research problems, define scaling factors and interesting measurement metrics, and finally correlate these scaling factors and measurement metrics with various network characteristics that abstraction techniques might distort. 6.5. Conclusion With the size of today’s Internet (100,000s of routers and continues to grow), all Internet mechanisms should be tested in a similar scale to prevent from disastrous breakdown of the Internet. Network simulation is one of the few methods that come close to provide such large- scale testing. Many of the previous works concentrate on designing and improving parallel and distributed simulators. Few of them explore the use of abstraction and hybrid simulation techniques. In this thesis, we compare and contrast the concept of abstraction, hybrid simulation, and optimization that have been used implicitly in previous simulator developments. After clearing out the distinction of the three types of scaling techniques, we propose and evaluate several abstraction, hybrid simulation, and optimization techniques. Each of these techniques removes a performance bottleneck, for example, link structure, routing table, or control messages. To help users avoid critical distortions that may invalidate their conclusions, we systemize the 121 selection of abstractions for relevant network studies. While our scaling techniques are specific to improve specific aspects of network simulations, our general guidelines are potentially applicable to select similar abstraction techniques or other forms of abstract simulations. To demonstrate our scaling techniques and selection guidelines, we provide three case studies, all active research areas, and successfully select the proper scaling techniques and avoid distortions that may critically affect the final conclusions. With the abstraction, hybrid simulation, and optimization techniques, we are capable of simulations with 10,000s of nodes and 100,000s of traffic connections for less than 1 GB of memory, comparing to no more than 500 nodes and 40,000 connections in full detail. That is two degrees of magnitude improvement in topology size and one degree of magnitude improvement in data traffic. In short-term future, we hope to expand the applicability of algorithmic routing so more experiments can take advantage of this memory-efficient routing mechanism and study the impact that sub-optimal routes might have to real research problems. In addition, we believe FSA TCP has great potential for better applicability and efficiency with automation of FSA generation and batch representative abstraction. In long-term future, techniques such as domain abstraction and router characteristics can further improve sizes of topology and amounts of data connections allowed in network simulations. Another direction of future work is to insure the validness of simulation results. This is an open research area that deserves more attention. 122 References [1] Roger McHaney, Computer Simulation A Practical Perspective, Academic Press Inc., 1991 [2] Raj Jain, The Art of Computer System Performance Analysis: Techniques for Experimental Design, Measurement, Simulation and Modeling, John Wiley & Sons, Inc., 1991 [3] Philip Morrison, Phylis Morrison, and The office of Charles and Ray Eame, Power of Ten: About the Relative Size of Things in the Universe, Science American Books, Inc., 1982 [4] Bruno R. Preiss, Ian D. MacIntyre, and Wayne M. Loucks, On the Trade-off between Time and Space in Optimistic Parallel Discrete-Event Simulation, In Proceedings of. 1992 Workshop on Parallel and Distributed Simulation, pages 33-42, Newport Beach, CA, January 1992. Society for Computer Simulation [5] James F. Ohi and Bruno Richard Preiss, Parallel Instance Discrete-Event Simulation Using a Vector Uniprocessor, In Proceedings of. 1991 Winter Simulation Conference., pages 93- 601, Phoenix, AZ, December 1991. Society for Computer Simulation [6] Jayadev Misra, Distributed Discrete-Event Simulation, ACM Computing Surveys, Vol. 18 No. 1, March 1986, pp. 39-65 [7] David R. Jefferson, Virtual Time, ACM Transactions on Programming Languages and Systems, Vol. 3 No. 40, July 1985, pp. 404-425 [8] K. M. Chandy and Jayadev Misra, Asynchronous Distributed Simulation via a Sequence of Parallel Computations, Communications of the ACM, Vol 24 No. 11, April 1981, pp. 198- 205 [9] Rajie L. Bagrodia and Wen-Toh Liao, Maisie: A Language for the Design of Efficient Discrete-Event Simulations, IEEE Transactions of Software Engineering, Vol. 20 No. 4, April 1994, pp. 225-238 [10] Scalable Self-Organizing Simulation (S3), Web Page http://www.dimacs.rutgers.edu/Projects/Simulations/darpa/ [11] BONeS Web Page http://www.cadence.com/alta/produces/bonesdat.html [12] OPNET Web Page http://www.mil3.com/home.html [13] COMNET III Web Page http://www.caciasl.com/comnetthree.html 123 [14] K. Perumalla, R. Fujimota, and A. Ogielski, TED – A Language for Modeling Telecommunication Networks, ACM SIGMETRICS Performance Evaluation Review, Vol 25 No 4, March 1998 [15] K. Perumalla, M. Andrews, and S. Bhatt, TED Models for ATM Internetworks, ACM SIGMETRICS Performance Evaluation Review, Vol 25 No 4, March 1998 [16] D. Rubenstein, J. Kurose, and D. Towsley, Optimistic Parallel Simulation of Reliable Multicast Protocols, [17] J. Panchal, O. Kelly, J. Lai, N. Mandayam, A.T. Ogielski, and R. Yates, Parallel Simulation of Wireless networks with TED: Radio Propagation, Mobility, and Protocols, ACM SIGMETRICS Performance Evaluation Review, Vol 25 No 4, March 1998 [18] B. J. Premore, and D. M. Nicol, Transformation of ns TCP Models to TED, ACM SIGMETRICS Performance Evaluation Review, Vol 25 No 4, March 1998 [19] DARPA Global Mobile (GloMo) Information Systems program, Web Page http://glomo.sri.com/glomo/ [20] Srinivasan Keshav, REAL: A Network Simulator, Technical Report 88/472, University of California, Berkeley, December 1988, http://www.cs.cornell.edu/skeshav/real/overview.html [21] A. Dupuy, J. Schwartz, Y. Yemini, and D. Bacon, NEST: A Network Simulation and Prototyping Testbed, Communication of ACM, Vol 33 No. 10, October 1990, pp. 64-74 [22] Andrew Heybey, and Niel Robertson, The Network Simulator Version 3.1, MIT, May 1994, ftp://thyme.lcs.mit.edu/pub/netsim/ [23] A. Bhattacharyya, al. et., An Overview of the Ptolemy Project, Department of Electrical Engineering and Computer Science, University of California at Berkeley, March 1994, http://ptolemy.eecs.berkeley.edu [24] C. Alaettinoglu, A. U. Shankar, K. Dussa-Zieger, and I. Matta, Design and Implementation of MARS: A Routing Testbed, Journal of Internetworking Research and Experience, Vol 5 No 1, March 1994, pp. 17-41 [25] B. Lewis Barnett III, Netsim: A Network Performance Simulator, In Proceedings of the ACM SIGCSE, 1993, http://www.mathcs.urich.edu/barnett/netsim/Netsim_SIGCSE.ps [26] Bruce A. Mah, INSANE Users Manual, The Tenet Group Computer Science Division, University of California, Berkeley 94720, May 1996, http://HTTP.CS.Berkeley.EDU/bmah/Software/Insane/InsaneMan.ps [27] Nada Golmie, Alfred Koenig, and David Su, The NIST ATM Network Simulator: Operation and Programming Version 1.0, U.S. Department of Commerce Technology Administration National institute of Standards and Technology, Computer System Laboratory, Advanced Systems Division, Gaithersburg, MD 20899, August 1995, ftp://isdn.ncsl.nist.gov/atm-sim/sim_man.ps.Z 124 [28] Jay Martin and Rajive Bagrodia, COMPOSE: An Object Oriented Environment for Parallel Discrete-Event Simulation, In Proceedings of the 1995 Winter Simulation Conference, December 1995, ftp://may.cs.ucla.edu/pub/papers/wsc95-compose.ps.gz [29] Hussein Salama, MCRSIM User’s Manual, May 1995 [30] Dorgham Sisalem, A TCP Simulator with Ptolemy, June 1995, http://chinon.thomsoncsf.fr/ptolemy/papers/tcpSim/tcp_des.ps.gz [31] Liming Wei, The Design of The USC PIM SIMulator (PIMSIM), Technical Report 95-604, University of Southern California Computer Science, Los Angeles, CA 90089-0781, August 1995, http://catarina.usc.edu/lwei/TR-95-604.ps.gz [32] Mark Carson, Application and Protocol Testing Through Network Emulation, September 1997, http://snad.ncsl.nist.gov/itg/nistnet [33] Jong-Suk Ahn and Peter B. Danzig, Speedup vs. Simulation Granularity, IEEE/ACM Transaction on Networking, Vol 4 No 5, October 1996, pp. 743-757 [34] Lili Qiu, Fast Network Simulation, work in progress, http:// www.cs.cornell.edu/cnrg/session_sim.html [35] D. Schwetman, Hybrid Simulation Models of Computer Systems, Communication of the ACM, September 1978 [36] Nancy Cheung and Andrew Parker, Optimising Simulation for Large Networks, work in progress, http://ana-www.lcs.mit.edu/anaweb/mesh/mesh-proj-curr.html#sim [37] D. Waitzman, S. Deering, and C. Partridge, Distance Vector Multicast Routing Protocol, RFC1075, November 1988 [38] Estrin, D. Farinacci, A. Helmy, V. Jacobson, and L. Wei, Protocol Independent Multicast - Dense Mode (PIM-DM): Protocol Specification, Proposed Experimental RFC, September 1996 [39] A. J. Ballardie, P. F. Francis, and J. Crowcroft, Core Based Trees, In Proceedings of the ACM SIGCOMM, San Francisco, 1993 [40] S. Deering, D. Estrin, D. Farinacci, M. Handley, A. Helmy, V. Jacobson, C. Liu, P. Sharma, D. Thaler, and L. Wei, Protocol Independent Multicast - Sparse Mode (PIM-SM): Motivation and Architecture, Proposed Experimental RFC, September 1996 [41] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei, An Architecture for Wide-Area Multicast Routing. In Proceedings of the ACM SIGCOMM, London 1994 [42] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei, The PIM Architecture for Wide-Area Multicast Routing. ACM Transactions on Networks, April 1994 [43] D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, and L. Wei, Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification, Proposed Experimental RFC, September 1996 125 [44] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, L. Wei, P. Sharma, and A. Helmy, Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification, Internet Draft, December 1995 [45] Deborah Estrin, Mark Handley, Ahmed Helmy, Polly Huang, David Thaler, A Dynamic Bootstrap Mechanism for Rendezvous-based Multicast Routing, Technical Report USC CS TR97-644, University of Southern California, 1997 [46] J. Moy, MOSPF: Analysis and Experience, RFC1585, March 1994 [47] J. Moy, Multicast Extensions to OSPF, RFC1584, March 1994 [48] E. Zegura, K. Calvert, and M. Donahoo, A Quantitative Comparison of Graph-based Models for Internet Topology, To Appear in Transactions on Networking, 1997 [49] S. Bajaj et al. Improving Simulation for Network Research. Technical Report 99-702, University of Southern California, 1999. Also submitted to CACM. UCB/LBNL/VINT Network Simulator - ns (version 2), http://www-mash.CS.Berkeley.EDU/ns/ [50] Sally Floyd, Van Jacobson, Steven McCanne, Ching-Gung Liu, and Lixia Zhang, A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing, Extended Report, LBNL Technical Report, pages 1-37, September 1995, also submitted for publication in IEEE/ACM Transactions on Networking. [51] S. McCanne, V. Jacobson, and M. Vetterli, Receiver-Driven Layered Multicast, ACM SIGCOMM, Stanford CA, August 1996, pp. 117-130 [52] Mark Handley, An Examination of MBone Performance, USC/ISI Research Report: ISI/RR-97-450, http://north.east.isi.edu/~mjh/mbone.ps [53] Steve Deering, IP Multicast and the MBone: Enabling Live, Multiparty, Multimedia Communication on the Internet, presentation slides, ftp://parcftp.xerox.com/pub/net- research/mbone/mbone-talk-dec95.ps [54] D. DeLucia and K. Obraczka, A Multicast Congestion Control Mechanism Using Representatives, Technical Report USC-CS TR 97-651, Department of Computer Science, University of Southern California, May 1997 [55] Van Jacobson and Steven McCanne, vat – LBNL Audio Conferencing Tool, http://www- nrg.ee.lbl.gov/vat [56] Steven McCanne and Van Jacobson, vic – LBNL Video Conferencing Tool, http://www- nrg.ee.lbl.gov/vic [57] Van Jacobson and Steven McCanne, wb – LBNL Whiteboard Tool, http://www- nrg.ee.lbl.gov/wb [58] S. Armstrong, A Freier, and K. Marzullo, Multicast Transport Protocol, RFC1301, February 1992 126 [59] John C. Lin and Sanjoy Paul, RMTP: A Reliable Multicast Transport Protocol, Proceedings of IEEE INFOCOM ‘96, pp. 1414-1424, April 1996 [60] R. Yavatkar, J. Griffioen, and M. Sudan, A Reliable Dissemination Protocol for Interactive Collaborative Applications, Proceedings of ACM Multimedia ‘95, 1995 [61] StarBurst Communication Corporation, StarBurst MFTP – An Efficient, Scalable Method for Distributing Information Using IP Multicast, http://www.starburstcom.com/while.htm [62] D. Mills, Network Time Protocol (v3), RFC 1305, April 1992, Obsoletes RFC 1119 [63] Ching-Gung Liu, A Scalable Reliable Multicast Protocol, PhD. Dissertation Proposal, September 1996 [64] Ching-Gung Liu, Deborah Estrin, Scott Shenker, and Lixia Zhang, Local Error Recovery in SRM: Comparison of Two Approaches, Submitted for Publication in IEEE/ACM Transactions on Networking, February 1997 [65] Puneet Sharma, Deborah Estrin, Sally Floyd, and Lixia Zhang, Scalable Session Messages in SRM, Submitted for Publication, http://netweb.usc.edu/vint/papers/ssession.ps, University of Southern California, August, 1997 [66] Kannan Varadhan, Deborah Estrin, and Sally Floyd, Impact of Network Dynamics on End- to-End Protocols: Case Studies in Reliable Multicast, Submitted for Review to the Third IEEE symposium on Computers and Communications, http://netweb.usc.edu/vint/papers/dynamics.ps, University of Southern California, August, 1997 [67] D. Thaler, D. Estrin, and D. Meyer, Border Gateway Multicast Protocol (BGMP): Protocol Specification, Internet Draft, IDMR Working Group, draft-ietf-idmr-gum-01.txt, October, 1997 [68] F. H. Desbrandes, S. Bertolotti, and L. Dunand, OPNET 2.4: An Environment for Communication Network Modeling and Simulation, In Proceedings of European Simulation Symposium, October 1993 [69] Armin R. Mikler, Johnny S. K. Wong and Vasant Honavar, An Object Oriented Approach to Simulating Large Communication Networks, Journal of System Software, Vol 40, 1998, pp. 151-164 [70] Peter B. Danzig and Sugih Jamin, tcplib: A Library of TCP Internetwork Traffic Characteristics, Technical Report USC-CS-91-495, Computer Science Department, University of Southern California, Los Angeles, CA 90089-0781, 1991 [71] L. Zhang, S. Deering, D. Estrin, S. Shenker and D. Zappala. RSVP: A New Resource ReSerVation Protocol, IEEE Network, September 1993 [72] L.R. Ford ad D.R. Fulkerson. Flows in Networks. Princeton University Press, Princeton, NJ, U.S.A. 1962 127 [73] C.L. Hedrick. Routing Information Protocol, RFC 1058 edition, 1988. (Updated by RFC 1388, RFC1723) (Status: HISTORIC) [74] J.M. McQuillan, G. Falk, and I. Richer. A Review of the Development and Performance of the ARPANET Routing Algorithm. IEEE Transactions on Communications, COM- 26(12):1802-1811, December 1978 [75] W.D. Tajibnapis. A Correctness Proof of a Topology Information Maintenance Protocol for a Distributed Computer Network. Communications of the ACM, 20(7), 1977 [76] Y. Rekhter. Inter-Domain Routing Protocol (IDRP). Internetworking: Research and Experience, 4:61-80, 1993 [77] Y. Rekhter and T. Li. A Border Gateway Protocol 4 (BGP-4), RFC 1771 edition, 1995 (Obsoletes RFC 1654) (Status: DRAFT STANDARD) [78] J. Moy. OSPF Version 2, RFC 1583 edition, 1994. (Obsoletes RFC 1247) ( Obsolete by RFC 2178) (Status: DRAFT STANDARD) [79] J. McQuillan, I. Richer, and E.C. Rosen. The New Routing Algorithm for the ARPANET. IEEE Transactions on Communications, COM-28(5):711-719, May 1980 [80] E. W. Dijkstra. A Note on Two Problems in Connection with Graphs. Numerical Mathematics, 1:269-271, 1959 [81] L.S. Brakmo and L.L. Peterson. Experiences with Network Simulation. In Proceedings of the ACM SIGMETRICS, May 1996 [82] L.S. Brakmo, S. O’Malley, and L.L. Peterson. TCP Vegas: New Techniques for Congestion Detection and Avoidance. In Proceedings of the ACM SIGCOMM, 1994 [83] J. Postel. Transmission Control Protocol, RFC 793 edition, 1981. (Status: STANDARD) [84] P. Mishra and H. Kanakia. A Hop-by-hop Rate-based Congestion Control Scheme. ACM SIGCOMM, 1992 [85] D.E. Comer and R.S. Yavatkar. A Rate-based Congestion Avoidance and Control Scheme for Packet Switched Network. Proc. of the ICDCS, IEEE, 1990 [86] D.D. Clark, M.L. Lambert, and L.Zhang. NETBLT: A High Throughput Transport Protocol, ACM SIGCOMM, August 1988 [87] S. Keshav. A Control-theoretic Approach to Congestion Control. ACM SIGCOMM, pages 3-16, September 1991 [88] S. Jacobs and A. Eleftheriadis. Real-time Dynamic Rate Shaping and Control for Internet Video Applications. Workshop on Multimedia Signal Processing, pages 23-25, June 1997 [89] S. Cen, C. Pu, and J. Walpole. Flow and Congestion Control for Internet Streaming Applications, Proceedings Multimedia Computing and Networking, January 1998 128 [90] Z. Chen, S-M Tan, R.H. Campbell, and Y. Li. Real Time Video and Audio in the World Wide Web. Fourth International World Wide Web Conference, December 1995 [91] P. Abry and D. Veitch. Wavelet Analysis of Long-range Dependent Traffic. IEEE Transactions on Information Theory, 44:2-15, 1998 [92] P. Barford and M. E. Crovella. Generating representative web workloads for network and server performance evaluation. In Proc. of Performance ‘98/ACM Sigmetrics ’98, pages 151-160, 1998 [93] M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web Traffic – Evidence and Possible Causes. In Proc. of ACM Sigmetrics ’96, pages 160-169, 1996 [94] A. Feldmann, R. Caceres, F. Douglis, and M. Rabinovich. Performance of web proxy caching in heterogeneous bandwidth environments. In Proc. of IEEE INFOCOM, 1999 [95] J. C. Mogul, F. Douglis, A. Feldmann, and B. Krishnamurthy. Potential benefits of delta encoding and data compression for HTTP. In Proc. of ACM/SIGCOMM ’97, pages 181- 194, 1997 [96] W3C, 1998. Web Characterization Working Group. [97] M. Lottor. http://www.nw.com/zone/hosts.gif. January 1999 [98] A. Feldmann, A. C. Gilbert, and W. Willinger. Data networks as cascades: Investigating the multifractal nature of Internet WAN traffic. In Proc. of the ACM/SIGCOMM ’98, pages 25-38. 1998 [99] A. Feldmann, A. C. Gilbert, P. Huang, and W. Willinger. Dynamics of IP traffic: A study of the role of variability and the impact of control. To appear in the Proc. of ACM/SIGCOMM ’99. 1999 [100] V. Paxson and S. Floyd. Why we don’t know how to simulate the Internet. In the Proc. of the 1997 Winter Simulation Conference, Atlanta, GA 1997 [101] W. Willinger and V. Paxson. Where Mathematics meets the Internet. In Notes of the American Mathematical Society, 45(8), pp. 961-970, September 1998 [102] P. Huang, D. Estrin, and J. Heidemann. Enabling large-scale simulations: Selective abstraction approach to the study of multicast protocols. In Proceedings of the Sixth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS ’98), Montreal, Canada, July 1998 [103] P. Huang. Enabling large-scale simulation. Thesis proposal, Computer Science Department, University of Southern California, April 1998, http://netweb.usc.edu/huang/publication/quals-paper.ps.gz [104] P. Huang, A. Feldmann, A. C. Gilbert, and W. Willinger. An Informal Validation Case Study: Using A Simulator to Isolate Correlation and Causality in Measured Network Traffic. DARPA/NIST Workshop on Validation of Large-scale Network Models and Simulation, April 1999 129 [105] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson. On the Self-similar Nature of Ethernet Traffic (extended version). IEEE/ACM Transactions on Networking, 2:1-15, 1994 [106] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. ISBN 0-262- 03141-8 (MIT Press). ISBN 0-07-013143-0 (McGraw-Hill). QA 76.6.C662. 1989 [107] J. Mogul. Observing TCP Dynamics in Real Networks. In Proc of the ACM SIGOMM, pages 305-317, 1992 [108] V. Paxson. End-to-end Internet Packet Dynamics. In proc ACM SIGCOMM, pages 139- 152. 1997 [109] W.R. Stevens. TCP/IP Illustrated Volume 1. Addison-Wesley, 1994 [110] D. Wetherall. Object-oriented Tcl. ftp://ftp.tns.lcs.mit.edu/pub/otcl/doc/tutorial.html [111] R. Rejaie, M. Handley, and D. Estrin. RAP: An End-to-end Rate-based Congestion Control Mechanism for Realtime Streams in the Internet. IEEE Infocom 99, New York, NY, March 1999 [112] S. Raman, S. Shenker, and S. McCanne. Asymptotic Scaling Behavior of Global recovery in SRM. In Proc of ACM SIGMETRIC 98/PERFORMANCE 98 Joint International Conference on Measurement & Modeling of Computer Systems. Madison, WI, USA, June 1998 [113] H. Yu, L. Breslau, and S. Shenker. A Scalable Web Cache Consistency Architecture. In Proc of ACM SIGCOMM, Boston, MA. September 1999 [114] L. Zhang, S. Michel, S. Floyd, and V. Jacobson. Adaptive Web Caching: Towards a New Global Caching Architecture. The Third International Caching Workshop, June 1998 [115] A. Chankhunthod, P.B. Danzig, C. Neerdaels, D Wessels, M.F. Schwartz, and E. Tsai. A Hierarchical Internet Object Cache. USENIX 1996 [116] RealPlayer. http://www.realplayer.com [117] J. Ahn, P.B. Danzig, Z. Liu, and E. Yan. TCP Vegas and WAN Emulator. In Proc. of ACM SIGCOMM, 1995 [118] J. Heidemann, K. Obraczka, and J. Touch. Modeling the Performance of HTTP Over Several Transport Protocols. To appear, IEEE/ACM Transactions on Networking, June 1997
Abstract (if available)
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 667 (1998)
PDF
USC Computer Science Technical Reports, no. 702 (1999)
PDF
USC Computer Science Technical Reports, no. 770 (2002)
PDF
USC Computer Science Technical Reports, no. 781 (2002)
PDF
USC Computer Science Technical Reports, no. 706 (1999)
PDF
USC Computer Science Technical Reports, no. 920 (2011)
PDF
USC Computer Science Technical Reports, no. 804 (2003)
PDF
USC Computer Science Technical Reports, no. 644 (1997)
PDF
USC Computer Science Technical Reports, no. 749 (2001)
PDF
USC Computer Science Technical Reports, no. 792 (2003)
PDF
USC Computer Science Technical Reports, no. 836 (2004)
PDF
USC Computer Science Technical Reports, no. 780 (2002)
PDF
USC Computer Science Technical Reports, no. 803 (2003)
PDF
USC Computer Science Technical Reports, no. 704 (1999)
PDF
USC Computer Science Technical Reports, no. 785 (2003)
PDF
USC Computer Science Technical Reports, no. 714 (1999)
PDF
USC Computer Science Technical Reports, no. 707 (1999)
PDF
USC Computer Science Technical Reports, no. 824 (2004)
PDF
USC Computer Science Technical Reports, no. 682 (1998)
PDF
USC Computer Science Technical Reports, no. 861 (2005)
Description
Polly Huang. "Enabling large-scale network simulation - A selective abstraction approach." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 715 (1999).
Asset Metadata
Creator
Huang, Polly
(author)
Core Title
USC Computer Science Technical Reports, no. 715 (1999)
Alternative Title
Enabling large-scale network simulation - A selective abstraction approach (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
129 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16270111
Identifier
99-715 Enabling Large-scale Network Simulation - A Selective Abstraction Approach (filename)
Legacy Identifier
usc-cstr-99-715
Format
129 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/