Title of article :
Analysis of the parallel packet switch architecture
Author/Authors :
S.، Iyer, نويسنده , , N.W.، McKeown, نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2003
Pages :
11
From page :
314
To page :
324
Abstract :
Our work is motivated by the desire to design packet switches with large aggregate capacity and fast line rates. We consider building a packet switch from multiple lower speed packet switches operating independently and in parallel. In particular, we consider a (perhaps obvious) parallel packet switch (PPS) architecture in which arriving traffic is demultiplexed over k identical lower speed packet switches, switched to the correct output port, then recombined (multiplexed) before departing from the system. Essentially, the packet switch performs packet-by-packet load balancing, or inverse multiplexing, over multiple independent packet switches. Each lower speed packet switch operates at a fraction of the line rate R. For example, each packet switch can operate at rate R/k. It is a goal of our work that all memory buffers in the PPS run slower than the line rate. Ideally, a PPS would share the benefits of an output-queued switch, i.e., the delay of individual packets could be precisely controlled, allowing the provision of guaranteed qualities of service. In this paper, we ask the question: is it possible for a PPS to precisely emulate the behavior of an output-queued packet switch with the same capacity and with the same number of ports? We show that it is theoretically possible for a PPS to emulate a first-come first-served (FCFS) output-queued (OQ) packet switch if each lower speed packet switch operates at a rate of approximately 2R/k. We further show that it is theoretically possible for a PPS to emulate a wide variety of quality-of-service queueing disciplines if each lower speed packet switch operates at a rate of approximately 3R/k. It turns out that these results are impractical because of high communication complexity, but a practical highperformance PPS can be designed if we slightly relax our original goal and allow a small fixed-size coordination buffer running at the line rate in both the demultiplexer and the multiplexer. We determine the size of this buffer and show that it can eliminate the need for a centralized scheduling algorithm, allowing a full distributed implementation with low computational and communication complexity. Furthermore, we show that if the lower speed packet switch operates at a rate of R/k (i.e., without speedup), the r- esulting PPS can emulate an FCFS-OQ switch within a delay bound.
Keywords :
Patients
Journal title :
IEEE/ ACM Transactions on Networking
Serial Year :
2003
Journal title :
IEEE/ ACM Transactions on Networking
Record number :
92151
Link To Document :
بازگشت