output queuing require O (log N) iterations to achieve good performance. If the hardware
implementation of the number of iterations required is not feasible within the cell duration,
the matching process can be pipelined to obtain a matching in every cell time slot. However,
existing approaches incur a substantial latency penalty due to the way the pipelining is
performed, which renders them unattractive in latency-sensitive applications such as parallel …