been designed and fabricated in 1.0 mu m p-well CMOS technology. A new quasi NP
domino logic structure has been adopted to increase the throughput rate, and special
pipeline structures were used in the accumulator to reduce the total latency. The chip
complexity is approximately 10000 transistors and the die area is 2.5 mm* 3.7 mm. The
measured maximum clock rate is 200 MHz (ie 200 million multiply-accumulate operations …