Design of Asynchronous 16-Bit Divider Using NST Algorithm

NST알고리즘을 이용한 비동기식 16비트 제산기 설계

  • 이우석 (충북대학교 반도체공학과) ;
  • 박석재 (충북대학교 반도체공학과) ;
  • 최호용 (충북대학교 전기전자 및 컴퓨터공학부)
  • Published : 2003.03.01

Abstract

This paper describes an efficient design of an asynchronous 16-bit divider using the NST (new Svoboda-Tung) algorithm. The divider is designed to reduce power consumption by using the asynchronous design scheme in which the division operation is performed only when it is requested. The divider consists of three blocks, i.e. pre-scale block, iteration step block, and on-the-fly converter block using asynchronous pipeline structure. The pre-scale block is designed using a new subtracter to have small area and high performance. The iteration step block consists of an asynchronous ring structure with 4 division steps for area reduction. In other to reduce hardware overhead, the part related to critical path is designed by a dual-rail circuit, and the other part is done by a single-rail circuit in the ring structure. The on-the-fly converter block is designed for high performance using the on-the-fly algorithm that enables parallel operation with iteration step block. The design results with 0.6${\mu}{\textrm}{m}$ CMOS process show that the divider consists of 12,956 transistors with 1,480 $\times$1,200${\mu}{\textrm}{m}$$^2$area and average-case delay is 41.7㎱.

본 논문에서는 NST (new Svoboda-Tung) 알고리즘을 이용한 비동기식 제산기의 효율적 설계에 관해 기술한다. 본 제산기설계에서는 비동기 설계방식을 사용하여 제산연산이 필요할 때에만 동작함으로써 전력소모를 줄이도록 설계한다. 제산기는 비동기식 파이프라인 구조를 이용한 per-scale부, iteration step부, on-the-fly converter부의 세부분으로 구성된다. Per-scale부에서는 새로운 전용 감산기를 이용하여 적은 면적과 고성능을 갖도록 설계한다. Iteration step부에서는 4개의 division step을 갖는 비동기식 링 구조로 설계하고, 아울러 크리티컬 패스(critical path)에 해당하는 부분만을 2선식으로, 나머지 부분은 단선식으로 구성하는 구현방법을 채택하여 하드웨어의 오버헤드를 줄인다. On-the-fly converter부는 iteration step부와 병렬연산이 가능한 on-the-fly 알고리즘을 이용하여 고속연산이 되도록 설계한다. 0.6㎛ CMOS 공정을 이용하여 설계한 결과, 1,480 ×1,200㎛²의 면적에 12,956개의 트랜지스터가 사용되었고, 41.7㎱의 평균지연시간을 가졌다.

Keywords

References

  1. Scott Hauck, 'Asynchronous design methodologies: an overview,' Proceedings of the IEEE, Vol. 83, No. 1, pp. 69-93, Jan. 1995 https://doi.org/10.1109/5.362752
  2. Al Davis and Steven M. Nowick, 'An introduction to asynchronous circuit design,' Technical Report UUCS-97-103, Department of Computer Science, University of Utah, pp. 1-57, sept. 1997
  3. M. Suzuoki, et al., 'A Microprocessor with a 128-bit CPU, ten floating-point MAC's, four floating-point dividers, and an MPEG-2 decoder,' IEEE Jounal of Solid-State Circuits, Vol. 34, No. 11, pp. 1608-1618, Nov. 1999 https://doi.org/10.1109/4.799870
  4. S. F. Obermann and M. J. Flynn, 'Division algorithms and implementations,' IEEE Transactions on Computers, Vol. 46, No. 8, pp. 833-854, Aug. 1997 https://doi.org/10.1109/12.609274
  5. Israel Koren, Computer arithmetic alorithm, Prentice-Hall, Inc., New Jersey, pp. 127-133, 1993
  6. L. A. Montalvo, K. K. Parhi, and A. Guyot, 'New Svoboda-Tung division,' IEEE Transactions on Computers, Vol. 47, No. 9, pp. 1014-1020, Sept. 1998 https://doi.org/10.1109/12.713319
  7. D. Somasekhar and K. Roy, 'Differential current switch logic: a low power DCVS logic family,' IEEE Journal of Solid-State Circuits, Vol. 31, No. 7, pp. 981-991, July 1996 https://doi.org/10.1109/4.508212
  8. T. E. Williams and M. A. Horowitz, 'A zero-overhead self-timed 160-ns 54-b CMOS divider,' IEEE Journal of Solid-State Circuits, Vol. 26, No. 11, pp. 1651-1661, Nov. 1991 https://doi.org/10.1109/4.98986
  9. N. Burgess, 'A fast division algorithm for VLSI,' Proceedings of ICCD '91, pp. 560-563, Oct. 1991 https://doi.org/10.1109/ICCD.1991.139973