DOI QR코드

DOI QR Code

모바일 그래픽 응용을 위한 파이프라인 구조 특수 목적 연산회로의 하드웨어 설계

Hardware Design of Pipelined Special Function Arithmetic Unit for Mobile Graphics Application

  • 투고 : 2013.06.18
  • 심사 : 2013.07.18
  • 발행 : 2013.08.31

초록

3차원 그래픽 API인 OpenGL과 Direct3D를 효율적으로 처리하기 위해 sine, cosine, 역수, 역제곱근, 지수 및 로그 연산을 처리하는 부동소수점 연산회로를 설계하였다. 고속 연산과 2 ulp 보다 작은 오차를 만족시키기 위해 2차 최대최소 근사 방식과 테이블 룩업 방식을 사용하였다. 설계된 회로는 65nm CMOS 표준 셀 조건에서 2.3-ns의 최대 지연시간을 갖고 있으며, 약 23,300 게이트로 구성된다. 최대 400 MFLOPS의 연산 성능과 높은 정밀도로, 설계한 연산회로는 3차원 모바일 그래픽 분야에 효율적으로 적용 가능하다.

To efficiently execute 3D graphic APIs, such as OpenGL and Direct3D, special purpose arithmetic unit(SFU) which supports floating-point sine, cosine, reciprocal, inverse square root, base-two exponential, and logarithmic operations is designed. The SFU uses second order minimax approximation method and lookup table method to satisfy both error less than 2 ulp(unit in the last place) and high speed operation. The designed circuit has about 2.3-ns delay time under 65nm CMOS standard cell library and consists of about 23,300 gates. Due to its maximum performance of 400 MFLOPS and high accuracy, it can be efficiently applicable to mobile 3D graphics application.

키워드

참고문헌

  1. Jeong-Ho Woo, Ju-Ho Sohn, Byeong-Gyu Nam, and Hoi-Hun Yoo, Mobile 3D Graphics Soc : From Algorithm to Chip, John Wiley & Sons, 2010.
  2. Chang-Hyo Yu, Kyusik Chung, Donghyun Kim, Lee-Sup Kim, "An Energy-Efficient Mobile Vertex Processor With Multithread Expanded VLIW Architecture and Vertex Caches", IEEE Journal of solid state circuits, vol. 42, no. 10, pp.2256-2269. oct. 2007.
  3. Jean-Michel Muller, Elementary Functions: Algorithms and Implementation, Birkhauser Press, 1997.
  4. H. C. Shin, J. A. Lee, and L. S. Kim, " A Minimized Hardware Architecture of fast Phong Shader Using Taylor Series Approximation in 3D Graphics," Proc. Int'l Conf. Computer Design, pp.286-291, 1998.
  5. Ping Tak Peter Tang, "Table-Driven Implementation of the logarithm function in IEEE Floating-Point Arithmetic," ACM Transactions on Mathematics Software, vol. 4, no. 16, pp.378-400, Dec. 1990.
  6. Stuart F. Oberman and Michael Y. Siu, "A High Performance Area Efficient Multifunction Interpolator", IEEE 11th Symposium on Computer Arithmetic, pp.272-279, 2005.
  7. IEEE, ANSI/IEEE Standard 754-1985: IEEE Standard for Binary Floating-Point Arithmetic, IEEE Press, 1985.
  8. K. C. Ng, "Argument Reduction for Huge Arguments : Good to the last Bit," SunPro, July 13, 1992.
  9. Jose-Alejandro Pineiro, Stuart F. Oberman, Jean-Michel Muller, and Javier D. Bruguera, "High-Speed Function Approximation Using a Minimax Quadratic Interpolator," IEEE Transaction on Computer, vol.54, no.3, pp.304-318, Mar. 2005. https://doi.org/10.1109/TC.2005.52
  10. Waterloo Maple Inc., Maple 14 Programming Guide, 2010.
  11. S. M. Quek and Larry Hu, "Apparatus for Determining Booth Recoder Input Controls Signals", US patent, 5,280,439, Jan. 18, 1994.
  12. Michael J. Schulte and Earl E. Swartzlander, Jr, "Hardware Design for Exactly Rounded Elementary functions," IEEE Transaction on Computer, vol.43, no.8, pp.964-973, Aug. 1994. https://doi.org/10.1109/12.295858
  13. M. Roorda, "Method to reduce the sign bit extension in a multiplier that uses the modified booth algorithm, "Electronics letters, vol.22, no.20, pp.1061-1062, 1986. https://doi.org/10.1049/el:19860727