DOI QR코드

DOI QR Code

Hardware Design of High Performance Arithmetic Unit with Processing of Complex Data for Multimedia Processor

복소수 데이터 처리가 가능한 멀티미디어 프로세서용 고성능 연산회로의 하드웨어 설계

  • Received : 2015.11.23
  • Accepted : 2016.01.04
  • Published : 2016.01.31

Abstract

In this paper, a high-performance arithmetic unit which can efficiently accelerate a number of algorithms for multimedia application was designed. The 3-stage pipelined arithmetic unit can execute 38 operations for complex and fixed-point data by using efficient configuration for four 16-bit by 16-bit multipliers, new sign extension method for carry-save data, and correction constant scheme to eliminate sign-extension in compression operation of multiple partial multiplication results. The arithmetic unit has about 300-MHz operating frequency and about 37,000 gates on 45nm CMOS technology and its estimated performance is 300 MCOPS(Million Complex Operations Per Second). Because the arithmetic unit has high processing rate and supports a number of operations dedicated to various applications, it can be efficiently applicable to multimedia processors.

본 논문에서는 멀티미디어용 알고리즘을 고속으로 처리하기 위한 고성능 연산 회로를 설계하였다. 3단 파이프라인 구조로 동작하는 연산회로는 4개의 16-비트${\times}$16-비트 곱셈기의 효율적인 구성, 캐리 보존 형식 데이터에 대한 새로운 부호 확장 기법과 다수 개의 부분 곱셈 결과의 통합과정에 부호 확장을 제거하는 교정 상수 기법을 사용하여 복소수 데이터와 가변 길이 고정 소수점 데이터에 대한 38개의 연산을 처리할 수 있다. 설계한 프로세서는 45nm CMOS 공정에서 최대 동작 속도는 300 MHz이며 약 37,000 게이트로 구성되며 300 MCOPS의 연산 성능을 갖는다. 연산 프로세서는 높은 연산 속도와 응용 분야에 특화된 다양한 연산 지원으로 멀티미디어 프로세서에 효율적으로 응용 가능하다.

Keywords

References

  1. Ruby B. Lee, "Subword Parallelism with MAX-2," IEEE Micro, vol.16, no. 4, pp.51-59, August 1996. https://doi.org/10.1109/40.526925
  2. Ruby B. Lee, "Accelerating Multimedia with Enhanced Microprocessors," IEEE Micro, vol.15, no.2, pp.22-32, April, 1995. https://doi.org/10.1109/40.372347
  3. QualComm, Hexagon V2 Programmer's Reference Manual, 80-NB419-1 Rev.A, August 2011.
  4. Israel Koren, Computer Arithmetic Algorithms, ch.5-6, CRC Press, 1993.
  5. Aamir Alam Farooqui, "VLSI Arithmetic for Media Signal Processing," Ph.D dissertation, ECE department, UC. Davis, 2000.
  6. Hyuk-Jun Lee and Michael Flynn, "Designing a Partitionable Multiplier," Stanford University, Technical Report CSL-TR-98-772, October 1998.
  7. Hesham Al-Twaijry and Michael Flynn. "Performance/Area Tradeoffs in Booth Multipliers," Stanford University, Technical Report CSL-TR-95-684, November 1995.
  8. Alexander F. Tenca, Song Park, and Lo'al A. Tawalbeh, "Carry-Save Representation Is Shift-Unsafe: The Problem and Its Solution," IEEE Transactions on Computers, vol. 55, no.5, pp.630-635, May 2006. https://doi.org/10.1109/TC.2006.70
  9. Stuart F. Oberman, and Ming Y. Siu, "A High-Performance Area-Efficient Multifunction Interpolator," Proc. of the 17th IEEE Symposium on Computer Arithmetic(ARITH'05), pp.271-279, 2005.
  10. M. Roorda, "Method to reduce the sign bit extension in a multiplier that uses the modified booth algorithm," Electronics Letters, vol.22. no.20, pp.1061-1062, 25th September 1986. https://doi.org/10.1049/el:19860727
  11. Christoper Fritz and Adly T. Farm, "The Interlaced Partition Multiplier," IEEE Trans. on Computer[online], no. 1, pp. 1, PrePrints, doi:10.1109/TC.2015.2481379, Available: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7274668.