The Optimal Extraction Method of Adder Sharing Component for Inner Product and its Application to DCT Design

내적연산을 위한 가산기 공유항의 최적 추출기법 제안 및 이를 이용한 DCT 설계

  • Im, Guk-Chan (Dept.of Electronics Engineering, Kyunghee University) ;
  • Jang, Yeong-Jin (Dept.of Electronics Engineering, Kyunghee University) ;
  • Lee, Hyeon-Su (Dept.of Electronics Engineering, Kyunghee University)
  • 임국찬 (경희대학교 전자계산공학과) ;
  • 장영진 (경희대학교 전자계산공학과) ;
  • 이현수 (경희대학교 전자계산공학과)
  • Published : 2001.07.01

Abstract

The general DSP algorithm, like orthogonal transform or filter processing, needs efficient hardware architecture to compute inner product. The typical MAC architecture has high cost of silicon. Because of this reason, the distributed arithmetic without multiplier is widely used for implementing inner product. This paper presents the optimization to reduce required hardware in distributed arithmetic by using extraction method of adder sharing component. The optimization process uses Boltzmann-machine which is one of the neural network. This proposed method can solve problem that is increasing complexity depending on depth of inner product and compose optimal summation-network with the minimum FA and FF in a few time. The designed DCT by using Proposed method is more efficient than a ROM-based distributed arithmetic.

직교변환이나 필터처리를 위한 대부분의 DSP알고리즘에서는 내적을 효율적으로 처리할 수 있는 하드웨어 구조가 필수적이다. 내적을 계산하기 위한, 전통적인 MAC구조는 실리콘 면적의 비용이 높기 때문에 승산기가 없는 분산연산구조가 널리 사용된다. 본 논문은 분산연산구조에서 가산기 공유항을 최대로 추출하여 구현에 필요한 하드웨어의 요소를 최소화하기 위한 방법으로 신경망의 최적화 알고리즘을 이용하는 방법을 제안한다. 제안한 방법은 내적의 깊이에 따라 복잡해지는 가산기 공유항 추출 과정을 최적화함으로써 단시간에 최소의 FA와 FF를 이용한 최적의 가산-네트워크 구성이 가능하다. 또한, 제안한 방법을 적용한 DCT 설계에서는 기존의 ROM-기반 분산연산 보다도 효율적인 구성이 가능하다.

Keywords

References

  1. Bernie New, 'A distributed arithmetic approach to designing scalable DSP chips,' EDN Design Feautre, Vol. Aug-17, pp. 107-114, Aug 1995
  2. T.-S. Chang, C. Chen, C.-W. Jen, 'New distributed arithmetic algorithm and its application to IDCT,' IEE Proc. Circuits Devices Syst., vol. 146, No. 4, Aug 1999 https://doi.org/10.1049/ip-cds:19990537
  3. James P. Coughlih, Robert H. Baran, 'Neural Computation in Hopfield Networks and Boltzmann Machines,' Univ of Delaware pr, February 1995
  4. Benyamin, D., Luk, W., Villasenor, J., 'Optimizing FPGA-based vector product designs,' FCCM '99 Proc., pp. 188-197, 1999 https://doi.org/10.1109/FPGA.1999.803680
  5. Kumar, D., Parhi, K.K., 'Performance trade-off of DCT architectures in Xilinx FPGAs,' Signals Systems and Computers, Thirty-Third Asilomar Conference, Vol. 1, pp. 579-583, 1999 https://doi.org/10.1109/ACSSC.1999.832396
  6. W.H. Chen, C.H. Smith, S.C. Fralick, 'A fast computational algorithm for the discrete transform,' IEEE Trans. Commun., Vol. COM-25, pp. 1004-1008, Sept. 1977 https://doi.org/10.1109/TCOM.1977.1093941
  7. B.G. Lee, 'A new algorithm to compute the discrete cosine transform,' IEEE Trans. Acoust., Speech, Signal Processing, Vol. ASSP-32, pp. 1243-1245, Dec. 1984
  8. C. Loeffler, A. Ligtenberg, G.S. Moschytz, 'Practical fast AD DCT algorithm with 11 multiplications,' In Proc. IEEE ECASSP, vol. 2, pp. 988-991, Feb. 1989 https://doi.org/10.1109/ICASSP.1989.266596
  9. Sung Bum Pan and Rae-Hong Park, 'Unified Systolic Arrays for computation of the DCT/DST/DHT,' IEEE Trans. Circuits and Systems for video Tech., vol. 7, no. 2, April 1997 https://doi.org/10.1109/76.564119