DOI QR코드

DOI QR Code

Performance Evaluation and Analysis for Discrete Wavelet Transform on Many-Core Processors

매니코어 프로세서 상에서 이산 웨이블릿 변환을 위한 성능 평가 및 분석

  • 박용훈 (울산대학교 전기공학부) ;
  • 김종면 (울산대학교 전기공학부)
  • Received : 2012.06.18
  • Accepted : 2012.08.01
  • Published : 2012.10.31

Abstract

To meet the usage of discrete wavelet transform (DWT) on potable devices, this paper implements 2-level DWT using a reference many-core processor architecture and determine the optimal many-core processor. To explore the optimal many-core processor, we evaluate the impacts of a data-per-processing element ratio that is defined as the amount of data mapped directly to each processing element (PE) on system performance, energy efficiency, and area efficiency, respectively. This paper utilized five PE configurations (PEs=16, 64, 256, 1,024, and 4,096) that were implemented in 130nm CMOS technology with a 720MHz clock frequency. Experimental results indicated that maximum energy and area efficiencies were achieved at PEs=1,024. However, the system area must be limited 140mm2 and the power should not exceed 3 watts in order to implement 2-level DWT on portable devices. When we consider these restrictions, the most reasonable energy and area efficiencies were achieved at PEs=256.

Keywords

References

  1. 유제택, 현명한, 남주훈, "DSP와 FPGA의 Co-design을 이용한 원격 측정용 임베디드 JPEG2000 시스템 구현," 한국항공우주학회 논문지, Vol. 39, No. 9, pp.896-903, 2011.
  2. 이만희, 박인규, 원석진, 조성대, "GPU를 이용한 DWT 및 JPEG2000의 고속 연산,", 전자공학회논문지, Vol. 44, No. 6, pp.625-631, 2007.
  3. 이승권, 공진흥, "JPEG2000 이산 웨이블릿 변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계," 전자공학회논문지, Vol. 46, No. 7, pp.543-552, 2009.
  4. 서영호, 김왕현, 김종현, 김동욱, "실시간 2차원 웨이블릿 영상 압축기의 FPGA 구현," 한국통신학회 논문지, Vol. 27, No. 7, pp.683-694, 2002.
  5. 정갑천, 박성모, "JPEG2000 영상 압축을 위한라인 기반의 리프팅 DWT 구조 설계," 전자공학회 논문지, Vol. 41, No. 11, pp.1061-1068, 2004.
  6. A.D. Bias, "The UCSC Kestrel parallel processor," IEEE Transactions on Parallel and Distributed Systems, Vol. 16, No. 1, pp.80-92, 2005. https://doi.org/10.1109/TPDS.2005.12
  7. A. Gentile, D.S. Wills, "Portable video supercomputing," IEEE Transactions on Computers, Vol. 53, No. 8, pp.960-973, 2004. https://doi.org/10.1109/TC.2004.48
  8. L.V. Huynh, 김철홍, 김종면, "퍼지 벡터 양자화를 위한 대규모 병렬 알고리즘", 한국정보처리학회 논문지, Vol. 16, No. 6, pp.411-418, 2009.
  9. I. Daubechies, "Orthonormal bases of Compactly Supported Wavelets," Communications on Pure and Applied Mathematics, Vol. 41, No. 7, pp.909-996, 1988. https://doi.org/10.1002/cpa.3160410705
  10. S.M. Chai, T. Taha, D.S. Wills, J.D. Meindl, "Heterogeneous architecture models for interconnect-motivated system design," IEEE Transactions on VLSI Systems, Vol. 8, No. 6, pp.660-670, 2000. https://doi.org/10.1109/92.902260
  11. A. Gentile, S. Sander, L. Wills, D.S. Wills, "The impact of grain size of the efficiency of embedded SIMD image processing architecture," Journal of Parallel Distributed Computing, Vol. 64, No. 11, pp.1318-1327, 2004. https://doi.org/10.1016/j.jpdc.2004.06.013
  12. International Technology Roadmap for Semiconductors 2009 Edition, http://www.itrs.net/Links/2009ITRS/Home2009.htm