Performance Analysis of Cache and Internal Memory of a High Performance DSP for an Optimal Implementation of Motion Picture Encoder

고성능 DSP에서 동영상 인코더의 최적화 구현을 위한 캐쉬 및 내부 메모리 성능 분석

  • 임세훈 (숭실대학교 정보통신전자공학부) ;
  • 정선태 (숭실대학교 정보통신전자공학부)
  • Published : 2008.05.31


High Performance DSP usually supports cache and internal memory. For an optimal implementation of a multimedia stream application on such a high performance DSP, one needs to utilize the cache and internal memory efficiently. In this paper, we investigate performance analysis of cache, and internal memory configuration and placement necessary to achieve an optimal implementation of multimedia stream applications like motion picture encoder on high performance DSP, TMS320C6000 series, and propose strategies to improve performance for cache and internal memory placement. From the results of analysis and experiments, it is verified that 2-way L2 cache configuration with the remaining memory configured as internal memory shows relatively good performance. Also, it is shown that L1P cache hit rate is enhanced when frequently called routines and routines having caller-callee relationships with them are continuously placed in the internal memory and that L1D cache hit rate is enhanced by the simple change of the data size. The results in the paper are expected to contribute to the optimal implementation of multimedia stream applications on high performance DSPs.


High Performance DSP;Motion Picture Encoder;Cache;Internal Memory;SW Optimization


  1. 경종민, 박인철, 양진혁, 남상준, 이승종, 김병운, 박봉일, 박창재, 장유성, 고성능 마이크로프로세서 구조 및 설계 방법, 대영사, 2000.
  2. P. R. Panda, N. D. Dutt, and A. Nicolau, "On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems," ACM Trans. Design Automation Electron. Syst., Issue 3, Vol5, pp.682-704, 2000(7).
  3. P. R. Panda, "Data and Memory Optimization Techniques for Embedded Systems," ACM Transactions on Design Automation of Electronic Systems, pp.149-206, Vol.6, No.2, 2001(4).
  4. TMS320C6000 CPU and Instruction Set Reference Guid,. No. SPRU189, Texas Instruments, 2000(1).
  5. R. Cucchiara, M. Piccardi, and A. Prati, "Exploiting Cache in Multimedia," Proc. International Conference on Multimedia Computing and Systems (IEEE ICMCS99), Vol.1, Italy, pp.345-350, 1999(6).
  6. TMS320C6713 Datasheet, Texas Instruments
  7. B. Erol, F. Kossentini, and H. Alnuweiri, "Implementation of a fast H.263+ encoder/ decoder," Proc. IEEE Asilomar Conf. Alsilomar Conf. on Signals, Systems and Computers, Vol.1, pp.462-466, 1998(11).
  8. E. G. R. Iain, Video Codec Design, John Wiley, 2002.
  9. F. Jason and W. Wayne, "Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors," IEEE Workshop on Signal Processing Systems, pp.59-85, 1998(3).
  10. F. Jason and W. Wayne, "Instruction fetch characteristics of media processing," SPIE Photonics West, Media Processors 2002, San Jose, CA, pp.72-83, 2002(1).
  11. F. Jason, W. Wayne, and L. Bede, "Understanding multimedia application characteristics for designing programmable media processors," SPIE Photonics West, Media Processors '99, San Jose, CA, pp.2-13, 1999(1).
  12. S. Bartolini, and C. A. Prete, "Optimizing instruction cache performance of embedded systems," ACM Transactions on Embedded Computing Systems (TECS), Issue4, Vol.4, 2005(11).
  13. P. R. Panda, N. D. Dutt, and A. Nicolau, "Memory Data Organization for Improved Cache Performance in Embedded Processor Applications," Proc. Int’l Symp. System-Level Synthesis (ISSS-96), pp.90-95, 1996(11).
  14. H. R. Sheikh, "Optimization of a Basline H.263 Video Encoder on the TMS320C6000," Proc. Texas Instruments DSP Educator's Conference, 2000(8).
  15. H. Miyazawa, H.263 Encoder: TMS320C6000 Implementation, Application Report SPRA721, Texas Instruments, 2000(12).
  16. TMS320C6000 DSP Cache User's Guide. No. SPRU656, Texas Instruments, 2000(1).