A Kernel Module to Support High-Performance Intra-Node Communication for Multi-Core Systems

멀티 코어 시스템을 위한 고속 노드내 통신 지원 모듈

  • 진현욱 (건국대학교 컴퓨터공학부) ;
  • 강현구 (건국대학교 컴퓨터공학부) ;
  • 김종순 (건국대학교 컴퓨터공학부)
  • Published : 2007.09.15

Abstract

In parallel cluster computing systems, the efficiency of communication between computing nodes is one of important factors that decide overall system performance. Accordingly, many researchers have studied on high-performance inter-node communication. The recently launched multi-core processor, however. increases the importance of intra-node communication as well because the more the number of cores in a node, the more the number of parallel processes running in the same node. Though there have been studies on intra-node communications, these have limited considerations on the state-of-the-art systems. In this paper, we propose a Linux kernel module that minimizes the number of data copy by exploiting the memory mapping mechanism for high-performance intra-node communication. The proposed kernel module supports the Linux kernel version 2.6. The performance measurements over a multi-core system present that the proposed kernel module can achieve lower latency up to 62% and higher throughput up to 144% than an existing kernel module approach. In addition, the measurements reveal that the performance of intra-node communication can vary significantly based on whether the cores that run the communication processes are belong to the same processor package (i.e., sharing the L2 cache).

병렬 클러스터 컴퓨팅 시스템에서는 노드간의 효율적인 통신이 시스템의 전체 성능을 좌우하는 중요한 요소로 인식되어 왔다. 따라서 지금까지의 많은 연구들은 노드간 통신(inter-node communication)의 성능 향상에 초점을 맞췄다. 하지만 최근 등장한 멀티 코어 프로세서(multi-core processor)는 노드간 통신 외에도 노드내 통신(intra-node communication)의 중요성을 크게 부각시키고 있다. 이와 같이 그 중요성이 점점 더 증가하고 있는 노드내 통신의 성능을 향상시키기 위해서 여러 가지 노드내 통신향상 기법들이 제안되어 왔다. 본 논문에서는 운영체제 커널의 도움으로 노드내 통신 시 발생하는 데이터 복사를 최소화하는 기법을 제안한다. 제안된 기법은 프로세스의 통신 버퍼를 상대 프로세스의 메모리 영역에 매핑하여 데이타 복사가 한번만 발생하도록 한다. 특히 제안된 기법은 리눅스 커널 버전 2.6을 위해서 설계된다. 성능 측정은 멀티 코어 프로세서를 장착한 시스템에서 이루어 졌으며, 기존 구현과 비교하여 본 논문에서 구현된 커널 모듈이 중간 및 작은 데이타 크기에 대해서 지연시간과 처리율을 각각 최대 62%와 144% 향상시킴을 보인다. 또한 프로세스가 수행되는 코어의 위치에 따라서 다른 성능을 보일 수 있음을 보인다.

References

  1. A. Basu, V. Buch, W. Vogels, T. von Eicken, 'U-Net: A User-Level Network Interface for Parallel and Distributed Computing,' Proceedings of Symposium on Operating Systems Principles (SOSP), December 1995
  2. S. Pakin, M. Lauria, and A. Chien, 'High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet,' Proceedings of Supercomputing (SC), 1995
  3. L. Prylli and B. Tourancheau, 'BIP: a new protocol designed for high performance networking on myrinet,' Proceedings of the International Parallel Processing Symposium Workshop on Personal Computer Based Networks of Workstations, 1998
  4. D. Dunning, G. Regnier, G. McAlpine, D. Cameron, B. Shubert, F. Berry, A. Merritt, E. Gronke, and C. Dodd, 'The Virtual Interface Architecture,' IEEE Micro, Vol.18, No.2, pp. 66-76, March/April 1998
  5. J. Chase, A. Gallatin, and K. Yocum, 'End-System Optimizations for High-Speed TCP,' IEEE Communications, special issue on TCP Performance in Future Networking Environments, Vol.39, No.4, April 2001
  6. H.-W. Jin, P. Balaji, C. Yoo, J.-Y. Choi, and D. K. Panda, 'Exploiting NIC Architectural Support for Enhancing IP based Protocols on High Performance Networks,' Journal of Parallel and Distributed Computing, Vol.65, No.11, pp. 1348-1365, November 2005 https://doi.org/10.1016/j.jpdc.2005.05.025
  7. Intel Corporation, http://www.intel.com
  8. Advanced Micro Devices, Inc., http://www.amd.com
  9. Samsung Electronics Co., LTD, http://www. samsung. com
  10. Tyan Computer Corporation, http://www.tyanpsc. com
  11. L. Chai, A. Hartono, and D. K. Panda, 'Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters,' Proceedings of The IEEE International Conference on Cluster Computing (Cluster 2006), September 2006
  12. P. Geoffray, C. Pham, and B. Tourancheau, 'A Software Suite for High-Performance Communications on Clusters of SMPs,' Cluster Computing, Vol.5, No.4, pp. 353-363, October 2002 https://doi.org/10.1023/A:1019756120212
  13. T. Takahashi, S. Sumimoto, A. Hori, H. Harada, and Y. Ishikawa, 'PM2: High Performance Communication Middleware for Heterogeneous Network Environments,' Proceedings of Supercomputing (SC2000), 2000
  14. H.-W. Jin, S. Sur, L. Chai, and D. K. Panda, LiMIC: Support for High-Performance MPI Intra- Node Communication on Linux Cluster, Proceedings of the 2005 International Conference on Parallel Processing (ICPP-05), pp. 184-191, June 2005
  15. Top500 Supercomputer Sites, http://www.top 500.org/
  16. Message Passing Interface Forum, 'MPI: A Message-Passing Interface Standard,' June 1995
  17. L. Chai, S. Sur, H.-W. Jin, and D. K. Panda, 'Analysis of Design Considerations for Optimizing Multi-Channel MPI over InfiniBand,' Proceedings of Workshop on Communication Architecture for Clusters (CAC 2005), April 2005
  18. InfiniBand Trade Association, 'InfiniBand Architecture Apecification,' Release 1.0, October, 2000
  19. MPI over InfiniBand Project, http://nowlab.cse.ohiostate.edu/projects/ mpi-iba/