A dual-link CC-NUMA System Tolerant to the Multiprogramming Environment

Suh, Hyo-Joong;

doi:10.3745/KIPSTA.2004.11A.3.199

정보처리학회논문지A (The KIPS Transactions:PartA)

제11A권3호
/
Pages.199-206
/
2004
/
1598-2831(pISSN)

한국정보처리학회 (Korea Information Processing Society)

DOI QR Code

다중 프로그램 환경에 적합한 이중 연결 CC-NUMA 시스템

A dual-link CC-NUMA System Tolerant to the Multiprogramming Environment

서효중 (가톨릭대학교 컴퓨터정보공학부)

Suh, Hyo-Joong

발행 : 2004.06.01

https://doi.org/10.3745/KIPSTA.2004.11A.3.199 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

다중 프로세서 시스템에서 여러 개의 프로그램이 동시에 수행될 경우의 프로그램 수행 성능은 각 프로세스를 어떠한 물리적 위치의 프로세서에 할당하여 수행하는가에 따라 다르게 나타난다. 일반적으로 시공간적으로 인접한 프로세서에 동일 프로그램의 프로세서를 할당할 경우 프로세스간 통신비용이 절감되므로 가장 효율적인 결과를 얻을 수 있다. 그러나 프로세스를 할당하는 운영체제는 이와 같은 친화성을 고려하기 위하여 부가적인 처리를 필요로 하며, 실제 수행시 각 프로그램은 독립적으로 수행되므로, 여러 프로그램으로부터 발생한 프로세스를 할당하는 방법은 많은 계산을 필요로 한다. 이중 링 구조의 CC-NUMA 시스템의 경우 특히 다수의 공유 메모리 접근에 의한 많은 트랜잭션이 발생하며, 연결망 부하의 불균등에 따른 병목 현상을 나타내므로, 프로세스의 할당 정책에 따라서 큰 성능 차이를 나타내게 된다. 본 논문은 규일한 연결망 부하특성을 나타내며, 프로세스 할당 정책을 필요로 하지 않는 CC-NUMA 시스템을 제시한다. 논문에서 제시하는 구조는 이중 링 구조와 동일한 연결망 비용을 나타내며, 건너뜀 연결을 이용한 균등한 부하 분배를 수행함으로써 프로세스 할당 정책의 유무와 무관한 성능을 보이다. 프로그램 구동 시뮬레이션을 통한 검증 결과 시스템은 이중 링 구조의 CC-NUMA 시스템에 비하여 1.5배의 성능 개선을 나타냈다.

Under the multiprogrammed situation, the performance of multiprocessor system is affected by the process allocation policy of the operating systems. The lowest communication cost can be achieved when the related processes positioned to the adjacent processors. While the effective allocation is quite difficult to the real situation, and the processing of the allocation policy consumes some computation time. The dual-ring CC-NUMA systems exhibit a quite performance difference according to the process a1location policy due to a lot of unbalanced memory transactions on the interconnection networks. In this paper, I propose a load balanced dual-link CC-NUMA system that does not requires the processes allocation policy. By the program-driven simulation results. the proposed system shows no remarkable difference according to the allocation policy while the dual-ring systems shows 10％ performance improvement by the process allocation. In addition, the proposed system outperforms the dual~ring systems about 1.5 times.

키워드

참고문헌

A. Gupta, A. Tucker and S. Urushibara, 'The impact of perating system scheduling policies and synchronization methods on the performance of parallel applications,' In Proc. of SIGMETRICS, pp.120-132, 1991 https://doi.org/10.1145/107971.107985
http://www.dg.com/
http://panda.snu.ac.kr/nrl/
IEEE Computer Society, IEEE Standard for Scalable Co-herent lnterface(SCI), Institute of Electrical and Electronics Engineers, Aug, 1993
Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Wolf-Dietrich Weber, Anoop Gupta, John Hennessy, Mark Horowitz and Monica S. Lam, 'The Stanford Dash multiprocessor,' Computer, Vol.25, No.3, pp.63-79, Mar., 1992 https://doi.org/10.1109/2.121510
G. Agarwal, Lim, Kranz and Kubiatowicz, 'APRIL A processor architecture for multiprocessing,' Proc. of the 17th Annual International Symp. on Computer Architecture, pp.104-114, May, 1990 https://doi.org/10.1145/325164.325119
Y. Zhang, H. Franke, J. E. Moreira and A. Sivasu-bramaniam, 'Improving parallel job scheduling by combining gang scheduling and backfilling techniques,' Proc. of International Parallel and Distributed Processing Symp., pp.133-144, May, 2000 https://doi.org/10.1109/IPDPS.2000.845975
D. L. Black, 'Scheduling support for concurrency and parallelism in the mach operating system,' IEEE Trans. on Computer, pp.35-43, May, 1990 https://doi.org/10.1109/2.53353
Tom Lovett and Russel Clapp, 'STiNG: A CC-NUMA Computer System for the Commercial Marketplace,' Proc. of the 23th International Symp. on Computer Architecture, pp.308-317, May, 1996 https://doi.org/10.1109/ISCA.1996.10001
L. Barroso and M. Dubois, 'The Performance of Cache-Coherent Ring-based Multiprocessors,' Proc. of the 20th International Symp. on Computer Architecture, pp.268-277, May, 1993
Hitoshi Oi and N. Ranganatban, 'A Comparative Study of Bidirectional Ring and Crossbar Interconnection Networks,' Proc. of the 1998 International Conf. on Parallel and Distributed Processing Techniques and Applications, pp.883-890, Jul., 1998
S. Gupta and S. Abraham, 'A distributed directory cache coherence scheme and its effects on network performance,' Journal of High Performance Computing, Vol.2, No.1, pp.3-16, Nov/Dec., 1995
E. P. Markatos and T. J. LeBlanc, 'Using Processor Affinity in Loop Scheduling on Shared Memory Multi-processors,' IEEE Trans. on Parallel and Distributed Systems, Vol.5, No.4, pp.379-400, Apr., 1994 https://doi.org/10.1109/71.273046
A-T. Nguyen, M. Michael, A. Sharma and J. Torrellaz, 'The Augmint multiprocessor simulation toolkit for Intel x86 architecture,' Proc. of the IEEE International Conf. on Computer Design, Oct, 1996
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh and A. Gupta, 'Methodological considerations and characterization of the SPLASH-2 parallel application suite,' Proc. of the 22th Annual International Symp. on Computer Architecture, pp.24-36, 1995 https://doi.org/10.1145/225830.223990

정보처리학회논문지A (The KIPS Transactions:PartA)

다중 프로그램 환경에 적합한 이중 연결 CC-NUMA 시스템

A dual-link CC-NUMA System Tolerant to the Multiprogramming Environment

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)