• Title/Summary/Keyword: OpenMP

Search Result 178, Processing Time 0.035 seconds

Implementation and Translation of Major OpenMP Directives for Chip Multiprocessor without using OS (단일 칩 다중 프로세서상에서 운영체제를 사용하지 않은 OpenMP 구현 및 주요 디렉티브 변환)

  • Jeun, Woo-Chul;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.4
    • /
    • pp.145-157
    • /
    • 2007
  • OpenMP is an attractive parallel programming model for a chip multiprocessor because there is no standard parallel programming method for a chip multiprocessor and it is easy to write a parallel program in OpenMP. Then, chip multiprocessor systems can have various architectures according to target application programs. So, we need to implement OpenMP in different way for each system. In this paper, we propose the implementation and the effective translation of major OpenMP directives for a chip multiprocessor without using OS to improve the performance without using special hardware and without extending the OpenMP directives. We present the experimental results on our target platform CT3400.

OpenMP Implementation using POSIX thread library on ARM11MPCore (ARM11MPCore에서 POSIX 쓰레드를 이용한 OpenMP 구현)

  • Lee, Jae-Won;Jeun, Woo-Chul;Ha, Soon-Hoi
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10b
    • /
    • pp.414-418
    • /
    • 2007
  • 멀티프로세서 환경에서 OpenMP는 MPI 에 비해 병렬 프로그래밍을 쉽게 할 수 있다는 장점을 가지고 있고, OpenMP는 표준이 없는 병렬 프로그래밍 세계에서 실질적인 표준으로써 인정받고 있다. OPenMP는 대상 플랫폼에 따라 OpenMP 구현을 다르게 해야 하기 때문에 새로운 프로세서가 등장하면 그에 맞는 OpenMP구현을 만들어야 한다. 이 논문에선 다중 프로세서 시스템-온-칩 시스템인 ARM11MPCore 시스템 위에 POSIX 쓰레드에 기반하여 OpenMP 환경을 구축하고 그 성능을 측정한다.

  • PDF

Performance Analysis of a Parallel Mesh Smoothing Algorithm using Graph Coloring and OpenMP (그래프 컬러링과 OpenMP를 이용한 병렬 메쉬 스무딩 알고리즘의 성능 분석)

  • Shin, Myeonggyu;Kim, Jibum
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.6
    • /
    • pp.80-87
    • /
    • 2016
  • We propose a parallel mesh smoothing algorithm using graph coloring and OpenMP library for shared memory many core computer architectures. The proposed algorithm partitions a mesh into independent sets and performs a parallel mesh smoothing using OpenMP library. We study the effect of using various graph coloring and color reordering algorithms on the efficiency of performing the proposed parallel mesh smoothing algorithm. We also investigate the influence of using various OpenMP loop scheduling methods on the parallel mesh smoothing efficiency.

Performance and Scalability of OpenMP Programs on Chip-MultiThreading Server (칩 멀티쓰레딩 서버에서 OpenMP 프로그램의 성능과 확장성)

  • Lee Myung-Ho;Kim Yong-Kyu
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.137-146
    • /
    • 2006
  • Shared Memory Multiprocessor (SMP) systems adopting Chip-level MultiThreading (CMT) technology are becoming mainstream servers in commercial applications and High Performance Computining (HPC) applications as well. OpenMP has become the standard paradigm to parallelize applications for SMP mostly because of its ease of use. As the demand for more computing power in HPC applications is growing rapidly, obtaining high performance and scalability for these applications parallelized using OpenMP API's will become more important. In this paper, we study the performance and scalability of HPC applications parallelized using OpenMP, SPEC OMPL (standard OpenMP benchmark suite), on the Sun Fire E25K server which adopts CMT technology. We also study the effect of CMT on SPEC OMPL.

Efficient Translation of OpenMP Directives for Cluster Systems (클러스터 시스템을 위한 효과적인 OpenMP 디렉티브 변환)

  • 기양석;하순회
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.10-12
    • /
    • 2003
  • SMP 클러스터가 고성능 계산을 위한 플랫폼으로 등장함에 따라, 이 시스템을 활용하기 위한 프로그래밍 환경에 대한 관심이 증가하고 있다. 이 논문에서 우리는 ParADE라고 부르는 쉽고, 이식성이 높으며. 고성능의 프로그래밍이 가능한 새로운 프로그래밍 환경을 소개한다. ParADE는 OpenMP 프로그래밍 환경으로 HLRC 변종 프로토콜을 구현한 다중 쓰레드 DSM 시스템을 기반으로 하고 있다. 특별히. 이 논문에서는 성능 개선을 위한 OpenMP 변환기의 역할에 중점을 둔다. OpenMP 변화기는 OpenMP 프로그램 모델과 실행 시스템의 수행 모델 사이에서 가교 역할을 한다. 특히, OpenMP 변환기는 동기화 디렉티브를 변환하고 임계 영역에 있는 작은 변수의 메모리 일관성을 유지하기 위해 집합 통신 함수를 활용한다. 동기화 디렉티브 성능 측정을 위한 마이크로벤치마크 프로그램을 통한 실험에서 ParADE 시스템은 기존의 DSM 시스템에 비해 우수한 성능을 보였다.

  • PDF

Survey and Analysis of OpenMP Specifications (OpenMP 명세에 대한 고찰 및 분석)

  • Lee, Jong-Woo;Park, Chan-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.621-624
    • /
    • 2000
  • 메시지 전달 방식과 공유 메모리 방식은 병렬 컴퓨터 시스템을 위한 대표적인 아키텍쳐이다. 이 중 공유 메모리 방식은 프로그래밍의 용이함으로 인해 메시지 전달 방식에 비해 많이 채택되고 있는 실정이다. 하지만 하드웨어 벤더마다 각기 다른 공유 메모리 프로그래밍 인터페이스를 제공하기 때문에, 코드 호환성이 주 관심사인 경우에는 프로그래밍의 불편함을 감수하면서 MPI 나 PVM 등을 이용한 메시지 전달 구조를 채택하는 경우가 자주 발생한다. 본 논문에서는 공유 메모리 병렬 컴퓨터 시스템을 위한 프로그래밍 인터페이스 표준인 OpenMP 명세에 대해 고찰, 분석한 결과를 제시한다. OpenMP 명세의 등장 배경 및 발전 과정 등을 기술하고, OpenMP 명세의 분분별 규정 내용을 요약한다. 또한 OpenMP 명세에 따라 기존 C 프로그램을 수정한 예도 보인다. 본 논문의 목적은 OpenMP 라는 공유 메모리 프로그래밍 인터페이스 표준을 소개하고, 이에 대한 관심을 높임으로써 관련 연구를 활성화시키는데 있다.

  • PDF

ePRO-OMP: A Tool for Performance/Energy PRofiler and Analyzer for OpenMP Applications (ePRO-OMP: OpenMP 응용 프로그램의 성능 및 에너지 분석 도구)

  • Lee, Young-Ho;Kim, Jihong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.5
    • /
    • pp.287-293
    • /
    • 2011
  • As chip multiprocessors have been widely adopted in embedded systems, achieving both high performance and low power consumptions of parallel applications becomes challenging. In order to meet these requirements, it is crucial for developers to analyze the performance and energy consumption of parallel applications. In this paper, we propose a tool for profiling and optimizing the performance and energy consumption of OpenMP applications (energy PROfiler and analyzer for OpenMP: ePRO-OMP). The main advantage of ePRO-OMP is that it can analyze both the performance and energy consumption of each parallel region of an OpenMP application, which can help developers find the bottleneck of parallel applications in detail.

A Detection Tool of First Races in OpenMP Programs with Directives (OpenMP 디렉티브 프로그램의 최초경합 탐지를 위한 도구)

  • Kang, Mun-Hye;Ha, Ok-Kyoon;Jun, Yong-Kee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.1
    • /
    • pp.1-7
    • /
    • 2010
  • Detecting data races is important for debugging programs with OpenMP directives, because races result in unintended non-deterministic executions of the program. It is especially important to detect the first data races to occur for effective debugging, because the removal of such races may make other affected races disappear or appear. The previous tools for race detecting can not guarantee that detected races are the first races to occur. This paper suggests a tool what detects the first races to occur on the program with nested parallelism using the two-pass on-the-fly technique. To show functionality of this tool, we empirically compare with the previous tools using a set of the synthetic programs with OpenMP directives.

MPI-OpenMP Hybrid Parallelization for Multibody Peridynamic Simulations (다물체 페리다이나믹 해석을 위한 MPI-OpenMP 혼합 병렬화)

  • Lee, Seungwoo;Ha, Youn Doh
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.33 no.3
    • /
    • pp.171-178
    • /
    • 2020
  • In this study, we develop MPI-OpenMP hybrid parallelization for multibody peridynamic simulations. Peridynamics is suitable for analyzing complicated dynamic fractures and various discontinuities. However, compared with a conventional finite element method, nonlocal interactions in peridynamics cost more time and memory. In multibody peridynamic analysis, the costs increase due to the additional interactions that occur when computing the nonlocal contact and ghost interlayer models between adjacent bodies. The costs become excessive when further refinement and smaller time steps are required in cases of high-velocity impact fracturing or similar instances. Thus, high computational efficiency and performance can be achieved by parallelization and optimization of multibody peridynamic simulations. The analytical code is developed using an Intel Fortran MPI compiler and OpenMP in NURION of the KISTI HPC center and parallelized through MPI-OpenMP hybrid parallelization. Further parallelization is conducted by hybridizing with OpenMP threads in each MPI process. We also try to minimize communication operations by model-based decomposition of MPI processes. The numerical results for the impact fracturing of multiple bodies show that the computing performance improves significantly with MPI-OpenMP hybrid parallelization.

A Verification Tool of Data Races in Programs with OpenMP Directives (OpenMP 디렉티브 프로그램을 위한 자료경합 검증도구)

  • Kim, Young-Joo;Jun, Yong-Kee
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.9
    • /
    • pp.395-406
    • /
    • 2007
  • Races in programs with OpenMP directives must be detected for debugging, because they may cause unexpected result by non-deterministic executions. But, Thread Checker of Intel corporation, a well-known existing tool for detecting the races, is not practical because this tool does not verify the existence of races and is known that the cost for race detection is too big. This paper presents a web-based tool which verify the existence of races with an optimal functionality and performance using the results from the property analysis of OpenMP program as well as the user requirements. Our tool is proved to be practical in the aspect of functionality and performance by experiments using synthetic programs, because the suggested tool can verify the existence of race and shows O(n) as the ratio of time consumption while Thread Checker can not verify the existence of race and shows $O(n^2)$ as the ratio, where n is the number of total accesses.