• 제목/요약/키워드: Parallel Computing

검색결과 807건 처리시간 0.026초

다분야통합최적설계 방법론의 병렬처리 성능 분석 (Performances of Multidisciplinary Design Optimization Methodologies in Parallel Computing Environment)

  • 안문열;이세정
    • 대한기계학회논문집A
    • /
    • 제31권12호
    • /
    • pp.1150-1156
    • /
    • 2007
  • Multidisciplinary design optimization methodologies play an essential role in modern engineering design which involves many inter-related disciplines. These methodologies usually require very long computing time and design tasks are hard to finish within a specified design cycle time. Parallel processing can be effectively utilized to reduce the computing time. The research on the parallel computing performance of MDO methodologies has been just begun and developing. This study investigates performances of MDF, IDF, SAND and CO among MDO methodologies in view of parallel computing. Finally, the best out of four methodologies is suggested for parallel processing purpose.

엣지 디바이스에서의 병렬 프로그래밍 모델 성능 비교 연구 (A Performance Comparison of Parallel Programming Models on Edge Devices)

  • 남덕윤
    • 대한임베디드공학회논문지
    • /
    • 제18권4호
    • /
    • pp.165-172
    • /
    • 2023
  • Heterogeneous computing is a technology that utilizes different types of processors to perform parallel processing. It maximizes task processing and energy efficiency by leveraging various computing resources such as CPUs, GPUs, and FPGAs. On the other hand, edge computing has developed with IoT and 5G technologies. It is a distributed computing that utilizes computing resources close to clients, thereby offloading the central server. It has evolved to intelligent edge computing combined with artificial intelligence. Intelligent edge computing enables total data processing, such as context awareness, prediction, control, and simple processing for the data collected on the edge. If heterogeneous computing can be successfully applied in the edge, it is expected to maximize job processing efficiency while minimizing dependence on the central server. In this paper, experiments were conducted to verify the feasibility of various parallel programming models on high-end and low-end edge devices by using benchmark applications. We analyzed the performance of five parallel programming models on the Raspberry Pi 4 and Jetson Orin Nano as low-end and high-end devices, respectively. In the experiment, OpenACC showed the best performance on the low-end edge device and OpenSYCL on the high-end device due to the stability and optimization of system libraries.

PDP 시스템의 실시간 모니터링 및 시각화 (Realtime Monitoring and Visualization for PDP System)

  • 김수자;송은하;박복자;정영식
    • 한국멀티미디어학회논문지
    • /
    • 제7권5호
    • /
    • pp.755-765
    • /
    • 2004
  • 최근에 많은 유휴 상태의 호스트 자원들을 이용한 인터넷 기반 분산/병렬 컴퓨팅은 대용량 작업처리와 여러 중요 논제들에 대해 그 유용성이 증명되고 있다. 대용량 작업이 수행되는 동안, 작업에 참여하는 호스트의 성능과 상태 변화에 대처하기 위한 실시간 모니터링 기능이 요구된다. 본 연구에서는 글로벌 컴퓨팅 (global computing) 인트라스트럭처(infrastructure)로 구축된 인터넷 기반 분산/병렬 처리 프레임워크인 PDP(Parallel Distributed Processing)상의 실시간 모니터링 및 시각화에 대한 내용을 소개한다.

  • PDF

Performance Comparison of Parallel Programming Frameworks in Digital Image Transformation

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권3호
    • /
    • pp.1-7
    • /
    • 2019
  • Previously, parallel computing was mainly used in areas requiring high computing performance, but nowadays, multicore CPUs and GPUs have become widespread, and parallel programming advantages can be obtained even in a PC environment. Various parallel programming frameworks using multicore CPUs such as OpenMP and PPL have been announced. Nvidia and AMD have developed parallel programming platforms and APIs for program developers to take advantage of multicore GPUs on their graphics cards. In this paper, we develop digital image transformation programs that runs on each of the major parallel programming frameworks, and measure the execution time. We analyze the characteristics of each framework through the execution time comparison. Also a constant K indicating the ratio of program execution time between different parallel computing environments is presented. Using this, it is possible to predict rough execution time without implementing a parallel program.

배터리 팩 수치해석 해의 비교를 통한 병렬연산 효율성 연구 (A Study for Parallel Computing Efficiency Comparing Numerical Solutions of Battery Pack)

  • 김광선;장경민
    • 반도체디스플레이기술학회지
    • /
    • 제15권2호
    • /
    • pp.20-25
    • /
    • 2016
  • The parallel computer cluster system has been known as the powerful tool to solve a complex physical phenomenon numerically. The numerical analysis of large size of Li-ion battery pack, which has a complex physical phenomenon, requires a large amount of computing time. In this study, the numerical analyses were conducted for comparing the computing efficiency between the single workstation and the parallel cluster system both with multicore CPUs'. The result shows that the parallel cluster system took the time 80 times faster than the single work station for the same battery pack model. The performance of cluster system was increased linearly with more CPU cores being increased.

Edge Computing 환경에서의 Stale Synchronous Parallel Model 연구 (Stale Synchronous Parallel Model in Edge Computing Environment)

  • 김동현;이병준;김경태;윤희용
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2018년도 제57차 동계학술대회논문집 26권1호
    • /
    • pp.89-92
    • /
    • 2018
  • 본 논문에서는 Edge computing 환경에서 다수의 노드들로 구성된 네트워크의 디바이스를 효율적으로 관리하기 위한 방법을 제안한다. 기존의 클라이언트-서버 모델은 모든 데이터와 그에 대한 요청을 중심 서버에서 처리하기 때문에, 다수의 노드로부터 생성된 많은 양의 데이터를 처리하는 데 빠른 응답속도를 보장하지 못한다. Edge computing은 분담을 통해 네트워크의 부담을 줄일 수 있는 IoT 네트워크에 적합한 방법으로, 데이터를 전송하고 받는 과정에서 네트워크의 대역폭을 사용하는 대신 서로 연결된 노드들이 협력해서 데이터를 처리하고, 또한 네트워크 말단에서의 데이터 처리가 허용되어 데이터 센터의 부담을 줄일 수 있다. 여러병렬 기계학습 모델 중 본 연구에서는 Stale Synchronous Parallel(SSP) 모델을 이용하여 Edge 노드에서 분산기계 학습에 적용하였다.

  • PDF

병렬컴퓨팅 환경에서의 대용량 퍼지 추론 (Fuzzy Inference of Large Volumes in Parallel Computing Environment)

  • 김진일;박찬량;이동철;이상구
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2000년도 춘계학술대회 학술발표 논문집
    • /
    • pp.13-16
    • /
    • 2000
  • In fuzzy expert systems or database systems that have huge volumes of fuzzy data or large fuzzy rules, the inference time is much increased. Therefore, a high performance parallel fuzzy computing environment is needed. In this paper, we propose a parallel fuzzy inference mechanism in parallel computing environment. In this, fuzzy rules are distributed and executed simultaneously. The ONE_TO_ALL algorithm is used to broadcast the fuzzy input vector to the all nodes. The results of the MIN/MAX operations are transferred to the output processor by the ALL_TO_ONE algorithm. By parallel processing of fuzzy rules or data, the parallel fuzzy inference algorithm extracts effective parallel ism and achieves a good speed factor.

  • PDF

Performance Optimization of Parallel Algorithms

  • Hudik, Martin;Hodon, Michal
    • Journal of Communications and Networks
    • /
    • 제16권4호
    • /
    • pp.436-446
    • /
    • 2014
  • The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). Algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributed-memory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.

Adaptive and optimized agent placement scheme for parallel agent-based simulation

  • Jin, Ki-Sung;Lee, Sang-Min;Kim, Young-Chul
    • ETRI Journal
    • /
    • 제44권2호
    • /
    • pp.313-326
    • /
    • 2022
  • This study presents a noble scheme for distributed and parallel simulations with optimized agent placement for simulation instances. The traditional parallel simulation has some limitations in that it does not provide sufficient performance even though using multiple resources. The main reason for this discrepancy is that supporting parallelism inevitably requires additional costs in addition to the base simulation cost. We present a comprehensive study of parallel simulation architectures, execution flows, and characteristics. Then, we identify critical challenges for optimizing large simulations for parallel instances. Based on our cost-benefit analysis, we propose a novel approach to overcome the performance constraints of agent-based parallel simulations. We also propose a solution for eliminating the synchronizing cost among local instances. Our method ensures balanced performance through optimal deployment of agents to local instances and an adaptive agent placement scheme according to the simulation load. Additionally, our empirical evaluation reveals that the proposed model achieves better performance than conventional methods under several conditions.

Debugging of Parallel Programs using Distributed Cooperating Components

  • Mrayyan, Reema Mohammad;Al Rababah, Ahmad AbdulQadir
    • International Journal of Computer Science & Network Security
    • /
    • 제21권12spc호
    • /
    • pp.570-578
    • /
    • 2021
  • Recently, in the field of engineering and scientific and technical calculations, problems of mathematical modeling, real-time problems, there has been a tendency towards rejection of sequential solutions for single-processor computers. Almost all modern application packages created in the above areas are focused on a parallel or distributed computing environment. This is primarily due to the ever-increasing requirements for the reliability of the results obtained and the accuracy of calculations, and hence the multiply increasing volumes of processed data [2,17,41]. In addition, new methods and algorithms for solving problems appear, the implementation of which on single-processor systems would be simply impossible due to increased requirements for the performance of the computing system. The ubiquity of various types of parallel systems also plays a positive role in this process. Simultaneously with the growing demand for parallel programs and the proliferation of multiprocessor, multicore and cluster technologies, the development of parallel programs is becoming more and more urgent, since program users want to make the most of the capabilities of their modern computing equipment[14,39]. The high complexity of the development of parallel programs, which often does not allow the efficient use of the capabilities of high-performance computers, is a generally accepted fact[23,31].