• Title, Summary, Keyword: 병렬컴퓨팅

Search Result 418, Processing Time 0.037 seconds

A Survey on Massive Data Processing Model in Cloud Computing (클라우드 컴퓨팅에서의 대용량 데이터 처리 모델에 관한 조사)

  • Jin, Ah-Yeon;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.145-146
    • /
    • 2011
  • 클라우드 컴퓨팅은 세계적인 시장조사기관인 가트너사의 10대전략기술에서 2년 연속 1위를 할 정도로 많은 각광을 받고 있다. 클라우드 컴퓨팅이란 인터넷 기술을 활용하여 가상화된 컴퓨팅 자원을 서비스로 제공하는 것으로, 사용자는 IT자원을 필요한 만큼 빌려서 사용하고 사용한 만큼 비용을 지불하는 컴퓨팅을 지칭한다. 이러한 클라우드 컴퓨팅 상에서 폭발적으로 증가하고 있는 데이터를 효율적으로 병렬 처리할 수 있는 방법에 대하여 많은 연구가 활발히 이루어지고 있다. 이러한 대용량 데이터 처리를 위한 대표적인 모델에는 MapReduce와 Dryad가 있으며, 서로간에 많은 공통점이 있지만 MapReduce는 범용 프로그래밍 언어를 기반으로 쉬운 병렬 프로그래밍을 가능하게 했다는 점에서 많이 사용되고 있으며 Dryad는 재사용이 쉽고 데이터 처리 흐름을 유연하게 작성할 수 있다는 점에서 장점을 가지고 있다.

  • PDF

A Reconfigurable Load and Performance Balancing Scheme for Parallel Loops in a Clustered Computing Environment (클러스터 컴퓨팅 환경에서 병렬루프 처리를 위한 재구성 가능한 부하 및 성능 균형 방법)

  • 김태형
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.1
    • /
    • pp.49-56
    • /
    • 2004
  • Load imbalance is a serious impediment to achieving good performance in parallel processing. Global load balancing schemes cannot adequately manage to balance parallel tasks generated from a single application. Dynamic loop scheduling methods are known to be useful in balancing parallel loops on shared-memory multiprocessor machines. However, their centralized nature causes a bottleneck for the relatively small number of processors in a network of workstations because of order-of-magniture differences in communication overheads. Moreover, improvements of basis loops scheduling methods have not effectively dealt with irregularly distributed workloads in parallel loops, which commonly occur in applications for a network of workstation. In this paper, we present a new reconfigurable and decentralized balancing method for parallel loops on a network of workstations. Since our method supplements performance balancing with those tranditional load balancing methods, it minimizes the overall execution time.

Application Independent Network Protocol for Distributed and Parallel Visualization (대용량 데이터의 분산/병렬 가시화를 위한 응용 독립적 가시화 프로토콜)

  • Kim, Min-Ah
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.126-129
    • /
    • 2011
  • 대용량 데이터의 분산/병렬 가시화를 위해서는 가시화 클라이언트와 서버 사이의 프로토콜이 필요하다. 기존 가시화 도구들은 개발 도구에 특화된 프로토콜을 사용하고 있으며, 이 때문에 클라이언트와 서버는 매우 tightly-coupled 되어 있다. 본 논문에서는 응용에 독립적인 분산/병렬 가시화를 위한 가시화 프로토콜을 설계하고 구현한다. 또한, 시변환 데이터의 효율적 가시화를 위해 animation을 구현할 수 있는 프리미티브를 설계하고 status machine으로 병렬 전송된 데이터들 간의 동기화를 구현한다. 이러한 응용 독립적 가시화 프로토콜을 도입함으로써 가시화는 병렬 분산 가시화를 수행하는 그리드의 서비스나 슈퍼컴퓨팅의 서비스로 확장될 수 있을 것이다.

  • PDF

Collective I/O with Process grouping (프로세스 그룹화를 이용한 집합 I/O)

  • 차광호;홍정우;이지수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.442-444
    • /
    • 2003
  • 병렬 처리를 요구하는 계산 과학 분야의 문제들 중에는 대용량 데이터 처리를 필요로 하는 경우가 많다. 그러나 기존의 파일 시스템을 그대로 병렬처리 환경에 적용하기에는 많은 문제가 따른다. 이를 위해서 병렬처리를 지원하는 파일 시스템에 대한 연구와 개발이 진행되어 오고 있다. 이와 같은 연구 중 하나인 집합 I/O(Collective I/O)를 본 논문에서 다루고자 한다. 이 집합 I/O는 여러 프로세스의 파일 I/O 요청을 효과적으로 처리하는 방법으로 MPl2의 MPI-10에도 포함되어 있다. 본 논문에서는 어플리케이션 프로그램 측면에서 MPI-10의 집합 I/O를 효과적으로 사용하기 위한 방안을 제시하며, 보편적으로 사용되는 NFS를 이용한 클러스터 시스템에서의 실험 결과를 분석한다.

  • PDF

Parallelization of Genome Sequence Data Pre-Processing on Big Data and HPC Framework (빅데이터 및 고성능컴퓨팅 프레임워크를 활용한 유전체 데이터 전처리 과정의 병렬화)

  • Byun, Eun-Kyu;Kwak, Jae-Hyuck;Mun, Jihyeob
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.10
    • /
    • pp.231-238
    • /
    • 2019
  • Analyzing next-generation genome sequencing data in a conventional way using single server may take several tens of hours depending on the data size. However, in order to cope with emergency situations where the results need to be known within a few hours, it is required to improve the performance of a single genome analysis. In this paper, we propose a parallelized method for pre-processing genome sequence data which can reduce the analysis time by utilizing the big data technology and the highperformance computing cluster which is connected to the high-speed network and shares the parallel file system. For the reliability of analytical data, we have chosen a strategy to parallelize the existing analytical tools and algorithms to the new environment. Parallelized processing, data distribution, and parallel merging techniques have been developed and performance improvements have been confirmed through experiments.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.

A Task Scheduling Algorithm with Environment-specific Performance Enhancement Method (환경 특성에 맞는 성능 향상 기법을 사용하는 태스크 스케줄링 알고리즘)

  • Song, Inseong;Yoon, Dongsung;Park, Taeshin;Choi, Sangbang
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.5
    • /
    • pp.48-61
    • /
    • 2017
  • An IaaS service of a cloud computing environment makes itself attractive for running large scale parallel application thanks to its innate characteristics that a user can utilize a desired number of high performance virtual machines without maintenance cost. The total execution time of a parallel application on a high performance computing environment depends on a task scheduling algorithm. Most studies on task scheduling algorithms on cloud computing environment try to reduce a user cost, and studies on task scheduling algorithms that try to reduce total execution time are rarely carried out. In this paper, we propose a task scheduling algorithm called an HAGD and a performance enhancement method called a group task duplication method of which the HAGD utilizes. The group task duplication method simplifies previous task duplication method, and the HAGD uses the group task duplication method or a task insertion method according to the characteristics of a computing environment and an application. We found that the proposed algorithm provides superior performance regardless of the characteristics in terms of normalized total execution time through performance evaluations.

An Application of MapReduce Technique over Peer-to-Peer Network (P2P 네트워크상에서 MapReduce 기법 활용)

  • Ren, Jian-Ji;Lee, Jae-Kee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.8
    • /
    • pp.586-590
    • /
    • 2009
  • The objective of this paper describes the design of MapReduce over Peer-to-Peer network for dynamic environments applications. MapReduce is a software framework used for Cloud Computing which processing large data sets in a highly-parallel way. Based on the Peer-to-Peer network character which node failures will happen anytime, we focus on using a DHT routing protocol which named Pastry to handle the problem of node failures. Our results are very promising and indicate that the framework could have a wide application in P2P network systems while maintaining good computational efficiency and scalability. We believe that, P2P networks and parallel computing emerge as very hot research and development topics in industry and academia for many years to come.

Evaluating Computational Efficiency of Spatial Analysis in Cloud Computing Platforms (클라우드 컴퓨팅 기반 공간분석의 연산 효율성 분석)

  • CHOI, Changlock;KIM, Yelin;HONG, Seong-Yun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.119-131
    • /
    • 2018
  • The increase of high-resolution spatial data and methodological developments in recent years has enabled a detailed analysis of individual experiences in space and over time. However, despite the increasing availability of data and technological advances, such individual-level analysis is not always possible in practice because of its computing requirements. To overcome this limitation, there has been a considerable amount of research on the use of high-performance, public cloud computing platforms for spatial analysis and simulation. The purpose of this paper is to empirically evaluate the efficiency and effectiveness of spatial analysis in cloud computing platforms. We compare the computing speed for calculating the measure of spatial autocorrelation and performing geographically weighted regression analysis between a local machine and spot instances on clouds. The results indicate that there could be significant improvements in terms of computing time when the analysis is performed parallel on clouds.