• 제목, 요약, 키워드: Parallel Computing

검색결과 712건 처리시간 0.046초

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

Parallel and Sequential Implementation to Minimize the Time for Data Transmission Using Steiner Trees

  • Anand, V.;Sairam, N.
    • Journal of Information Processing Systems
    • /
    • v.13 no.1
    • /
    • pp.104-113
    • /
    • 2017
  • In this paper, we present an approach to transmit data from the source to the destination through a minimal path (least-cost path) in a computer network of n nodes. The motivation behind our approach is to address the problem of finding a minimal path between the source and destination. From the work we have studied, we found that a Steiner tree with bounded Steiner vertices offers a good solution. A novel algorithm to construct a Steiner tree with vertices and bounded Steiner vertices is proposed in this paper. The algorithm finds a path from each source to each destination at a minimum cost and minimum number of Steiner vertices. We propose both the sequential and parallel versions. We also conducted a comparative study of sequential and parallel versions based on time complexity, which proved that parallel implementation is more efficient than sequential.

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.

Fully Homomorphic Encryption Based On the Parallel Computing

  • Tan, Delin;Wang, Huajun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.497-522
    • /
    • 2018
  • Fully homomorphic encryption(FHE) scheme may be the best method to solve the privacy leakage problem in the untrusted servers because of its ciphertext calculability. However, the existing FHE schemes are still not being put into the practical applications due to their low efficiency. Therefore, it is imperative to find a more efficient FHE scheme or to optimize the existing FHE schemes so that they can be put into the practical applications. In this paper, we optimize GSW scheme by using the parallel computing, and finally we get a high-performance FHE scheme, namely PGSW scheme. Experimental results show that the time overhead of the homomorphic operations in new FHE scheme will be reduced manyfold with the increasing of processing units number. Therefore, our scheme can greatly reduce the running time of homomorphic operations and improve the performance of FHE scheme through sacrificing hardware resources. It can be seen that our FHE scheme can catalyze the development of FHE.

An Optimized Iterative Semantic Compression Algorithm And Parallel Processing for Large Scale Data

  • Jin, Ran;Chen, Gang;Tung, Anthony K.H.;Shou, Lidan;Ooi, Beng Chin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2761-2781
    • /
    • 2018
  • With the continuous growth of data size and the use of compression technology, data reduction has great research value and practical significance. Aiming at the shortcomings of the existing semantic compression algorithm, this paper is based on the analysis of ItCompress algorithm, and designs a method of bidirectional order selection based on interval partitioning, which named An Optimized Iterative Semantic Compression Algorithm (Optimized ItCompress Algorithm). In order to further improve the speed of the algorithm, we propose a parallel optimization iterative semantic compression algorithm using GPU (POICAG) and an optimized iterative semantic compression algorithm using Spark (DOICAS). A lot of valid experiments are carried out on four kinds of datasets, which fully verified the efficiency of the proposed algorithm.

P2P 네트워크상에서 MapReduce 기법 활용 (An Application of MapReduce Technique over Peer-to-Peer Network)

  • 임건길;이재기
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • v.15 no.8
    • /
    • pp.586-590
    • /
    • 2009
  • 본 논문의 목적은 P2P 네트워크 상에서 동적 환경 애플리케이션을 지원하기 위한 MapReduce 의 설계이다. MapReduce는 클라우드컴퓨팅 중에서 대용량 데이터의 병렬처리를 위해서 개발된 소프트웨어 프레임워크이다. P2P 기반 네트워크의 특징은 노드 고장이 언제든지 발생할 수 있으며, 이런 노드 고장을 제어하기 위해 Pastry라는 DHT 라우팅 프로토콜의 사용에 초점을 맞추었다. 본 논문의 결과는 프레임워크가 양호한 계산 효율과 확장성을 유지하는 가운데 P2P 네트워크 시스템의 다양한 애플리케이션에 적용될 수 있음을 보이고 있다. 향후 몇 년 동안은 P2P 네트워크와 병렬 컴퓨팅이 산업과 학계에서 매우 중요한 연구 및 개발 주제로 자리 잡을 것으로 확신한다.

자바를 위한 분산된 병렬 컴퓨팅 환경 (Distributed Parallel Computing Environment for Java)

  • 이상윤;김승호
    • 전자공학회논문지CI
    • /
    • v.41 no.6
    • /
    • pp.23-37
    • /
    • 2004
  • 자바의 쓰레드는 다중 처리 환경에서 하나의 프로그램 공간 내의 독립적인 프로세스로 취급되는 객체 요소이므로 병렬처리를 위한 독립적인 프로세스로 활용할 수 있다. 또한, 자바의 동기화 메커니즘과 쓰레드를 활용하면 병렬 처리를 수행하는 응용프로그램을 쉽게 작성할 수 있다. 이에 따라, 자바의 병렬 처리 지원 기능을 분산된 컴퓨팅 환경에 적용하기 위한 많은 연구 결과가 있다. 본 논문에서는 레거시 자바 프로그램에 포함된 쓰레드를 분산된 컴퓨팅 환경에서 병렬 수행 하도록 지원하는 시스템 환경을 제안한다. TORB(Transparent Object Request Broker)라고 명명된 본 시스템은 프로그래밍 투명성을 지원하므로 이미 작성된 레거시 자바 프로그램을 간단한 변환 과정을 거친 후 병렬 수행 하도록 지원한다. TORB는 본 연구팀에서 이미 발표한 분산 프로그래밍 도구의 기능을 확장한 것이며, 이는 지정된 기능을 지정된 컴퓨터에서 수행하도록 지원하는 전형적인 분산처리 기능만을 보유하고 있었다.

웹 환경에서의 병렬/분산 처리를 위한 동적 호스트 관리 기법의 개발 (Development of the Dynamic Host Management Scheme for Parallel/Distributed Processing on the Web)

  • 송은하;정영식
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • v.8 no.3
    • /
    • pp.251-260
    • /
    • 2002
  • 웹에 존재하는 수많은 유휴상태 호스트들을 이용한 병렬/분산 처리는 대규모 응용문제에 대해 높은 가격 대 성능비를 가진다. 웹 환경에서 병렬/분산 처리를 위하여 호스트들의 이질성 및 가변성, 자율성, 지속적인 성능보장과 참여 호스트 수 변화 등 예측할 수 없는 상태에 대한 해결책을 제시하여야 한다. 본 논문은 지리적으로 떨어져 있는 참여 호스트들의 작업 처리를 성능에 기반하는 적응적 작업 재할당 전략을 제안한다. 또한, 대규모 응용문제의 병렬 처리 중에 호스트 수가 변하는 동적 환경에 대해 동적 호스트 관리 스킴을 제공한다. 본 논문에서는 PDSWeb (Parallel/Distributed Scheme on Web) 시스템을 구현하여, 많은 연산량을 지닌 랜더링 이미지 생성에 적용하여 평가한다. 그 결과 호스트의 가변성에 대해 적응적 작업 재할당은 최고 90%이상 향상하였으며. 호스트 추가와 삭제에 따른 성능 향상 정도를 보인다.