• Title/Summary/Keyword: Parallel efficiency

Search Result 1,035, Processing Time 0.04 seconds

A Study on Effect of Domain-Decomposition Method on Parallel Efficiency in 2-D Flow Computations (2차원 유동장 해석에서 영역분할법에 따른 병렬효율성 검토)

  • Lee Sangyeul;Hur Nahmkeon
    • 한국전산유체공학회:학술대회논문집
    • /
    • 1998.11a
    • /
    • pp.147-152
    • /
    • 1998
  • 2-D flow fields are studied by using a shared memory parallel computer with a parallel flow analysis program which uses domain decomposition method and MPI library for data exchange at overlapped interface. Especially, effects of directional domain decomposition on parallel efficiency are studied for 2-D Lid-Driven cavity flow and flow through square cavity. It is known from the present study that domain decomposition along the main flow direction gives better parallel efficiency in 1-D partitioning than along the other direction. 2-D partitioning, however, is less sensitive to flow directions and gives good parallel efficiency for most of the cases considered.

  • PDF

Performance Optimization of Parallel Algorithms

  • Hudik, Martin;Hodon, Michal
    • Journal of Communications and Networks
    • /
    • v.16 no.4
    • /
    • pp.436-446
    • /
    • 2014
  • The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). Algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributed-memory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.

Iterative mesh partitioning strategy for improving the efficiency of parallel substructure finite element computations

  • Hsieh, Shang-Hsien;Yang, Yuan-Sen;Tsai, Po-Liang
    • Structural Engineering and Mechanics
    • /
    • v.14 no.1
    • /
    • pp.57-70
    • /
    • 2002
  • This work presents an iterative mesh partitioning approach to improve the efficiency of parallel substructure finite element computations. The proposed approach employs an iterative strategy with a set of empirical rules derived from the results of numerical experiments on a number of different finite element meshes. The proposed approach also utilizes state-of-the-art partitioning techniques in its iterative partitioning kernel, a cost function to estimate the computational cost of each submesh, and a mechanism that adjusts element weights to redistribute elements among submeshes during iterative partitioning to partition a mesh into submeshes (or substructures) with balanced computational workloads. In addition, actual parallel finite element structural analyses on several test examples are presented to demonstrate the effectiveness of the approach proposed herein. The results show that the proposed approach can effectively improve the efficiency of parallel substructure finite element computations.

Load Dispatching Control of Multiple-Parallel-Converters Rectifier to Maximize Conversion Efficiency

  • Orihara, Dai;Saitoh, Hiroumi;Higuchi, Yuji;Babasaki, Tadatoshi
    • Journal of Electrical Engineering and Technology
    • /
    • v.9 no.3
    • /
    • pp.1132-1136
    • /
    • 2014
  • In the context of increasing electric energy consumption in a data center, energy efficiency improvement is strongly emphasized. In a data center, electric energy is largely consumed by DC power supply system, which is based on a rectifier composed by multiple parallel converters. Therefore, rectifier efficiency must be improved for minimizing loss of DC power supply system. Rectifier efficiency can be modulated by load allocation to converters because converter efficiency depends on input AC power. In this paper, we propose a new control method to maximize rectifier efficiency. The method can control load allocation to converters by introducing active power converter control scheme and start-and-stop of converters. In order to illustrate optimal load allocations in a rectifier, a maximization problem of rectifier efficiency is formulated as a nonlinear optimization one. The problem is solved by Lagrangian relaxation method and the computation results provide the validity of proposed method.

Construction and Performance Test of a Supercomputing PC System using PC-clustering and Parallel Virtual Machine (PC-Clustering과 병렬가상장치에 의한 수치계산용 슈퍼컴퓨팅 PC 시스템 구축과 성능 테스트)

  • Hong, Woo-Pyo;Kim, Jong-Jae;Oh, Kwang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.473-483
    • /
    • 1999
  • We introduce a way to construct a supercomputing capable system with some networked PCs, running the Linux operating system and computing power comparable with expensive commercial workstations, and with the Parallel Virtual Machine (PVM) software which enables one to control the total CPUs and memories of the networked PCs. By benchmarking the system using a PVM parallel program, we find that the system's parallel efficiency is close to 90 %.

  • PDF

Study on improvement of submicron particle collection performance in 2-stage parallel-plate electrostatic precipitators (2단 평행판 전기집진기의 서브마이크론입자 집진성능 개선 연구)

  • Yoo, K.H.;Oh, M.D.;Lee, J.S.
    • Korean Journal of Air-Conditioning and Refrigeration Engineering
    • /
    • v.9 no.3
    • /
    • pp.323-332
    • /
    • 1997
  • It was reported by some researchers that two-stage parallel-plate ESPs, commonly called electronic air cleaners, show decreasing behavior of collection efficiency as particle size decreases below about $0.03{\mu}m$. This phenomenon is attributed to partial particle charging characteristics, where some of incoming particles are not charged in the charging cell of 2-stage parallel-plate ESP. One way to improve the decreasing collection efficieny in that particle size range is to enforce particle charging quantity in the charging cell. In the present study, in order to do this a 2-wire series-type charging cell modified from a 1-wire normal-type one was suggested and investigated theoretically and experimentally concerning improvement of the collection efficiency. It was confirmed from the experimental and theoretical works that the collection efficiency was apparently improved.

  • PDF

A Study for Parallel Computing Efficiency Comparing Numerical Solutions of Battery Pack (배터리 팩 수치해석 해의 비교를 통한 병렬연산 효율성 연구)

  • Kim, Kwang Sun;Jang, Kyung Min
    • Journal of the Semiconductor & Display Technology
    • /
    • v.15 no.2
    • /
    • pp.20-25
    • /
    • 2016
  • The parallel computer cluster system has been known as the powerful tool to solve a complex physical phenomenon numerically. The numerical analysis of large size of Li-ion battery pack, which has a complex physical phenomenon, requires a large amount of computing time. In this study, the numerical analyses were conducted for comparing the computing efficiency between the single workstation and the parallel cluster system both with multicore CPUs'. The result shows that the parallel cluster system took the time 80 times faster than the single work station for the same battery pack model. The performance of cluster system was increased linearly with more CPU cores being increased.

A Parallel Finite Element Procedure for Contact-Impact Problems (충돌해석을 위한 병렬유한요소 알고리즘)

  • Har, Jason
    • Proceedings of the KSME Conference
    • /
    • 2003.11a
    • /
    • pp.1286-1290
    • /
    • 2003
  • This paper presents a newly implemented parallel finite element procedure for contact-impact problems. Three sub-algorithms are includes in the proposed parallel contact-impact procedure, such as a parallel Belytschko-Lin-Tsay (BLT) shell element generation, a parallel explicit time integration scheme, and a parallel contact search algorithm based on the master slave slide-line algorithm. The underlying focus of the algorithms is on its effectiveness and efficiency for inclusion in future finite element systems on parallel computers. Throughout this research, a prototype code, named GT-PARADYN, is developed on the IBM SP2, a distributed-memory computer. Some numerical examples are provided to demonstrate the timing results of the procedure, discussing the accuracy and efficiency of the code.

  • PDF

Design of 32 bit Parallel Processor Core for High Energy Efficiency using Instruction-Levels Dynamic Voltage Scaling Technique

  • Yang, Yil-Suk;Roh, Tae-Moon;Yeo, Soon-Il;Kwon, Woo-H.;Kim, Jong-Dae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.9 no.1
    • /
    • pp.1-7
    • /
    • 2009
  • This paper describes design of high energy efficiency 32 bit parallel processor core using instruction-levels data gating and dynamic voltage scaling (DVS) techniques. We present instruction-levels data gating technique. We can control activation and switching activity of the function units in the proposed data technique. We present instruction-levels DVS technique without using DC-DC converter and voltage scheduler controlled by the operation system. We can control powers of the function units in the proposed DVS technique. The proposed instruction-levels DVS technique has the simple architecture than complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system and a hardware implementation is very easy. But, the energy efficiency of the proposed instruction-levels DVS technique having dual-power supply is similar to the complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system. We simulate the circuit simulation for running test program using Spectra. We selected reduced power supply to 0.667 times of the supplied power supply. The energy efficiency of the proposed 32 bit parallel processor core using instruction-levels data gating and DVS techniques can improve about 88.4% than that of the 32 bit parallel processor core without using those. The designed high energy efficiency 32 bit parallel processor core can utilize as the coprocessor processing massive data at high speed.

All Phase Discrete Sine Biorthogonal Transform and Its Application in JPEG-like Image Coding Using GPU

  • Shan, Rongyang;Zhou, Xiao;Wang, Chengyou;Jiang, Baochen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4467-4486
    • /
    • 2016
  • Discrete cosine transform (DCT) based JPEG standard significantly improves the coding efficiency of image compression, but it is unacceptable event in serious blocking artifacts at low bit rate and low efficiency of high-definition image. In the light of all phase digital filtering theory, this paper proposes a novel transform based on discrete sine transform (DST), which is called all phase discrete sine biorthogonal transform (APDSBT). Applying APDSBT to JPEG scheme, the blocking artifacts are reduced significantly. The reconstructed image of APDSBT-JPEG is better than that of DCT-JPEG in terms of objective quality and subjective effect. For improving the efficiency of JPEG coding, the structure of JPEG is analyzed. We analyze key factors in design and evaluation of JPEG compression on the massive parallel graphics processing units (GPUs) using the compute unified device architecture (CUDA) programming model. Experimental results show that the maximum speedup ratio of parallel algorithm of APDSBT-JPEG can reach more than 100 times with a very low version GPU. Some new parallel strategies are illustrated in this paper for improving the performance of parallel algorithm. With the optimal strategy, the efficiency can be improved over 10%.