Performance Optimization of Parallel Algorithms

  • Hudik, Martin (Department of Technical Cybernetics, University of Zilina) ;
  • Hodon, Michal (Department of Technical Cybernetics, University of Zilina)
  • Received : 2014.04.14
  • Published : 2014.08.30


The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). Algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributed-memory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.


Grant : Research Centre of University of Zilina

Supported by : ITMS


  1. A. B. Abderazek, Multicore Systems On-Chip Practical Software/ Hardware Design, 2nd ed. Imperial College Press, 2013.
  2. M. Anthony and M. Harvey, Linear Algebra: Concepts and Methods. Cambridge University Press, 2012.
  3. G. Coulouris, J. Dollimore, and T. Kindberg, Distributed Systems Concepts and Design, 5th ed. Addison Wesley, 2011.
  4. I. Foster, Y. Zhao, I. Raicu, and S. Lu, "Cloud computing and grid computing 360-degree compared," in Proc. IEEE GCE, Chicago, IL, USA, Austin, TX, 2008, pp. 1-10.
  5. F. Gebali, Algorithms and Parallel Computing. Wiley, 2011.
  6. G. H. Golub and C. F. Van Loan, Matrix Computations. Johns Hopkins University Press, 2012.
  7. P. Hanuliak and I. Hanuliak, "Performance evaluation of iterative parallel algorithms," Kybernetes, 36, 2007.
  8. P. Hanuliak and I. Hanuliak, "Performance evaluation of iterative parallel algorithms," Kybernetes, 39:107126, 2010.
  9. A. Holubek, "Performance prediction and optimization of parallel algorithm," InterTech, Poznan, 2010.
  10. J. Jeffers and J. Reinders, "Intel Zeon Phi coprocessor high performance programming," Elsevier Sci. & Technol. Books, 2013.
  11. J. Joseph and C. Fellenstein, Grid computing. Prentice Hall PTR, 2004.
  12. J.Micek, M. Hyben, M. Fratrik, and J. Puchyova, "Voice command recognition in multirobot systems: Information fusion," Int. J. Advv Robotic Syst., vol. 9, pp. 1-9, 2012.
  13. P. Pacheco, An Introduction to Parallel Computing. Morgan Kaufmann, 2011.
  14. Y. K. Penya, "Last-generation applied artificial intelligence for energy management in building automation," in Proc. IFAC Int. Conf. Fieldbus Systems, Appl., 2003, pp. 79-83.
  15. P. Sevck and O. Kovar, "Very efficient exploitation of FPGA block RAM memories in the complex digital system design," J. Inform. Control. Management Syst., vol. 8, pp. 403-414, 2010.
  16. D. Soudris and A. Jantsch, Scalable Multi-Core Architectures: Design Methodologies and Tools. Springer: New York, 2012.