• Title, Summary, Keyword: Multi-core processor

Search Result 119, Processing Time 0.034 seconds

A Performance Study of Multi-core Out-of-Order Superscalar Processor Architecture (멀티코어 비순차 수퍼스칼라 프로세서의 성능 연구)

  • Lee, Jong-Bok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.10
    • /
    • pp.1502-1507
    • /
    • 2012
  • In order to overcome the hardware complexity and power consumption problems, recently the multi-core architecture has been prevalent. For hardware simplicity, usually RISC processor is adopted as the unit core processor. However, if the performance of unit core processor is enhanced, the overall performance of the multi-core processor architecture can be further increased. In this paper, out-of-order superscalar processor is utilized for the multi-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the out-of-order superscalar cores between 2 and 16 extensively. As a result, the 16-core out-of-order superscalar processor for the window size of 16 resulted in 17.4 times speed up over the single-core out-of-order superscalar processor, and 50 times speed up over the single core RISC processor. When compared for the same number of cores on the average, the multi-core out-of-order superscalar processor performance achieved 3.2 times speed up over the multi-core RISC processor and 1.6 times speed up over the multi-core in-order superscalar processor.

Performance Study of Multi-core In-Order Superscalar Processor Architecture (멀티코어 순차 수퍼스칼라 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.5
    • /
    • pp.123-128
    • /
    • 2012
  • In order to overcome the hardware complexity and performance limit problems, recently the multi-core architecture has been prevalent. For hardware simplicity, usually RISC processor is adopted as the unit core processor. However, if the performance of unit core processor is enhanced, the overall performance of the multi-core processor architecture can be further enhanced. In this paper, in-order superscalar processor is utilized as the core for the multi-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the number of superscalar cores between 2 and 16 and the window size of 4 to 16 extensively. As a result, the 16-core superscalar processor for the window size of 16 results in 8.4 times speed up over the single core superscalar processor. When compared with the same number of cores, the multi-core superscalar processor performance doubles that of the multi-core RISC processor.

Implementation and Verification of a Multi-Core Processor including Multimedia Specific Instructions (멀티미디어 전용 명령어를 내장한 멀티코어 프로세서 구현 및 검증)

  • Seo, Jun-Sang;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.1
    • /
    • pp.17-24
    • /
    • 2013
  • In this paper, we present a multi-core processor including multimedia specific instructions to process multimedia data efficiently in the mobile environment. Multimedia specific instructions exploit subword level parallelism (SLP), while the multi-core processor exploits data level parallelism (DLP). These combined parallelisms improve the performance of multimedia processing applications. The proposed multi-core processor including multimedia specific instructions is implemented and tested using a Xilinx ISE 10.1 tool and SoCMaster3 testbed system including Vertex 4 FPGA. Experimental results using a fire detection algorithm show that multimedia specific instructions outperform baseline instructions in the same multi-core architecture in terms of performance (1.2x better), energy efficiency (1.37x better), and area efficiency (1.23x better).

A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs (SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.2
    • /
    • pp.252-256
    • /
    • 2013
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.

A Performance Study of Multi-Core Processors with Perceptrons (퍼셉트론을 이용하는 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.12
    • /
    • pp.1704-1709
    • /
    • 2014
  • In order to increase the performance of multi-core system processor architectures, the multi-thread branch predictor which speculatively fetches and allocates threads to each core should be highly accurate. In this paper, the perceptron based multi-thread branch predictor is proposed for the multi-core processor architectures. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 2 to 16-core architectures employing perceptron multi-thread branch predictor extensively. Its performance is compared with the architecture which utilizes the two-level adaptive multi-thread branch predictor.

A Performance Study of Asymmetric Multi-core Digital Signal Processor Architectures (비대칭적 멀티코어 디지털 신호처리 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.5
    • /
    • pp.219-224
    • /
    • 2015
  • Recently, the multi-core processor architecture is widely used in the digital signal processors for enhancing its performance. Multi-core processors are classified either as symmetric or asymmetric. Asymmetric multi-core processors are known to have higher performance and more efficient than symmetric multi-core processors. In order to study the performance enhancement of asymmetric multi-core digital signal processors over the symmetric ones, the trace-driven simulation has been executed for various asymmetric quad-core, octa-core and hexadeca-core digital signal processors and compared with the symmetric ones of similar hardware budget using UTDSP benchmarks as input.

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.

New Hypervisor Improving Network Performance for Multi-core CE Devices

  • Hong, Cheol-Ho;Park, Miri;Yoo, Seehwan;Yoo, Chuck
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.4
    • /
    • pp.231-241
    • /
    • 2011
  • Recently, system virtualization has been applied to consumer electronics (CE) such as smart mobile phones. Although multi-core processors have become a viable solution for complex applications of consumer electronics, the issue of utilizing multi-core resources in the virtualization layer has not been researched sufficiently. In this paper, we present a new hypervisor design and implementation for multi-core CE devices. We concretely describe virtualization methods for a multi-core processor and multi-core-related subsystems. We also analyze bottlenecks of network performance in a virtualization environment that supports multimedia applications and propose an efficient virtual interrupt distributor. Our new multi-core hypervisor improves network performance by 5.5 times as compared to a hypervisor without the virtual interrupt distributor.

A Study of Trace-driven Simulation for Multi-core Processor Architectures (멀티코어 프로세서의 명령어 자취형 모의실험에 대한 연구)

  • Lee, Jong-Bok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.9-13
    • /
    • 2012
  • In order to overcome the complexity and power problems of superscalar processors, the multi-core architecture has been prevalent recently. Although the execution-driven simulation is wide spread, the trace-driven simulation has speed advantages over the execution-driven simulation. We present a methodology to simulate multi-core architecture using trace-driven simulator. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the cores ranging from 2 to 16 extensively. As a result, the 16-core processor resulted in 4.1 IPC and 13.3 times speed up over single-core processor on the average.

A Study On Statistical Simulation for Asymmetric Multi-Core Processor Architectures (비대칭적 멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.157-163
    • /
    • 2016
  • If trace-driven or execution-driven simulation is used for the performance analysis of asymmetric multi-core processors, excessive time and much disk space are necessary. In this paper, statistical simulations are performed for asymmetric multi-core processors with various hardware configurations. For the experiment, SPEC 2000 benchmark programs are used for profiling and synthesis, which is supplied as input for the simulation of asymmetric multi-core processors. As a result, the performance of asymmetric multi-core processor obtained by statistical simulation is comparable to that of the trace-driven simulation with a tremendous reduction in the simulation time.