• Title, Summary, Keyword: Last Level Cache(LLC)

Search Result 6, Processing Time 0.025 seconds

Large-Scale Last-Level Cache Design Based on Parallel TLC STT-MRAM (병렬 TLC STT-MRAM 기반 대용량 LLC 설계)

  • Park, Taejin;Jang, Wooyoung
    • The Journal of Korean Institute of Information Technology
    • /
    • v.15 no.12
    • /
    • pp.77-89
    • /
    • 2017
  • State-of-the-art high-performance multi-core processors demand a large-scale last-level cache (LLC). Triple-level cell (TLC) spin-transfer torque (STT)-magnetic random access memory (MRAM), one of the emerging memories, can provide high storage density, but cause long memory latency and high memory power consumption due to multi-step read and write operations. In this paper, the architecture and operation of an LLC comprised of a parallel TLC STT-MRAM are proposed. Our LLC minimizes the occurrence of the three-step read and write operations via cell division mapping and conditional block swapping techniques. Experimental results show our proposed LLC achieves 11.03% more instructions per cycle and 12.54% lower power consumption, on average, than LLCs using direct mapping technique in the case that two PARSEC benchmarks are concurrently performed in MARSSx86.

Core-aware Cache Replacement Policy for Reconfigurable Last Level Cache (재구성 가능한 라스트 레벨 캐쉬 구조를 위한 코어 인지 캐쉬 교체 기법)

  • Son, Dong-Oh;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.11
    • /
    • pp.1-12
    • /
    • 2013
  • In multi-core processors, Last Level Cache(LLC) can reduce the speed gap between the memory and the core. For this reason, LLC has big impact on the performance of processors. LLC is composed of shared cache and private cache. In computer architecture community, most researchers have mainly focused on the management techniques for shared cache, while management techniques for private cache have not been widely researched. In conventional private LLC, memory is statically assigned to each core, resulting in serious performance degradation when the workloads are not fairly distributed. To overcome this problem, this paper proposes the replacement policy for managing private cache of LLC efficiently. As proposed core-aware cache replacement policy can reconfigure LLC dynamically, hit rate of LLC is increases drastically. Moreover, proposed policy uses 2-bit saturating counters to improve the performance. According to our simulation results, the proposed method can improve hit rates by 9.23% and reduce the access time by 12.85% compared to the conventional method.

Improving Energy Efficiency and Lifetime of Phase Change Memory using Delta Value Indicator

  • Choi, Ju Hee;Kwak, Jong Wook
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.3
    • /
    • pp.330-338
    • /
    • 2016
  • Phase change memory (PCM) has been studied as an emerging memory technology for last-level cache (LLC) due to its extremely low leakage. However, it consumes high levels of energy in updating cells and its write endurance is limited. To relieve the write pressure of LLC, we propose a delta value indicator (DVI) by employing a small cache which stores the difference between the value currently stored and the value newly loaded. Since the write energy consumption of the small cache is less than the LLC, the energy consumption is reduced by access to the small cache instead of the LLC. In addition, the lifetime of the LLC is further extended because the number of write accesses to the LLC is decreased. To this end, a delta value indicator and controlling circuits are inserted into the LLC. The simulation results show a 26.8% saving of dynamic energy consumption and a 31.7% lifetime extension compared to a state-of-the-art scheme for PCM.

Performance evaluation and analysis of TILE-Gx36 many-core processor with PARSEC benchmark (PARSEC을 이용한 TILE-Gx36 다중코어 프로세서의 성능 평가 및 분석)

  • Lee, Boseon;Kim, Han-Yee;Yu, Heonchang;Suh, Taeweon
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.1
    • /
    • pp.107-115
    • /
    • 2014
  • This paper evaluates and analyzes the performance of TILE-Gx36(Gx36), a many-core processor. The PARSEC parallel benchmark suite was used to measure the performance, and Core i7 (i7) and Atom are used for the performance comparison. When experimented with the maximum number of threads that can be executed concurrently on each machine, Gx36 showed a 2.73${\times}$ inferior performance to Core i7 and a 1.93${\times}$ superior performance to Atom. Gx36 has the largest Last Level Cache(LLC) among the compared processors. Nevertheless, it reported the biggest number of LLC misses, which, we strongly believe, is the major culprit for lower performance than expected. Our study suggests that the DDC employed in Gx36 is not a favorable cache structure for the general-purpose high-performance computing. The actual measurement with off-the-shelf machine provides non-biased data for polishing the future many-core architecture.

  • PDF

Virtual Machine Scheduling for Multicores Considering Effects of Shared On-chip Last Level Cache Interference (공유 말단 캐시에서의 간섭의 영향을 고려한 멀티코어 프로세서를 위한 가상 머신 스케줄링)

  • Kim, Shin-gyu;Choi, Chanho;Eom, Hyeonsang;Yeom, Heon Y.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.134-136
    • /
    • 2012
  • 클라우드 컴퓨팅 서비스 시장이 성장하면서, 서비스 제공자들은 전력 사용량 감소와 서비스 수준을 보장하는 등의 여러 가지 문제와 맞딱드리게 되었다. 이런 문제에 대한 원인 중 하나는 자원 효율성을 높이기 위해 도입한 가상머신 기반의 서버 통합 정책이다. 현재의 가상머신 기술들은 아직까지 완벽한 격리수준을 제공하지 못하기 때문에, 같은 노드에 배치된 가상머신들은 자원을 공유하면서 서로 간에 간섭을 일으키게 된다. 본 연구에서는 가상머신끼리 공유하는 자원 중 프로세서의 말단 캐시(Last-level Cache, LLC)에서의 간섭을 최대한 줄여서 성능을 극대화하기 위한 방법을 제안한다.

  • PDF

I/O Traffic based Task Classification for Shared Last Level Cache Utilization in NUMA Systems (NUMA 시스템의 공유 LLC 활용을 위한 I/O 트래픽에 따른 태스크 분류법)

  • An, Deukhyeon;Kim, Jihong;Eom, Young Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.199-201
    • /
    • 2012
  • 디스크나 이더넷과 같은 I/O 장치로부터 발생하는 I/O 트래픽은, 여러 개의 노드를 가진 NUMA 시스템의 공유 LLC에 캐시 오염을 일으켜 캐시 라인이 재사용되는 것을 방해한다. 이러한 태스크는 캐시를 효율적으로 이용할 수 있는 메모리 집중적인 태스크들과 따로 분리하여 다룰 필요가 있다. 본 논문에서는 이러한 캐시 오염을 발생시키는 태스크들을 해당 태스크의 I/O 트래픽을 이용하여 실시간으로 감시하고 분류하는 기법을 제안한다. 또한 대량의 I/O 트래픽을 일으키는 태스크의 특성을 알아본다. 이를 통해, NUMA 시스템 환경에서 각 노드의 공유 LLC를 보다 효율적으로 사용할 수 있는 운영체제 스케줄링 기법을 연구하기 위한 토대를 마련하였다.

  • PDF