• 제목/요약/키워드: Many-core

검색결과 1,559건 처리시간 0.027초

Deep Packet Inspection Time-Aware Load Balancer on Many-Core Processors for Fast Intrusion Detection

  • Choi, Yoon-Ho;Park, Woojin;Choi, Seok-Hwan;Seo, Seung-Woo
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제5권3호
    • /
    • pp.169-177
    • /
    • 2016
  • To realize high-speed intrusion detection by accommodating many regular expression (regex)-based signatures and growing network link capacities, we propose the Service TimE-Aware Load-balancing (STEAL) algorithm. This work is motivated from the observation that utilization of a many-core network intrusion detection system (NIDS) is influenced by unfair computational distribution among many-core NIDS nodes. To avoid such unfair computational distribution, STEAL is designed to dynamically distribute a large volume of traffic among many-core NIDS nodes based on packet service time, which is represented by the deep packet time in many-core NIDS nodes. From experiments, we show that compared to the commonly used load-balancing algorithm based on arrival rate, STEAL increases the number of received packets (i.e., decreases the number of dropped packets) in many-core NIDS. Specifically, by integrating an open source NIDS (i.e. Bro) with STEAL, we show that even under attack-dominant traffic and with many signatures, STEAL can rapidly improve the performance of many-core NIDS to realize high-speed intrusion detection.

매니코어 프로세서를 이용한 벡터 기반 래스터화 알고리즘 구현 및 성능평가 (Implementation and Performance Evaluation of Vector based Rasterization Algorithm using a Many-Core Processor)

  • 손동구;김종면
    • 대한임베디드공학회논문지
    • /
    • 제8권2호
    • /
    • pp.87-93
    • /
    • 2013
  • In this paper, we implemented and evaluated the performance of a vector-based rasterization algorithm of 3D graphics using a SIMD-based many-core processor that consists of 4,096 processing elements. In addition, we compared the performance and efficiency of the rasterization algorithm using the many-core processor and commercial GPU (Graphics Processing Unit) system which consists of 7 GPUs and each of which have 512 cores. Experimental results showed that the SIMD-based many-core processor outperforms the commercial GPU system in terms of execution time (3.13x speedup), energy efficiency (17.5x better), and area efficiency (13.3x better). These results demonstrate that the SIMD-based many-core processor has potential as an embedded mobile processor.

매니코어 프로세서를 이용한 SIFT 알고리즘 병렬구현 및 성능분석 (Parallel Implementation and Performance Evaluation of the SIFT Algorithm Using a Many-Core Processor)

  • 김재영;손동구;김종면;전희성
    • 한국컴퓨터정보학회논문지
    • /
    • 제18권9호
    • /
    • pp.1-10
    • /
    • 2013
  • 본 논문에서는 대표적인 특징점 추출 알고리즘인 SIFT(Scale-Invariant Feature Transform)를 매니코어 프로세서를 이용하여 병렬 구현하고, 이를 실행 시간, 시스템 이용률, 에너지 효율 및 시스템 면적 효율 측면에서 분석하였다. 또한 기존의 고성능 CPU와 GPU(Graphics Processing Unit)와의 성능 비교를 통해 제안하는 매니코어의 잠재가능성을 입증하였다. 모의실험 결과, 매니코어를 이용한 SIFT 알고리즘 구현 결과는 기존의 OpenCV 구현 결과와 정확도면에서 동일하였고, 매니코어 구현은 고성능 CPU 및 GPU 구현보다 실행시간 측면에서 우수하였다. 또한 본 논문에서는 SIFT알고리즘의 옥타브 크기에 따른 에너지 효율 및 시스템 면적 효율을 분석하여 최적의 모델을 제시하였다.

New Thermal-Aware Voltage Island Formation for 3D Many-Core Processors

  • Hong, Hyejeong;Lim, Jaeil;Lim, Hyunyul;Kang, Sungho
    • ETRI Journal
    • /
    • 제37권1호
    • /
    • pp.118-127
    • /
    • 2015
  • The power consumption of 3D many-core processors can be reduced, and the power delivery of such processors can be improved by introducing voltage island (VI) design using on-chip voltage regulators. With the dramatic growth in the number of cores that are integrated in a processor, however, it is infeasible to adopt per-core VI design. We propose a 3D many-core processor architecture that consists of multiple voltage clusters, where each has a set of cores that share an on-chip voltage regulator. Based on the architecture, the steady state temperature is analyzed so that the thermal characteristic of each voltage cluster is known. In the voltage scaling and task scheduling stages, the thermal characteristics and communication between cores is considered. The consideration of the thermal characteristics enables the proposed VI formation to reduce the total energy consumption, peak temperature, and temperature gradients in 3D many-core processors.

SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구 (A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs)

  • 이종복
    • 전기학회논문지
    • /
    • 제62권2호
    • /
    • pp.252-256
    • /
    • 2013
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.

매니코어 프로세서 상에서 이산 웨이블릿 변환을 위한 성능 평가 및 분석 (Performance Evaluation and Analysis for Discrete Wavelet Transform on Many-Core Processors)

  • 박용훈;김종면
    • 대한임베디드공학회논문지
    • /
    • 제7권5호
    • /
    • pp.277-284
    • /
    • 2012
  • To meet the usage of discrete wavelet transform (DWT) on potable devices, this paper implements 2-level DWT using a reference many-core processor architecture and determine the optimal many-core processor. To explore the optimal many-core processor, we evaluate the impacts of a data-per-processing element ratio that is defined as the amount of data mapped directly to each processing element (PE) on system performance, energy efficiency, and area efficiency, respectively. This paper utilized five PE configurations (PEs=16, 64, 256, 1,024, and 4,096) that were implemented in 130nm CMOS technology with a 720MHz clock frequency. Experimental results indicated that maximum energy and area efficiencies were achieved at PEs=1,024. However, the system area must be limited 140mm2 and the power should not exceed 3 watts in order to implement 2-level DWT on portable devices. When we consider these restrictions, the most reasonable energy and area efficiencies were achieved at PEs=256.

효과적인 Backbone Core Tree(BCT)생성 알고리즘 (Efficient Backbone Core Tree Generation Algorithm)

  • 서현곤;김기형
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2002년도 봄 학술발표논문집 Vol.29 No.1 (A)
    • /
    • pp.214-216
    • /
    • 2002
  • 본 논문에서는 many-to-many IP 멀티캐스팅을 위한 효율적인 Backbone Core Tree(BCT)생성 알고리즘에 대하여 제안한다. 본 논문의 제안기법은 Core Based Tree(CBT)에 기반을 두고 있다. CBT는 공유 트리를 이용하여 멀티캐스트 자료를 전달하기 때문에 Source Based Tree에 비하여 각 라우터가 유지해야 하는 상태 정보의 양에 적고 적용하기 간단하지만, Core 라우터 선택의 어려움과 트래픽이 Core로 집중되는 문제점을 가지고 있다. 이에 대한 보완책으로 Backbone Core Tree기법이 제안되었는데, 본 논문에서는 주어진 네트워크 위상 그래프에서 최소신장 트리를 만들고, 센트로이드를 이용하여 효율적인 BCT를 생성하는 알고리즘을 제안한다.

  • PDF

멀티캐스팅을 위한 BCT생성 알고리즘의 시뮬레이션 (A Simulation of BCT(Backbone Core Tree) Generation Algorithm for Multicasting)

  • 서현곤;김기형
    • 한국시뮬레이션학회:학술대회논문집
    • /
    • 한국시뮬레이션학회 2002년도 춘계학술대회논문집
    • /
    • pp.67-71
    • /
    • 2002
  • 본 논문에서는 many-to-many IP 멀티캐스팅을 위한 효율적인 BCT(Backbone Core Tree)생성 알고리즘의 시뮬레이션 방법에 대하여 제안한다. BCT는 기법은 CBT(Core Based Tree)에 기반을 두고 있다. CBT는 공유 트리를 이용하여 멀티캐스트 자료를 전달하기 때문에 Source based Tree에 비하여 각 라우터가 유지해야 하는 상태 정보의 양에 적고, 적용하기 간단하지만, Core 라우터 선택의 어려움과 트래픽이 Core로 집중되는 문제점을 가지고 있다. 이에 대한 보완책으로 BCT기법이 제안되었는데, 본 논문에서는 주어진 네트워크 위상 그래프에서 최소신장 트리를 만들고, 센트로이드(Centroid)를 이용하여 효율적인 BCT를 생성하는 알고리즘을 제안하고 시뮬레이션 방법을 제시한다.

  • PDF

휴대 장치용 기타 음 합성을 위한 매니코어 아키텍처의 디자인 공간 탐색 (Design Space Exploration of Many-Core Architecture for Sound Synthesis of Guitar on Portable Device)

  • 강명수;김종면
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2014년도 제49차 동계학술대회논문집 22권1호
    • /
    • pp.1-4
    • /
    • 2014
  • Although physical modeling synthesis is becoming more and more efficient in rich and natural high-quality sound synthesis, its high computational complexity limits its use in portable devices. This constraint motivated research of single-instruction multiple-data many-core architectures that support the tremendous amount of computations by exploiting massive parallelism inherent in physical modeling synthesis. Since no general consensus has been reached which grain sizes of many-core processors and memories provide the most efficient operation for sound synthesis, design space exploration is conducted for seven processing element (PE) configurations. To find an optimal PE configuration, each PE configuration is evaluated in terms of execution time, area and energy efficiencies. Experimental results show that all PE configurations are satisfied with the system requirements to be implemented in portable devices.

  • PDF

메타데이터의 표준화 동향 : DCMI를 중심으로 (Current changes in standardization activities of Dublin Core Metadata Initiative)

  • 김태수
    • 지식정보인프라
    • /
    • 통권9호
    • /
    • pp.8-19
    • /
    • 2002
  • The Dublin Core element set has been developed over the past years as an open, consensus-building metadata from many user communities. Wider adoption in many countries made this metadata a major resource discovery standard on the internet. The version 1.1 of the Dublin Core element set was adopted as CEN Workshop Agreement 13874 in Europe, and also ratified under the auspices of the National Information Standards Organization in US as ANSI Standard Z39.85. This report summarizes the standardization activities that have taken place in the Dublin Core Metadata Initiative for the past years.

  • PDF