• Title, Summary, Keyword: 스트링 패턴 매칭

Search Result 16, Processing Time 0.044 seconds

Development of the Pattern Matching Engine using Regular Expression (정규 표현식을 이용한 패턴 매칭 엔진 개발)

  • Ko, Kwang-Man;Park, Hong-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.2
    • /
    • pp.33-40
    • /
    • 2008
  • In various manners, string pattern matching algorithm has been proven for prominence in speed of searching particular queries and keywords. Whereas, the existing algorithms are limited in terms of various pattern. In this paper, regular expression has been utilized to improve efficiency of pattern matching through efficient execution towards various pattern of queries including particular keywords. Such as this research would enable to search various harmful string pattern more efficiently, rather than matching simple keywords, which also implies excellent speed of string pattern matching compared to that of those existing algorism. In this research, the proposed string search engine generated from the LEX are more efficient than BM & AC algorithm for a string patterns search speed in cases of 1000 with more than patterns, but we have got similar results for the keywords pattern matching.

Retargetable Intermediate Code Optimization System Using Tree Pattern Matching Techniques (트리패턴매칭기법의 재목적 가능한 중간코드 최적화 시스템)

  • Kim, Jeong-Suk;O, Se-Man
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2253-2261
    • /
    • 1999
  • ACK generates optimized code using the string pattern matching technique in pattern table generator and peephole optimizer. But string pattern matching method is not effective due to the many comparative actions in pattern selection. We designed and implemented the EM intermediate code optimizer using tree pattern matching algorithm composed of EM tree generator, optimization pattern table generator and tree pattern matcher. Tree pattern matching algorithm practices the pattern matching that centering around root node with refer to the pattern table, with traversing the EM tree by top-down method. As a result, compare to ACK string pattern matching methods, we found that the optimized code effected to pattern selection time, and contributed to improved the pattern selection time by about 10.8%.

  • PDF

A Fast String Matching Scheme without using Buffer for Linux Netfilter based Internet Worm Detection (리눅스 넷필터 기반의 인터넷 웜 탐지에서 버퍼를 이용하지 않는 빠른 스트링 매칭 방법)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7
    • /
    • pp.821-830
    • /
    • 2006
  • As internet worms are spread out worldwide, the detection and filtering of worms becomes one of hot issues in the internet security. As one of implementation methods to detect worms, the Linux Netfilter kernel module can be used. Its basic operation for worm detection is a string matching where coming packet(s) on the network is/are compared with predefined worm signatures(patterns). A worm can appear in a packet or in two (or more) succeeding packets where some part of worm is in the first packet and its remaining part is in its succeeding packet(s). Assuming that the maximum length of a worm pattern is less than 1024 bytes, we need to perform a string matching up to two succeeding packets of 2048 bytes. To do so, Linux Netfilter keeps the previous packet in buffer and performs matching with a combined 2048 byte string of the buffered packet and current packet. As the number of concurrent connections to be handled in the worm detection system increases, the total size of buffer (memory) increases and string matching speed becomes low In this paper, to reduce the memory buffer size and get higher speed of string matching, we propose a string matching scheme without using buffer. The proposed scheme keeps the partial matching result of the previous packet with signatures and has no buffering for previous packet. The partial matching information is used to detect a worm in the two succeeding packets. We implemented the proposed scheme by modifying the Linux Netfilter. Then we compared the modified Linux Netfilter module with the original Linux Netfilter module. Experimental results show that the proposed scheme has 25% lower memory usage and 54% higher speed compared to the original scheme.

An Effective Algorithm for Checking Subsumption Relation on String Data Containing Wildcard Characters (와일드카드 문자를 포함하는 스트링 데이터 사이의 포함관계 확인을 위한 효율적인 알고리즘)

  • Kim, Do-Han;Park, Hee-Jin;Paek, Eun-Ok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.9
    • /
    • pp.475-482
    • /
    • 2005
  • String data containing wildcard characters may represent certain patterns in texts. A subsumption relation between two patterns can be defined by a subset relation between sets of strings that match those patterns. Thus, the subsumption relation check is important to determine whether each pattern represents a set of strings without any overlap with another pattern. In this paper, we propose an effective algorithm that can determine subsumption relation between strings with wildcard characters. First, we consider a simple extension of the suffix tree algorithm so that it nay include wildcard characters and then we propose another method that checks the subsumption relation by dividing a suffix tree structure at each location of string data.

Code Optimization Using DFA (DFA 를 이용한 코드 최적화)

  • Yun, Sung-Lim;Oh, Se-Man
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.525-528
    • /
    • 2005
  • 원시 프로그램에 대한 컴파일 과정 중 최적화 단계에서는 프로그램의 실행 속도를 개선시키고 코드 크기를 줄일 수 있는 다양한 최적화 기법을 수행한다. 특히, 핍홀 ��적화는 비효율적인 명령어의 순서를 구별해 내고 연속되는 명령어의 순서를 의미적으로 동등하면서 좀 더 효율적인 코드로 개선하는 방법이다. 최적화 패턴 매칭 방법 중 스트링 패턴 매칭 방법은 중간 코드에 대응하는 최적의 패턴을 찾기 위한 방법으로 과다한 최적화 패턴 검색 시간으로 비효율적이고, 트리 패턴 매칭은 패턴 결정시 중복 비교가 발생할 수 있으며, 코드의 트리 구성에 많은 비용이 드는 단점을 가지고 있는 방법들이다. 본 논문에서는 기존의 최적화 방법들의 단점을 극복하기 위한 방법으로 DFA(Deterministic Finite Automata)를 이용한 코드 최적화 방법을 제안한다. 이 방법은 다른 패턴 매칭 기법보다 오토마타(Automata)로 구성하기 때문에 비용은 적어지고, 오토마타를 통해 결정적으로 패턴이 확정됨에 따른 패턴 선택 비용이 줄어들며, 최적화 패턴 검색 시간도 빨라지는 효율적인 방법이다.

  • PDF

Flexible Pattern Alignment Problem (연성 패턴 정렬 문제)

  • 서진택;김삼묘
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.655-657
    • /
    • 1999
  • 본 논문에서는 1차원 스트링과 2차원 텍스트를 유동적으로 정렬하는 소위 1-2차 연성 정렬 문제를 정의하고, 이 문제를 위한 동적 알고리즘을 제시하고, 응용 예를 보인다. 문제의 패턴은 그 길이가 주어져 있지만 그 형체가 유연성을 갖고 있어 변형될 수 있다는 점이 지금까지 연구되어온 패턴 매칭 문제와 다르다.

  • PDF

Code Optimization Using Pattern Table (패턴 테이블을 이용한 코드 최적화)

  • Yun Sung-Lim;Oh Se-Man
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.11
    • /
    • pp.1556-1564
    • /
    • 2005
  • Various optimization techniques are deployed in the compilation process of a source program for improving the program's execution speed and reducing the size of the source code. Of the optimization pattern matching techniques, the string pattern matching technique involves finding an optimal pattern that corresponds to the intermediate code. However, it is deemed inefficient due to excessive time required for optimized pattern search. The tree matching pattern technique can result in many redundant comparisons for pattern determination, and there is also the disadvantage of high cost involved in constructing a code tree. The objective of this paper is to propose a table-driven code optimizer using the DFA(Deterministic Finite Automata) optimization table to overcome the shortcomings of existing optimization techniques. Unlike other techniques, this is an efficient method of implementing an optimizer that is constructed with the deterministic automata, which determines the final pattern, refuting the pattern selection cost and expediting the pattern search process.

  • PDF

Design and Implementation of Intermediate Code Translator using String Pattern Matching Technique (스트링 패턴 매칭 기법을 이용한 중간 코드 변환기의 설계 및 구현)

  • 고광만
    • Journal of Internet Computing and Services
    • /
    • v.3 no.3
    • /
    • pp.1-9
    • /
    • 2002
  • The various researches are investigated for transforming byte code into objective machine code which can be implemented in the specific processor using classical compiling methods to improve the execution speed of the JAVA language. The code generation techniques using pattern matching can generate more high-quality code than code expansion techniques. We provide, in this research, the standardized pattern describing methods and pattern matching techniques that can be used to generate the register-based inter-language which is for the effective native code generation from byte code. And we designed and realized the inter-code transformer with which we can generate the high-quality register-based inter-code using standardized pattern described formerly.

  • PDF

A Traffic Pattern Matching Hardware for a Contents Security System (콘텐츠 보안 시스템용 트래픽 패턴 매칭 하드웨어)

  • Choi, Young;Hong, Eun-Kyung;Kim, Tae-Wan;Paek, Seung-Tae;Choi, Il-Hoon;Oh, Hyeong-Cheol
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.1
    • /
    • pp.88-95
    • /
    • 2009
  • This paper presents a traffic pattern matching hardware that can be used in high performance network applications. The presented hardware is designed for a contents security system which is to block various kinds of information drain or intrusion activities. The hardware consists of two parts: the header lookup and string pattern matching parts. For implementing the header lookup part in hardware, the TCAMs(ternary CAMs) are popularly used. Since the TCAM approach is inefficient in terms of the hardware and memory costs and the power consumption, however, we adopt and modify an alternative approach based on the comparator arrays and the HiCuts tree. Our implementation results, using Xilinx FPGA XC4VSX55, show that our design can reduce the usage of the FPGA slices by about 26%, and the Block RAM by about 58%. In the design of string pattern matching part, we design and use a hashing module based on cellular automata, which is hardware efficient and consumes less power by adaptively changing its configuration to reduce the collision rates.

Concept based Image Retrieval Using Similarity Measurement Between Concepts (개념간 유사성 측정을 이용한 개념 기반 이미지 검색)

  • 조미영;최춘호;신주현;김판구
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.253-255
    • /
    • 2003
  • 기존의 개념 기반 이미지 검색에서는 이미지의 의미적 내용 인식을 위해 일반적으로 어휘적 정보나 텍스트 정보를 이용했다. 이러한 텍스트 정보 기반 이미지 검색은 전통적인 검색 방법인 키워드 검색 기술을 그대로 사용하여 쉽게 구현할 수 있으나 텍스트의 개념적 매칭이 아닌 스트링 매칭이므로 주석처리된 단어와 정확한 매칭이 없다면 찾을 수가 없었다. 이에 본 논문에서는 ontology의 일종인 WordNet을 이용하여 깊이 정보량 링크 타입, 밀도 등을 고려한 개념간 유사성 측정으로 패턴 매칭의 문제를 해결하고자 했다. 또한 키워드로 주석처리 되어 있는 Microsofts Design Gallery Live의 이미지를 이용하여 개념간 유사성 측정법을 실질적으로 개념 기반 이미지 검색에 적용해 보았다.

  • PDF