• Title/Summary/Keyword: Structural SVM

Search Result 77, Processing Time 0.032 seconds

Automatic Korean Word Spacing using Structural SVM (Structural SVM을 이용한 한국어 자동 띄어쓰기)

  • Lee, Chang-Ki;Kim, Hyun-Ki
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.270-272
    • /
    • 2012
  • 본 논문에서는 띄어쓰기가 완전히 무시된 한국어 문장의 띄어쓰기 문제를 위해 structural SVM을 이용한 한국어 띄어쓰기 방법을 제안한다. Structural SVM은 기존의 이진 분류 SVM을 sequence labeling 등의 문제에 적용할 수 있도록 확장된 것으로, 이 분야에 띄어난 성능을 보이는 것으로 알려진 CRF와 비슷하거나 더 높은 성능을 보이고 있다. 본 논문에서는 약 2,600만 어절의 세종 코퍼스 원문을 학습 데이터로 사용하고, 약 29만 어절의 ETRI 품사 부착 코퍼스를 평가 데이터로 사용하였다. 평가 결과 음절단위의 정확도는 99.01%, 어절단위의 정확도는 95.47%를 보였다.

Sentiment Analysis using Latent Structural SVM (잠재 구조적 SVM을 활용한 감성 분석기)

  • Yang, Seung-Won;Lee, Changki
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.5
    • /
    • pp.240-245
    • /
    • 2016
  • In this study, comments on restaurants, movies, and mobile devices, as well as tweet messages regardless of specific domains were analyzed for sentimental information content. We proposed a system for extraction of objects (or aspects) and opinion words from each sentence and the subsequent evaluation. For the sentiment analysis, we conducted a comparative evaluation between the Structural SVM algorithm and the Latent Structural SVM. As a result, the latter showed better performance and was able to extract objects/aspects and opinion words using VP/NP analyzed by the dependency parser tree. Lastly, we also developed and evaluated the sentiment detector model for use in practical services.

Modified Fixed-Threshold SMO for 1-Slack Structural SVMs

  • Lee, Chang-Ki;Jang, Myung-Gil
    • ETRI Journal
    • /
    • v.32 no.1
    • /
    • pp.120-128
    • /
    • 2010
  • In this paper, we describe a modified fixed-threshold sequential minimal optimization (FSMO) for 1-slack structural support vector machine (SVM) problems. Because the modified FSMO uses the fact that the formulation of 1-slack structural SVMs has no bias, it breaks down the quadratic programming (QP) problems of 1-slack structural SVMs into a series of smallest QP problems, each involving only one variable. For various test sets, the modified FSMO is as accurate as existing structural SVM implementations (n-slack and 1-slack SVM-struct) but is faster on large data sets.

Jointly Learning Model using modified Latent Structural SVM (Latent Structural SVM을 확장한 결합 학습 모델)

  • Lee, Changki
    • Annual Conference on Human and Language Technology
    • /
    • 2013.10a
    • /
    • pp.70-73
    • /
    • 2013
  • 자연어처리에서는 많은 모듈들이 파이프라인 방식으로 연결되어 사용되나, 이 경우 앞 단계의 오류가 뒷 단계에 누적되는 문제와 앞 단계에서 뒷 단계의 정보를 사용하지 못한다는 단점이 있다. 본 논문에서는 파이프라인 방식의 문제를 해결하기 위해 사용되는 일반적인 결합 학습 방법을 확장하여, 두 작업이 동시에 태깅된 학습 데이터뿐만 아니라 한 작업만 태깅된 학습데이터도 동시에 학습에 사용할 수 있는 결합 학습 모델을 Latent Structural SVM을 확장하여 제안한다. 실험 결과, 기존의 한국어 띄어쓰기와 품사 태깅 결합 모델의 품사 태깅 성능이 96.99%였으나, 본 논문에서 제안하는 결합 학습 모델을 이용하여 대용량의 한국어 띄어쓰기 학습데이터를 추가로 학습한 결과 품사 태깅 성능이 97.20%까지 향상 되었다.

  • PDF

Prediction of unmeasured mode shapes and structural damage detection using least squares support vector machine

  • Kourehli, Seyed Sina
    • Structural Monitoring and Maintenance
    • /
    • v.5 no.3
    • /
    • pp.379-390
    • /
    • 2018
  • In this paper, a novel and effective damage diagnosis algorithm is proposed to detect and estimate damage using two stages least squares support vector machine (LS-SVM) and limited number of attached sensors on structures. In the first stage, LS-SVM1 is used to predict the unmeasured mode shapes data based on limited measured modal data and in the second stage, LS-SVM2 is used to predicting the damage location and severity using the complete modal data from the first-stage LS-SVM1. The presented methods are applied to a three story irregular frame and cantilever plate. To investigate the noise effects and modeling errors, two uncertainty levels have been considered. Moreover, the performance of the proposed methods has been verified through using experimental modal data of a mass-stiffness system. The obtained damage identification results show the suitable performance of the proposed damage identification method for structures in spite of different uncertainty levels.

Structural health monitoring data reconstruction of a concrete cable-stayed bridge based on wavelet multi-resolution analysis and support vector machine

  • Ye, X.W.;Su, Y.H.;Xi, P.S.;Liu, H.
    • Computers and Concrete
    • /
    • v.20 no.5
    • /
    • pp.555-562
    • /
    • 2017
  • The accuracy and integrity of stress data acquired by bridge heath monitoring system is of significant importance for bridge safety assessment. However, the missing and abnormal data are inevitably existed in a realistic monitoring system. This paper presents a data reconstruction approach for bridge heath monitoring based on the wavelet multi-resolution analysis and support vector machine (SVM). The proposed method has been applied for data imputation based on the recorded data by the structural health monitoring (SHM) system instrumented on a prestressed concrete cable-stayed bridge. The effectiveness and accuracy of the proposed wavelet-based SVM prediction method is examined by comparing with the traditional autoregression moving average (ARMA) method and SVM prediction method without wavelet multi-resolution analysis in accordance with the prediction errors. The data reconstruction analysis based on 5-day and 1-day continuous stress history data with obvious preternatural signals is performed to examine the effect of sample size on the accuracy of data reconstruction. The results indicate that the proposed data reconstruction approach based on wavelet multi-resolution analysis and SVM is an effective tool for missing data imputation or preternatural signal replacement, which can serve as a solid foundation for the purpose of accurately evaluating the safety of bridge structures.

Korean Semantic Role Labeling Using Structured SVM (Structural SVM 기반의 한국어 의미역 결정)

  • Lee, Changki;Lim, Soojong;Kim, Hyunki
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.220-226
    • /
    • 2015
  • Semantic role labeling (SRL) systems determine the semantic role labels of the arguments of predicates in natural language text. An SRL system usually needs to perform four tasks in sequence: Predicate Identification (PI), Predicate Classification (PC), Argument Identification (AI), and Argument Classification (AC). In this paper, we use the Korean Propbank to develop our Korean semantic role labeling system. We describe our Korean semantic role labeling system that uses sequence labeling with structured Support Vector Machine (SVM). The results of our experiments on the Korean Propbank dataset reveal that our method obtains a 97.13% F1 score on Predicate Identification and Classification (PIC), and a 76.96% F1 score on Argument Identification and Classification (AIC).

Named Entity Recognition with Structural SVMs and Pegasos algorithm (Structural SVMs 및 Pegasos 알고리즘을 이용한 한국어 개체명 인식)

  • Lee, Changki;Jang, Myungil
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.100-104
    • /
    • 2010
  • 개체명 인식은 정보 추출의 한 단계로서 정보검색 분야 뿐 아니라 질의응답과 요약 분야에서 매우 유용하게 사용되고 있다. 본 논문에서는 structural Support Vector Machines(structural SVMs) 및 수정된 Pegasos 알고리즘을 이용한 한국어 개체명 인식 시스템에 대하여 기술하고 기존의 Conditional Random Fields(CRFs)를 이용한 시스템과의 성능을 비교한다. 실험결과 structural SVMs과 수정된 Pegasos 알고리즘이 기존의 CRFs 보다 높은 성능을 보였고(신뢰도 99%에서 통계적으로 유의함), structural SVMs과 수정된 Pegasos 알고리즘의 성능은 큰 차이가 없음(통계적으로 유의하지 않음)을 알 수 있었다. 특히 본 논문에서 제안하는 수정된 Pegasos 알고리즘을 이용한 경우 CRFs를 이용한 시스템보다 높은 성능 (TV 도메인 F1=85.43, 스포츠 도메인 F1=86.79)을 유지하면서 학습 시간은 4%로 줄일 수 있었다.

  • PDF

A Multi-Objective TRIBES/OC-SVM Approach for the Extraction of Areas of Interest from Satellite Images

  • Benhabib, Wafaa;Fizazi, Hadria
    • Journal of Information Processing Systems
    • /
    • v.13 no.2
    • /
    • pp.321-339
    • /
    • 2017
  • In this work, we are interested in the extraction of areas of interest from satellite images by introducing a MO-TRIBES/OC-SVM approach. The One-Class Support Vector Machine (OC-SVM) is based on the estimation of a support that includes training data. It identifies areas of interest without including other classes from the scene. We propose generating optimal training data using the Multi-Objective TRIBES (MO-TRIBES) to improve the performances of the OC-SVM. The MO-TRIBES is a parameter-free optimization technique that manages the search space in tribes composed of agents. It makes different behavioral and structural adaptations to minimize the false positive and false negative rates of the OC-SVM. We have applied our proposed approach for the extraction of earthquakes and urban areas. The experimental results and comparisons with different state-of-the-art classifiers confirm the efficiency and the robustness of the proposed approach.

Restoring Omitted Sentence Constituents in Encyclopedia Documents Using Structural SVM (Structural SVM을 이용한 백과사전 문서 내 생략 문장성분 복원)

  • Hwang, Min-Kook;Kim, Youngtae;Ra, Dongyul;Lim, Soojong;Kim, Hyunki
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.131-150
    • /
    • 2015
  • Omission of noun phrases for obligatory cases is a common phenomenon in sentences of Korean and Japanese, which is not observed in English. When an argument of a predicate can be filled with a noun phrase co-referential with the title, the argument is more easily omitted in Encyclopedia texts. The omitted noun phrase is called a zero anaphor or zero pronoun. Encyclopedias like Wikipedia are major source for information extraction by intelligent application systems such as information retrieval and question answering systems. However, omission of noun phrases makes the quality of information extraction poor. This paper deals with the problem of developing a system that can restore omitted noun phrases in encyclopedia documents. The problem that our system deals with is almost similar to zero anaphora resolution which is one of the important problems in natural language processing. A noun phrase existing in the text that can be used for restoration is called an antecedent. An antecedent must be co-referential with the zero anaphor. While the candidates for the antecedent are only noun phrases in the same text in case of zero anaphora resolution, the title is also a candidate in our problem. In our system, the first stage is in charge of detecting the zero anaphor. In the second stage, antecedent search is carried out by considering the candidates. If antecedent search fails, an attempt made, in the third stage, to use the title as the antecedent. The main characteristic of our system is to make use of a structural SVM for finding the antecedent. The noun phrases in the text that appear before the position of zero anaphor comprise the search space. The main technique used in the methods proposed in previous research works is to perform binary classification for all the noun phrases in the search space. The noun phrase classified to be an antecedent with highest confidence is selected as the antecedent. However, we propose in this paper that antecedent search is viewed as the problem of assigning the antecedent indicator labels to a sequence of noun phrases. In other words, sequence labeling is employed in antecedent search in the text. We are the first to suggest this idea. To perform sequence labeling, we suggest to use a structural SVM which receives a sequence of noun phrases as input and returns the sequence of labels as output. An output label takes one of two values: one indicating that the corresponding noun phrase is the antecedent and the other indicating that it is not. The structural SVM we used is based on the modified Pegasos algorithm which exploits a subgradient descent methodology used for optimization problems. To train and test our system we selected a set of Wikipedia texts and constructed the annotated corpus in which gold-standard answers are provided such as zero anaphors and their possible antecedents. Training examples are prepared using the annotated corpus and used to train the SVMs and test the system. For zero anaphor detection, sentences are parsed by a syntactic analyzer and subject or object cases omitted are identified. Thus performance of our system is dependent on that of the syntactic analyzer, which is a limitation of our system. When an antecedent is not found in the text, our system tries to use the title to restore the zero anaphor. This is based on binary classification using the regular SVM. The experiment showed that our system's performance is F1 = 68.58%. This means that state-of-the-art system can be developed with our technique. It is expected that future work that enables the system to utilize semantic information can lead to a significant performance improvement.