• Title/Summary/Keyword: Electronic Dictionary

Search Result 83, Processing Time 0.029 seconds

Statistical Information of Korean Dictionary to Construct an Enormous Electronic Dictionary (대용량 전자사전 구축을 위한 국어 대사전의 통계 정보)

  • Kim, Cheol-Su;Kim, Yang-Beom
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.6
    • /
    • pp.60-68
    • /
    • 2007
  • There are various application areas of Language information processing such as information retrieval, morphological analysis, spell checker, voice recognition, character recognition, etc. In these language information processing areas, an electronic dictionary is essential. This thesis made researches on basic statistical information on the Korean dictionary and on the construction of electronic dictionary. The targets of analysis were the number of registered word in Korea dictionary, the entry number of registered word in electronic dictionary, the number of used syllables, the number of different syllables, the average length of entry, the distribution of part of speech and the number of used nodes to construct electronic dictionary using Trie, except for words including a archaic word or incomplete syllables. Total entry number of electronic dictionary is 361,980, the number of used syllables is 1,289,659, the average length of entries is 3.56 and the number of different syllables is 2,463. Theses informations would play a beneficial role in constructing an electronic dictionary and in processing Korean information.

Component Implementation of Electronic Dictionary (전자사전 컴포넌트의 구현)

  • Choe, Seong-Un
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.587-592
    • /
    • 2001
  • Many applications are being developed to automate office works, and the electronic dictionary(e-Dictionary) is one of the main components of the office suites. Several requirements are proposed for the efficient e-dictionaries :1) Fast searching time, 2) Data compatibility with other e-dictionaries to deal with words and obsolete word, and 3) Reusable components to develop new customized e-dictionaries with minimized development time and cost. We propose a data format with which any e-dictionary can change data with others. We also develop System Dictionary component and Customer Dictionary component to enable-and-play component reuse. Our e-dictionary achieves fast searching time by efficiently managing Trie and B-tree index structure for the dictionary components.

  • PDF

Word Sense Disambiguation of Predicate using Sejong Electronic Dictionary and KorLex (세종 전자사전과 한국어 어휘의미망을 이용한 용언의 어의 중의성 해소)

  • Kang, Sangwook;Kim, Minho;Kwon, Hyuk-chul;Jeon, SungKyu;Oh, Juhyun
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.7
    • /
    • pp.500-505
    • /
    • 2015
  • The Sejong Electronic(machine readable) Dictionary, which was developed by the 21 century Sejong Plan, contains a systematic of immanence information of Korean words. It helps in solving the problem of electronical presentation of a general text dictionary commonly used. Word sense disambiguation problems can also be solved using the specific information available in the Sejong Electronic Dictionary. However, the Sejong Electronic Dictionary has a limitation of suggesting structure of sentences and selection-restricted nouns. In this paper, we discuss limitations of word sense disambiguation by using subcategorization information as suggested by the Sejong Electronic Dictionary and generalize selection-restricted noun of argument using Korean Lexico-semantic network.

Fast Super-Resolution Algorithm Based on Dictionary Size Reduction Using k-Means Clustering

  • Jeong, Shin-Cheol;Song, Byung-Cheol
    • ETRI Journal
    • /
    • v.32 no.4
    • /
    • pp.596-602
    • /
    • 2010
  • This paper proposes a computationally efficient learning-based super-resolution algorithm using k-means clustering. Conventional learning-based super-resolution requires a huge dictionary for reliable performance, which brings about a tremendous memory cost as well as a burdensome matching computation. In order to overcome this problem, the proposed algorithm significantly reduces the size of the trained dictionary by properly clustering similar patches at the learning phase. Experimental results show that the proposed algorithm provides superior visual quality to the conventional algorithms, while needing much less computational complexity.

A Study of Methodology for Automatic Construction of OWL Ontologies from Sejong Electronic Dictionary (대용량 OWL 온톨로지 자동구축을 위한 세종전자사전 활용 방법론 연구)

  • Song Do Gyu
    • Language and Information
    • /
    • v.9 no.1
    • /
    • pp.19-34
    • /
    • 2005
  • Ontology is an indispensable component in intelligent and semantic processing of knowledge and information, such as in semantic web. However, ontology construction requires vast amount of data collection and arduous efforts in processing these un-structured data. This study proposed a methodology to automatically construct and generate ontologies from Sejong Electronic Dictionary. As Sejong Electronic Dictionary is structured in XML format, it can be processed automatically by computer programmed tools into an OWL(Web Ontology Language)-based ontologies as specified in W3C . This paper presents the process and concrete application of this methodology.

  • PDF

Memory Performance of Electronic Dictionary-Based Commercial Workload

  • Lee, Changsik;Kim, Hiecheol;Lee, Yongdoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.7 no.5
    • /
    • pp.39-48
    • /
    • 2002
  • long with the rapid spread of the Internet, a new class of commercial applications which process transactions with respect to electronic dictionaries become popular Typical examples are Internet search engines. In this paper, we present a new approach to achieving high performance electronic dictionaries. Different from the conventional approach which use Trie data structures for the implementation of electronic dictionaries, our approach used multi-dimensional binary trees. In this paper, we present the implementation of our electronic dictionary ED-MBT(Electronic Dictionary based on Multidimensional Binary Tree). Exhaustive performance study is also presented to assess the performance impact of ED-MBT on the real world applications.

  • PDF

Word Sense Disambiguation of Predicate using Semi-supervised Learning and Sejong Electronic Dictionary (세종 전자사전과 준지도식 학습 방법을 이용한 용언의 어의 중의성 해소)

  • Kang, Sangwook;Kim, Minho;Kwon, Hyuk-chul;Oh, Jyhyun
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.2
    • /
    • pp.107-112
    • /
    • 2016
  • The Sejong Electronic(machine-readable) Dictionary, developed by the 21st century Sejong Plan, contains systematically organized information on Korean words. It helps to solve problems encountered in the electronic formatting of the still-commonly-used hard-copy dictionary. The Sejong Electronic Dictionary, however has a limitation relate to sentence structure and selection-restricted nouns. This paper discuses the limitations of word-sense disambiguation(WSD) that uses subcategorization information suggested by the Sejong Electronic Dictionary and generalized selection-restricted nouns from the Korean Lexico-semantic network. An alternative method that utilized semi-supervised learning, the chi-square test and some other means to make WSD decisions is presented herein.

Assignment Semantic Category of a Word using Word Embedding and Synonyms (워드 임베딩과 유의어를 활용한 단어 의미 범주 할당)

  • Park, Da-Sol;Cha, Jeong-Won
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.946-953
    • /
    • 2017
  • Semantic Role Decision defines the semantic relationship between the predicate and the arguments in natural language processing (NLP) tasks. The semantic role information and semantic category information should be used to make Semantic Role Decisions. The Sejong Electronic Dictionary contains frame information that is used to determine the semantic roles. In this paper, we propose a method to extend the Sejong electronic dictionary using word embedding and synonyms. The same experiment is performed using existing word-embedding and retrofitting vectors. The system performance of the semantic category assignment is 32.19%, and the system performance of the extended semantic category assignment is 51.14% for words that do not appear in the Sejong electronic dictionary of the word using the word embedding. The system performance of the semantic category assignment is 33.33%, and the system performance of the extended semantic category assignment is 53.88% for words that do not appear in the Sejong electronic dictionary of the vector using retrofitting. We also prove it is helpful to extend the semantic category word of the Sejong electronic dictionary by assigning the semantic categories to new words that do not have assigned semantic categories.

A New Dictionary Mechanism for Efficient Fault Diagnosis (효율적인 고장진단을 위한 딕셔너리 구조 개발)

  • Kim Sang-Wook;Kim Yong-Joon;Chun Sung-Hoon;Kang Sung-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.4 s.346
    • /
    • pp.49-55
    • /
    • 2006
  • In this paper, a fault dictionary for fault locations is considered. The foremost problem in fault diagnosis is the size of the data. As circuits are large, the data for fault diagnosis increase to the point where they are impossible to be stored. The increased information makes it impossible to store the dictionary into storage media. In order to generate the dictionary, j.e. pass-fail dictionary some dictionaries store a portion of the information. The deleted data makes it difficult to diagnose fault models except single stuck-at fault. This paper proposes a new dictionary format. A new format makes a dictionary small size without deleting any informations.

Hierarchical Regression for Single Image Super Resolution via Clustering and Sparse Representation

  • Qiu, Kang;Yi, Benshun;Li, Weizhong;Huang, Taiqi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2539-2554
    • /
    • 2017
  • Regression-based image super resolution (SR) methods have shown great advantage in time consumption while maintaining similar or improved quality performance compared to other learning-based methods. In this paper, we propose a novel single image SR method based on hierarchical regression to further improve the quality performance. As an improvement to other regression-based methods, we introduce a hierarchical scheme into the process of learning multiple regressors. First, training samples are grouped into different clusters according to their geometry similarity, which generates the structure layer. Then in each cluster, a compact dictionary can be learned by Sparse Coding (SC) method and the training samples can be further grouped by dictionary atoms to form the detail layer. Last, a series of projection matrixes, which anchored to dictionary atoms, can be learned by linear regression. Experiment results show that hierarchical scheme can lead to regression that is more precise. Our method achieves superior high quality results compared with several state-of-the-art methods.