• Title/Summary/Keyword: Hangeul Information Processing

Search Result 22, Processing Time 0.022 seconds

A Study on an Efficient Coding of Hangeul (효율적인 한글 코드화에 관한 연구)

  • 김경태;민용식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.6
    • /
    • pp.633-641
    • /
    • 1989
  • In this paper, we proposed an economical coding method to be applied for Hangeul character, the Korean letter, by utilizing three state transition graph. With this method, only about 3.5 biys are required in expressing a Hangeul character, which is more than 1 bit shorter than conventional codes so far introduced in order to realze extensive code compression. Hence this method will improve the rapidity and exanctitude and economy in processing Hangeul letter.

  • PDF

An Experimental Study on Automatic Indexing for Hangeul Text (한글문헌의 자동색인에 관한 실험적 연구)

  • Ahn, Heyon-Soo
    • Journal of the Korean Society for information Management
    • /
    • v.3 no.2
    • /
    • pp.109-128
    • /
    • 1986
  • The explosive amount of information and various demands for it have led to the development of automatic indexing. Specially, in the HANGEUL data processing, the necessity of automatic indexing has been steadily increased. It is hypothesized that in the HANGEUL text, CHE-ON's only become key words and the CHE-ON is followed by JOSA. Through the morphological analysis the key words were selected from the titles and abstracts in the experimental data which consisted of 20 papers in "Journal of the Korea Society for Information Science."

  • PDF

A Study on the Pre-Classification of Handwritten Hangeul Characters Using Partial Separation and Recognition of Initial Consonants (초성자소분리 인식에 의한 필기 한글문자의 대분류에 관한 연구)

  • 안석출;김명기
    • Journal of the Korean Graphic Arts Communication Society
    • /
    • v.6 no.1
    • /
    • pp.41-57
    • /
    • 1988
  • Recently, it Is required to develop OCR(Optical Character Reader) along with the progress of the information processing system for Hangeul. Characters have to be recognized clearly so that OCR can be applied, Structure analysis method and lump method are used for the recognition of characters, and OCR is now available for the recognition of printed characters and handwritten alphanumeric characters having simple structure by them However, It is known that there should be much more study on the development of handwritten Hangout's OCR. This paper proposed a new method for the handwritten Hangout character recognition. The units of Initial consonant of Hangout are separated and then recognized from the utilization of the position- Information of Hangeul's units from the normalized patterns using the regression line theory. It is carried out for the extraction of the block which exists in the virtual Initial consonant region from the normalized input patterns and the calculation on maximum value (${\beta}$) of likelihood after comparing the features of separated subpattern with the initial consonant dictionary.

  • PDF

Development of EUC-KR based Locale and Application Program Supporting North Korean Collating Sequence (북한 한글 순서를 지원하는 EUC-KR 기반의 로캘과 응용 프로그램 개발)

  • Jung Il-dong;Lee Jung-hwa;Kim Yong-ho;Kim Kyongsok
    • The KIPS Transactions:PartB
    • /
    • v.11B no.7 s.96
    • /
    • pp.875-884
    • /
    • 2004
  • UCS (=ISO/IEC 10646, =Unicode) will be used widely as globalization. If UCS is used for official purpose in Koreas, UCS solves a Problem in different hangeul code between South and North Korea. But, UCS is not a solution for problems in unequal order with the same character. IS0/1EC 146sl : 2000 (International String Ordering), which is a international standard for string ordering, defines a framework sorting all char-acter strings consisting multi-national scripts. Because the Common Template Table in ISO/IEC 14651 defines orders of characters, we can change orders of characters without changes of characters sequences in programs. Therefore, we can solve a ordering problem without unifying order of hangeul in South and North Korea. Functions related ISO/IEC 14651 are contained by system librarys in unix-based operating system such as Linux, Solaris and FreeBSD. We implement EUC-KR-based North Korean locale, which includes North Korean hangeul order, in Linux in order to use North Korean locale in South Korea. And we develop a program ordering strings with South and North Korean hangout order.

Some Structural Analysis of HAN GEUL Information Source and its Application to the Improved Coding Methed (한국어 정보원의 구조분석과 Code의 개선)

  • Lee, Ju-Geun;Park, Jong-Uk;Kim, Chang-Seon
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.2
    • /
    • pp.1-7
    • /
    • 1978
  • In this paper, a preliminary structural analysis of the information source of HANGEUL is performed statistically. And the results of the analysis are utilized to a KS HAHGEUL code, improving its transmission rate of 14%, on the basis of probability rankings of the fundamental HANGEUL elements. Forthermore Some problems about KS code, a coding method for the HAHGEUL information processing, as well as data entry are realized.

  • PDF

Hangeul Stem Extraction Algorithm for Text Mining Based on Natural Language Processing (자연어 처리 기반 텍스트 마이닝을 위한 한글 어간 추출 알고리즘)

  • Choi, Ki-won;Choi, Seong-hun;Jo, Sang-hyeon;Kim, Hee-cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.718-721
    • /
    • 2017
  • Natural language processing, which is the basis of text mining, differs depending on the type of language. Especially, Hangeul, which has relatively high freedom of expression compared to other languages, has various forms of words depending on the use of ending. The part that does not change in these various forms of words is called the stem. For effective text mining, it is essential to extract words and unify various types of words. Therefore, this paper proposes an extraction algorithm for Hangul word for effective text mining of Hangul document.

  • PDF

A Study on an On-Line Handwritten Hangeul Character Recognition Using Fuzzy Inference (Fuzzy 推論을 이용한 온라인 筆記體 한글문자 認識에 관한 연구)

  • Choi, Yong-Yub;Choi, Kap-Seok
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.11
    • /
    • pp.103-110
    • /
    • 1990
  • This paper studies on an on-line recognition of handwritten Hangeul characters using the fuzzy inference. To solve the ambiguity due to the variations of writing style by writes, these handwri-tten characters are recognized by means of the fuzzy inference on the production rule which is generated with every relative position information between strokes. In order to reduce the processing time, a subgroup which is previously classified with the number of strokes of reference characters is selected according to the number of strokes of input character, and the tolerance limit of distances between input character and reference characters of a subgroup is introduced to reduce the reference characters which is applied to the fuzzy inference. Experimental results for handwritten Hanguel charters 3990 by 10 writers show the recognition rate of $99.5{\%}$and the average processing time of 0.4sec/character.

  • PDF

Computerization and Application of Hangeul Standard Pronunciation Rule (음성처리를 위한 표준 발음법의 전산화)

  • 이계영
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1363-1366
    • /
    • 2003
  • This paper introduces computerized version of Hangout(Korean Language) Standard Pronunciation Rule that can be used in Korean processing systems such as Korean voice synthesis system and Korean voice recognition system. For this purpose, we build Petri net models for each items of the Standard Pronunciation Rule, and then integrate them into the vocal sound conversion table. The reversion of Hangul Standard Pronunciation Rule regulates the way of matching vocal sounds into grammatically correct written characters. This paper presents not only the vocal sound conversion table but also character conversion table obtained by reversely converting the vocal sound conversion table. Making use of these tables, we have implemented a Hangeul character into a vocal sound system and a Korean vocal sound into character conversion system, and tested them with various data sets reflecting all the items of the Standard Pronunciation Rule to verify the soundness and completeness of our tables. The test results shows that the tables improves the process speed in addition to the soundness and completeness.

  • PDF

A study on the Partial Separation for Subpatterns and Recognition of the Hangeul Patterns (한글 Pattern에서 Subpattern분리와 인식에 관한 연구)

  • Lee, Ju-Geun;Namkung, J.C.;Kim, Yeong-Geon
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.18 no.3
    • /
    • pp.1-8
    • /
    • 1981
  • In this paper, the recognition method of Hangout patterns with the partial separation for the subpatterns is proposed. First, Hangout patterns are formalized into six formal patterns and their surface structures are discriminated. Second, two to four subpatterns from one form pattern are separated by the new algorithm combined with Index mark and Window. Hangout patterns are recognized with only frontiers of the tree by defining the regular U-tree grammar for the separated subpatterns. Compared with the previous tree grammar , this grammar reduces the production rules to 1/3 and simplifies automaton processing and has more flexiblity. By the simulation result for 1,600 characters of Hangeul patterns, separation rate of subpatterns (24 or 44) is obtained 99.1% and recognition rate is obtained 100 %.

  • PDF

A Study on the Classification of Hangeul Patterns Using Hierarchical Neural Network (계층적 신경회로망을 이용한 한글 패턴 분류에 관한 연구)

  • Kim, Do-Hyeon;Lee, Byeong-Mo;Cha, Eui-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.04a
    • /
    • pp.569-572
    • /
    • 2002
  • 한글을 인식하기 위한 전처리 방법으로 흔히 모음의 종류 및 자음과의 결합 정도에 따라 6가지 유형으로 분류하는 방법을 많이 사용하고 있다. 간 논문에서는 이러한 한글 문자를 인식하기 위한 전처리 과정으로써 한글의 유형을 분류하는 방법에 대한 연구로 계층적인 신경회로망을 도입하여 빠르고 신뢰성 있는 분류 방법을 제안한다. 실험에 사용된 글자는 KS X 1001(KS C 5601) 완성형 글자 2,350개에 대한 굴림, 바탕, 돋움, 궁서 글꼴로 총 9400개의 이미지 파일을 사용하였으며. 이 중 일부는 훈련에 사용하고 나머지는 분류를 위한 테스트 데이터로 사용한 결과 약 94%의 유형 분류율과 개별 패턴을 5.67ms에 분류하는 빠른 분류 속도를 나타내었다.

  • PDF