Search | Korea Science

Encoding Dictionary Feature for Deep Learning-based Named Entity Recognition

Ronran, Chirawan;Unankard, Sayan;Lee, Seungwoo
- International Journal of Contents
- /
- v.17 no.4
- /
- pp.1-15
- /
- 2021
Named entity recognition (NER) is a crucial task for NLP, which aims to extract information from texts. To build NER systems, deep learning (DL) models are learned with dictionary features by mapping each word in the dataset to dictionary features and generating a unique index. However, this technique might generate noisy labels, which pose significant challenges for the NER task. In this paper, we proposed DL-dictionary features, and evaluated them on two datasets, including the OntoNotes 5.0 dataset and our new infectious disease outbreak dataset named GFID. We used (1) a Bidirectional Long Short-Term Memory (BiLSTM) character and (2) pre-trained embedding to concatenate with (3) our proposed features, named the Convolutional Neural Network (CNN), BiLSTM, and self-attention dictionaries, respectively. The combined features (1-3) were fed through BiLSTM - Conditional Random Field (CRF) to predict named entity classes as outputs. We compared these outputs with other predictions of the BiLSTM character, pre-trained embedding, and dictionary features from previous research, which used the exact matching and partial matching dictionary technique. The findings showed that the model employing our dictionary features outperformed other models that used existing dictionary features. We also computed the F1 score with the GFID dataset to apply this technique to extract medical or healthcare information.
https://doi.org/10.5392/IJoC.2021.17.4.001 인용 PDF KSCI HTML

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

Lee, Hong-Joo;Klein, Mark
- Asia pacific journal of information systems
- /
- v.18 no.1
- /
- pp.79-96
- /
- 2008
One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.
PDF KSCI

Retrieval Model using Subject Classification Table, User Profile, and LSI (전공분류표, 사용자 프로파일, LSI를 이용한 검색 모델)

Woo Seon-Mi
- The KIPS Transactions:PartD
- /
- v.12D no.5 s.101
- /
- pp.789-796
- /
- 2005
Because existing information retrieval systems, in particular library retrieval systems, use 'exact keyword matching' with user's query, they present user with massive results including irrelevant information. So, a user spends extra effort and time to get the relevant information from the results. Thus, this paper will propose SULRM a Retrieval Model using Subject Classification Table, User profile, and LSI(Latent Semantic Indexing), to provide more relevant results. SULRM uses document filtering technique for classified data and document ranking technique for non-classified data in the results of keyword-based retrieval. Filtering technique uses Subject Classification Table, and ranking technique uses user profile and LSI. And, we have performed experiments on the performance of filtering technique, user profile updating method, and document ranking technique using the results of information retrieval system of our university' digital library system. In case that many documents are retrieved proposed techniques are able to provide user with filtered data and ranked data according to user's subject and preference.
https://doi.org/10.3745/KIPSTD.2005.12D.5.789 인용 PDF KSCI

Strip Adjustment of Airborne Laser Scanner Data Using Area-based Surface Matching

Lee, Dae Geon;Yoo, Eun Jin;Yom, Jae-Hong;Lee, Dong-Cheon
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.32 no.6
- /
- pp.625-635
- /
- 2014
Multiple strips are required for large area mapping using ALS (Airborne Laser Scanner) system. LiDAR (Light Detection And Ranging) data collected from the ALS system has discrepancies between strips due to systematic errors of on-board laser scanner and GPS/INS, inaccurate processing of the system calibration as well as boresight misalignments. Such discrepancies deteriorate the overall geometric quality of the end products such as DEM (Digital Elevation Model), building models, and digital maps. Therefore, strip adjustment for minimizing discrepancies between overlapping strips is one of the most essential tasks to create seamless point cloud data. This study implemented area-based matching (ABM) to determine conjugate features for computing 3D transformation parameters. ABM is a well-known method and easily implemented for this purpose. It is obvious that the exact same LiDAR points do not exist in the overlapping strips. Therefore, the term "conjugate point" means that the location of occurring maximum similarity within the overlapping strips. Coordinates of the conjugate locations were determined with sub-pixel accuracy. The major drawbacks of the ABM are sensitive to scale change and rotation. However, there is almost no scale change and the rotation angles are quite small between adjacent strips to apply AMB. Experimental results from this study using both simulated and real datasets demonstrate validity of the proposed scheme.
https://doi.org/10.7848/ksgpc.2014.32.6.625 인용 PDF KSCI KPUBS HTML

Decentralized Suboptimal $H_2$ Filtering

Jo, Nam-Hoon;Kong, Jae-Sop;Seo, Jin-Heon
- Proceedings of the KIEE Conference
- /
- 1993.11a
- /
- pp.323-325
- /
- 1993
In this paper, the decentralized suboptimal $H_2$ filtering problem is considered. An additional term is added to the centralized optimal $H_2$ filter so that the whole filter is decentralized. We derive a sufficient condition for existence of such decentralized filters. By employing the solution procedure for the exact model matching problem, we obtain a set of decentralized $H_2$ filters, and choose a suboptimal filter from this set of decentralized $H_2$ filters. Naturally the resulting filter is guaranteed to be stable.
PDF

Studies of Interface Continuity in Isogeometric Structural Analysis for Multi-patch Shell Components (다중 패치 쉘 아이소 지오메트릭 해석의 계면 연속성 검토)

Ha, Youn Doh;Noh, Jungmin
- Journal of the Computational Structural Engineering Institute of Korea
- /
- v.31 no.2
- /
- pp.71-78
- /
- 2018
This paper presents the assembling of multiple patches based on the single patch isogeometric formulation for the shear deformable shell element given in the previous study. The geometrically exact shell formulation has been accomplished with the shell theory based formulation and the generalized curvilinear coordinate system directly derived from the given NURBS geometry. For the knot elements matching across adjacent surfaces, the zero-th and first parametric continuity conditions are considered and the corresponding coupling constraints are implemented by a master-slave formulation between adjacent patches. The constraints are then enforced by a substitution method for condensation of the slave variables, thereby reducing the model size. Through numerical investigations, the important features of the first parametric continuity condition are confirmed. The performance of the multi-patch shell models is also examined comparing the rate of convergence of response coefficients for the zero and first order continuity conditions and continuity in coupling boundary between two patches is confirmed.
https://doi.org/10.7734/COSEIK.2018.31.2.71 인용 PDF KSCI

User Location Prediction Within a Building Using Search Tree (탐색 트리를 이용한 건물 내 사용자의 위치 예측 방법)

Oh, Se-Chang
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2010.10a
- /
- pp.585-588
- /
- 2010
The prediction of user location within a building can be applied to many areas like visitor guiding. The existing methods for solving this problem consider limited number of locations a user visited in the past to predict the current location. It cannot model the complex movement patterns, and makes the system inefficient by modeling simple ones too detail. Also it causes prediction errors. In this paper, there is no restriction on the length of past movement patterns to consider for current location prediction. For this purpose, a modified search tree is used. The search tree is constructed to make exact matching as needed for location prediction. The search tree makes the efficient and accurate prediction possible.
PDF

Why Gabor Frames? Two Fundamental Measures of Coherence and Their Role in Model Selection

Bajwa, Waheed U.;Calderbank, Robert;Jafarpour, Sina
- Journal of Communications and Networks
- /
- v.12 no.4
- /
- pp.289-307
- /
- 2010
The problem of model selection arises in a number of contexts, such as subset selection in linear regression, estimation of structures in graphical models, and signal denoising. This paper studies non-asymptotic model selection for the general case of arbitrary (random or deterministic) design matrices and arbitrary nonzero entries of the signal. In this regard, it generalizes the notion of incoherence in the existing literature on model selection and introduces two fundamental measures of coherence-termed as the worst-case coherence and the average coherence-among the columns of a design matrix. It utilizes these two measures of coherence to provide an in-depth analysis of a simple, model-order agnostic one-step thresholding (OST) algorithm for model selection and proves that OST is feasible for exact as well as partial model selection as long as the design matrix obeys an easily verifiable property, which is termed as the coherence property. One of the key insights offered by the ensuing analysis in this regard is that OST can successfully carry out model selection even when methods based on convex optimization such as the lasso fail due to the rank deficiency of the submatrices of the design matrix. In addition, the paper establishes that if the design matrix has reasonably small worst-case and average coherence then OST performs near-optimally when either (i) the energy of any nonzero entry of the signal is close to the average signal energy per nonzero entry or (ii) the signal-to-noise ratio in the measurement system is not too high. Finally, two other key contributions of the paper are that (i) it provides bounds on the average coherence of Gaussian matrices and Gabor frames, and (ii) it extends the results on model selection using OST to low-complexity, model-order agnostic recovery of sparse signals with arbitrary nonzero entries. In particular, this part of the analysis in the paper implies that an Alltop Gabor frame together with OST can successfully carry out model selection and recovery of sparse signals irrespective of the phases of the nonzero entries even if the number of nonzero entries scales almost linearly with the number of rows of the Alltop Gabor frame.
PDF KSCI

Semi-automatic Building Area Extraction based on Improved Snake Model (개선된 스네이크 모텔에 기반한 반자동 건물 영역 추출)

Park, Hyun-Ju;Gwun, Ou-Bong
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.48 no.1
- /
- pp.1-7
- /
- 2011
Terrain, building location and area, and building shape information is in need of implementing 3D map. This paper proposes a method of extracting a building area by an improved semi-automatic snake algorithm. The method consists of 3-stage: pre-processing, initializing control points, and applying an improved snake algorithm. In the first stage, after transforming a satellite image to a gray image and detecting the approximate edge of the gray image, the method combines the gray image and the edge. In the second stage, the user looks for the center point of a building and the system sets the circular or rectangular initial control points by an procedural method. In the third stage, the enhanced snake algorithm extracts the building area. In particular, this paper sets the one tenn of the snake in a new way in order to use the proposed method for specializing building area extraction. Finally, this paper evaluated the performance of the proposed method using sky view satellite image and it showed that the matching percentage to the exact building area is 75%.
PDF KSCI

The Comparative Analysis of 3D Software Virtual and Actual Wedding Dress

Yuan, Xin-Yi;Bae, Soo-Jeong
- Journal of Fashion Business
- /
- v.21 no.6
- /
- pp.47-65
- /
- 2017
This study is intended to compare an actual wedding dress being made completely through 3D software, and compare it with an actual dress of a real model by using collective tools for comparative analysis. The method of the study was conducted via a literature review along with the production of the dresses. In the production, two wedding dresses for the small wedding ceremony were designed. Each of the design was made into both 3D and an actual garment. The results are as follows. First, the 3D whole body scanner reflects the measure of the exact human body size, however there were some difficulties in matching what the customer wanted, because the difference of the skin color and the hair style. Second, the pattern of the dress is much more easily altered than it was in the real production. Third, the silhouette of the virtual and the actual person with the dress was nearly the same. Fourth, textile tool was much more convenient because of the use of real-time rendering on the virtual dresses. Lastly, the lace and biz decoration were flat, and the luster was duller than in reality. Prospectively, the consumer will decide their own design of variety through the use of the avatar without wearing the actual dresses, and they would demand what the another one desired, different from the presented ones by making the corrections by themselves. Through this process, the consumer would be actively participating in the design, a step which would finally lead to the two way designing rather than the one way design of present times.
https://doi.org/10.12940/jfb.2017.21.6.47 인용 PDF KSCI

Search Result 30, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)