• Title/Summary/Keyword: Transformer

Search Result 4,278, Processing Time 0.029 seconds

A medium-range streamflow forecasting approach over South Korea using Double-encoder-based transformer model (다중 인코더 기반의 트랜스포머 모델을 활용한 한반도 대규모 유역에 중장기 유출량 예측 전망 방법 제시)

  • Dong Gi Lee;Sung-Hyun Yoon;Kuk-Hyun Ahn
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.101-101
    • /
    • 2023
  • 지난 수십 년 동안 다양한 딥러닝 방법이 개발되고 있으며 수문 분야에서는 이러한 딥러닝 모형이 기존의 수문모형의 역할을 대체하여 사용할 수 있다는 가능성이 제시되고 있다. 본 연구에서는 딥러닝 모형 중에 트랜스포머 모형에 다중 인코더를 사용하여 중장기 기간 (1 ~ 10일)의 리드 타임에 대한 한국의 유출량 예측 전망의 가능성을 확인하고자 하였다. 트랜스포머 모형은 인코더와 디코더 구조로 구성되어 있으며 어텐션 (attention) 기법을 사용하여 기존 모형의 정보를 손실하는 단점을 보완한 모형이다. 본 연구에서 사용된 다중 인코더 기반의 트랜스포머 모델은 트랜스포머의 인코더와 디코더 구조에서 인코더를 하나 더 추가한 모형이다. 그리고 결과 비교를 위해 기존에 수문모형을 활용한 스태킹 앙상블 모형 (Stacking ensemble model) 기반의 예측모형을 추가로 구축하였다. 구축된 모형들은 남한 전체를 총 469개의 대규모 격자로 나누어 각 격자의 유출량을 비교하여 평가하였다. 결과적으로 수문모형보다 딥러닝 모형인 다중 인코더 기반의 트랜스포머 모형이 더 긴 리드 타임에서 높은 성능을 나타냈으며 이를 통해 수문모형의 역할을 딥러닝 모형이 어느 정도는 대신할 수 있고 높은 성능을 가질 수 있는 것을 확인하였다.

  • PDF

An Automated Production System Design for Natural Language Processing Models Using Korean Pre-trained Model (한국어 사전학습 모델을 활용한 자연어 처리 모델 자동 산출 시스템 설계)

  • Jihyoung Jang;Hoyoon Choi;Gun-woo Lee;Myung-seok Choi;Charmgil Hong
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.613-618
    • /
    • 2022
  • 효과적인 자연어 처리를 위해 제안된 Transformer 구조의 등장 이후, 이를 활용한 대규모 언어 모델이자 사전학습 모델인 BERT, GPT, OPT 등이 공개되었고, 이들을 한국어에 보다 특화한 KoBERT, KoGPT 등의 사전학습 모델이 공개되었다. 자연어 처리 모델의 확보를 위한 학습 자원이 늘어나고 있지만, 사전학습 모델을 각종 응용작업에 적용하기 위해서는 데이터 준비, 코드 작성, 파인 튜닝 및 저장과 같은 복잡한 절차를 수행해야 하며, 이는 다수의 응용 사용자에게 여전히 도전적인 과정으로, 올바른 결과를 도출하는 것은 쉽지 않다. 이러한 어려움을 완화시키고, 다양한 기계 학습 모델을 사용자 데이터에 보다 쉽게 적용할 수 있도록 AutoML으로 통칭되는 자동 하이퍼파라미터 탐색, 모델 구조 탐색 등의 기법이 고안되고 있다. 본 연구에서는 한국어 사전학습 모델과 한국어 텍스트 데이터를 사용한 자연어 처리 모델 산출 과정을 정형화 및 절차화하여, 궁극적으로 목표로 하는 예측 모델을 자동으로 산출하는 시스템의 설계를 소개한다.

  • PDF

DART: Data Augmentation using Retrieval Technique (DART: 검색 모델 기술을 사용한 데이터 증강 방법론 연구)

  • Seungjun Lee;Jaehyung Seo;Jungseob Lee;Myunghoon Kang;Hyeonseok Moon;Chanjun Park;Dahyun Jung;Jaewook Lee;Kinam Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.313-319
    • /
    • 2022
  • 최근 BERT와 같은 트랜스포머 (Transformer) 기반의 모델이 natural language understanding (NLU)와 같은 여러 자연어 처리 태스크에서 좋은 성능을 보인다. 이러한 모델은 여전히 대용량의 학습을 요구한다. 일반적으로, 데이터 증강 기법은 low-resource 환경을 개선하는 데 도움을 준다. 최근 생성 모델을 활용해 합성 데이터를 생성해 데이터를 증강하는 시도가 이루어졌다. 이러한 방법은 원본 문장과 의미론적 유사성을 훼손하지 않으면서 어휘와 구조적 다양성을 높이는 것을 목표로 한다. 본 논문은 task-oriented 한 어휘와 구조를 고려한 데이터 증강 방법을 제안한다. 이를 위해 검색 모델과 사전 학습된 생성 모델을 활용한다. 검색 모델을 사용해 학습 데이터셋의 입력 문장과 유사한 문장 쌍을 검색 (retrieval) 한다. 검색된 유사한 문장 쌍을 사용하여 생성 모델을 학습해 합성 데이터를 생성한다. 본 논문의 방법론은 low-resource 환경에서 베이스라인 성능을 최대 4% 이상 향상할 수 있었으며, 기존의 데이터 증강 방법론보다 높은 성능 향상을 보인다.

  • PDF

Charging and Persistent-Current Mode Operating Characteristics of BSCCO Magnet Using High-Tc Superconducting Power Supply (고온 초전도 전원장치를 이용한 BSCCO Magnet의 충전 및 영구전류 운전 특성)

  • Jo, Hyun-Chul;Yang, Seong-Eun;Kim, Young-Jae;Hwang, Young-Jin;Yoon, Yong-Soo;Chung, Yoon-Do;Ko, Tae-Kuk
    • Progress in Superconductivity and Cryogenics
    • /
    • v.11 no.1
    • /
    • pp.30-34
    • /
    • 2009
  • This paper deals with charging and persistent-current mode operating characteristics of BSCCO magnet load using high-temperature superconducting (HTS) power supply. The HTS power supply consists of two heater-triggered switches, an iron-core transformer with the primary copper winding and the secondary BSCCO solenoid, and a BSCCO magnet load. The magnet load was fabricated by double pancake winding and its inductance is about 21 mH. A hall sensor was installed at the middle of the magnet load to measure the current in the load. In order to investigate the efficient pumping characteristics, operating tests of heater-triggered switch with respect to dc heater current were carried out, and the electromagnet current was determined by considering saturation characteristics of its iron core. The saturation characteristics of charged current in the magnet load were observed with respect to various pumping periods: 12 s, 14 s, 24 s and 32 s. After charging the magnet load, the persistent current was measured. The operating characteristics of the persistent current mode were mainly determined by joint resistance and magnet load.

FakedBits- Detecting Fake Information on Social Platforms using Multi-Modal Features

  • Dilip Kumar, Sharma;Bhuvanesh, Singh;Saurabh, Agarwal;Hyunsung, Kim;Raj, Sharma
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.1
    • /
    • pp.51-73
    • /
    • 2023
  • Social media play a significant role in communicating information across the globe, connecting with loved ones, getting the news, communicating ideas, etc. However, a group of people uses social media to spread fake information, which has a bad impact on society. Therefore, minimizing fake news and its detection are the two primary challenges that need to be addressed. This paper presents a multi-modal deep learning technique to address the above challenges. The proposed modal can use and process visual and textual features. Therefore, it has the ability to detect fake information from visual and textual data. We used EfficientNetB0 and a sentence transformer, respectively, for detecting counterfeit images and for textural learning. Feature embedding is performed at individual channels, whilst fusion is done at the last classification layer. The late fusion is applied intentionally to mitigate the noisy data that are generated by multi-modalities. Extensive experiments are conducted, and performance is evaluated against state-of-the-art methods. Three real-world benchmark datasets, such as MediaEval (Twitter), Weibo, and Fakeddit, are used for experimentation. Result reveals that the proposed modal outperformed the state-of-the-art methods and achieved an accuracy of 86.48%, 82.50%, and 88.80%, respectively, for MediaEval (Twitter), Weibo, and Fakeddit datasets.

Generative Interactive Psychotherapy Expert (GIPE) Bot

  • Ayesheh Ahrari Khalaf;Aisha Hassan Abdalla Hashim;Akeem Olowolayemo;Rashidah Funke Olanrewaju
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.15-24
    • /
    • 2023
  • One of the objectives and aspirations of scientists and engineers ever since the development of computers has been to interact naturally with machines. Hence features of artificial intelligence (AI) like natural language processing and natural language generation were developed. The field of AI that is thought to be expanding the fastest is interactive conversational systems. Numerous businesses have created various Virtual Personal Assistants (VPAs) using these technologies, including Apple's Siri, Amazon's Alexa, and Google Assistant, among others. Even though many chatbots have been introduced through the years to diagnose or treat psychological disorders, we are yet to have a user-friendly chatbot available. A smart generative cognitive behavioral therapy with spoken dialogue systems support was then developed using a model Persona Perception (P2) bot with Generative Pre-trained Transformer-2 (GPT-2). The model was then implemented using modern technologies in VPAs like voice recognition, Natural Language Understanding (NLU), and text-to-speech. This system is a magnificent device to help with voice-based systems because it can have therapeutic discussions with the users utilizing text and vocal interactive user experience.

A study on the effectiveness of intermediate features in deep learning on facial expression recognition

  • KyeongTeak Oh;Sun K. Yoo
    • International journal of advanced smart convergence
    • /
    • v.12 no.2
    • /
    • pp.25-33
    • /
    • 2023
  • The purpose of this study is to evaluate the impact of intermediate features on FER performance. To achieve this objective, intermediate features were extracted from the input images at specific layers (FM1~FM4) of the pre-trained network (Resnet-18). These extracted intermediate features and original images were used as inputs to the vision transformer (ViT), and the FER performance was compared. As a result, when using a single image as input, using intermediate features extracted from FM2 yielded the best performance (training accuracy: 94.35%, testing accuracy: 75.51%). When using the original image as input, the training accuracy was 91.32% and the testing accuracy was 74.68%. However, when combining the original image with intermediate features as input, the best FER performance was achieved by combining the original image with FM2, FM3, and FM4 (training accuracy: 97.88%, testing accuracy: 79.21%). These results imply that incorporating intermediate features alongside the original image can lead to superior performance. The findings can be referenced and utilized when designing the preprocessing stages of a deep learning model in FER. By considering the effectiveness of using intermediate features, practitioners can make informed decisions to enhance the performance of FER systems.

Scientific Paper Abstract Corpus and Automatic Abstract Structure Parsing using Pretrained Transformer (과학 논문 초록 말뭉치 구축 및 선학습 트랜스포머 기반 초록 자동구조화 방법)

  • Kim, Seokyung;Cho, Yunhui;Heo, Sehun;Jung, Sangkeun
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.280-283
    • /
    • 2020
  • 논문 초록은 논문의 내용을 요약해 제시함으로써 독자들의 연구결과물에 대한 빠른 검색과 이해를 도모한다. 초록의 구성은 대부분 전형적인 경우가 많기 때문에, 초록의 구조를 자동 분석하여 색인해두면 유사구조 초록을 검색하거나 생성하는 등의 연구효율화에 기여할 수 있다. 허세훈 외 (2019)는 초록 자동구조화를 위한 말뭉치 SPA2019 및 기계학습기반의 자동구조화 방법을 제시하였다. 본 연구는, 기존 SPA2019 의 구조화 오류를 바로잡고, SPA2019 에서 추출한 1,346 개의 초록데이터와 2,385 개의 초록데이터를 추가한 SPA2020 말뭉치를 새로이 소개한다. 또한, 다양한 선학습 기반 트랜스포머들을 활용하여 초록 자동구조화를 수행하였으며, 그 결과 BERT-0.86%, RoBERTa-0.86%, ALBERT-0.84%, XLNet-0.86%, DistilBERT-0.85% 등의 자동구조화 성능을 보임을 확인하였다.

  • PDF

Structural reliability analysis using temporal deep learning-based model and importance sampling

  • Nguyen, Truong-Thang;Dang, Viet-Hung
    • Structural Engineering and Mechanics
    • /
    • v.84 no.3
    • /
    • pp.323-335
    • /
    • 2022
  • The main idea of the framework is to seamlessly combine a reasonably accurate and fast surrogate model with the importance sampling strategy. Developing a surrogate model for predicting structures' dynamic responses is challenging because it involves high-dimensional inputs and outputs. For this purpose, a novel surrogate model based on cutting-edge deep learning architectures specialized for capturing temporal relationships within time-series data, namely Long-Short term memory layer and Transformer layer, is designed. After being properly trained, the surrogate model could be utilized in place of the finite element method to evaluate structures' responses without requiring any specialized software. On the other hand, the importance sampling is adopted to reduce the number of calculations required when computing the failure probability by drawing more relevant samples near critical areas. Thanks to the portability of the trained surrogate model, one can integrate the latter with the Importance sampling in a straightforward fashion, forming an efficient framework called TTIS, which represents double advantages: less number of calculations is needed, and the computational time of each calculation is significantly reduced. The proposed approach's applicability and efficiency are demonstrated through three examples with increasing complexity, involving a 1D beam, a 2D frame, and a 3D building structure. The results show that compared to the conventional Monte Carlo simulation, the proposed method can provide highly similar reliability results with a reduction of up to four orders of magnitudes in time complexity.

Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting

  • Haein Lee;Hae Sun Jung;Seon Hong Lee;Jang Hyun Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2334-2347
    • /
    • 2023
  • Metaverse services generate text data, data of ubiquitous computing, in real-time to analyze user emotions. Analysis of user emotions is an important task in metaverse services. This study aims to classify user sentiments using deep learning and pre-trained language models based on the transformer structure. Previous studies collected data from a single platform, whereas the current study incorporated the review data as "Metaverse" keyword from the YouTube and Google Play Store platforms for general utilization. As a result, the Bidirectional Encoder Representations from Transformers (BERT) and Robustly optimized BERT approach (RoBERTa) models using the soft voting mechanism achieved a highest accuracy of 88.57%. In addition, the area under the curve (AUC) score of the ensemble model comprising RoBERTa, BERT, and A Lite BERT (ALBERT) was 0.9458. The results demonstrate that the ensemble combined with the RoBERTa model exhibits good performance. Therefore, the RoBERTa model can be applied on platforms that provide metaverse services. The findings contribute to the advancement of natural language processing techniques in metaverse services, which are increasingly important in digital platforms and virtual environments. Overall, this study provides empirical evidence that sentiment analysis using deep learning and pre-trained language models is a promising approach to improving user experiences in metaverse services.