DOI QR코드

DOI QR Code

Research on data augmentation algorithm for time series based on deep learning

  • Shiyu Liu (School of Digital Economics and Management, Wuxi University ) ;
  • Hongyan Qiao (Wuxi Yunyin Technology Group Co. LTD ) ;
  • Lianhong Yuan (Hangzhou Polytechnic ) ;
  • Yuan Yuan (Zhejiang Guozi Robot Technology Co. LTD ) ;
  • Jun Liu (Zhejiang Tuofeng Intelligent Equipment Co. LTD)
  • 투고 : 2023.01.04
  • 심사 : 2023.05.30
  • 발행 : 2023.06.30

초록

Data monitoring is an important foundation of modern science. In most cases, the monitoring data is time-series data, which has high application value. The deep learning algorithm has a strong nonlinear fitting capability, which enables the recognition of time series by capturing anomalous information in time series. At present, the research of time series recognition based on deep learning is especially important for data monitoring. Deep learning algorithms require a large amount of data for training. However, abnormal sample is a small sample in time series, which means the number of abnormal time series can seriously affect the accuracy of recognition algorithm because of class imbalance. In order to increase the number of abnormal sample, a data augmentation method called GANBATS (GAN-based Bi-LSTM and Attention for Time Series) is proposed. In GANBATS, Bi-LSTM is introduced to extract the timing features and then transfer features to the generator network of GANBATS.GANBATS also modifies the discriminator network by adding an attention mechanism to achieve global attention for time series. At the end of discriminator, GANBATS is adding averagepooling layer, which merges temporal features to boost the operational efficiency. In this paper, four time series datasets and five data augmentation algorithms are used for comparison experiments. The generated data are measured by PRD(Percent Root Mean Square Difference) and DTW(Dynamic Time Warping). The experimental results show that GANBATS reduces up to 26.22 in PRD metric and 9.45 in DTW metric. In addition, this paper uses different algorithms to reconstruct the datasets and compare them by classification accuracy. The classification accuracy is improved by 6.44%-12.96% on four time series datasets.

키워드

1. Introduction

In recent years, the world has achieved a high speed of technological development, including electricity, transportation and healthcare [1-3]. The methods of data monitoring in various industries have changed significantly after application of various advanced technologies. With the upgrading of various monitoring equipment, the efficiency of data acquisition continues to improve. By increasing the frequency of monitoring and self-inspection can effectively prevent many hidden dangers. Data monitoring is now receiving increasing attention, which poses many challenges for monitoring technology. At present, various monitoring data can be obtained through various types of sensors [4], which significantly boost the safety and reliability of monitoring.

Time series data is a very important component of monitoring data, including industrial monitoring data, geographic monitoring data and health monitoring data. This means that a significant portion of the monitoring data is in the form of time series. It is worth noting that there is a process of getting worse for most of negative issues, which means there is a high correlation between monitoring data and time.

At present, the interpretation of time series with highly subjective is that people mainly relies on the experience in related fields. Fortunately, with the rapid development of computer-aided recognition techniques based on deep learning, recognition assistance systems for time series have received wide attention [5-7]. The relevant staff can read the deep information of the time series through the recognition assistance systems for time series and determine the health condition of the equipment (individual)..

However, the recognition accuracy of recognition assistance systems for time series based on deep learning is affected by the lack of abnormal time series. In detail, there is a class imbalance between normal and abnormal time series, and the recognition accuracy of the recognition assistance systems for time series can be seriously affected. Therefore, a large number of studies have been conducted by relevant researchers to address the class imbalance. Among these methods, GAN (Generative Adversarial Networks) [8] can generate a large amount of data and the generated data have local features with high similarity to real data. Based on GAN, this paper develops a GANBATS (GAN-based Bi-LSTM and Attention for Time Series) to generate time series for improving the classification accuracy of recognition assistance systems.

The rest of this paper is organized as follows. Related works are reviewed in Section 2. Section 3 introduces GANBATS in details before the experimental results analysis in Section 4. Section 5 concludes the paper and discusses future work.

2. Related Work

Recognition assistance systems for time series can help staff to obtain more accurate status of the equipment (individual). However, the actual database contains a large amount of normal time series and a very small amount of abnormal time series, which makes it more difficult to train deep learning models. The reason for the difficulty in training model is that the highly imbalance data distribution will make the classification model more inclined to fit the class of normal data. Currently, data augmentation methods have received wide attention in solving the class imbalance. Data augmentation methods are usually divided into traditional methods and deep learning-based methods.

The traditional data augmentation methods mainly include random transformation and statistical models. First, time series can be considered as two-dimensional signals of small width. Therefore, some methods can be mined from image data methods for time series data augmentation. Therefore, scholars invented methods such as segment interception, rotation, narrowing and expanding [9] and adding noise for time series data augmentation. Time series data enhancement can be generally divided into three categories: amplitude domain, time domain and frequency domain. Amplitude domain transformation is mainly used in multi-dimensional time series. The time domain is the most common representation of time series. The frequency domain mainly exists in the field of industrial control. However, the common features of random transformation methods are randomness and instability.

Through development, researchers have realized that time series have characteristic distribution. A generation model can be used to sample time series from feature distributions. We divide the generation models into two categories, statistical models and deep learning-based models. Generation algorithms based on statistical models focus on obtaining the distribution of data. The most classic of these is SMOTE[10]. The object of the SMOTE algorithm remains the imbalanced datasets. The core of the SMOTE algorithm is to reconstruct the new datasets based on small number of samples. Another category is clustering-based data augmentation method. Clustering-based method refers to clustering normal and abnormal data, and then using the oversampling method to expand the database. The advantage of clustering-based methods : generating different numbers of new samples for different minority classes of samples according to specific data distribution. This method generates samples by giving each minority class sample a certain weight, which is positively related to the number of surrounding majority class samples. However, the data distance affects the quality of data generation by clustering-based data augmentation methods.

In recent years, deep learning-based methods have become popular. GAN has a powerful data fitting and generation capability due to its innovative network structure, which is able to generate specific signals from the input noise. After GAN introduction in 2014, GAN has been widely used in various data augmentation tasks.

GAN models are mainly divided into discriminator networks and generator networks. The discriminator network is a binary classifier that makes judgments on the output samples from the generative model. If the input samples are from real training data, the discriminator output value is larger, otherwise the discriminator output value is smaller. The structure of GAN model is shown in Fig. 1.

E1KOBZ_2023_v17n6_1530_f0001.png 이미지

Fig. 1. Structure of GAN

However, traditional generator networks have limited modeling and fitting capabilities and often perform poorly when training complex signals (such as natural language and 3D scenes). In addition, the training process of GAN models is not stable enough. Therefore, scholars have improved these aspects based on the GAN model.

In order to solve the problems in the GAN model, scholars have proposed a number of different GAN-based models to reconstruct the new datasets. The DCGAN model was proposed (Deep Convolutional GAN) [11], which replace all pool layers with convolution layers. WGAN improves the data generation capability of the generator network by fitting the wasserstein distance with a discriminator [12]. WGAN-GP introduces the gradient penalty strategy to WGAN, which greatly enhances the training stability of WGAN-GP [13].

3. GAN-based Bi-LSTM and Attention for Time Series

The research goal of this paper is to improve computer timing signal recognition by solving class imbalance in time series. The lack of large amounts of time series data can have a negative impact when designing deep learning models. Therefore, this paper attempts to generate time series using GAN-based models.

According to the composition structure of GAN in Fig. 1, we can know that the whole and core idea of GAN: G and D are constantly updated, so G simulates data very close to the real data. The optimization objective function of GAN is shown in Eq.1:

\(\begin{aligned}\min _{G} \max _{G} V(D, G)\end{aligned}\)       (1)

First, the generated signal with the true label passed into discriminator network and the generated signal is required to deceive the discriminator, which is to minimize the objective function V(D, G). Then, the GAN-based model feeds the real signal and the generated signal into the discriminator network. The discriminator network needs to distinguish the real signal from the generated signal as much as possible, which is to maximize the objective function V(D, G). The GAN-based model is trained to find the balance between maximizing and minimizing the objective function to generate a more realistic signal. The objective function to be optimized is shown in Eq.2.

V(D, G) = Ex-Pdata[logD(x)] + Ex-PG[log{1 - D[ G(x)]}]       (2)

In Eq. 2, the generator G takes the maximum value and the discriminator D takes the minimum value. D(x) denotes the probability that the discriminator determines that the true anomaly signal is true. D[G(x)] denotes the probability that the discriminator determines that the generated signal is true Pdata denotes the true sample distribution. PG denotes the prior distribution of the random variable.

The goal of this section is to propose an algorithm for reconstructing class imbalanced datasets. A time signal is usually a long, multi-feature sequence, which represents timing features in signal need to be taken into account in data augmentation algorithm. Therefore, we propose a new data generation model GANBATS (GAN-based Bi-LSTM and Attention for Time Series) is designed by introducing Long Short-Term Memory (LSTM) network [14] in the model, and the model diagram of GANBATS is shown in Fig. 2.

E1KOBZ_2023_v17n6_1530_f0002.png 이미지

Fig. 2. Diagrams of GANBATS

GANBATS operation process :

1)Data input : Input the data into the generator network of the GANBATS model. The data are first input to the Bi-LSTM, which has the main function of gaining the temporal features from time series, and solving the gradient problem that occurs during the training process. Then the output of Bi-LSTM is transferring to the G-part of generator network, and the generated pseudo samples are obtained through several steps such as convolution and batch normalization.

2)The time series generated by the generator network and the real time series are sent into the discriminator network of GANBATS. The main function of the discriminator network of GANBATS is to determine the authenticity of the generated time series. The attention mechanism [15] is introduced in the discriminator network of GANBATS to extract the global temporal features of the generated time series. The probability that the generated time series and the true time series are true is obtained in the output part of the discriminator network. When the generated time series are judged to be true, the samples are re-entered into the discriminator network for training.

3)After several training epochs, the model training ends when the discriminator network is unable to judge the truth of the time series generated by the generator network.

3.1 Generator Network of GANBATS

Considering the characteristics of time series, the generator network of GANBATS model is improved in this section. GANBATS adds Bi-LSTM to the beginning of the original generator network of GAN. Adding Bi-LSTM to generator network can bring two benefits. One is to fuse information from different position in time series to solve the gradient disappearance problem when training long time series. The second is to obtain multi-dimensional features through Bi-LSTM. The improved generator network of GANBATS is shown in Fig. 3.

E1KOBZ_2023_v17n6_1530_f0003.png 이미지

Fig. 3. Generator of GANBATS

The generator network structure of GANBATS has three parts: data input layer, Bi-LSTM network and G-part. The generator network structure of GANBATS operates on the input data in turn. The Bi-LSTM network has two layers with an input feature dimension of 1 and a hidden layer of 64. The Bi-LSTM is followed by a convolutional structure, where the three G-part structure consists of convolution, BN layer and ReLU activation function, and the last layer is the output structure.

In this paper, the whole time signal is taken as the target domain signal, and the random noise in the input data conforms to the standard Gaussian distribution. The noise obtain the temporal characteristics through the Bi-LSTM network. Then it is passed through the G-part to continuously compress the clustering of series, which fit the target time series with maximum degree.

The core improvement in the generator network of GANBATS proposed in this section is the introduction of Bi-LSTM. The LSTM neural network consists of an input layer, multiple hidden layers and an output layer. There are many neurons within the hidden layer. In LSTM, the gate is its most basic structure. The information flow is determined according to Eq. 3.

\(\begin{aligned}\sigma=\frac{1}{1+e^{-t}}\end{aligned}\)       (3)

The basic unit of LSTM is called LSTM cell. It has three kinds of gates, namely forget gate, input gate and output gate, which maintain the performance and state of LSTM. The structure of LSTM cell is shown in Fig. 4.

E1KOBZ_2023_v17n6_1530_f0004.png 이미지

Fig. 4. Structure of LSTM cell

Fig. 4 shows the transformation of data in an LSTM cell. Each LSTM cell has three inputs and three outputs. According to Fig. 4, Xt is the newly added information at this time, and at-1 , and Ct-1 are the accumulations of some information in the past. Enter Xt and at-1 into the three gates in the LSTM cell. The weight Wf of the forget gate is activated by σ to get the forget weight. The weight Wi of the input gate is activated by σ to get the input weight. The weight Wo of the output gate is activated by σ to get the out weight. Through the operation of three gates, LSTM adds new information and updates the state, which is Ct.

The LSTM network performs the above operations on the input vector at each moment until all elements in the time series are processed, and finally all moment states are output. LSTM can also obtain information about the future of the sequence. In other words, bi-directional information can be obtained through Bi-LSTM. The structure of Bi-LSTM is shown in Fig. 5.

E1KOBZ_2023_v17n6_1530_f0005.png 이미지

Fig. 5. Structure of Bi-LSTM

The Bi-LSTM added to generator network of GANBATS is a bi-directional combination of LSTM, which can extract twice the number of features from time series. In general, the noise signal enters the Bi-LSTM. At different moments, the memory cell in the LSTM records the relevant information, while the forget gate forgets some information that existed in the memory cell at the previous moment. Through the selective memory and forgetting of the information at different moments, the long-term memory of the time signal is realized, and thus the timing features are extracted.

3.2 Discriminator Network of GANBATS

After the generator network generates new time series from the noise with Gaussian distribution, the generated time series are sent into the discriminator network for determination. The feature extraction capability and discrimination speed of the discriminator need to be improved for the characteristics of time series with multiple features and long length. Therefore, the discriminator network of GANBATS is improved in this section. GANBATS introduces the attention mechanism in the discriminator network for obtaining global information of time series, which enhances the feature extraction capability. For discriminant speed, GANBATS introduces a global average pooling layer at the tail of the discriminator network, which can greatly reduce the number of parameters of the model. The improved discriminator network of GANBATS is shown in Fig. 6.

E1KOBZ_2023_v17n6_1530_f0006.png 이미지

Fig. 6. Discriminator of GANBATS

From Fig. 6, the discriminator network of GANBATS consists of three parts: two D-part parts, Attention mechanism and global average pooling layer, where the three D-part structure consists of convolution, ReLu activation function and BN layer. This structure makes it easier to balance the adversarial behavior of the model. The attention mechanism and global average pooling are introduced in the discriminator instead of MLP to improve the classification accuracy and classification speed of the discriminator.

Colloquially, the Attention mechanism can be used to selectively select a small amount of important information from a large number of information and focus on these important information. In other words, the Attention mechanism pays different attention to time series at different moments, which considers applying different weights to different moments in time series. The calculation process of Attention mechanism is shown in Fig. 7.

E1KOBZ_2023_v17n6_1530_f0007.png 이미지

Fig. 7. Attention mechanism calculation process

The Attention mechanism can be described as the mapping of a Q(query) matrix to a series of K(keys) matrix and V(values) matrix. There are three main steps in calculating attention.

Step 1: Perform similarity calculation (Dot Product) between query and each key to get the weight

f(Q, Kx)       (4)

Step 2: Using a softmax function to normalize the weights.

\(\begin{aligned}\alpha_{x}=\frac{e^{f\left(Q, K_{x}\right)}}{\sum_{y=1}^{L} f\left(Q, K_{y}\right)},(x, y)=1,2, \ldots, L\end{aligned}\)       (5)

Step 3: Adding weights and corresponding values to get the final score.

\(\begin{aligned}\sum_{x=1}^{L} \alpha_{x} V_{x}\end{aligned}\)       (6)

Where L is the length of time series. x and y are the moments in the time series. Q, K, V are three independent matrices. In this study, K and V are the same matrix, which means K=V. Finally, Average pooling reduces the parameters obtained by the Attention mechanism to prevent overfitting.

4. Experiment

4.1 Experimental environment and evaluation criteria

The experimental environment setting in this paper is shown in Table 1. In this section, in order to evaluate the quality of the generated time series, two parameters, PRD (Percent Root Mean Square Difference) and DTW (Dynamic Time Warping) [16] are used to evaluate the generated time series. These two indicators comment on the similarity and difference between the generated and real data, respectively. The smaller the value of PRD and DTW, the better the quality of the generated time series.

Table 1. Experimental environment

E1KOBZ_2023_v17n6_1530_t0001.png 이미지

4.2 Databases and training hyperparameter

The experiments in this section verify the ability of the GANBATS algorithm to reconstruct time series datasets. In this paper, NASA Battery Data Set (Battery Data Set), XJTU-SY (Bearing Database) and ECG-MIT (ECG Database) were selected for the experiments. NASA Battery Data Set is an experimental database of ternary lithium batteries classified by temperature and discharge multiplier. XJTU-SY measured motor currents and vibration signals of 26 damaged bearing states and 6 undamaged (healthy) states. ECG-MIT is a joint ECG database created by MIT and Beth Israel Hospital. There are 48 records in total and each record is over 30 min. We use Adam optimizer with an initial learning rate of 1e-4(NASA Battery Data Set),1e-3(XJTU-SY),1e-4(ECG-MIT) in training. We set the batch size to 32 (NASA Battery Data Set),128(XJTU-SY), 128(ECG-MIT), and train for 100 epochs, and the reducing learning rate by half every 50 epochs.

4.3 Experimental results analysis

4.3.1 Generated time series visualization

In this paper, CGAN, DCGAN, WGAN, WGAN-GP and GANBATS models are selected for the generation of data in three time series datasets. CGAN, DCGAN, WGAN, WGAN-GP models are all mainstream data generation models, and GANBATS proposed in this paper. First, five data augmentation models are trained using normal and abnormal time series and form class balanced datasets. Then, the performance of data augmentation model is evaluated in terms of the training convergence speed and classification accuracy of the data augmentation model. Fig. 8 shows the four types of time series generated by GANBATS, which proposed in this paper.

E1KOBZ_2023_v17n6_1530_f0008.png 이미지

Fig. 8. Example of generated data

Fig. 8 shows the visualization results of the generated time series. From Fig. 8, it can be seen that GANBATS has the ability to generate long time series and to generate local features of time series.

4.3.2 The influence of different parts of GANBATS

In the experiments, real data and generated data are sent to the discriminator model for training. The structure of the generator network and discriminator network of GANBATS affects the model training time and the quality of the reconstructed datasets. In other words, well-structured generator networks and discriminator networks not only improve the efficiency of the mode, but also enhance the quality of the generated data. This section designs ablation experiments to verify the effect of different parts in GANBATS on the task of time series generation. The results are shown in Table 2.

Table 2. The quality of generated data of different parts in GANBATS

E1KOBZ_2023_v17n6_1530_t0002.png 이미지

From Table 2, it can be seen that the data generation of GAN model without the addition of Bi-LSTM to the generator is poor, with PRD and DTW only 61.38 and 21.72. With the addition of Bi-LSTM, the PRD and DTW decreased to 40.32 and 14.37. Compared to the original GAN, the metrics decreased by 33.3% and 33.8%. And when the discriminator adds the attention mechanism, the DTW of the model is further reduced to 9.45. When the model adds Bi-LSTM to the generator and attention mechanism to the discriminator. The PRD and DTW are reduced to 42.7% and 43.5% of the initial state. Therefore, all kinds of parts included in the GANBATS model proposed in this paper have positive effects on data augmentation.

Table 3 is an ablation experiment on accuracy. From Table 3, the classification accuracy basically justifies the quality of the data generated in Table 2.

Table 3. The classification results of generated data of different parts in GANBATS

E1KOBZ_2023_v17n6_1530_t0003.png 이미지

Fig. 9 shows the change of loss values during the training of the GANBATS model, which reflects the stability of the GANBATS model during training.

E1KOBZ_2023_v17n6_1530_f0009.png 이미지

Fig. 9. Loss curve

The number of iterations of the model is 100. It can be seen from Fig. 9 that the loss value of the GANBATS model is decreasing as the number of iterations increases, and the oscillation trend of the loss value is getting smaller. This indicates that the GANBATS model converges better, and the addition of Bi-LSTM makes the training of the generative network more stable.

4.3.3 Comparison experiments

The experiments in this section analyze the reconstructed datasets based on applicability (classification accuracy), and Table 4 shows the classification accuracy of the five compared algorithms on the three datasets.

Table 4. Accuracy of different algorithms

E1KOBZ_2023_v17n6_1530_t0004.png 이미지

Table 4 shows the classification accuracy of the five data augmentation algorithms after reconstructing the datasets. The experiments in this section compare the classification performance under the same criteria and same classifier for five compared algorithms. DCGAN model is introduced to CNN network. However, DCGAN does not have the best classification performance in the comparison. The reason may be that the CNN network has a large gap in receptive field for different time series. WGAN improves the way of calculating the distance between samples, which makes the generated data less overlapping. As a result, the classification accuracy of WGAN is improved by up to 4.09% compared to DCGAN. WGAN-GP uses the gradient penalty strategy to make the training process more stable and further improve the quality of the reconstructed dataset. WGAN-GP obtains the highest accuracy of 89.88% on XJTU-SY database. The GANBATS proposed in this paper introduces Bi-LSTM in the generator network and attention mechanism in the discriminator network to make the data augmentation model with better training quality and stability. From Table 2, the NASA Battery and ECG-MIT datasets reconstructed by GANBATS all obtained the best performance of 94.27% and 98.75%, respectively.

5. Conclusion

Deep learning-based computer signal recognition assistance system can effectively improve the efficiency of equipment (personnel) monitoring and timely detection of abnormalities. Time series monitoring signals are of great importance to industry, energy and medical fields. However, time series monitoring signals usually have varying degrees of class imbalance, which increases the need for data augmentation of class balanced time series database. In this paper, the GANBATS model is proposed for the characteristics of the time series, and the time series generated by this model are reconstructed to obtain normal databases. The performance of the GANBATS model is verified by two experiments. First, the authenticity of the generated time series evaluated by PRD and DTW metrics. Then, classification experiments are performed on four time series databases. The results show that the GANBATS algorithm has a good data augmentation effect on the time series. In addition, the deep learning research of time data includes not only data augmentation, but also prediction, segmentation, etc. In the future work, we will be carried out in these aspects.

참고문헌

  1. Muzammal M, Talat R, Sodhro A H, et al., "multi-sensor data fusion enabled ensemble approach for medical data from body sensor networks," Information Fusion, vol. 53, pp. 155-164, 2020. https://doi.org/10.1016/j.inffus.2019.06.021
  2. Nellore K, and Hancke G P, "A survey on urban traffic management system using wireless sensor networks," Sensors, vol. 16, no. 2, pp. 157, 2016.
  3. Zeng Z, Zeng F, Han X, et al., "Real-time monitoring of environmental parameters in a commercial gestating sow house using a zigbee-based wireless sensor network," Applied Sciences, vol. 11, no. 3, pp. 972, 2021.
  4. Marti E, De Miguel M A, Garcia F, et al., "A review of sensor technologies for perception in automated driving," IEEE Intelligent Transportation Systems Magazine, vol. 11, no. 4, pp. 94-108, 2019. https://doi.org/10.1109/MITS.2019.2907630
  5. Li J, Liu X, Zhang W, et al., "Spatio-temporal attention networks for action recognition and detection," IEEE Transactions on Multimedia, vol. 22, no. 11, pp. 2990-3001, 2020. https://doi.org/10.1109/TMM.2020.2965434
  6. Elmadany N E, He Y, Guan L, et al., "Improving Action Recognition via Temporal and Complementary Learning," ACM Transactions on Intelligent Systems and Technology, vol. 12, no. 3, pp. 1-24, 2021. https://doi.org/10.1145/3447686
  7. Ismail Fawaz H, Forestier G, Weber J, et al., "Deep learning for time series classification: a review," Data mining and knowledge discovery, vol. 33, no. 4, pp. 917-963, 2019. https://doi.org/10.1007/s10618-019-00619-1
  8. Creswell A, White T, Dumoulin V, et al., "Generative adversarial networks: An overview," IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 53-65, 2018. https://doi.org/10.1109/MSP.2017.2765202
  9. Tsai C F, Lin W C, Hu Y, et al., "Under-sampling class imbalanced datasets by combining clustering analysis and instance selection," Information Sciences, no. 477, pp. 47-54, 2019.
  10. Chawla N V, Bowyer K W, Hall L O, et al., "SMOTE: synthetic minority over-sampling technique," Journal of artificial intelligence research, vol. 16, pp. 321-357, 2002. https://doi.org/10.1613/jair.953
  11. Dewi C, Chen R C, Liu Y T, et al., "Synthetic Data generation using DCGAN for improved traffic sign recognition," Neural Computing and Applications, vol. 34, no. 24, pp. 21465-21480, 2022. https://doi.org/10.1007/s00521-021-05982-z
  12. Wang Q, Zhou X, Wang C, et al., "WGAN-based synthetic minority over-sampling technique: Improving semantic fine-grained classification for lung nodules in CT images," IEEE Access, vol. 7, pp. 18450-18463, 2019. https://doi.org/10.1109/ACCESS.2019.2896409
  13. Lee J, and Lee H, "Improving SSH detection model using IPA time and WGAN-GP," Computers & Security, vol. 116, pp. 102672, 2022.
  14. Yu Y, Si X, Hu C, et al., "A review of recurrent neural networks: LSTM cells and network architectures," Neural computation, vol. 31, no. 7, pp. 1235-1270, 2019. https://doi.org/10.1162/neco_a_01199
  15. Niu Z, Zhong G, Yu H, et al., "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48-62, 2021. https://doi.org/10.1016/j.neucom.2021.03.091
  16. Zhang G, Kinsner W, and Huang B, "Electrocardiogram data mining based on frame classification by dynamic time warping matching," Computer methods in biomechanics and biomedical engineering, vol. 6, no. 12, pp. 701-707, 2009. https://doi.org/10.1080/10255840902882158