• Title/Summary/Keyword: Data detection

Search Result 8,029, Processing Time 0.038 seconds

A data corruption detection scheme based on ciphertexts in cloud environment

  • Guo, Sixu;He, Shen;Su, Li;Zhang, Xinyue;Geng, Huizheng;Sun, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3384-3400
    • /
    • 2021
  • With the advent of the data era, people pay much more attention to data corruption. Aiming at the problem that the majority of existing schemes do not support corruption detection of ciphertext data stored in cloud environment, this paper proposes a data corruption detection scheme based on ciphertexts in cloud environment (DCDC). The scheme is based on the anomaly detection method of Gaussian model. Combined with related statistics knowledge and cryptography knowledge, the encrypted detection index for data corruption and corruption detection threshold for each type of data are constructed in the scheme according to the data labels; moreover, the detection token for data corruption is generated for the data to be detected according to the data labels, and the corruption detection of ciphertext data in cloud storage is realized through corresponding tokens. Security analysis shows that the algorithms in the scheme are semantically secure. Efficiency analysis and simulation results reveal that the scheme shows low computational cost and good application prospect.

Identification of Incorrect Data Labels Using Conditional Outlier Detection

  • Hong, Charmgil
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.915-926
    • /
    • 2020
  • Outlier detection methods help one to identify unusual instances in data that may correspond to erroneous, exceptional, or surprising events or behaviors. This work studies conditional outlier detection, a special instance of the outlier detection problem, in the context of incorrect data label identification. Unlike conventional (unconditional) outlier detection methods that seek abnormalities across all data attributes, conditional outlier detection assumes data are given in pairs of input (condition) and output (response or label). Accordingly, the goal of conditional outlier detection is to identify incorrect or unusual output assignments considering their input as condition. As a solution to conditional outlier detection, this paper proposes the ratio-based outlier scoring (ROS) approach and its variant. The propose solutions work by adopting conventional outlier scores and are able to apply them to identify conditional outliers in data. Experiments on synthetic and real-world image datasets are conducted to demonstrate the benefits and advantages of the proposed approaches.

An Empirical Comparison Study on Attack Detection Mechanisms Using Data Mining (데이터 마이닝을 이용한 공격 탐지 메커니즘의 실험적 비교 연구)

  • Kim, Mi-Hui;Oh, Ha-Young;Chae, Ki-Joon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.2C
    • /
    • pp.208-218
    • /
    • 2006
  • In this paper, we introduce the creation methods of attack detection model using data mining technologies that can classify the latest attack types, and can detect the modification of existing attacks as well as the novel attacks. Also, we evaluate comparatively these attack detection models in the view of detection accuracy and detection time. As the important factors for creating detection models, there are data, attribute, and detection algorithm. Thus, we used NetFlow data gathered at the real network, and KDD Cup 1999 data for the experiment in large quantities. And for attribute selection, we used a heuristic method and a theoretical method using decision tree algorithm. We evaluate comparatively detection models using a single supervised/unsupervised data mining approach and a combined supervised data mining approach. As a result, although a combined supervised data mining approach required more modeling time, it had better detection rate. All models using data mining techniques could detect the attacks within 1 second, thus these approaches could prove the real-time detection. Also, our experimental results for anomaly detection showed that our approaches provided the detection possibility for novel attack, and especially SOM model provided the additional information about existing attack that is similar to novel attack.

Anomaly Detection Scheme Using Data Mining Methods (데이터마이닝 기법을 이용한 비정상행위 탐지 방법 연구)

  • 박광진;유황빈
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.2
    • /
    • pp.99-106
    • /
    • 2003
  • Intrusions pose a serious security risk in a network environment. For detecting the intrusion effectively, many researches have developed data mining framework for constructing intrusion detection modules. Traditional anomaly detection techniques focus on detecting anomalies in new data after training on normal data. To detect anomalous behavior, Precise normal Pattern is necessary. This training data is typically expensive to produce. For this, the understanding of the characteristics of data on network is inevitable. In this paper, we propose to use clustering and association rules as the basis for guiding anomaly detection. For applying entropy to filter noisy data, we present a technique for detecting anomalies without training on normal data. We present dynamic transaction for generating more effectively detection patterns.

A Distance-based Outlier Detection Method using Landmarks in High Dimensional Data (고차원 데이터에서 랜드마크를 이용한 거리 기반 이상치 탐지 방법)

  • Park, Cheong Hee
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.9
    • /
    • pp.1242-1250
    • /
    • 2021
  • Detection of outliers deviating normal data distribution in high dimensional data is an important technique in many application areas. In this paper, a distance-based outlier detection method using landmarks in high dimensional data is proposed. Given normal training data, the k-means clustering method is applied for the training data in order to extract the centers of the clusters as landmarks which represent normal data distribution. For a test data sample, the distance to the nearest landmark gives the outlier score. In the experiments using high dimensional data such as images and documents, it was shown that the proposed method based on the landmarks of one-tenth of training data can give the comparable outlier detection performance while reducing the time complexity greatly in the testing stage.

A System for Improving Data Leakage Detection based on Association Relationship between Data Leakage Patterns

  • Seo, Min-Ji;Kim, Myung-Ho
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.520-537
    • /
    • 2019
  • This paper proposes a system that can detect the data leakage pattern using a convolutional neural network based on defining the behaviors of leaking data. In this case, the leakage detection scenario of data leakage is composed of the patterns of occurrence of security logs by administration and related patterns between the security logs that are analyzed by association relationship analysis. This proposed system then detects whether the data is leaked through the convolutional neural network using an insider malicious behavior graph. Since each graph is drawn according to the leakage detection scenario of a data leakage, the system can identify the criminal insider along with the source of malicious behavior according to the results of the convolutional neural network. The results of the performance experiment using a virtual scenario show that even if a new malicious pattern that has not been previously defined is inputted into the data leakage detection system, it is possible to determine whether the data has been leaked. In addition, as compared with other data leakage detection systems, it can be seen that the proposed system is able to detect data leakage more flexibly.

Emerging Topic Detection Using Text Embedding and Anomaly Pattern Detection in Text Streaming Data (텍스트 스트리밍 데이터에서 텍스트 임베딩과 이상 패턴 탐지를 이용한 신규 주제 발생 탐지)

  • Choi, Semok;Park, Cheong Hee
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.9
    • /
    • pp.1181-1190
    • /
    • 2020
  • Detection of an anomaly pattern deviating normal data distribution in streaming data is an important technique in many application areas. In this paper, a method for detection of an newly emerging pattern in text streaming data which is an ordered sequence of texts is proposed based on text embedding and anomaly pattern detection. Using text embedding methods such as BOW(Bag Of Words), Word2Vec, and BERT, the detection performance of the proposed method is compared. Experimental results show that anomaly pattern detection using BERT embedding gave an average F1 value of 0.85 and the F1 value of 1 in three cases among five test cases.

Intelligent Intrusion Detection Systems Using the Asymmetric costs of Errors in Data Mining (데이터 마이닝의 비대칭 오류비용을 이용한 지능형 침입탐지시스템 개발)

  • Hong, Tae-Ho;Kim, Jin-Wan
    • The Journal of Information Systems
    • /
    • v.15 no.4
    • /
    • pp.211-224
    • /
    • 2006
  • This study investigates the application of data mining techniques such as artificial neural networks, rough sets, and induction teaming to the intrusion detection systems. To maximize the effectiveness of data mining for intrusion detection systems, we introduced the asymmetric costs with false positive errors and false negative errors. And we present a method for intrusion detection systems to utilize the asymmetric costs of errors in data mining. The results of our empirical experiment show our intrusion detection model provides high accuracy in intrusion detection. In addition the approach using the asymmetric costs of errors in rough sets and neural networks is effective according to the change of threshold value. We found the threshold has most important role of intrusion detection model for decreasing the costs, which result from false negative errors.

  • PDF

A Study on Security Event Detection in ESM Using Big Data and Deep Learning

  • Lee, Hye-Min;Lee, Sang-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.42-49
    • /
    • 2021
  • As cyber attacks become more intelligent, there is difficulty in detecting advanced attacks in various fields such as industry, defense, and medical care. IPS (Intrusion Prevention System), etc., but the need for centralized integrated management of each security system is increasing. In this paper, we collect big data for intrusion detection and build an intrusion detection platform using deep learning and CNN (Convolutional Neural Networks). In this paper, we design an intelligent big data platform that collects data by observing and analyzing user visit logs and linking with big data. We want to collect big data for intrusion detection and build an intrusion detection platform based on CNN model. In this study, we evaluated the performance of the Intrusion Detection System (IDS) using the KDD99 dataset developed by DARPA in 1998, and the actual attack categories were tested with KDD99's DoS, U2R, and R2L using four probing methods.

Human Detection using Real-virtual Augmented Dataset

  • Jongmin, Lee;Yongwan, Kim;Jinsung, Choi;Ki-Hong, Kim;Daehwan, Kim
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.98-102
    • /
    • 2023
  • This paper presents a study on how augmenting semi-synthetic image data improves the performance of human detection algorithms. In the field of object detection, securing a high-quality data set plays the most important role in training deep learning algorithms. Recently, the acquisition of real image data has become time consuming and expensive; therefore, research using synthesized data has been conducted. Synthetic data haves the advantage of being able to generate a vast amount of data and accurately label it. However, the utility of synthetic data in human detection has not yet been demonstrated. Therefore, we use You Only Look Once (YOLO), the object detection algorithm most commonly used, to experimentally analyze the effect of synthetic data augmentation on human detection performance. As a result of training YOLO using the Penn-Fudan dataset, it was shown that the YOLO network model trained on a dataset augmented with synthetic data provided high-performance results in terms of the Precision-Recall Curve and F1-Confidence Curve.