• Title/Summary/Keyword: WEKA

Search Result 57, Processing Time 0.026 seconds

Wearable Sensor-Based Biometric Gait Classification Algorithm Using WEKA

  • Youn, Ik-Hyun;Won, Kwanghee;Youn, Jong-Hoon;Scheffler, Jeremy
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.1
    • /
    • pp.45-50
    • /
    • 2016
  • Gait-based classification has gained much interest as a possible authentication method because it incorporate an intrinsic personal signature that is difficult to mimic. The study investigates machine learning techniques to mitigate the natural variations in gait among different subjects. We incorporated several machine learning algorithms into this study using the data mining package called Waikato Environment for Knowledge Analysis (WEKA). WEKA's convenient interface enabled us to apply various sets of machine learning algorithms to understand whether each algorithm can capture certain distinctive gait features. First, we defined 24 gait features by analyzing three-axis acceleration data, and then selectively used them for distinguishing subjects 10 years of age or younger from those aged 20 to 40. We also applied a machine learning voting scheme to improve the accuracy of the classification. The classification accuracy of the proposed system was about 81% on average.

Development and application of a floor failure depth prediction system based on the WEKA platform

  • Lu, Yao;Bai, Liyang;Chen, Juntao;Tong, Weixin;Jiang, Zhe
    • Geomechanics and Engineering
    • /
    • v.23 no.1
    • /
    • pp.51-59
    • /
    • 2020
  • In this paper, the WEKA platform was used to mine and analyze measured data of floor failure depth and a prediction system of floor failure depth was developed with Java. Based on the standardization and discretization of 35-set measured data of floor failure depth in China, the grey correlation degree analysis on five factors affecting the floor failure depth was carried out. The correlation order from big to small is: mining depth, working face length, floor failure resistance, mining thickness, dip angle of coal seams. Naive Bayes model, neural network model and decision tree model were used for learning and training, and the accuracy of the confusion matrix, detailed accuracy and node error rate were analyzed. Finally, artificial neural network was concluded to be the optimal model. Based on Java language, a prediction system of floor failure depth was developed. With the easy operation in the system, the prediction from measured data and error analyses were performed for nine sets of data. The results show that the WEKA prediction formula has the smallest relative error and the best prediction effect. Besides, the applicability of WEKA prediction formula was analyzed. The results show that WEKA prediction has a better applicability under the coal seam mining depth of 110 m~550 m, dip angle of coal seams of 0°~15° and working face length of 30 m~135 m.

A Topographical Classifier Development Support System Cooperating with Data Mining Tool WEKA from Airborne LiDAR Data (항공 라이다 데이터로부터 데이터마이닝 도구 WEKA를 이용한 지형 분류기 제작 지원 시스템)

  • Lee, Sung-Gyu;Lee, Ho-Jun;Sung, Chul-Woong;Park, Chang-Hoo;Cho, Woo-Sug;Kim, Yoo-Sung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.1
    • /
    • pp.133-142
    • /
    • 2010
  • To monitor composition and change of the national land, intelligent topographical classifier which enables accurate classification of land-cover types from airborne LiDAR data is highly required. We developed a topographical classifier development support system cooperating with da1a mining tool WEKA to help users to construct accurate topographical classification systems. The topographical classifier development support system has the following functions; superposing LiDAR data upon corresponding aerial images, dividing LiDAR data into tiles for efficient processing, 3D visualization of partial LiDAR data, feature from tiles, automatic WEKA input generation, and automatic C++ program generation from the classification rule set. In addition, with dam mining tool WEKA, we can choose highly distinguishable features by attribute selection function and choose the best classification model as the result topographical classifier. Therefore, users can easily develop intelligent topographical classifier which is well fitted to the developing objectives by using the topographical classifier development support system.

Design and implementation of data mining tool using PHP and WEKA (피에이치피와 웨카를 이용한 데이터마이닝 도구의 설계 및 구현)

  • You, Young-Jae;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.425-433
    • /
    • 2009
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We need a data mining tool to explore a lot of information. There are many data mining tools or solutions; E-Miner, Clementine, WEKA, and R. Almost of them are were focused on diversity and general purpose, and they are not useful for laymen. In this paper we design and implement a web-based data mining tool using PHP and WEKA. This system is easy to interpret results and so general users are able to handle. We implement Apriori algorithm of association rule, K-means algorithm of cluster analysis, and J48 algorithm of decision tree.

  • PDF

Estimation of Smart Election System data

  • Park, Hyun-Sook;Hong, You-Sik
    • International journal of advanced smart convergence
    • /
    • v.7 no.2
    • /
    • pp.67-72
    • /
    • 2018
  • On the internal based search, the big data inference, which is failed in the president's election in the United States of America in 2016, is failed, because the prediction method is used on the base of the searching numerical value of a candidate for the presidency. Also the Flu Trend service is opened by the Google in 2008. But the Google was embarrassed for the fame's failure for the killing flu prediction system in 2011 and the prediction of presidential election in 2016. In this paper, using the virtual vote algorithm for virtual election and data mining method, the election prediction algorithm is proposed and unpacked. And also the WEKA DB is unpacked. Especially in this paper, using the K means algorithm and XEDOS tools, the prediction of election results is unpacked efficiently. Also using the analysis of the WEKA DB, the smart election prediction system is proposed in this paper.

Prediction of protein binding regions in RNA using random forest (Random forest를 이용한 RNA에서의 단백질 결합 영역 예측)

  • Choi, Daesik;Park, Byungkyu;Chae, Hanju;Lee, Wook;Han, Kyungsook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.583-586
    • /
    • 2016
  • 단백질과 RNA의 상호작용 데이터가 대량으로 늘어남에 따라, 단백질과 RNA의 결합부위를 예측하는 계산학적인 방법들이 많이 개발되고 있다. 하지만, 많은 계산학적인 방법들은 단백질에서 단백질과 RNA 결합부위를 예측한다는 한계점이 있었다. 본 논문에서는 RNA와 단백질의 서열정보를 모두 사용하여, 단백질과 결합하는 RNA 결합부위를 예측하는 기법과 그 결과를 논한다. WEKA random forest(http://www.cs.waikato.ac.nz/ml/weka/)를 이용하여 예측 모델을 개발하였고, RNA 서열의 서열 프로파일, 서열 composition, 결합 상대방의 단백질의 특성 등을 특정으로 표현하였다. Random forest 기법을 사용한 cross validation의 결과로서 1:1 모델에서 제일 높은 성능인 92.4% sensitivity, 92.0% specificity, 92.2% accuracy를 보였고, independent test에서는 72.5% sensitivity, 90.0% specificity, 2.1% accuracy를 보였다.

Pattern Analysis and Performance Comparison of Lottery Winning Numbers

  • Jung, Yong Gyu;Han, Soo Ji;kim, Jae Hee
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.6 no.1
    • /
    • pp.16-22
    • /
    • 2014
  • Clustering methods such as k-means and EM are the group of classification and pattern recognition, which are used in management science and literature search widely. In this paper, k-means and EM algorithm are compared the performance using by Weka. The winning Lottery numbers of 567 cases are experimented for our study and presentation. Processing speed of the k-means algorithm is superior to the EM algorithm, which is about 0.08 seconds faster than the other. As the result it is summerized that EM algorithm is better than K-means algorithm with comparison of accuracy, precision and recall. While K-means is known to be sensitive to the distribution of data, EM algorithm is probability sensitive for clustering.

Fault Location Technique of 154 kV Substation using Neural Network (신경회로망을 이용한 154kV 변전소의 고장 위치 판별 기법)

  • Ahn, Jong-Bok;Kang, Tae-Won;Park, Chul-Won
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.9
    • /
    • pp.1146-1151
    • /
    • 2018
  • Recently, researches on the intelligence of electric power facilities have been trying to apply artificial intelligence techniques as computer platforms have improved. In particular, faults occurring in substation should be able to quickly identify possible faults and minimize power fault recovery time. This paper presents fault location technique for 154kV substation using neural network. We constructed a training matrix based on the operating conditions of the circuit breaker and IED to identify the fault location of each component of the target 154kV substation, such as line, bus, and transformer. After performing the training to identify the fault location by the neural network using Weka software, the performance of fault location discrimination of the designed neural network was confirmed.

Phishing Email Detection Using Machine Learning Techniques

  • Alammar, Meaad;Badawi, Maria Altaib
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.277-283
    • /
    • 2022
  • Email phishing has become very prevalent especially now that most of our dealings have become technical. The victim receives a message that looks as if it was sent from a known party and the attack is carried out through a fake cookie that includes a phishing program or through links connected to fake websites, in both cases the goal is to install malicious software on the user's device or direct him to a fake website. Today it is difficult to deploy robust cybersecurity solutions without relying heavily on machine learning algorithms. This research seeks to detect phishing emails using high-accuracy machine learning techniques. using the WEKA tool with data preprocessing we create a proposed methodology to detect emails phishing. outperformed random forest algorithm on Naïve Bayes algorithms by accuracy of 99.03 %.

A Study on Intrusion Detection in Network Intrusion Detection System using SVM (SVM을 이용한 네트워크 기반 침입탐지 시스템에서 새로운 침입탐지에 관한 연구)

  • YANG, Eun-mok;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.16 no.5
    • /
    • pp.399-406
    • /
    • 2018
  • Much research has been done using the KDDCup99 data set to study intrusion detection using artificial intelligence. Previous studies have shown that the performance of the SMO (SVM) algorithm is superior. However, intrusion detection studies of new intrusion types not used in training are insufficient. In this paper, a model was created using the instances of weka's SMO and KDDCup99 training data set, kddcup.data.gz. We tested existing instances(292,300) of the corrected.gz file and new intrusions(18,729). In general, intrusion labels not used in training are not tested, so new intrusion labels were changed to normal. Of the 18,729 new intrusions, 1,827 were classified as intrusions. 1,827 instances classified as new intrusions are buffer_overflow. Three, neptune. 392, portsweep. 164, ipsweep. 9, back. 511, imap. 1, satan. Dogs, 645, nmap. 102.