• Title, Summary, Keyword: gradient boosting

Search Result 47, Processing Time 0.035 seconds

AN OPTIMAL BOOSTING ALGORITHM BASED ON NONLINEAR CONJUGATE GRADIENT METHOD

  • CHOI, JOOYEON;JEONG, BORA;PARK, YESOM;SEO, JIWON;MIN, CHOHONG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.22 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • Boosting, one of the most successful algorithms for supervised learning, searches the most accurate weighted sum of weak classifiers. The search corresponds to a convex programming with non-negativity and affine constraint. In this article, we propose a novel Conjugate Gradient algorithm with the Modified Polak-Ribiera-Polyak conjugate direction. The convergence of the algorithm is proved and we report its successful applications to boosting.

A gradient boosting regression based approach for energy consumption prediction in buildings

  • Bataineh, Ali S. Al
    • Advances in Energy Research
    • /
    • v.6 no.2
    • /
    • pp.91-101
    • /
    • 2019
  • This paper proposes an efficient data-driven approach to build models for predicting energy consumption in buildings. Data used in this research is collected by installing humidity and temperature sensors at different locations in a building. In addition to this, weather data from nearby weather station is also included in the dataset to study the impact of weather conditions on energy consumption. One of the main emphasize of this research is to make feature selection independent of domain knowledge. Therefore, to extract useful features from data, two different approaches are tested: one is feature selection through principal component analysis and second is relative importance-based feature selection in original domain. The regression model used in this research is gradient boosting regression and its optimal parameters are chosen through a two staged coarse-fine search approach. In order to evaluate the performance of model, different performance evaluation metrics like r2-score and root mean squared error are used. Results have shown that best performance is achieved, when relative importance-based feature selection is used with gradient boosting regressor. Results of proposed technique has also outperformed the results of support vector machines and neural network-based approaches tested on the same dataset.

Prediction of the Movement Directions of Index and Stock Prices Using Extreme Gradient Boosting (익스트림 그라디언트 부스팅을 이용한 지수/주가 이동 방향 예측)

  • Kim, HyoungDo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.623-632
    • /
    • 2018
  • Both investors and researchers are attentive to the prediction of stock price movement directions since the accurate prediction plays an important role in strategic decision making on stock trading. According to previous studies, taken together, one can see that different factors are considered depending on stock markets and prediction periods. This paper aims to analyze what data mining techniques show better performance with some representative index and stock price datasets in the Korea stock market. In particular, extreme gradient boosting technique, proving itself to be the fore-runner through recent open competitions, is applied to the prediction problem. Its performance has been analyzed in comparison with other data mining techniques reported good in the prediction of stock price movement directions such as random forests, support vector machines, and artificial neural networks. Through experiments with the index/price datasets of 12 years, it is identified that the gradient boosting technique is the best in predicting the movement directions after 1 to 4 days with a few partial equivalence to the other techniques.

The study of foreign exchange trading revenue model using decision tree and gradient boosting (외환거래에서 의사결정나무와 그래디언트 부스팅을 이용한 수익 모형 연구)

  • Jung, Ji Hyeon;Min, Dae Kee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.161-170
    • /
    • 2013
  • The FX (Foreign Exchange) is a form of exchange for the global decentralized trading of international currencies. The simple sense of Forex is simultaneous purchase and sale of the currency or the exchange of one country's currency for other countries'. We can find the consistent rules of trading by comparing the gradient boosting method and the decision trees methods. Methods such as time series analysis used for the prediction of financial markets have advantage of the long-term forecasting model. On the other hand, it is difficult to reflect the rapidly changing price fluctuations in the short term. Therefore, in this study, gradient boosting method and decision tree method are applied to analyze the short-term data in order to make the rules for the revenue structure of the FX market and evaluated the stability and the prediction of the model.

Study on Fault Detection of a Gas Pressure Regulator Based on Machine Learning Algorithms

  • Seo, Chan-Yang;Suh, Young-Joo;Kim, Dong-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.19-27
    • /
    • 2020
  • In this paper, we propose a machine learning method for diagnosing the failure of a gas pressure regulator. Originally, when implementing a machine learning model for detecting abnormal operation of a facility, it is common to install sensors to collect data. However, failure of a gas pressure regulator can lead to fatal safety problems, so that installing an additional sensor on a gas pressure regulator is not simple. In this paper, we propose various machine learning approach for diagnosing the abnormal operation of a gas pressure regulator with only the flow rate and gas pressure data collected from a gas pressure regulator itself. Since the fault data of a gas pressure regulator is not enough, the model is trained in all classes by applying the over-sampling method. The classification model was implemented using Gradient boosting, 1D Convolutional Neural Networks, and LSTM algorithm, and gradient boosting model showed the best performance among classification models with 99.975% accuracy.

Comparison of machine learning algorithms for Chl-a prediction in the middle of Nakdong River (focusing on water quality and quantity factors) (머신러닝 기법을 활용한 낙동강 중류 지역의 Chl-a 예측 알고리즘 비교 연구(수질인자 및 수량 중심으로))

  • Lee, Sang-Min;Park, Kyeong-Deok;Kim, Il-Kyu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.34 no.4
    • /
    • pp.277-288
    • /
    • 2020
  • In this study, we performed algorithms to predict algae of Chlorophyll-a (Chl-a). Water quality and quantity data of the middle Nakdong River area were used. At first, the correlation analysis between Chl-a and water quality and quantity data was studied. We extracted ten factors of high importance for water quality and quantity data about the two weirs. Algorithms predicted how ten factors affected Chl-a occurrence. We performed algorithms about decision tree, random forest, elastic net, gradient boosting with Python. The root mean square error (RMSE) value was used to evaluate excellent algorithms. The gradient boosting showed 10.55 of RMSE value for the Gangjeonggoryeong (GG) site and 11.43 of RMSE value for the Dalsung (DS) site. The gradient boosting algorithm showed excellent results for GG and DS sites. Prediction value for the four algorithms was also evaluated through the Receiver operating characteristic (ROC) curve and Area under curve (AUC). As a result of the evaluation, the AUC value was 0.877 at GG site and the AUC value was 0.951 at DS site. So the algorithm's ability to interpret seemed to be excellent.

Android Malware Detection Using Permission-Based Machine Learning Approach (머신러닝을 이용한 권한 기반 안드로이드 악성코드 탐지)

  • Kang, Seongeun;Long, Nguyen Vu;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.3
    • /
    • pp.617-623
    • /
    • 2018
  • This study focuses on detection of malicious code through AndroidManifest permissoion feature extracted based on Android static analysis. Features are built on the permissions of AndroidManifest, which can save resources and time for analysis. Malicious app detection model consisted of SVM (support vector machine), NB (Naive Bayes), Gradient Boosting Classifier (GBC) and Logistic Regression model which learned 1,500 normal apps and 500 malicious apps and 98% detection rate. In addition, malicious app family identification is implemented by multi-classifiers model using algorithm SVM, GPC (Gaussian Process Classifier) and GBC (Gradient Boosting Classifier). The learned family identification machine learning model identified 92% of malicious app families.

Predicting Gross Box Office Revenue for Domestic Films

  • Song, Jongwoo;Han, Suji
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.4
    • /
    • pp.301-309
    • /
    • 2013
  • This paper predicts gross box office revenue for domestic films using the Korean film data from 2008-2011. We use three regression methods, Linear Regression, Random Forest and Gradient Boosting to predict the gross box office revenue. We only consider domestic films with a revenue size of at least KRW 500 million; relevant explanatory variables are chosen by data visualization and variable selection techniques. The key idea of analyzing this data is to construct the meaningful explanatory variables from the data sources available to the public. Some variables must be categorized to conduct more effective analysis and clustering methods are applied to achieve this task. We choose the best model based on performance in the test set and important explanatory variables are discussed.

Vehicle Detection Scheme Based on a Boosting Classifier with Histogram of Oriented Gradient (HOG) Features and Image Segmentation] (HOG 특징 및 영상분할을 이용한 부스팅분류 기반 자동차 검출 기법)

  • Choi, Mi-Soon;Lee, Jeong-Hwan;Roh, Tae-Moon;Shim, Jae-Chang
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.10
    • /
    • pp.955-961
    • /
    • 2010
  • In this paper, we describe a study of a vehicle detection method based on a Boosting Classifier which uses Histogram of Oriented Gradient (HOG) features and Image Segmentation techniques. An input image is segmented by means of a split and merge algorithm. Then, the two largest segmented regions are removed in order to reduce the search region and speed up processing time. The HOG features are then calculated for each pixel in the search region. In order to detect the vehicle region we used the AdaBoost (adaptive boost) method, which is well known for classifying samples with two classes. To evaluate the performance of the proposed method, 537 training images were used to train and learn the classifier, followed by 500 non-training images to provide the recognition rate. From these experiments we were able to detect the proper image 98.34% of the time for the 500 non-training images. In conclusion, the proposed method can be used for detecting the location of a vehicle in an intelligent vehicle control system.

Korean Web Content Extraction using Tag Rank Position and Gradient Boosting (태그 서열 위치와 경사 부스팅을 활용한 한국어 웹 본문 추출)

  • Mo, Jonghoon;Yu, Jae-Myung
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.581-586
    • /
    • 2017
  • For automatic web scraping, unnecessary components such as menus and advertisements need to be removed from web pages and main contents should be extracted automatically. A content block tends to be located in the middle of a web page. In particular, Korean web documents rarely include metadata and have a complex design; a suitable method of content extraction is therefore needed. Existing content extraction algorithms use the textual and structural features of content blocks because processing visual features requires heavy computation for rendering and image processing. In this paper, we propose a new content extraction method using the tag positions in HTML as a quasi-visual feature. In addition, we develop a tag rank position, a type of tag position not affected by text length, and show that gradient boosting with the tag rank position is a very accurate content extraction method. The result of this paper shows that the content extraction method can be used to collect high-quality text data automatically from various web pages.