Search | Korea Science

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
- Journal of information and communication convergence engineering
- /
- v.20 no.2
- /
- pp.96-102
- /
- 2022
We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.
https://doi.org/10.6109/jicce.2022.20.2.96 인용 PDF KSCI

CNDO/2 MO Calculations for Catalytic Acidity of V-silicalite (실리카에 담지된 바나듐 촉매의 산성도에 대한 CNDO/2 분자궤도론적 계산)

Kim, Myung-Chul
- Applied Chemistry for Engineering
- /
- v.5 no.2
- /
- pp.357-360
- /
- 1994
The CNDO/2 calculations have been applied on cluster models for the representative active sites in V-silicalite to get Wiberg bond orders, LUMO energies and total energies. The B acidities of suggested models were investigated in terms of O-H bond orders. And the calculated LUMO energies showed the L acidities of the active sites. The structural stabilities of cluster models were also explained in terms of total energies.
PDF

Analysis of Cluster-based Truck-Drone Delivery Routing Models (군집 기반 트럭-드론 배송경로 모형의 효과분석)

Chang, Yong Sik
- Journal of Information Technology Applications and Management
- /
- v.26 no.1
- /
- pp.53-64
- /
- 2019
The purpose of this study is to find out the fast delivery route that several drones return a truck again after departing from it for delivery locations at each cluster while the truck goes through the cluster composed of several delivery locations. The main issue is to reduce the total delivery time composed of the delivery time by relatively slow trucks via clusters and the sum of maximum delivery times by relatively fast drones in each cluster. To solve this problem, we use a three-step heuristic approach. First, we cluster the nearby delivery locations with minimal number of clusters satisfying a constraint of drone flight distance to set delivery paths for drones in each cluster. Second, we set an optimal delivery route for a truck through centers of the clusters using the TSP model. Finally, we find out the moved centers of clusters while maintaining the delivery paths for the truck and drones and satisfying the constraint of drone flight. distance in the two-dimensional region to reduce the total delivery time. In order to analyze the effect of this study model according to the change of the number of delivery locations, we developed a R-based simulation prototype and compared the relative efficiency, and performed paired t-test between TSP model and the cluster-based models. This study showed its excellence through this experimentation.
https://doi.org/10.21219/jitam.2019.26.1.053 인용 PDF KSCI HTML

Design and Performance Measurement of a Genetic Algorithm-based Group Classification Method : The Case of Bond Rating (유전 알고리듬 기반 집단분류기법의 개발과 성과평가 : 채권등급 평가를 중심으로)

Min, Jae-H.;Jeong, Chul-Woo
- Journal of the Korean Operations Research and Management Science Society
- /
- v.32 no.1
- /
- pp.61-75
- /
- 2007
The purpose of this paper is to develop a new group classification method based on genetic algorithm and to com-pare its prediction performance with those of existing methods in the area of bond rating. To serve this purpose, we conduct various experiments with pilot and general models. Specifically, we first conduct experiments employing two pilot models : the one searching for the cluster center of each group and the other one searching for both the cluster center and the attribute weights in order to maximize classification accuracy. The results from the pilot experiments show that the performance of the latter in terms of classification accuracy ratio is higher than that of the former which provides the rationale of searching for both the cluster center of each group and the attribute weights to improve classification accuracy. With this lesson in mind, we design two generalized models employing genetic algorithm : the one is to maximize the classification accuracy and the other one is to minimize the total misclassification cost. We compare the performance of these two models with those of existing statistical and artificial intelligent models such as MDA, ANN, and Decision Tree, and conclude that the genetic algorithm-based group classification method that we propose in this paper significantly outperforms the other methods in respect of classification accuracy ratio as well as misclassification cost.
PDF KSCI

Development of a New Cluster Index for Semiconductor Wafer Defects and Simulation - Based Yield Prediction Models (변동계수를 이용한 반도체 결점 클러스터 지표 개발 및 수율 예측)

Park, Hang-Yeob;Jun, Chi-Hyuck;Hong, Yu-Shin;Kim, Soo-Young
- Journal of Korean Institute of Industrial Engineers
- /
- v.21 no.3
- /
- pp.371-385
- /
- 1995
The yield of semiconductor chips is dependent not only on the average defect density but also on the distribution of defects over a wafer. The distribution of defects leads to consider a cluster index. This paper briefly reviews the existing yield prediction models ad proposes a new cluster index, which utilizes the information about the defect location on a wafer in terms of the coefficient of variation. An extensive simulation is performed under a variety of defect distributions and a yield prediction model is derived through the regression analysis to relate the yield with the proposed cluster index and the average number of defects per chip. The performance of the proposed simulation-based yield prediction model is compared with that of the well-known negative binomial model.
PDF

Quantum Chemical Calculation of NO Decomposition over Cu-Y Zeolite (Cu-Y 제올라이트상의 NO분해반응에 대한 양자화학적 해석)

Kim, Myung-Chul
- Applied Chemistry for Engineering
- /
- v.7 no.2
- /
- pp.321-325
- /
- 1996
Quantum chemical calculations are used to characterize the decomposition of nitrogenmonoxide over $Cu^{n+}$-Y zeolite. The method of theoretical calculations, such as CNDO/2, have been applied to cluster models representing cation sites in zeolite to obtain total energies, LUMO energies, and Wiberg bond orders. The calculated total energies and bond orders of cluster models showed the reaction mechanism of NO decomposition over $Cu^{n+}$ site in zeolite framework. The suggested cluster models of varying Si/Al ratios studied with exchange cations in the $Cu^+$ and in the $Cu^{2+}$ states. And the calculated LUMO energies can predict L acidifies of cluster models. The results from these experiments showed the possibility of the mechanism of NO decomposition, progressing adsorption of NO, conversion to $N_2$ and $O_2$, desorption of $N_2$ and $O_2$ in sequence. The L acidity of $Cu^{2+}$ ion in cation site is more strong than $Cu^+$.
PDF

Cluster Analysis of Daily Electricity Demand with t-SNE

Min, Yunhong
- Journal of the Korea Society of Computer and Information
- /
- v.23 no.5
- /
- pp.9-14
- /
- 2018
For an efficient management of electricity market and power systems, accurate forecasts for electricity demand are essential. Since there are many factors, either known or unknown, determining the realized loads, it is difficult to forecast the demands with the past time series only. In this paper we perform a cluster analysis on electricity demand data collected from Jan. 2000 to Dec. 2017. Our purpose of clustering on electricity demand data is that each cluster is expected to consist of data whose latent variables are same or similar values. Then, if properly clustered, it is possible to develop an accurate forecasting model for each cluster separately. To validate the feasibility of this approach for building better forecasting models, we clustered data with t-SNE. To apply t-SNE to time series data effectively, we adopt the dynamic time warping as a similarity measure. From the result of experiments, we found that several clusters are well observed and each cluster can be interpreted as a mix of well-known factors such as trends, seasonality and holiday effects and other unknown factors. These findings can motivate the approaches which build forecasting models with respect to each cluster independently.
https://doi.org/10.9708/jksci.2018.23.05.009 인용 PDF KSCI

Analytic Study of Acquiring KANSEI Information Regarding the Recognition of Shape Models

Wang, Shao-Chi;Hiroshi Kubo;Hiromitsu Kikita;Takashi Uozumi;Tohru Ifukube
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 2002.05a
- /
- pp.266-269
- /
- 2002
This paper explores a fundamental study of acquiring the users' KANSEI information regarding the recognition of shape models. Since there are many differences such as background differences and knowledge differences among users, they will produce different evaluations based on their KANSEI even when an identical shape model is presented. Cluster analysis is proved to be available for catching a group tendency and for constructing a mapping relation between a description of the shape model and the HANSEl database. In order to investigate an analogical relation and a mutual influence in our consciousness, first, we made a questionnaire that asked subjects to represent images having different colors and shape cones by using 4 pairs of adjectives (KANSEI words). Next, based on the cluster analysis of the questionnaire using a fuzzy set theory, we proposed a hypothesis showing how the analogical relation and the mutual influence work in our mind while viewing the shape models. Furthermore, how the properties of KANSEI depend on their descriptions was also investigated by virtue of the cluster analysis. This work will be valuable to construct a personal KANSEI database regarding the Shape Model Processing System.
PDF

Some models for rainfall focused on the inner correlation structure

Kim, Sangdan
- Proceedings of the Korea Water Resources Association Conference
- /
- 2004.05b
- /
- pp.1290-1294
- /
- 2004
In this study, new stochastic point rainfall models which can consider the correlation structure between rainfall intensity and duration are developed. In order to consider the negative and positive correlation simultaneously, the Gumbels type-II bivariate distribution is applied, and for the cluster structure of rainfall events, the Neyman-Scott cluster point process is selected. In the theoretical point of view, it is shown that the models considering the dependent structure between rainfall intensity and duration have slightly heavier tail autocorrelation functions than the corresponding independent mode]s. Results from generating long time rainfall events show that the dependent models better reproduce historical rainfall time series than the corresponding independent models in the sense of autocorrelation structures, zero rainfall probabilities and extreme rainfall events.
PDF

The Evaluation of Regional Innovation and Cluster Policies : Theory and Methods (지역혁신과 클러스터 정책의 평가: 이론과 방법)

Diez, Maria Angeles
- Journal of the Korean Academic Society of Industrial Cluster
- /
- v.1 no.1
- /
- pp.1-15
- /
- 2007
Regional innovation and cluster policies are the new agenda of regional policy, an agenda that began to spread over recent years throughout different countries and regions. In this context, our main question arises: how are we going to evaluate regional innovation and cluster policies? What models and methods are we going to use? Since 1990, regional and national governments have put more emphasis on evaluation as a tool directed to produce knowledge to design better policies. The objective of this article is to summarise the main challenges arising from the evaluation of regional innovation and cluster policies and make some methodological proposals that can contribute to produce better evaluations. In the first section, there is a brief presentation of regional innovation and cluster policies, followed by a more detailed analysis, in the second section, of their principal characteristics and of the main challenges posed by their evaluation. In the third section, some evaluation proposals that can help to improve current evaluation practice are presented. The paper concludes with a short number of general recommendations that we should bear in mind when designing an evaluation of regional innovation and cluster policies.
PDF

Search Result 353, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)