• Title/Summary/Keyword: Data mining p

Search Result 95, Processing Time 0.027 seconds

Enhanced Hybrid Privacy Preserving Data Mining Technique

  • Kundeti Naga Prasanthi;M V P Chandra Sekhara Rao;Ch Sudha Sree;P Seshu Babu
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.6
    • /
    • pp.99-106
    • /
    • 2023
  • Now a days, large volumes of data is accumulating in every field due to increase in capacity of storage devices. These large volumes of data can be applied with data mining for finding useful patterns which can be used for business growth, improving services, improving health conditions etc. Data from different sources can be combined before applying data mining. The data thus gathered can be misused for identity theft, fake credit/debit card transactions, etc. To overcome this, data mining techniques which provide privacy are required. There are several privacy preserving data mining techniques available in literature like randomization, perturbation, anonymization etc. This paper proposes an Enhanced Hybrid Privacy Preserving Data Mining(EHPPDM) technique. The proposed technique provides more privacy of data than existing techniques while providing better classification accuracy. The experimental results show that classification accuracies have increased using EHPPDM technique.

A Six Sigma Methodology Using Data Mining : A Case Study of "P" Steel Manufacturing Company (데이터 마이닝 기반의 6 시그마 방법론 : 철강산업 적용사례)

  • Jang, Gil-Sang
    • The Journal of Information Systems
    • /
    • v.20 no.3
    • /
    • pp.1-24
    • /
    • 2011
  • Recently, six sigma has been widely adopted in a variety of industries as a disciplined, data-driven problem solving approach or methodology supported by a handful of powerful statistical tools in order to reduce variation through continuous process improvement. Also, data mining has been widely used to discover unknown knowledge from a large volume of data using various modeling techniques such as neural network, decision tree, regression analysis, etc. This paper proposes a six sigma methodology based on data mining for effectively and efficiently processing massive data in driving six sigma projects. The proposed methodology is applied in the hot stove system which is a major energy-consuming process in a "P" steel company for improvement of heat efficiency through reduction of energy consumption. The results show optimal operation conditions and reduction of the hot stove energy cost by 15%.

Development of Data Mining System for Ship Design using Combined Genetic Programming with Self Organizing Map (유전적 프로그래밍과 SOM을 결합한 개선된 선박 설계용 데이터 마이닝 시스템 개발)

  • Lee, Kyung-Ho;Park, Jong-Hoon;Han, Young-Soo;Choi, Si-Young
    • Korean Journal of Computational Design and Engineering
    • /
    • v.14 no.6
    • /
    • pp.382-389
    • /
    • 2009
  • Recently, knowledge management has been required in companies as a tool of competitiveness. Companies have constructed Enterprise Resource Planning(ERP) system in order to manage huge knowledge. But, it is not easy to formalize knowledge in organization. We focused on data mining system by genetic programming(GP). Data mining system by genetic programming can be useful tools to derive and extract the necessary information and knowledge from the huge accumulated data. However when we don't have enough amounts of data to perform the learning process of genetic programming, we have to reduce input parameter(s) or increase number of learning or training data. In this study, an enhanced data mining method combining Genetic Programming with Self organizing map, that reduces the number of input parameters, is suggested. Experiment results through a prototype implementation are also discussed.

Comparison and Analysis of P2P Botnet Detection Schemes

  • Cho, Kyungsan;Ye, Wujian
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.3
    • /
    • pp.69-79
    • /
    • 2017
  • In this paper, we propose our four-phase life cycle of P2P botnet with corresponding detection methods and the future direction for more effective P2P botnet detection. Our proposals are based on the intensive analysis that compares existing P2P botnet detection schemes in different points of view such as life cycle of P2P botnet, machine learning methods for data mining based detection, composition of data sets, and performance matrix. Our proposed life cycle model composed of linear sequence stages suggests to utilize features in the vulnerable phase rather than the entire life cycle. In addition, we suggest the hybrid detection scheme with data mining based method and our proposed life cycle, and present the improved composition of experimental data sets through analysing the limitations of previous works.

Occupational Health and Safety Management and Turnover Intention in the Ghanaian Mining Sector

  • Amponsah-Tawiah, Kwesi;Ntow, Michael Akomeah Ofori;Mensah, Justice
    • Safety and Health at Work
    • /
    • v.7 no.1
    • /
    • pp.12-17
    • /
    • 2016
  • Background: The mining industry is considered as one of the most dangerous and hazardous industries and the need for effective and efficient occupational health and safety management is critical to safeguard workers and the industry. Despite the dangers and hazards present in the mining industry, only few studies have focused on how occupational health and safety and turnover intentions in the mines. Method: The study suing a cross-sectional survey design collected quantitative data from the 255 mine workers that were conveniently sampled from the Ghanaian mining industry. The data collection tools were standardized questionnaires that measured occupational health and safety management and turnover intentions. These scales were also pretested before their usage in actual data collection. Results: The correlation coefficient showed that a negative relationship existed between dimensions of occupational health and safety management and turnover intention; safety leadership (r = -0.33, p < 0.01); supervision (r = -0.26, p < 0.01); safety facilities and equipment (r = -0.32, p < 0.01); safety procedure (r = -0.27, p < 0.01). Among these dimensions, safety leadership and safety facility were significant predictors of turnover intention, (${\beta}=-0.28$, p < 0.01) and (${\beta}=-0.24$, p < 0.01) respectively. The study also found that turnover intention of employees is heavily influenced by the commitment of safety leadership in ensuring the effective formulation of policies and supervision of occupational health and safety at the workplace. Conclusion: The present study demonstrates that safety leadership is crucial in the administration of occupational health and safety and reducing turnover intention in organizations.

Method for Preference Score Based on User Behavior (웹 사이트 이용 고객의 행동 정보를 기반으로 한 고객 선호지수 산출 방법)

  • Seo, Dong-Yal;Kim, Doo-Jin;Yun, Jeong-Ki;Kim, Jae-Hoon;Moon, Kang-Sik;Oh, Jae-Hoon
    • CRM연구
    • /
    • v.4 no.1
    • /
    • pp.55-68
    • /
    • 2011
  • Recently with the development of Web services by utilizing a variety of web content, the studies on user experience and personalization based on web usage has attracted much attention. Majority of personalized analysis are have been carried out based on existing data, primarily using the database and statistical models. These approaches are difficult to reflect in a timely mannerm, and are limited to reflect the true behavioral characteristics because the data itself was just a result of customers' behaviors. However, recent studies and commercial products on web analytics try to track and analyze all of the actions from landing to exit to provide personalized service. In this study, by analyzing the customer's click-stream behaviors, we define U-Score(Usage Score), P-Score (Preference Score), M-Score(Mania Score) to indicate variety of customer preferences. With the devised three indicators, we can identify the customer's preferences more precisely, provide in-depth customer reports and customer relationship management, and utilize personalized recommender services.

  • PDF

Research on Security Threats Emerging from Blockchain-based Services

  • Yoo, Soonduck
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.1-10
    • /
    • 2021
  • The purpose of the study is to contribute to the positive development of blockchain technology by providing data to examine security vulnerabilities and threats to blockchain-based services and review countermeasures. The findings of this study are as follows. Threats to the security of blockchain-based services can be classified into application security threats, smart contract security threats, and network (P2P) security threats. First, application security threats include wallet theft (e-wallet stealing), double spending (double payment attack), and cryptojacking (mining malware infection). Second, smart contract security threats are divided into reentrancy attacks, replay attacks, and balance increasing attacks. Third, network (P2P) security threats are divided into the 51% control attack, Sybil attack, balance attack, eclipse attack (spread false information attack), selfish mining (selfish mining monopoly), block withholding attack, DDoS attack (distributed service denial attack) and DNS/BGP hijacks. Through this study, it is possible to discuss the future plans of the blockchain technology-based ecosystem through understanding the functional characteristics of transparency or some privacy that can be obtained within the blockchain. It also supports effective coping with various security threats.

Data Mining and FNN-Driven Knowledge Acquisition and Inference Mechanism for Developing A Self-Evolving Expert Systems

  • Kim, Jin-Sung
    • Proceedings of the KAIS Fall Conference
    • /
    • 2003.11a
    • /
    • pp.99-104
    • /
    • 2003
  • In this research, we proposed the mechanism to develop self evolving expert systems (SEES) based on data mining (DM), fuzzy neural networks (FNN), and relational database (RDB)-driven forward/backward inference engine. Most former researchers tried to develop a text-oriented knowledge base (KB) and inference engine (IE). However, thy have some limitations such as 1) automatic rule extraction, 2) manipulation of ambiguousness in knowledge, 3) expandability of knowledge base, and 4) speed of inference. To overcome these limitations, many of researchers had tried to develop an automatic knowledge extraction and refining mechanisms. As a result, the adaptability of the expert systems was improved. Nonetheless, they didn't suggest a hybrid and generalized solution to develop self-evolving expert systems. To this purpose, in this study, we propose an automatic knowledge acquisition and composite inference mechanism based on DM, FNN, and RDB-driven inference. Our proposed mechanism has five advantages empirically. First, it could extract and reduce the specific domain knowledge from incomplete database by using data mining algorithm. Second, our proposed mechanism could manipulate the ambiguousness in knowledge by using fuzzy membership functions. Third, it could construct the relational knowledge base and expand the knowledge base unlimitedly with RDBMS (relational database management systems). Fourth, our proposed hybrid data mining mechanism can reflect both association rule-based logical inference and complicate fuzzy logic. Fifth, RDB-driven forward and backward inference is faster than the traditional text-oriented inference.

  • PDF

Improving Process Mining with Trace Clustering (자취 군집화를 통한 프로세스 마이닝의 성능 개선)

  • Song, Min-Seok;Gunther, C.W.;van der Aalst, W.M.P.;Jung, Jae-Yoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.34 no.4
    • /
    • pp.460-469
    • /
    • 2008
  • Process mining aims at mining valuable information from process execution results (called "event logs"). Even though process mining techniques have proven to be a valuable tool, the mining results from real process logs are usually too complex to interpret. The main cause that leads to complex models is the diversity of process logs. To address this issue, this paper proposes a trace clustering approach that splits a process log into homogeneous subsets and applies existing process mining techniques to each subset. Based on log profiles from a process log, the approach uses existing clustering techniques to derive clusters. Our approach are implemented in ProM framework. To illustrate this, a real-life case study is also presented.

A Study on the Turbidity Estimation Model Using Data Mining Techniques in the Water Supply System (데이터마이닝 기법을 이용한 상수도 시스템 내의 탁도 예측모형 개발에 관한 연구)

  • Park, No-Suk;Kim, Soonho;Lee, Young Joo;Yoon, Sukmin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.38 no.2
    • /
    • pp.87-95
    • /
    • 2016
  • Turbidity is a key indicator to the user that the 'Discolored Water' phenomenon known to be caused by corrosion of the pipeline in the water supply system. 'Discolored Water' is defined as a state with a turbidity of the degree to which the user visually be able to recognize water. Therefore, this study used data mining techniques in order to estimate turbidity changes in water supply system. Decision tree analysis was applied in data mining techniques to develop estimation models for turbidity changes in the water supply system. The pH and residual chlorine dataset was used as variables of the turbidity estimation model. As a result, the case of applying both variables(pH and residual chlorine) were shown more reasonable estimation results than models only using each variable. However, the estimation model developed in this study were shown to have underestimated predictions for the peak observed values. To overcome this disadvantage, a high-pass filter method was introduced as a pretreatment of estimation model. Modified model using high-pass filter method showed more exactly predictions for the peak observed values as well as improved prediction performance than the conventional model.