• Title/Summary/Keyword: measures of association

Search Result 3,166, Processing Time 0.036 seconds

Exploration of relationship between confirmation measures and association thresholds (기준 확인 측도와 연관성 평가기준과의 관계 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.835-845
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relevance between a set of items in a big database, andhas been applied in various fields like manufacturing industry, shopping mall, healthcare, insurance, and education. Philosophers of science have proposed interestingness measures for various kinds of patterns, analyzed their theoretical properties, evaluated them empirically, and suggested strategies to select appropriate measures for particular domains and requirements. Such interestingness measures are divided into objective, subjective, and semantic measures. Objective measures are based on data used in the discovery process and are typically motivated by statistical considerations. Subjective measures take into account not only the data but also the knowledge and interests of users who examine the pattern, while semantic measures additionally take into account utility and actionability. In a very different context, researchers have devoted a lot of attention to measures of confirmation or evidential support. The focus in this paper was on asymmetric confirmation measures, and we compared confirmation measures with basic association thresholds using some simulation data. As the result, we could distinguish the direction of association rule by confirmation measures, and interpret degree of association operationally by them. Futhermore, the result showed that the measure by Rips and that by Kemeny and Oppenheim were better than other confirmation measures.

The Development of Relative Interestingness Measure for Comparing with Degrees of Association

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1269-1279
    • /
    • 2008
  • Data mining is the technique to find useful information in huge databases. One of the well-studied problems in data mining is exploration for association rules. An association rule technique finds the relation among each items in massive volume databases by several interestingness measures. An important and useful classification scheme of interestingness measures may be based on user-involvement. This results in two categories - objective and subjective measures. This paper present some relative interestingess measures to compare with degrees of association for two groups. A comparative study with some relative interestingness measures is shown by numerical example. The results show that the relative net confidence is the best relative interestingness measure.

  • PDF

Effect of Market Basket Size on the Accuracy of Association Rule Measures (장바구니 크기가 연관규칙 척도의 정확성에 미치는 영향)

  • Kim, Nam-Gyu
    • Asia pacific journal of information systems
    • /
    • v.18 no.2
    • /
    • pp.95-114
    • /
    • 2008
  • Recent interests in data mining result from the expansion of the amount of business data and the growing business needs for extracting valuable knowledge from the data and then utilizing it for decision making process. In particular, recent advances in association rule mining techniques enable us to acquire knowledge concerning sales patterns among individual items from the voluminous transactional data. Certainly, one of the major purposes of association rule mining is to utilize acquired knowledge in providing marketing strategies such as cross-selling, sales promotion, and shelf-space allocation. In spite of the potential applicability of association rule mining, unfortunately, it is not often the case that the marketing mix acquired from data mining leads to the realized profit. The main difficulty of mining-based profit realization can be found in the fact that tremendous numbers of patterns are discovered by the association rule mining. Due to the many patterns, data mining experts should perform additional mining of the results of initial mining in order to extract only actionable and profitable knowledge, which exhausts much time and costs. In the literature, a number of interestingness measures have been devised for estimating discovered patterns. Most of the measures can be directly calculated from what is known as a contingency table, which summarizes the sales frequencies of exclusive items or itemsets. A contingency table can provide brief insights into the relationship between two or more itemsets of concern. However, it is important to note that some useful information concerning sales transactions may be lost when a contingency table is constructed. For instance, information regarding the size of each market basket(i.e., the number of items in each transaction) cannot be described in a contingency table. It is natural that a larger basket has a tendency to consist of more sales patterns. Therefore, if two itemsets are sold together in a very large basket, it can be expected that the basket contains two or more patterns and that the two itemsets belong to mutually different patterns. Therefore, we should classify frequent itemset into two categories, inter-pattern co-occurrence and intra-pattern co-occurrence, and investigate the effect of the market basket size on the two categories. This notion implies that any interestingness measures for association rules should consider not only the total frequency of target itemsets but also the size of each basket. There have been many attempts on analyzing various interestingness measures in the literature. Most of them have conducted qualitative comparison among various measures. The studies proposed desirable properties of interestingness measures and then surveyed how many properties are obeyed by each measure. However, relatively few attentions have been made on evaluating how well the patterns discovered by each measure are regarded to be valuable in the real world. In this paper, attempts are made to propose two notions regarding association rule measures. First, a quantitative criterion for estimating accuracy of association rule measures is presented. According to this criterion, a measure can be considered to be accurate if it assigns high scores to meaningful patterns that actually exist and low scores to arbitrary patterns that co-occur by coincidence. Next, complementary measures are presented to improve the accuracy of traditional association rule measures. By adopting the factor of market basket size, the devised measures attempt to discriminate the co-occurrence of itemsets in a small basket from another co-occurrence in a large basket. Intensive computer simulations under various workloads were performed in order to analyze the accuracy of various interestingness measures including traditional measures and the proposed measures.

Association of Mutual Fund Risk Measures and Return Parameters: A Juxtapose of Ranking for Performance in Pakistan

  • KHURRAM, Muhammad Usman;HAMID, Kashif;JAVEED, Sohail Ahmad
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.2
    • /
    • pp.25-39
    • /
    • 2021
  • This purpose of this study is to investigate the association among mutual funds (MFs) risk measures and return parameters, evaluate mutual fund performance and also explore the best appropriate mutual fund performance measure for investment in Pakistan. Therefore, thirty-five mutual funds have been selected for the period 2007-2015. The Sharpe, Treynor, Jensen Alpha, Information ratio and Fama's Net Selectivity measures has been used to analyze MF performance. Our study findings show significant positive relation exist between Sharpe and Jenson alpha & information ratio (IR); Treynor ratio is negatively correlated to Jenson alpha and Jenson alpha is positively allied with IR. Moreover, association among performance measures, Fama's net selectivity is a major driver in leading to other measures but Sharpe and IR lead to Treynor ratio as well. Furthermore, performance measures are ranked in accordance standard deviation with the arrangement of Fama's net selectivity at top, Jenson Alpha at second, Sharpe ratio at third, IR at fourth and Treynor ratio at fifth position according to risk parameters in Pakistan. Overall, Jensen Alpha measure appears to be the best suitable mutual fund performance measure in Pakistan due to its practical nature. Finally, the Pakistani stock market index KSE100 (as benchmark) performs better than MF industry of Pakistan.

A Study on the Frequency Level Preference Tendency of Association Measures (연관성 척도의 빈도수준 선호경향에 대한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.4 s.54
    • /
    • pp.281-294
    • /
    • 2004
  • Association measures are applied to various applications, including information retrieval and data mining. Each association measure is subject to a close examination to its tendency to prefer high or low frequency level because it has a significant impact on the performance of applications. This paper examines the frequency level preference(FLP) tendency of some popular association measures using artificially generated cooccurrence data, and evaluates the results. After that, a method of how to adjust the FLP tendency of major association measures such as cosine coefficient is proposed. This method is tested on the cooccurrence-based query expansion in information retrieval and the result can be regarded as promising the usefulness of the method. Based on these results of analysis and experiment, implications for related disciplines are identified.

Exploration of PIM based similarity measures as association rule thresholds (확률적 흥미도를 이용한 유사성 측도의 연관성 평가 기준)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1127-1135
    • /
    • 2012
  • Association rule mining is the method to quantify the relationship between each set of items in a large database. One of the well-studied problems in data mining is exploration for association rules. There are three primary quality measures for association rule, support and confidence and lift. We generate some association rules using confidence. Confidence is the most important measure of these measures, but it is an asymmetric measure and has only positive value. Thus we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure to find a solution to this problem. The comparative studies with support, two confidences, lift, and some similarity measures by probabilistic interestingness measure are shown by numerical example. As the result, we knew that the similarity measures by probabilistic interestingness measure could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values.

Content Analysis as a Method for Measuring Exploitation and Exploration: Discussion with Example Application to the Worldwide Optical Library Industry (활용과 탐색 측정을 위한 방법론으로써 콘텐츠 분석 :세계 광디스크 라이브러리장치 산업)

  • Yu, Gun Jea
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.7
    • /
    • pp.495-510
    • /
    • 2014
  • Measures of exploration and exploitation fall into three categories, patent-based measures, survey-based measures, and press-based measures. Such variety stems from the lack of consensus on definitions of exploration and exploitation. Given a dynamic nature of exploration and exploitation, I suggested how to improve existing press-based measures by suggesting strategies and procedures for constructing valid and reliable press-based measures in a single industry context. I illustrate my arguments through a study of the worldwide optical disk industry.

Proposals for the Coexisting of Legal Units and Living Measures (법정계량단위와 생활계량단위의 공존방안)

  • Sohn, Jin-Hyeon
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.9
    • /
    • pp.185-193
    • /
    • 2008
  • The Korean Government is regulating using traditional measures such as ‘pyeong' or ‘don' in commercial transactions not also as a standard but also as a subsidiarity since the first of July, 2007. However, contrary to our expectation, the measures ‘pyeong’ and ‘don’ are used in other forms because the living measures are convenient for our living and they have useful meanings. In this article, we propose an idea that makes the convenient living measures and the legal units coexist.

ASEAN Protection Trade Measures: Focusing on Non-Tariff Measures and Specific Trade Concerns (아세안의 보호무역조치 연구: 비관세조치 및 특정무역현안을 중심으로)

  • Ra, Hee-Ryang
    • Korea Trade Review
    • /
    • v.44 no.3
    • /
    • pp.43-72
    • /
    • 2019
  • This study examines the trends, current situation and implications of non-tariff measures (NTM) and specific trade concerns (STC) on the protection trade measures of ASEAN. ASEAN's non-tariff measures and the share of specific trade concerns are very significant as they are the second and third largest, respectively, of the major countries. This means that protection measures using non-tariff measures are a strong feature of ASEAN's trade policy. Also, in the future, ASEAN should try to prevent unnecessary disputes caused by exporting countries' specific trade concerns in the implementation of non-tariff measures. Activating trade policy cooperation is likely to reduce conflicts and costs caused by these trade disputes.

A Comparison Study on the Weighted Network Centrality Measures of tnet and WNET (tnet과 WNET의 가중 네트워크 중심성 지수 비교 연구)

  • Lee, Jae Yun
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.4
    • /
    • pp.241-264
    • /
    • 2013
  • This study compared and analyzed weighted network centrality measures supported by Opsahl's tnet and Lee's WNET, which are free softwares for weighted network analysis. Three node centrality measures including weighted degree, weighted closeness, and weighted betweenness are supported by tnet, and four node centrality measures including nearest neighbor centrality, mean association, mean profile association, triangle betweenness centrality are supported by WNET. An experimental analysis carried out on artificial network data showed tnet's high sensitiveness on linear transformations of link weights, however, WNET's centrality measures were insensitive to linear transformations. Seven centrality measures from both tools, tnet and WNET, were calculated on six real network datasets. The results showed the characteristics of weighted network centrality measures of tnet and WNET, and the relationships between them were also discussed.