• Title/Summary/Keyword: Precision-recall

Search Result 699, Processing Time 0.034 seconds

A Study on the Effectiveness of Information Retrieval (정보검색효율에 관한 연구)

  • Yoon Koo-ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.8
    • /
    • pp.73-101
    • /
    • 1981
  • Retrieval effectiveness is the principal criterion for measuring the performance of an information retrieval system. The effectiveness of a retrieval system depends primarily on the extent to which it can retrieve wanted documents without retrieving unwanted ones. So, ultimately, effectiveness is a function of the relevant and nonrelevant documents retrieved. Consequently, 'relevance' of information to the user's request has become one of the most fundamental concept encountered in the theory of information retrieval. Although there is at present no consensus as to how this notion should be defined, relevance has been widely used as a meaningful quantity and an adequate criterion for measures of the evaluation of retrieval effectiveness. The recall and precision among various parameters based on the 'two-by-two' table (or, contingency table) were major considerations in this paper, because it is assumed that recall and precision are sufficient for the measurement of effectiveness. Accordingly, different concepts of 'relevance' and 'pertinence' of documents to user requests and their proper usages were investigated even though the two terms have unfortunately been used rather loosely in the literature. In addition, a number of variables affecting the recall and precision values were discussed. Some conclusions derived from this study are as follows: Any notion of retrieval effectiveness is based on 'relevance' which itself is extremely difficult to define. Recall and precision are valuable concepts in the study of any information retrieval system. They are, however, not the only criteria by which a system may be judged. The recall-precision curve represents the average performance of any given system, and this may vary quite considerably in particular situations. Therefore, it is possible to some extent to vary the indexing policy, the indexing policy, the indexing language, or the search methodology to improve the performance of the system in terms of recall and precision. The 'inverse relationship' between average recall and precision could be accepted as the 'fundamental law of retrieval', and it should certainly be used as an aid to evaluation. Finally, there is a limit to the performance(in terms of effectiveness) achievable by an information retrieval system. That is : "Perfect retrieval is impossible."

  • PDF

Image Clustering Using Machine Learning : Study of InceptionV3 with K-means Methods. (머신 러닝을 사용한 이미지 클러스터링: K-means 방법을 사용한 InceptionV3 연구)

  • Nindam, Somsauwt;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.681-684
    • /
    • 2021
  • In this paper, we study image clustering without labeling using machine learning techniques. We proposed an unsupervised machine learning technique to design an image clustering model that automatically categorizes images into groups. Our experiment focused on inception convolutional neural networks (inception V3) with k-mean methods to cluster images. For this, we collect the public datasets containing Food-K5, Flowers, Handwritten Digit, Cats-dogs, and our dataset Rice Germination, and the owner dataset Palm print. Our experiment can expand into three-part; First, format all the images to un-label and move to whole datasets. Second, load dataset into the inception V3 extraction image features and transferred to the k-mean cluster group hold on six classes. Lastly, evaluate modeling accuracy using the confusion matrix base on precision, recall, F1 to analyze. In this our methods, we can get the results as 1) Handwritten Digit (precision = 1.000, recall = 1.000, F1 = 1.00), 2) Food-K5 (precision = 0.975, recall = 0.945, F1 = 0.96), 3) Palm print (precision = 1.000, recall = 0.999, F1 = 1.00), 4) Cats-dogs (precision = 0.997, recall = 0.475, F1 = 0.64), 5) Flowers (precision = 0.610, recall = 0.982, F1 = 0.75), and our dataset 6) Rice Germination (precision = 0.997, recall = 0.943, F1 = 0.97). Our experiment showed that modeling could get an accuracy rate of 0.8908; the outcomes state that the proposed model is strongest enough to differentiate the different images and classify them into clusters.

A Film-Defect Inspection System Using Image Segmentation and Template Matching Techniques (영상 세그멘테이션 및 템플리트 매칭 기술을 응용한 필름 결함 검출 시스템)

  • Yoon, Young-Geun;Lee, Seok-Lyong;Park, Ho-Hyun;Chung, Chin-Wan;Kim, Sang-Hee
    • Journal of KIISE:Databases
    • /
    • v.34 no.2
    • /
    • pp.99-108
    • /
    • 2007
  • In this paper, we design and implement the Film Defect Inspection System (FDIS) that detects film defects and determines their types which can be used for producing polarized films of TFT-LCD. The proposed system is designed to detect film defects from polarized film images using image segmentation techniques and to determine defect types through the image analysis of detected defects. To determine defect types, we extract features such as shape and texture of defects, and compare those features with corresponding features of referential images stored in a template database. Experimental results using FDIS show that the proposed system detects all defects of test images effectively (Precision 1.0, Recall 1.0) and efficiently (within 0.64 second in average), and achieves the considerably high correctness in determining defect types (Precision 0.96 and Recall 0.95 in average). In addition, our system shows the high robustness for rotated transformation of images, achieving Precision 0.95 and Recall 0.89 in average.

Sentiment Analysis From Images - Comparative Study of SAI-G and SAI-C Models' Performances Using AutoML Vision Service from Google Cloud and Clarifai Platform

  • Marcu, Daniela;Danubianu, Mirela
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.179-184
    • /
    • 2021
  • In our study we performed a sentiments analysis from the images. For this purpose, we used 153 images that contain: people, animals, buildings, landscapes, cakes and objects that we divided into two categories: images that suggesting a positive or a negative emotion. In order to classify the images using the two categories, we created two models. The SAI-G model was created with Google's AutoML Vision service. The SAI-C model was created on the Clarifai platform. The data were labeled in a preprocessing stage, and for the SAI-C model we created the concepts POSITIVE (POZITIV) AND NEGATIVE (NEGATIV). In order to evaluate the performances of the two models, we used a series of evaluation metrics such as: Precision, Recall, ROC (Receiver Operating Characteristic) curve, Precision-Recall curve, Confusion Matrix, Accuracy Score and Average precision. Precision and Recall for the SAI-G model is 0.875, at a confidence threshold of 0.5, while for the SAI-C model we obtained much lower scores, respectively Precision = 0.727 and Recall = 0.571 for the same confidence threshold. The results indicate a lower classification performance of the SAI-C model compared to the SAI-G model. The exception is the value of Precision for the POSITIVE concept, which is 1,000.

Tree-Pattern-Based Clone Detection with High Precision and Recall

  • Lee, Hyo-Sub;Choi, Myung-Ryul;Doh, Kyung-Goo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.1932-1950
    • /
    • 2018
  • The paper proposes a code-clone detection method that gives the highest possible precision and recall, without giving much attention to efficiency and scalability. The goal is to automatically create a reliable reference corpus that can be used as a basis for evaluating the precision and recall of clone detection tools. The algorithm takes an abstract-syntax-tree representation of source code and thoroughly examines every possible pair of all duplicate tree patterns in the tree, while avoiding unnecessary and duplicated comparisons wherever possible. The largest possible duplicate patterns are then collected in the set of pattern clusters that are used to identify code clones. The method is implemented and evaluated for a standard set of open-source Java applications. The experimental result shows very high precision and recall. False-negative clones missed by our method are all non-contiguous clones. Finally, the concept of neighbor patterns, which can be used to improve recall by detecting non-contiguous clones and intertwined clones, is proposed.

Evaluation of the Newspaper Library -With Emphasis on the Document Delivery Capability and Retrieval Effectivenss- (신문사 자료실에 대한 평가 -문헌전달능력과 검색효율을 중심으로-)

  • 노동조
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.7 no.1
    • /
    • pp.319-351
    • /
    • 1994
  • This rearch is a case study for the newspaper libraries in Seoul and the primary purpose of the this study are to investigate its document delivery capability. To achieve the above-mentioned purpose, representative rsers visited seven the newspaper library and checked their searching time. Document delivery capability was checked by units of hour, minute, second(searching time). Retrieval effectiveness was tested through the recall ratio and the precision ratio. The major findings of the study are summarized as follows: 1) Most of the newspaper libraries excellent to the document delivery capability; 6 newspaper libraries deliverived the data related subject. 2) The newspaper libraries were came out 50.1% the mean recall ratio and 84.8% the mean precision ratio about the all materials. 3) Concerned their own articles, the newspaper libraries showed 71.4% the recall ratio and 90.0% the precision ratio. That moaned their own articles were more effectived than others. 4) The Kookmin Ilbo library had the most excellent system, and the precision ratio of The Dong-A Ilbo library prior to the recall ratio. The Han Kyoreh Shinmun library had a excellent arragement in own articles, but The Segye Times library had problem in every parties.

  • PDF

A Study on a Filtering Method of Recommendation Service System Using User's Context (사용자 상황을 이용한 추천 서비스 시스템의 필터링 기법에 관한 연구)

  • Han, Dong-Jo;Park, Dae-Young;Choi, Ki-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.1
    • /
    • pp.119-126
    • /
    • 2009
  • In recent years, many recommendation service systems that search or recommend information automatically considering user's taste or property are developed. However, there is a weak point that correct recommendation is hard without considering the preference of user's context. This paper proposes a filtering method that gives correct recommendation considering the preference of user's context. To support this method, we get UCOP(User-Context Object Preference) using the preference of user's context and Pearson correlation coefficient. The results of the experiment show the improvement of 11%, 2% of precision and 8%, 4% of recall comparing with the existing service systems. Our recommendation service systems show 77% of precision and 53% of recall overall.

  • PDF

CCR : Tree-pattern based Code-clone Detector (CCR : 트리패턴 기반의 코드클론 탐지기)

  • Lee, Hyo-Sub;Do, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.8 no.2
    • /
    • pp.13-27
    • /
    • 2012
  • This paper presents a tree-pattern based code-clone detector as CCR(Code Clone Ransacker) that finds all clusterd dulpicate pattern by comparing all pair of subtrees in the programs. The pattern included in its entirely in another pattern is ignored since only the largest duplicate patterns are interesed. Evaluation of CCR is high precision and recall. The previous tree-pattern based code-clone detectors are known to have good precision and recall because of comparing program structure. CCR is still high precision and the maximum 5 times higher recall than Asta and about 1.9 times than CloneDigger. The tool also include the majority of Bellon's reference corpus.

An Experimental Study on Semantic Searches for Image Data Using Structured Social Metadata (구조화된 소셜 메타데이터를 활용한 이미지 자료의 시맨틱 검색에 관한 실험적 연구)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.1
    • /
    • pp.117-135
    • /
    • 2010
  • We designed a structured folksonomy system in which queries can be expanded through tag control; equivalent, synonym or related tags are bound together, in order to improve the retrieval efficiency (recall and precision) of image data. Then, we evaluated the proposed system by comparing it to a tag-based system without tag control in terms of recall, precision, and user satisfaction. Furthermore, we also investigated which query expansion method is the most efficient in terms of retrieval performance. The experimental results showed that the recall, precision, and user satisfaction rates of the proposed system are statistically higher than the rates of the tag-based system, respectively. On the other hand, there are significant differences among the precision rates of query expansion methods but there are no significant differences among their recall rates. The proposed system can be utilized as a guide on how to effectively index and retrieve the digital content of digital library systems in the Library 2.0 era.

Diagnostic performance of artificial intelligence using cone-beam computed tomography imaging of the oral and maxillofacial region: A scoping review and meta-analysis

  • Farida Abesi ;Mahla Maleki ;Mohammad Zamani
    • Imaging Science in Dentistry
    • /
    • v.53 no.2
    • /
    • pp.101-108
    • /
    • 2023
  • Purpose: The aim of this study was to conduct a scoping review and meta-analysis to provide overall estimates of the recall and precision of artificial intelligence for detection and segmentation using oral and maxillofacial cone-beam computed tomography (CBCT) scans. Materials and Methods: A literature search was done in Embase, PubMed, and Scopus through October 31, 2022 to identify studies that reported the recall and precision values of artificial intelligence systems using oral and maxillofacial CBCT images for the automatic detection or segmentation of anatomical landmarks or pathological lesions. Recall (sensitivity) indicates the percentage of certain structures that are correctly detected. Precision (positive predictive value) indicates the percentage of accurately identified structures out of all detected structures. The performance values were extracted and pooled, and the estimates were presented with 95% confidence intervals(CIs). Results: In total, 12 eligible studies were finally included. The overall pooled recall for artificial intelligence was 0.91 (95% CI: 0.87-0.94). In a subgroup analysis, the pooled recall was 0.88 (95% CI: 0.77-0.94) for detection and 0.92 (95% CI: 0.87-0.96) for segmentation. The overall pooled precision for artificial intelligence was 0.93 (95% CI: 0.88-0.95). A subgroup analysis showed that the pooled precision value was 0.90 (95% CI: 0.77-0.96) for detection and 0.94 (95% CI: 0.89-0.97) for segmentation. Conclusion: Excellent performance was found for artificial intelligence using oral and maxillofacial CBCT images.