• Title/Summary/Keyword: hyperlinks

Search Result 61, Processing Time 0.02 seconds

Web Document Clustering based on Graph using Hyperlinks (하이퍼링크를 이용한 그래프 기반의 웹 문서 클러스터링)

  • Lee, Joon;Kang, Jin-Beom;Choi, Joong-Min
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.590-595
    • /
    • 2009
  • With respect to the exponential increment of web documents on the internet, it is important how to improve performance of clustering method for web documents. Web document clustering techniques can offer accurate information and fast information retrieval by clustering web documents through semantic relationship. The clustering method based on mesh-graph provides high recall by calculating similarity for documents, but it requires high computation cost. This paper proposes a clustering method using hyperlinks which is structural feature of web documents in order to keep effectiveness and reduce computation cost.

  • PDF

Web Structure Mining by Extracting Hyperlinks from Web Documents and Access Logs (웹 문서와 접근로그의 하이퍼링크 추출을 통한 웹 구조 마이닝)

  • Lee, Seong-Dae;Park, Hyu-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.11
    • /
    • pp.2059-2071
    • /
    • 2007
  • If the correct structure of Web site is known, the information provider can discover users# behavior patterns and characteristics for better services, and users can find useful information easily and exactly. There may be some difficulties, however, to extract the exact structure of Web site because documents one the Web tend to be changed frequently. This paper proposes new method for extracting such Web structure automatically. The method consists of two phases. The first phase extracts the hyperlinks among Web documents, and then constructs a directed graph to represent the structure of Web site. It has limitations, however, to discover the hyperlinks in Flash and Java Applet. The second phase is to find such hidden hyperlinks by using Web access log. It fist extracts the click streams from the access log, and then extract the hidden hyperlinks by comparing with the directed graph. Several experiments have been conducted to evaluate the proposed method.

The Structure of a Web site and Navigability (웹 사이트의 구조와 항해가능성)

  • Min, Kyung-Sil;Chun, Sung-Kyu;Jang, Gi-Ho;Jung, Hyo-Sook;Park, Seong-Bin
    • The Journal of Korean Association of Computer Education
    • /
    • v.14 no.3
    • /
    • pp.51-62
    • /
    • 2011
  • Navigability refers to how easy a user can find desired information in a web site and is influenced by the structure of a web site. In this paper, we created three types of Web sites, that is a Web site whose structure forms a small world, a Web site whose structure forms a semi-matroid, and a Web site based on an ontology and measured the navigability of each Web site based on two criteria (the number of hyperlinks clicked by users to find the desired information and the elapsed time for finding the desired information). The reason that we selected three structures is because hyperlinks can be created in a way that helps a user find desired information in each site. From the experiments, we found that the average number of hyperlinks which a user clicked to find out the desired information was as follows: a Web site that had semi-matroid property (100.37 hyperlinks) < a Web site that was created based on an ontology (117.63 hyperlinks) < a Web site that had small-world property (236.17 hyperlinks). In addition, we found that the average elapsed time during which a user found out the desired information was as follows: a Web site that was created based on an ontology (20 min 26 sec) < a Web site that had semi-matroid property (23 min 6 sec) < a Web site that had small-world property (30 min 47 sec). Therefore, we can consider a Web site that is created based on a semi-matroid or an ontology is relatively more navigable than a Web site that has small-world property. In this paper, we also propose a way by which our experimental results can be reflected in designing an educational Web site.

  • PDF

Development of an Automatic Hypertext Indexer for Dynamic Information Storage (동적 정보 저장을 위한 자동 하이퍼텍스트 색인 기법의 개발)

  • Yi, Dong-Ae;Jang, Duk-Sung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.9
    • /
    • pp.2333-2341
    • /
    • 1997
  • The hyperlinks to related nodes should be changed when we insert, or modify an information in a hypertext database. We can find more informations by means of hyperlinks that are based upon hypertext indexes. Therefore, the management of the hypertext indexes is an important component for dynamic information storage. In this paper, we suggest a method to manage the hypertext indexes and to determine hyperlinks automatically by using a dynamic indexer. We also construct index, stopword, and postposition dictionaries, an inverted index file, and a thesaurus to help the dynamic indexer.

  • PDF

An Analysis on the Web Usage Pattern Graph Using Web Users' Access Information (웹 이용자의 접속 정보 분석을 통한 웹 활용 그래프의 구성 및 분석)

  • Kim, Hu-Gon;Kim, Jae-Gyo
    • Korean Management Science Review
    • /
    • v.23 no.3
    • /
    • pp.63-75
    • /
    • 2006
  • There are many kinds of research on web graph, most of them are focus on the hyperlinked structure of the web graph. Well known results on the web graph are rich-get-richer phenomenon, small-world phenomenon, scale-free network, etc. In this paper, we define 3 new directed web graph, so called the Web Usage Pattern Graph (WUPG), that nodes represent web sites arid arcs between nodes represent a movement between two sites by users' browsing behavior. The data to constructing the WUPG, approximately 56,000 records, are gathered from some users' PCs. The results analysing the data summarized as follows : (i) extremely rich-get-richer phenomenon (ii) average path length between sites is significantly less than the previous one (iii) less external hyperlinks, more internal hyperlinks.

An analysis on the web usage pattern graph using web users' access information (웹 이용자의 접속 정보 분석을 통한 웹 활용 그래프의 구성 및 분석)

  • Kim, Hu-Gon
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.10a
    • /
    • pp.422-440
    • /
    • 2005
  • There are many kinds of research on web graph, most of them are focus on the hyperlinked structure of the web graph. Well known results on the web graph are rich-get-richer phenomenon, small-world phenomenon, scale-free network, etc. In this paper, we define a new directed web graph, so called the Web Usage Pattern Graph (WUPG), that nodes represent web sites and arcs between nodes represent a movement between two sites by users' browsing behavior. The data to constructing the WUPG, approximately 56,000 records, are gathered in the Kyungsung University. The results analysing the data summarized as follows: (i) extremely rich-get-richer phenomenon (ii) average path length between sites is significantly less than the previous one (iii) less external hyperlinks, more internal hyperlinks

  • PDF

Intelligent Spam-mail Filtering Based on Textual Information and Hyperlinks (텍스트정보와 하이퍼링크에 기반한 지능형 스팸 메일 필터링)

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.895-901
    • /
    • 2004
  • This paper describes a two-phase intelligent method for filtering spam mail based on textual information and hyperlinks. Scince the body of spam mail has little text information, it provides insufficient hints to distinguish spam mails from legitimate mails. To resolve this problem, we follows hyperlinks contained in the email body, fetches contents of a remote webpage, and extracts hints (i.e., features) from original email body and fetched webpages. We divided hints into two kinds of information: definite information (sender`s information and definite spam keyword lists) and less definite textual information (words or phrases, and particular features of email). In filtering spam mails, definite information is used first, and then less definite textual information is applied. In our experiment, the method of fetching web pages achieved an improvement of F-measure by 9.4% over the method of using on original email header and body only.

Website Classification based on Occurrence Frequency of Medical Terms and Hyperlinks in Webpage (웹페이지의 의학용어 출현 빈도와 하이퍼링크에 기반한 웹사이트 분류)

  • Lee, In Keun;Kim, Hwa Sun;Cho, Hune
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.2
    • /
    • pp.126-132
    • /
    • 2013
  • This study proposed a method to classify internet websites based on occurrence frequency of medical terms in the webpages and website structure composed with webpages and hyperlinks. The classification was done by using the suitability measure defined by three factors: (1)occurrence frequency of medical terms in the whole terms involved in a webpage, (2)occurrence frequency of medical terms in de-duplicated terms involved in the webpage, and (3)the number of hyperlinks to reach to a specific webpage from homepage. We conducted an experiment to verify the proposed method with the 80 websites registered in directories related to medical field and 127 websites in nonmedical field directories, and the experiment result showed 82.5 % of accuracy of the classification.

Analysis on the Visitors' Pattern of the University Webpages (대학 웹페이지 방문자 패턴분석)

  • Jeon, Mihyeon;Kwon, Hyejung;Hwang, Jahee;Kim, Gyu-Tae;Cho, HyungJun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.4
    • /
    • pp.153-158
    • /
    • 2018
  • The visitors' patterns of the homepages in university were classified and analyzed with the network analysis based on the hyperlinks. The numbers of visits to English web-pages were proportional to those of Korean with much less counts. The larger count of visits was confirmed for the case of colleges than the departments, showing the upper boundary of visits from the plot with the Betweenness centrality normalized by the degree. For the better visibility, well-designed hyperlinks with the proper public relations were suggested based on the quantitative analysis of visitors' count.

International Scientific and Scholarly Communication Networks on World Wide Web (월드와이드웹에 나타난 국제 학술 커뮤니케이션 네트워크에 대한 탐사적 연구)

  • Park, Han-Woo
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.37 no.2
    • /
    • pp.153-168
    • /
    • 2003
  • A hyperlink on academic World Wide Web has started to be recognized as a form of collaborative communication network connecting individual researchers and research groups and expanding their collaboration relations by making possible easy and direct online contact among people or groups anywhere in the world. This paper describes the structure of academic hyperlinks embedded in universities' Web sites hosted at the 10 Asian countries and further, examines the association between the structure of the hyperlink network and collaborative communication pattern among those countries based on their frequency of co-authoring articles. This research found that the number of inter-hyperlinks among universities' Web sites was significantly correlated with the frequency of co-authored articles across the 10 countries.