Search | Korea Science

Numerical Formula and Verification of Web Robot for Collection Speedup of Web Documents

Kim Weon;Kim Young-Ki;Chin Yong-Ok
- Journal of Internet Computing and Services
- /
- v.5 no.6
- /
- pp.1-10
- /
- 2004
A web robot is a software that has abilities of tracking and collecting web documents on the Internet(l), The performance scalability of recent web robots reached the limit CIS the number of web documents on the internet has increased sharply as the rapid growth of the Internet continues, Accordingly, it is strongly demanded to study on the performance scalability in searching and collecting documents on the web. 'Design of web robot based on Multi-Agent to speed up documents collection ' rather than 'Sequentially executing Web Robot based on the existing Fork-Join method' and the results of analysis on its performance scalability is presented in the thesis, For collection speedup, a Multi-Agent based web robot performs the independent process for inactive URL ('Dead-links' URL), which is caused by overloaded web documents, temporary network or web-server disturbance, after dividing them into each agent. The agents consist of four component; Loader, Extractor, Active URL Scanner and inactive URL Scanner. The thesis models a Multi-Agent based web robot based on 'Amdahl's Law' to speed up documents collection, introduces a numerical formula for collection speedup, and verifies its performance improvement by comparing data from the formula with data from experiments based on the formula. Moreover, 'Dynamic URL Partition algorithm' is introduced and realized to minimize the workload of the web server by maximizing a interval of the web server which can be a collection target.
PDF

Fast Selection of Composite Web Services Based on Workflow Partition (워크플로우 분할에 기반한 복합 웹 서비스의 빠른 선택)

Jang, Jae-Ho;Shin, Dong-Hoon;Lee, Kyong-Ho
- Journal of KIISE:Software and Applications
- /
- v.34 no.5
- /
- pp.431-446
- /
- 2007
Executable composite Web services are selected by binding a given abstract workflow with the specific Web services that satisfy given QoS requirements. Considering the rapidly increasing number of Web services and their highly dynamic QoS environment, the fast selection of composite services is important. This paper presents a method for quality driven comosite Web services selection based on a workflow partition strategy. The proposed method partitions an abstract workflow into two sub-workflows to decrease the number of candidate services that should be considered. The QoS requirement is also decomposed for each partitioned workflow. Since the decomposition of a QoS requirement is based on heuristics, the selection might fail to find composite Web services. To avoid such a failure, the tightness of a QoS requirement is defined and a workflow is determined to be partitioned according to the tightness. A mixed integer linear programming is utilized for the efficient service selection. Experimental results show that the success rate of partitioning is above 99%. Particularly, the proposed method performs faster and selects composite services whose qualities are not significantly different (less than 5%) from the optimal one.
PDF KSCI

Optimized Adoption of NVM Storage by Considering Workload Characteristics

Kim, Jisun;Bahn, Hyokyung
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.17 no.1
- /
- pp.1-6
- /
- 2017
This paper presents an optimized adoption of NVM for the storage system of heterogeneous applications. Our analysis shows that a bulk of I/O does not happen on a single storage partition, but it is varied significantly for different application categories. In particular, journaling I/O accounts for a dominant portion of total I/O in DB applications like OLTP, whereas swap I/O accounts for a large portion of I/O in graph visualization applications, and file I/O accounts for a large portion in web browsers and multimedia players. Based on these observations, we argue that maximizing the performance gain with NVM is not obtained by fixing it as a specific storage partition but varied widely for different applications. Specifically, for graph visualization, DB, and multimedia player applications, using NVM as a swap, a journal, and a file system partitions, respectively, performs well. Our optimized adoption of NVM improves the storage performance by 10-61%.
https://doi.org/10.5573/JSTS.2017.17.1.001 인용 PDF KSCI

Effective Web Crawling Orderings from Graph Search Techniques (그래프 탐색 기법을 이용한 효율적인 웹 크롤링 방법들)

Kim, Jin-Il;Kwon, Yoo-Jin;Kim, Jin-Wook;Kim, Sung-Ryul;Park, Kun-Soo
- Journal of KIISE:Computer Systems and Theory
- /
- v.37 no.1
- /
- pp.27-34
- /
- 2010
Web crawlers are fundamental programs which iteratively download web pages by following links of web pages starting from a small set of initial URLs. Previously several web crawling orderings have been proposed to crawl popular web pages in preference to other pages, but some graph search techniques whose characteristics and efficient implementations had been studied in graph theory community have not been applied yet for web crawling orderings. In this paper we consider various graph search techniques including lexicographic breadth-first search, lexicographic depth-first search and maximum cardinality search as well as well-known breadth-first search and depth-first search, and then choose effective web crawling orderings which have linear time complexity and crawl popular pages early. Especially, for maximum cardinality search and lexicographic breadth-first search whose implementations are non-trivial, we propose linear-time web crawling orderings by applying the partition refinement method. Experimental results show that maximum cardinality search has desirable properties in both time complexity and the quality of crawled pages.
PDF KSCI

A Partition Mechanism of Server Nodes for SLA in Web Server Cluster (웹 서버 클러스터에서 차별화된 서비스 제공을 위한 서버 노드의 분할 기법)

장인재;최창열;박기진;김성수
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04d
- /
- pp.554-556
- /
- 2003
최근 웹 서비스가 다양한 컨텐츠와 전자상거래 둥 비즈니스와 관련된 서비스로 변화함에 따라 대용량 서비스뿐만 아니라 고품질 서비스(QoS)를 제공하기 위한 연구가 진행되고 있다. 웹 서버 클러스터에서도 서버 성능 향상과 함께 005를 제공하기 위한 차별화된 서비스가 필요하다. 본 논문에서는 사용자 계층별로 차별화된 서비스를 제공하기 위해서 서버 노드를 동적으로 분할하는 기법을 제안한다.
PDF

Study on the form of expression for Web Comics : Focused on Scroll Comics (웹 만화의 표현 양식에 관한 연구 : 스크롤 만화를 중심으로)

Kim, byong soo
- Proceedings of the Korea Contents Association Conference
- /
- 2007.11a
- /
- pp.657-660
- /
- 2007
The growth of Web comics is very noticeable in Korean comic market as the 21st century is entered. In amongst these1 trends, the scroll comics had established it self as one of the main stream form for expression, outside the form of the traditional published comics, so it is providing a new visual experience for the readers. The scroll method uses the large vertical space that is uncomparable to the column compartment of the printed comics, and its uses of animation-like techniques, innovative partitioning, flob styles and narration partition positioning, the limitless canvas and the scroll bar of the web page, is leading the digital comic age. However, it is still very uncertain whether the 'scroll comics' will still be valid in the age of Web2.0. It is concerning that even though there are limitless potential in the realms of digital and web, the web comics seem to be bound to one particular medium, 'scroll'. In this report, the form of expression in the scroll centered web comics will be analyzed, and based on this, the future evolution of digital comics shall be investigated.
PDF

A Design of Filtering Technique on LBSNS using Spatial Join (LBSNS에서의 공간조인을 이용한 필터링 기법의 설계)

Lee, Eun-Sik;Cho, Dae-Soo
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2011.05a
- /
- pp.230-232
- /
- 2011
Owing to the advent of digital devices which equipped with GPS, such as smartphone and tablet pc, a number of LBSNS applications have been released and even SNS applications serve various Location-Based Services. In twitter's case, the news of interesting area is provided to user not by being subscribed them automatically, but by being searched on web-site. This paper describes the system designed for users want to subscribe the local news without procedure like searching using operators. This system uses PBSM(Partition Based Spatial-Merge Join) which has no index for batch processing and against a massive query. The results from Spatial Join are stored in Materialized View then provided to user.
PDF

A study on Partition Allocation Techniques of iATA-based Virtual Storage (iATA 기반의 가상 스토리지 파티션 할당 기법에 관한 연구)

Park, Sungjin;Chun, Jooyoung;Lim, Hyotaek
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2009.05a
- /
- pp.68-71
- /
- 2009
iATA(Internet Advanced Technology Attachment)는 TCP/IP 네트워크를 통해 원격의 대용량의 가상 저장 공간을 자신의 로컬 디스크 처럼 활용이 가능하게 하는 기술로 이는 모바일 기기(PDA, 휴대폰 등)와 같은 저장 공간의 부족문제를 가지고 있는 장치에 적용하여 저장 공간의 부족 문제를 근본적으로 해결할 수 있는 솔루션이라 할 수 있다. iATA는 SCSI 하드디스크가 장착된 시스템에서만 서버를 구축해야만 하는 iSCSI와는 달리 일반 가정이나 사무실에서 주로 사용하는 ATA 하드 디스크에도 서버 구축이 가능한 확장성을 가지고 있으며 이는 iATA의 최대 장점 중 하나라 할 수 있다. 또한, Web 사이트를 통해 많은 사람들이 iATA 서비스를 이용하여 자신의 모바일 기기에서의 저장공간 문제를 극복할 수 있으며, 최근 사회 문제로 크게 대두되고 있는 개인 정보 유출의 폐해를 막기 위해 개인 인증시 OpenSSL과 MD5를 이용한 보안/암호화 기법을 사용하여 개인 정보 유출에 의한 불이익 및 개인 정보 악용으로 인한 범죄를 막을 수 있다. 하지만, 기존의 iATA상에서는 디스크 관리가 서버에서만 가능하다는 문제점이 있다. 즉, 사용자만의 개인공간을 가질 수가 없다. 이처럼 개인 사진이나, 다이어리같은 정보들을 관리 할 수 없는 문제점을 해결하기 위해 클라이언트에서 자신만의 디스크 공간을 가질 수 있게 파티션을 할당해주고, 그 공간을 클라이언트만의 공간으로 만들어 주는 기술을 개발하는데 그 목적이 있다.
PDF

Design and Algorithm Implementation of a Distributed Information Retrieval System using Sequential Transferring Method(STM) (순차적 전달방식(STM)을 이용한 분산정보검색시스템의 설계 및 알고리즘 구현)

Yoon, Hee-Byung;Kim, Yong-Han;Kim, Hwa-Soo
- The KIPS Transactions:PartB
- /
- v.11B no.5
- /
- pp.603-610
- /
- 2004
The distributed Information Retrieval System centrally controlled by mediator or meta search engine result in congestion of heavy traffic and int he problem of increment of cost for the reason of the design of complicated algorithm for central control and installation of hardware. So to figure out this problem, the way is needed that has independent retrieval functionality and can cooperate each other without dependency. In this paper, we overview a few works involved in distributed information retrieval system, then, implement algorithm and design the frame-work of distributed information retrieval system using sequential transferring method(STM) including multiple information retrieval system separated from central control. For this first of all, we present a web partition policy which devide and manage web logically and we present the sequential query processing way by means of illustration through changing numbered information retrieval system. Then, we also present 3-layered structure of framework and function and module of each layer suitable for information retrieval system. Last of ail, for effective implementation of STM algorithm we analysis module structure and present description of pseudocode of this, and show that the proposed STM algorithm works smoothly by demonstration of sequential query transfer process between servers.
https://doi.org/10.3745/KIPSTB.2004.11B.5.603 인용 PDF KSCI

Recommendation System using Associative Web Document Classification by Word Frequency and α-Cut (단어 빈도와 α-cut에 의한 연관 웹문서 분류를 이용한 추천 시스템)

Jung, Kyung-Yong;Ha, Won-Shik
- The Journal of the Korea Contents Association
- /
- v.8 no.1
- /
- pp.282-289
- /
- 2008
Although there were some technological developments in improving the collaborative filtering, they have yet to fully reflect the actual relation of the items. In this paper, we propose the recommendation system using associative web document classification by word frequency and ${\alpha}$-cut to address the short comings of the collaborative filtering. The proposed method extracts words from web documents through the morpheme analysis and accumulates the weight of term frequency. It makes associative rules and applies the weight of term frequency to its confidence by using Apriori algorithm. And it calculates the similarity among the words using the hypergraph partition. Lastly, it classifies related web document by using ${\alpha}$-cut and calculates similarity by using adjusted cosine similarity. The results show that the proposed method significantly outperforms the existing methods.
https://doi.org/10.5392/JKCA.2008.8.1.282 인용 PDF

Search Result 19, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)