Efficient web searching using temporal factors

Artur Czumaj, Ian Finch, Leszek Gąsieniec, Alan Gibbons, Paul Leng, Wojciech Rytter, Michele Zito

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Web traversal robots are used to gather information periodically from large numbers of documents distributed throughout the Web. In this paper we study the issues involved in the design of algorithms for performing information gathering of this kind more efficiently, by taking advantage of anticipated variations in access times in different regions at different times of the day or week. We report and comment on a number of experiments showing a complex pattern in the access times as a function of the time of the day. We look at the problem theoretically, as a generalisation of single processor sequencing with release times and deadlines, in which performance times (lengths) of the tasks can change in time. The new problem is called Variable Length Sequencing Problem (VLSP). We show that although the decision version of VLSP seems to be intractable in the general case, it can be solved optimally for lengths 1 and 2. This result opens the possibility of practicable algorithms to schedule searches efficiently when expected access times can be categorised as either slow or fast. Some algorithms for more general cases are examined and complexity results derived.

Original languageEnglish (US)
Pages (from-to)569-582
Number of pages14
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume262
Issue number1-2
DOIs
StatePublished - Aug 2 2001
Externally publishedYes
Event6th International Workshop on Algorithms and Data Structures, WADS 1999 - Vancouver, Canada
Duration: Aug 11 1999Aug 14 1999

Keywords

  • Approximation
  • Matching
  • Random
  • Sequencing
  • Web searching

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Efficient web searching using temporal factors'. Together they form a unique fingerprint.

Cite this