Focused Crawling Algorithm

Context:
- It can be applied by a Focused Crawling System.
See: Web Crawling Algorithm, Web Crawling System, Reinforcement Learning, DOM Tree, Breadth-First,

References

(Wikipedia - "Focused Crawler", 2011-June-10) ⇒ http://en.wikipedia.org/wiki/Focused_crawler
- QUOTE: A focused crawler or topical crawler is a web crawler that attempts to download only web pages that are relevant to a pre-defined topic or set of topics. Topical crawling generally assumes that only the topic is given, while focused crawling also assumes that some labeled examples of relevant and not relevant pages are available. Topical crawling was first introduced by Menczer ^[1]^[2].

↑ Menczer, F. (1997). ARACHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods for Information Discovery. In D. Fisher, ed., Proceedings of the 14th International Conference on Machine Learning (ICML97). Morgan Kaufmann.
↑ Menczer, F. and Belew, R.K. (1998). Adaptive Information Agents in Distributed Textual Environments. In K. Sycara and M. Wooldridge (eds.) Proceedings of the 2nd International Conference on Autonomous Agents (Agents '98). ACM Press.

(Diligenti & al) ⇒ Michelangelo Diligenti, Frans Coetzee, Steve Lawrence, C. Lee Giles, and Marco Gori. (2000). “Focused Crawling Using Context Graphs.” In: Proceedings of the 26th International Conference on very large data bases (VLDB 2000).
- CITED BY ~484 http://scholar.google.com/scholar?q=%22Focused+crawling+using+context+graphs%22+2000