Apache Nutch System
Jump to navigation
Jump to search
An Apache Nutch System is a Lucene-based web-search software that is an Apache project.
- Context:
- It is written in a Java Programming Language.
- See: Web Crawler System, Apache Project, Apache Lucene.
References
2012
- http://en.wikipedia.org/wiki/Nutch
- Nutch is an effort to build an open source web search engine based on Lucene and Java for the search and index component.
- http://en.wikipedia.org/wiki/Nutch#Features
- QUOTE: Nutch is coded entirely in the Java programming language, but data is written in language-independent formats. It has a highly modular architecture, allowing developers to create plug-ins for media-type parsing, data retrieval, querying and clustering.
The fetcher ("robot" or “web crawler") has been written from scratch specifically for this project.
- QUOTE: Nutch is coded entirely in the Java programming language, but data is written in language-independent formats. It has a highly modular architecture, allowing developers to create plug-ins for media-type parsing, data retrieval, querying and clustering.
2011
- http://nutch.apache.org/
- Apache Nutch is an open source web-search software project. Nutch is a project of the Apache Software Foundation and is part of the larger Apache community of developers and users