HarvestMan - The HarvestMan Web Crawler

HarvestMan : News | About | Releases | Project page
| FAQ | Architecture | Downloads | Projects | Links & Related Projects

Projects using HarvestMan


PathCrawler

HarvestMan is being used at Group of Measurements, Federal University of Espirito Santo, Brazil to implement a webcrawler named PathCrawler which is used to perform network analysis and to mine path capacity and minimum delays from a point on the network to a set of web servers. This group has published a paper which has been accepted in NOMS 2008 to be held in Brazil during April 08.

EIAO - European Internet Accessibility Observatory

The European Internet Accessibility Observatory is a consortium of projects which will assess the accessibility of European web sites and participate in a cluster developing a European Accessibility Methodology. The assessment will be based on the WCAG developed by W3C. The project is carried out in a co-operation among 10 partners in a consortium co-ordinated by Agder University College, Norway.

HarvestMan is used in EIAO in the crawler component of the project called ROBACC (ROBot for ACCessibility assessment), to download files from European websites, which are saved to a data warehouse and assessed against accessibility measurements.

HarvestMan in San Diego State University

HarvestMan is installed in the Computational Linguistics Lab, part of the Linguistics Department, of San Diego State University. HarvestMan has its own wiki in this lab.

Ingeniweb ExternalSiteCatalog

Ingeniweb has a Plone extension named ExternalSiteCatalog which extends HarvestMan to provide a crawler and indexer to catalog external websites in Plone.