|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Description
| Class Summary | |
|---|---|
| Crawl | |
| CrawlDatum | |
| CrawlDatum.Comparator | A Comparator optimized for CrawlDatum. |
| CrawlDb | This class takes the output of the fetcher and updates the crawldb accordingly. |
| CrawlDbFilter | This class provides a way to separate the URL normalization and filtering steps from the rest of CrawlDb manipulation code. |
| CrawlDbMerger | This tool merges several CrawlDb-s into one, optionally filtering URLs through the current URLFilters, to skip prohibited pages. |
| CrawlDbMerger.Merger | |
| CrawlDbReader | Read utility for the CrawlDB. |
| CrawlDbReader.CrawlDbDumpReducer | |
| CrawlDbReader.CrawlDbStatCombiner | |
| CrawlDbReader.CrawlDbStatMapper | |
| CrawlDbReader.CrawlDbStatReducer | |
| CrawlDbReader.CrawlDbTopNMapper | |
| CrawlDbReader.CrawlDbTopNReducer | |
| CrawlDbReducer | Merge new page entries with existing entries. |
| Generator | Generates a subset of a crawl db to fetch. |
| Generator.CrawlDbUpdater | Update the CrawlDB so that the next generate won't include the same URLs. |
| Generator.DecreasingFloatComparator | |
| Generator.HashComparator | Sort fetch lists by hash of URL. |
| Generator.Selector | Selects entries due for fetch. |
| Generator.SelectorEntry | |
| Generator.SelectorInverseMapper | |
| Injector | This class takes a flat file of URLs and adds them to the of pages to be crawled. |
| Injector.InjectMapper | Normalize and filter injected urls. |
| Injector.InjectReducer | Combine multiple new entries for a url. |
| Inlink | |
| Inlinks | A list of Inlinks. |
| LinkDb | Maintains an inverted link map, listing incoming links for each url. |
| LinkDb.Merger | |
| LinkDbFilter | This class provides a way to separate the URL normalization and filtering steps from the rest of LinkDb manipulation code. |
| LinkDbMerger | This tool merges several LinkDb-s into one, optionally filtering URLs through the current URLFilters, to skip prohibited URLs and links. |
| LinkDbReader | . |
| MapWritable | A writable map, with a similar behavior as java.util.HashMap. |
| MD5Signature | Default implementation of a page signature. |
| PartitionUrlByHost | Partition urls by hostname. |
| Signature | |
| SignatureComparator | |
| SignatureFactory | Factory class, which instantiates a Signature implementation according to the current Configuration configuration. |
| TextProfileSignature | An implementation of a page signature. |
Crawl control code.
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||