fcs.server.crawling_depth_policy¶
In this module crawling depth computing policies are contained.
- class BaseCrawlingDepthPolicy¶
This is a base class for crawling depth policy implementations.
- static calculate_depth()¶
Returns crawling depth.
Returns: Crawling depth. Return type: int
- class IgnoreDepthPolicy¶
Implementation that ignores depth. Calculated depth is always 0.
- static calculate_depth()¶
Always returns 0.
Returns: Crawling depth (0). Return type: int
- class SimpleCrawlingDepthPolicy¶
Depth is computed in accordance with the following rules:
* - new domain
- A.com -> *B.com => depth_2 = 0
- A.com -> A.com/aaa/ => depth_2 = depth_1 + 1
- A.com -> *B.com -> A.com/aaa/ => depth_1 = x, depth_2 = 0, depth_3 = 0
- static calculate_depth(link=None, source_url=None, depth=None)¶
Parameters: Returns: Crawling depth.
Return type: int
Raises ValueError: if some URL is invalid.
- class RealDepthCrawlingDepthPolicy¶
Depth is computed in accordance with the following rules:
* - new domain
- A.com -> *B.com => depth_2 = 0
- A.com -> A.com/aaa/ => depth_2 = depth_1 + 1
- A.com -> *B.com -> A.com/aaa/ => depth_1 = x, depth_2 = 0, depth_3 = x + 1