fcs.server.url_processor¶
This module contains class for processing and unifying URLs.
- class URLProcessor¶
Processes and corrects URLs retrieved from a web site and delivers other methods operating on web addresses (these methods are used e.g. by crawl depth policy classes).
- static validate(link, domain=None)¶
Validates and unifies link.
Parameters: Returns: Validated link.
Return type: string
- static identical_hosts(link_a, link_b)¶
Compares link_a’s and link_b’s hosts.
Parameters: Returns: Information if links’ hosts are identical.
Return type: bool
- static generate_url_hierarchy(link)¶
Returns list of all URLs which are component parts of the given link. Such URLs may be generated by trimming the link. For example, if value of link is http://www.allegro.pl/country_pages/1/0/z9.php, the method will return the following list: [‘http://allegro.pl‘, ‘http://allegro.pl/country_pages‘, ‘http://allegro.pl/country_pages/1‘, ‘http://allegro.pl/country_pages/1/0‘].
Parameters: link (string) – Link from which a resultant list will be generated. Returns: All URLs generated by trimming the link. Return type: list