fcs.server.content_db¶
This module contains API for connection with database for crawled content.
- class BerkeleyContentDB(base_name)¶
API for Berkeley DB (http://www.oracle.com/database/berkeley-db/). It uses an interface to the Berkeley DB library provided by the bsddb module.
Parameters: base_name (string) – Name of the database - content_db¶
Object to access Berkeley DB.
- id_iter¶
Number of database records.
- get_data_iter¶
Number of records retrieved from database.
- parts_iter¶
Number of content data packages (files with crawled data) requested by user.
- add_content(url, links, content)¶
Adds crawled content do database.
Parameters:
- get_file_with_data_package(size)¶
Returns path to file with crawled data of given size.
Parameters: size (int) – Size of demanded data in MB. Returns: Path to file with crawled data. Return type: string
- size()¶
Returns the number of elements (i.e. crawled content) in the database (taking into consideration the fact that after getting a record via web application or API, it is no longer available).
Returns: Number of elements in database. Return type: int
- added_records_num()¶
Returns number of entries containing information about sites that have been crawled since the beginning of crawling (takes also into account already unavailable data).
Returns: Number of added entries informing about crawled sites. Return type: int
- clear()¶
Clears content of database and closes it.
- show()¶
Prints entries in database.