Kingfisher Process API¶
- class kingfisher_scrapy.extensions.kingfisher_process_api2.Client(**kwargs)[source]¶
- class kingfisher_scrapy.extensions.kingfisher_process_api2.KingfisherProcessAPI2(url, stats, rabbit_url, rabbit_exchange_name, rabbit_routing_key)[source]¶
If the
KINGFISHER_API2_URL
,RABBIT_URL
,RABBIT_EXCHANGE_NAME
andRABBIT_ROUTING_KEY
environment variables or configuration settings are set, then OCDS data is stored in Kingfisher Process, incrementally.When the spider is opened, a collection is created in Kingfisher Process via its web API. The API also receives the
note
andsteps
spider arguments (if set) and the spider’socds_version
class attribute.When an item is scraped, a message is published to the exchange for Kingfisher Process in RabbitMQ, with the path to the file written by the
FilesStore
extension.When the spider is closed, the collection is closed in Kingfisher Process via its web API, unless the
keep_collection_open
spider argument was set to'true'
. The API also receives the crawl statistics and the reason why the spider was closed.Note
If the
DATABASE_URL
environment variable or configuration setting is set, this extension is disabled and theDatabaseStore
extension is enabled.Note
This extension ignores items generated by the pluck command.
- spider_closed(spider, reason)[source]¶
Sends an API request to close the collection in Kingfisher Process.