Kingfisher Process API¶
- class kingfisher_scrapy.extensions.kingfisher_process_api2.KingfisherProcessAPI2(url, stats, rabbit_url, rabbit_exchange_name, rabbit_routing_key)[source]¶
If the
KINGFISHER_API2_URL,RABBIT_URL,RABBIT_EXCHANGE_NAMEandRABBIT_ROUTING_KEYenvironment variables or configuration settings are set, then OCDS data is stored in Kingfisher Process, incrementally.When the spider is opened, a collection is created in Kingfisher Process via its web API. The API also receives the
noteandstepsspider arguments (if set) and the spider’socds_versionclass attribute.When an item is scraped, a message is published to the exchange for Kingfisher Process in RabbitMQ, with the path to the file written by the
FilesStoreextension.When the spider is closed, the collection is closed in Kingfisher Process via its web API, unless the
keep_collection_openspider argument was set to'true'. The API also receives the crawl statistics and the reason why the spider was closed.Note
If the
DATABASE_URLenvironment variable or configuration setting is set, this extension is disabled and theDatabaseStoreextension is enabled.Note
This extension ignores items generated by the pluck command.
- spider_closed(spider, reason)[source]¶
Send an API request to close the collection in Kingfisher Process.