Integrate with Kingfisher Process

See also

Kingfisher Collect has optional integration with Kingfisher Process, through the KingfisherProcessAPI2 extension.

After deploying and starting an instance of Kingfisher Process, set the following either as environment variables or as Scrapy settings in kingfisher_scrapy.settings.py:

KINGFISHER_API2_URL

The URL of Kingfisher Process’ web API, for example: http://user:pass@localhost:8000

RABBIT_URL

The URL of the RabbitMQ message broker, for example: amqp://user:pass@localhost:5672

RABBIT_EXCHANGE_NAME

The name of the exchange in RabbitMQ, for example: kingfisher_process_development

RABBIT_ROUTING_KEY

The routing key for messages sent to RabbitMQ, equal to the exchange name with an _api suffix, for example: kingfisher_process_development_api

Add a note to the collection

Add a note to the collection_note table in Kingfisher Process. For example, to track provenance:

scrapy crawl spider_name -a note='Started by NAME.'

Select which processing steps to run

Kingfisher Process stores OCDS data, and upgrades it if the spider sets a class attribute of ocds_version = '1.0'. It can also perform the optional steps below.

Run structural checks and create compiled releases
scrapy crawl spider_name -a steps=check,compile
Run structural checks only
scrapy crawl spider_name -a steps=check
Create compiled releases only
scrapy crawl spider_name -a steps=compile
Do neither
scrapy crawl spider_name -a steps=