Spider Middlewares#

class kingfisher_scrapy.spidermiddlewares.ConcatenatedJSONMiddleware[source]#

If the spider’s concatenated_json class attribute is True, yields each object of the File as a FileItem. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#
Returns:

a generator of FileItem objects, in which the data field is parsed JSON

class kingfisher_scrapy.spidermiddlewares.LineDelimitedMiddleware[source]#

If the spider’s line_delimited class attribute is True, yields each line of the File as a FileItem. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#
Returns:

a generator of FileItem objects, in which the data field is bytes

class kingfisher_scrapy.spidermiddlewares.RootPathMiddleware[source]#

If the spider’s root_path class attribute is non-empty, replaces the item’s data with the objects at that prefix; if there are multiple releases, records or packages at that prefix, combines them into a single package, and updates the item’s data_type if needed. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#
Returns:

a generator of File or FileItem objects, in which the data field is parsed JSON

class kingfisher_scrapy.spidermiddlewares.AddPackageMiddleware[source]#

If the spider’s data_type class attribute is “release” or “record”, wraps the item’s data in an appropriate package, and updates the item’s data_type. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#
Returns:

a generator of File or FileItem objects, in which the data field is parsed JSON

class kingfisher_scrapy.spidermiddlewares.ResizePackageMiddleware[source]#

If the spider’s resize_package class attribute is True, splits the package into packages of 100 releases or records each. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#

The spider must yield items whose data field has package and data keys.

Returns:

a generator of FileItem objects, in which the data field is a string

class kingfisher_scrapy.spidermiddlewares.ReadDataMiddleware[source]#

If the item’s data is a file descriptor, replaces the item’s data with the file’s contents and closes the file descriptor. Otherwise, yields the original item.

process_spider_output(response, result, spider)[source]#
Returns:

a generator of File objects, in which the data field is bytes

class kingfisher_scrapy.spidermiddlewares.RetryDataErrorMiddleware[source]#

Retries a request for a ZIP file up to 3 times, on the assumption that, if the spider raises a BadZipFile exception, then the response was truncated.

process_spider_exception(response, exception, spider)[source]#