Item Pipelines#

class kingfisher_scrapy.pipelines.Validate[source]#

Drops duplicate files based on file_name and file items based on file_name and number.

Raises:

jsonschema.ValidationError – if the item is invalid

process_item(item, spider)[source]#
class kingfisher_scrapy.pipelines.Sample[source]#

Drops items and closes the spider once the sample size is reached.

process_item(item, spider)[source]#
open_spider(spider)[source]#
class kingfisher_scrapy.pipelines.Pluck[source]#

Extracts a value from the item and returns it as a plucked item.

process_item(item, spider)[source]#
class kingfisher_scrapy.pipelines.Unflatten[source]#

Converts an item’s data from CSV/XLSX to JSON, using the unflatten command from Flatten Tool.

process_item(item, spider)[source]#