Item Pipelines

class kingfisher_scrapy.pipelines.BasePipeline(crawler)[source]

Base class for pipelines that need access to the spider instance.

classmethod from_crawler(crawler)[source]
class kingfisher_scrapy.pipelines.Validate[source]

Drops duplicate files based on file_name and file items based on file_name and number.

process_item(item)[source]
class kingfisher_scrapy.pipelines.Sample(crawler)[source]

Drops items and closes the spider once the sample size is reached.

process_item(item)[source]
class kingfisher_scrapy.pipelines.Pluck(crawler)[source]

Extracts a value from the item and returns it as a plucked item.

process_item(item)[source]
class kingfisher_scrapy.pipelines.Unflatten(crawler)[source]

Converts an item’s data from CSV/XLSX to JSON, using the unflatten command from Flatten Tool.

process_item(item)[source]