Download Handlers¶
- class kingfisher_scrapy.downloadhandlers.CurlImpersonateDownloadHandler(settings)[source]¶
A download handler that uses
curl_cffito impersonate a browser’s TLS/JA3 fingerprint.Some sites use an anti-bot service (like Cloudflare) that rejects Scrapy’s default Twisted client by its TLS/JA3 fingerprint.
curl_cffireproduces a real browser’s fingerprint.To use it for a spider, override the
https(and/orhttp) handler in the spider’scustom_settings:custom_settings = { "DOWNLOAD_HANDLERS": { "https": "kingfisher_scrapy.downloadhandlers.CurlImpersonateDownloadHandler", }, }
And optionally:
Set the
CURL_IMPERSONATEsetting to a browser profile. Choose a specific version (like"chrome146") for a consistent fingerprint acrosscurl_cffiupgrades.Set the
CURL_IP_VERSIONsetting to"4"or"6"for a consistent version across requests. If unset,curl_cffichooses.
- lazy = True¶
- IP_RESOLVE = {'4': CurlIpResolve.V4, '6': CurlIpResolve.V6}¶