Browser breaks with new tab on empty content response

Hello world!

I am building a common Spider that crawls sites and the contained request.
I use scrapy-playwright to load websites first and get the requests that are sent.

I noticed that when I parse urls that have no content on body the execution freezes and playwright's browser shows empty tab.
To be clear reproduction of the problem is when parse a url that has the following condition as true:

response_body_text = await response.text() response_body_text == ''

For the urls that this condition is false spider works perfectly!

For the reproduction, I have a quite common configuration with:

CrawlerProcess({ ... # Playwright settings 'DOWNLOAD_HANDLERS': { "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler", "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler", }, 'TWISTED_REACTOR': "twisted.internet.asyncioreactor.AsyncioSelectorReactor", 'PLAYWRIGHT_BROWSER_TYPE': 'chromium', 'PLAYWRIGHT_MAX_PAGES_PER_CONTEXT': 10, 'PLAYWRIGHT_LAUNCH_OPTIONS': { 'headless': True, } })

and on each scrapy.Request() I pass the following meta:

{ "playwright": True }

Has anybody else come up with this issue?

Thank you all!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Browser breaks with new tab on empty content response #288

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Browser breaks with new tab on empty content response #288

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions