Possible Networking Issue when scraping data

I am using FlareBypasser to bypass challenges when scraping data. I have deployed the service on Fly.io using the following fly.toml:

app = 'flare-bypasser'
primary_region = 'otp'

[build]

[http_service]
  internal_port = 8080
  auto_stop_machines = 'stop'
  auto_start_machines = true
  min_machines_running = 0
  processes = ['app']

[[vm]]
  memory = '2gb'
  cpu_kind = 'shared'
  cpus = 1

During the initial few minutes, when performing a large number of requests, FlareBypasser works fine. However, after some time, I start getting the following error:

Error solving the challenge. On platform linux(docker = true). At step 'browser init':  
---------------------  
Failed to connect to browser  
---------------------  
One of the causes could be when you are running as root.  
In that case you need to pass no_sandbox=True  

Browser error output:  
[5069:5083:0306/223446.440235:ERROR:bus.cc(407)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")  
(repeats multiple times...)

What I’ve Tried

  • Running FlareBypasser locally → Everything works fine.
  • Checking Fly.io metrics → No obvious memory issues.

Question

Could this be a networking issue on Fly.io? Or something related to how the browser instance is managed over time?

Any insights would be greatly appreciated!

What browser are you using? Some browser remote control (especially in test frameworks) let you grab the browser’s console logs.

Also, what is the “bus” in this case? Is that a thing in this library or the browser?

The other thing to try, and this is the nature of scraping generally, is to slow down your scraping. If you’re grabbing content too aggressively you can expect to be blocked.

1 Like

FlareBypasser is using zendriver to create an instance of Chrome browser. “bus” seems to be related to Chrome.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.