All of the sudden "Could not proxy HTTP request" in my app

I have a simple NodeJS app listening for incoming requests. It was working well for a week or so and today these errors started:

2023-03-06T18:31:44.155 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 60)

2023-03-06T18:31:44.707 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 20)

2023-03-06T18:31:54.158 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 70)

2023-03-06T18:31:54.629 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 30)

2023-03-06T18:32:04.137 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 80)

2023-03-06T18:32:04.514 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 40)

2023-03-06T18:32:14.065 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 90)

2023-03-06T18:32:14.671 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 50)

2023-03-06T18:32:24.749 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 60)

2023-03-06T18:32:34.807 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 70)

2023-03-06T18:32:44.716 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 80)

2023-03-06T18:32:54.839 proxy[b77cb7ec] fra [warn] Could not proxy HTTP request. Retrying in 1000 ms (attempt 90)

Code snippet:

const express = require('express');
const app = express();
const PORT = process.env.PORT || 8080;
const axios = require('axios');

app.listen(PORT, function () {
  console.log(`App listening on port ${PORT}!`);
});

app.post('/music', function (req) {
// do something
});

It looks like your app is scaled to 0. Is that on purpose? The proxy can’t find any instance to route to.

Edited my OP with a code snippet if relevant. I have scaled it down to 0 on purpose after noticing the warnings.

edit: Back to 1 if that helps with debugging.

I’m not sure why the “actual” error doesn’t show here. I tried fixing that the other day. Looks like I failed.

It seems like connecting to your app started timing out.

You didn’t see any logs like this?

could not send HTTP request to instance: connection error: timed out

Yes. I am now back to the previous state, receiving this warning after every incoming request:

could not send HTTP request to instance: connection error: timed out

Which is weird because I can actually handle the request, but I don’t mind as long as the app works (which it does now).

What changed? I only scaled it back to 1.

I’m also seeing this error and I’ve emailed support. My app is currently scaled to 2. I’ve tried restarting and scaling the count up and down it’s not fixing the problem. This is really serious.

Also, another user reported the error in Could not proxy HTTP request. Retrying in 1000 ms - #14 by jakub1

We suddenly started experiencing that today and it’s driving me crazy. Everything looks good in our end. Did you end up finding a solution?

Is it the same connection timeout for both of you?

Connection timeouts are usually a problem with the app blocking and unable to accept connection anymore.

Yes, it was the same timeout. The app was working well and suddenly it started failing with that error, after a 3hrs downtime we JUST were able to fix it by scaling down our vm, this is really weird.

To be clear, we scale down the resources but not the vm instances.

Scaling down the resources would restart the instances. Restart often helps for a while if your app gets into that state progressively.

I do not like blaming our users’ apps but this does seem to be the main cause for connection timeouts.

@foocux Fly support responded via email earlier saying, “We had some state cleanup to perform on our end which caused the proxy to have intermittent issues in a few regions.” It’s fixed for me now. This was the first time I’ve seen this specific issue.

In my case there were no issues in my app.

1 Like

When the errors started, I deployed the app many times and restarted it at least two times and that didn’t fix it but when I scaled down the app, it started working again, very odd.

Ok that could be something else. Since this is working now, I’ll let it sit and take a look tomorrow morning (I have set myself a reminder).

1 Like

I’ve had time to look at your app, here are some observations:

  • For a while our proxy started seeing “connection closed before message completed” error from your instances. That usually indicates the app is erroring and not responding with HTTP headers.
  • Your concurrency limits are 99999,99999 so you’re not getting limit enforcement or getting any load balancing from us.
  • Your app is slow to respond with HTTP headers, that’s usually an indication something is wrong with the app. I see response headers taking up to 3-4s under normal load, but during the problematic times, it took over a minute sometimes.
  • I think the scaling down fixing the issue was only a coincidence, your app received less traffic at that time.

These metrics are all available to you: https://fly-metrics.net (the Fly Instance dashboard should tell you more about your app’s performance)

Some things you can try:

  • Requests concurrency (explained here). Without that, our proxy establishes a new connection for every request. This is to avoid race conditions with connection pooling in general. Opening many connections per second might not suit your app, it seems to stop working right at ~20-30 connections per second (this isn’t the same as concurrent connections)
  • Find the right concurrency your app can handle, either by benchmarking or trial and error. I expect not all routes have the same resource cost. My guess would be that each instance can handle 10-20 concurrent requests.
  • Troubleshoot your app’s performance: What’s taking so much time in these requests? Could be slow DB queries, slow network calls, or almost anything.

I’m happy to help some more figuring out our metrics and logs.

1 Like

Hmm, what you just said makes a lot of sense. I actually didn’t know we had those concurrency limits which explain a lot of things about why our iad load was always high and our other instance all the opposite.

Thank you for your analysis, I’m gonna take a closer look to the metrics you sent me, it’s gonna be really helpful right now.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.