Error: Exceeded maxRedirects (Docker Container and fly.dev domain + IP)

Hi, I’m a newbie to the platform and I want to migrate over from Heroku.

I followed the docker quickstart information and am having trouble accessing my service.
I have a java application running inside a docker container that connects to a postgres database. The database is fine, my container can access that internally and I can access it externally from my tooling, but I’ve tried everything I can see in the docs to get the service itself available externally and am running in to issues.

Here is my fly.toml:

app = "valuable-api"

kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[build]
  image = "registry.fly.io/valuable-api:latest"

[env]

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.http_checks]]
    interval = 10000
    grace_period = "60s"
    method = "get"
    path = "/actuator/health"
    protocol = "http"
    timeout = 5000
    tls_skip_verify = false

The health check on /actuator/health is working, and I can see traces forming out of that and into Honeycomb. When I deploy it takes a long time, but the container is eventually marked as healthy.

When I make a request to https://valuable-api.fly.dev/actuator/health the request yields Error: Exceeded maxRedirects. Probably stuck in a redirect loop https://valuable-api.fly.dev/actuator/health. When I make a request to the IP address of the service I get the same (using https for the fly.dev URL and HTTP for the IP address) I get the same thing.

The output of fly info is:

App
  Name     = valuable-api          
  Owner    = personal              
  Version  = 19                    
  Status   = running               
  Hostname = valuable-api.fly.dev  

Services
PROTOCOL PORTS                   
TCP      80 => 8080 [HTTP]       
         443 => 8080 [TLS, HTTP] 

IP Adresses
TYPE ADDRESS             REGION CREATED AT 
v4   109.105.216.45             12h50m ago 
v6   2a09:8280:1::1:1b41        12h50m ago 

Everything looks like it is configured correctly, so I don’t know why I would be getting this issue (I have a PORT environment variable that defines the port my application inside the container listens to, that is set to 8080). I know that my container is indeed listening on 8080 because fly logs gives:
2022-01-22T10:45:37.545 app[87cb4e8f] syd [info]2022-01-22 10:45:37.539 INFO 516 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path ''

So I have a running container on a URL and IP, where the listeners appear to be taking traffic and pointing it towards my container, but somewhere the request gives Error: Exceeded maxRedirects. Probably stuck in a redirect loop https://valuable-api.fly.dev/actuator/health regardless of the listeners.

So how come the Fly platform can access my healthchecks, see mark them as healthy while my application writes that traffic out to Honeycomb and yet I can’t access it?

Would love some help as it would be great to move entirely away from Heroku. The platform isn’t well suited for me as they lack availability zones in my area, but without solving this problem I won’t be able to migrate.

Thanks so much for any help you might have!

I’m not sure if this is maybe related or not, but my deployments also seem to be a bit unstable.
I’ve been fiddling with fly.toml settings to try and solve the issue and noticed some weird behaviour.

My deployments appear to be taking so long, that health checks begin testing the health of the deployment before the service has restarted!

Here’s an example:

❯ fly deploy
==> Verifying app config
--> Verified app config
==> Building image
Searching for image 'registry.fly.io/valuable-api:latest' locally...
image found: sha256:cea23b229d06e913da5abaa1960e47b89d45c34403c971f02ae7671a3a1118a5
==> Pushing image to fly
The push refers to repository [registry.fly.io/valuable-api]
e527cae18b52: Layer already exists 
0d729d927d7d: Layer already exists 
5172562c3f00: Layer already exists 
9a9c94e6b14b: Layer already exists 
3ebf62b428e4: Layer already exists 
0eba131dffd0: Layer already exists 
deployment-1642885751: digest: sha256:9c6444441007d14d1103c590167a69ff9326047cfa82d6c6efe7f9a8ddf88fbf size: 1584
--> Pushing image done
==> Creating release
--> release v22 created

--> You can detach the terminal anytime without stopping the deployment
==> Monitoring deployment

 1 desired, 1 placed, 0 healthy, 1 unhealthy

 1 desired, 1 placed, 0 healthy, 0 unhealthy [health checks: 1 total, 1 critical]

This is where the deployment gets to before failing due to unhealthy allocations.
On my local machine the docker container takes 10 seconds to start up, in Heroku it takes about 15 seconds. I’m using a dedicated-1x-spu with 2gb ram and have set my [[services.http_checks]] grace_period to 120s. I have the fly logs open in one terminal with the deploy in another, and visually it appears that the 120s grace period is not being respected.

I see the instance starting at 21:14:23.434:
2022-01-22T21:14:23.434 runner[3bd6b315] sin [info]Starting instance

I see the VM preparing to run my container:

2022-01-22T21:14:39.359 app[3bd6b315] sin [info]Preparing to run: `sh -c  java -Xms512m -Xmx1024m -Xss512k  ${JAVA_OPTS} -Xverify:none -javaagent:honeycomb-opentelemetry-javaagent-0.8.1.jar -jar /valuable-api.jar` as root
2022-01-22T21:14:39.371 runner[3bd6b315] sin [info]Virtual machine started successfully
2022-01-22T21:14:39.406 app[3bd6b315] sin [info]OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.

and the health check going critical at 21:15:26.335:
2022-01-22T21:15:26.335 proxy[3bd6b315] sin [error]Health check status changed 'warning' => 'critical'2022-01-22T21:15:26.335 proxy[3bd6b315] sin [error]Health check status changed 'warning' => 'critical'

The time between my container starting (which I take as the time of the first line of log output) is 21:14:39 and the healthcheck’s first failure occurs at 21:15:26 - which is only 47 seconds.

So the side problem is actually two-fold. Why is the 120s configured health check not being respected, and why with a dedicated CPU and 2 gigs of ram is this service taking over three times as long to start as it did on Heroku, despite having access to far more resources? (I was using a standard 2x on Heroku - 1gb ram and shared CPU)

Hi @4lexNZ

Are you seeing any entries in your logs for the http requests you make?

If not, then your app might only be listening on the local loop back IP address (127.0.0.1) instead of all IP addresses (0.0.0.0).

Hey! Thanks for the suggestion. My service is already listening on loopback, I think I might have edited my posts right as you were typing. My issue has actually changed. I’m not getting an error: Error: Exceeded maxRedirects. Probably stuck in a redirect loop https://valuable-api.fly.dev/actuator/health

when I try to access the application externally and I’m not sure of the cause.
I have some java code:

/*
        This configuration tells Spring to redirect all plain HTTP requests back to the same URL using HTTPS if the
        X-Forwarded-Proto header is present. Heroku sets the X-Forwarded-Proto header for you, which means the request
        will be redirected back through the Heroku router where SSL is terminated. In your localhost environment,
        you can continue to use plain HTTP.
         */
        http.requiresChannel()
                .requestMatchers(r -> r.getHeader("X-Forwarded-Proto") != null)
                .requiresSecure();

That handles the X-Forwarded-Proto header. From the fly docs it seems that Heroku and Fly are handling this the same way so it shouldn’t be an issue, and I’ve tested with the code removed as well and no change there. That’s the only thing I can think of.

I can see that the healthchecks being performed by Fly are making it into an external tool, I know they work. I’ve ssh’d into that container and confirmed they work locally also. When I make the request from outside (i.e. Postman), I get the maxRedirects - but do not see any logging or indication that the service has received the request. I’ll turn up logging and see if that reveals anything and update this message once I’ve done that.

This is most likely because your health check is responding with a redirect to https://

Try just removing the health check entirely, this will make it default to a TCP connection check.

If that works, you’ll need to modify your app to accept a health check URL that doesn’t perform a redirect. Health checks have to return a 200 response to count.

Thanks Kurt you were right. I was able to prove that the culprit was a poorly scoped if statement in my security settings. That being said, I don’t want the API to be accessible on unsecured lines so I decided to remove the offending code, and also remove the HTTP listener on 80. Now my system is running and just listens on 443 which is fantastic!

And performance is way better coming from Sydney compared to Heroku coming from Virginia!
Thanks for all the help!