Hello,
When I try to deploy using blue-green strategy, i have following error:
-------
✔ release_command 1857434c1406e8 completed successfully
-------
-------
Updating existing machines in 'prod' with bluegreen strategy
Verifying if app can be safely deployed
Creating green machines
Created machine 7811e4eb593e98 [app]
Created machine d8d9242a2e6328 [app]
Waiting for all green machines to start
Machine 7811e4eb593e98 [app] - started
Machine d8d9242a2e6328 [app] - started
Waiting for all green machines to be healthy
Machine 7811e4eb593e98 [app] - 0/1 passing
Machine d8d9242a2e6328 [app] - 0/1 passing
Deployment failed after error: could not get all green machines to be healthy: wait timeout
Rolling back failed deployment
Checking DNS configuration for prod.fly.dev
Error: could not get all green machines to be healthy: wait timeout
Your machine never reached the state "%s".
You can try increasing the timeout with the --wait-timeout flag
here is my fly.toml configuration:
app = "prod"
primary_region = "fra"
kill_signal = "SIGTERM"
[build]
[deploy]
release_command = "/app/bin/migrate"
strategy = "bluegreen"
[env]
DNS_CLUSTER_QUERY = "prod.internal"
PHX_HOST = "website.url"
PORT = "8080"
PRIMARY_REGION = "fra"
RELEASE_COOKIE = "cookie"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = false
auto_start_machines = true
min_machines_running = 2
processes = ["app"]
[http_service.concurrency]
type = "connections"
hard_limit = 1000
soft_limit = 1000
[[http_service.checks]]
interval = "5s"
grace_period = "20s"
method = "GET"
path = "/check"
protocol = "http"
port = 8080
timeout = "5s"
any idea why is it failing?
when i run fly checks list
it is showing:
Health Checks for prod
NAME | STATUS | MACHINE | LAST UPDATED | OUTPUT
----------------------------*----------*----------------*--------------*------------------------------
bg_deployments_http | critical | 7811e4eb593e98 | 4m40s ago | connect: connection refused
----------------------------*----------*----------------*--------------*------------------------------
servicecheck-00-http-8080 | warning | 7811e4eb593e98 | 4m44s ago | waiting for status update
----------------------------*----------*----------------*--------------*------------------------------
bg_deployments_http | critical | d8d9242a2e6328 | 4m43s ago | connect: connection refused
----------------------------*----------*----------------*--------------*------------------------------
servicecheck-00-http-8080 | warning | d8d9242a2e6328 | 3m58s ago | waiting for status update
----------------------------*----------*----------------*--------------*------------------------------
Also, if I check the logs of the started machines, they seem to be up and running:
fly logs -a prod -i e286013f95d5d8
Waiting for logs...
2024-05-02T12:29:32.100 runner[e286013f95d5d8] fra [info] Pulling container image registry.fly.io/prod:deployment-01HWWMFMGSQT7G2VRQZYDNXWDY
2024-05-02T12:29:33.199 runner[e286013f95d5d8] fra [info] Successfully prepared image registry.fly.io/prod:deployment-01HWWMFMGSQT7G2VRQZYDNXWDY (1.098938948s)
2024-05-02T12:29:34.022 runner[e286013f95d5d8] fra [info] Configuring firecracker
2024-05-02T12:29:34.552 app[e286013f95d5d8] fra [info] [ 0.156665] PCI: Fatal: No config space access function found
2024-05-02T12:29:34.800 app[e286013f95d5d8] fra [info] INFO Starting init (commit: c1e2693b)...
2024-05-02T12:29:34.868 app[e286013f95d5d8] fra [info] INFO Preparing to run: `/app/bin/server` as nobody
2024-05-02T12:29:34.881 app[e286013f95d5d8] fra [info] INFO [fly api proxy] listening at /.fly/api
2024-05-02T12:29:34.901 app[e286013f95d5d8] fra [info] 2024/05/02 12:29:34 INFO SSH listening listen_address=[fdaa:2:be18:a7b:caca:5ed2:149a:2]:22 dns_server=[fdaa::3]:53
2024-05-02T12:29:34.935 runner[e286013f95d5d8] fra [info] Machine created and started in 2.998s
2024-05-02T12:29:39.266 app[e286013f95d5d8] fra [info] 12:29:39.265 [info] no parent found, :ignore
2024-05-02T12:29:39.368 app[e286013f95d5d8] fra [info] 12:29:39.368 [info] Oban running in primary region. Activated.
2024-05-02T12:29:39.370 app[e286013f95d5d8] fra [info] 12:29:39.369 [info] Detected running on primary. No local replication to track.
2024-05-02T12:29:39.376 app[e286013f95d5d8] fra [info] 12:29:39.376 [info] Running NexusWeb.Endpoint with cowboy 2.10.0 at :::8080 (http)
2024-05-02T12:29:39.389 app[e286013f95d5d8] fra [info] 12:29:39.384 [info] Access NexusWeb.Endpoint at https://website.url
2024-05-02T12:29:39.390 app[e286013f95d5d8] fra [info] 12:29:39.389 [info] Discovered node :"prod-01HWWMFMGSQT7G2VRQZYDNXWDY@fdaa:2:be18:a7b:caca:14b:e986:2" in region fra
2024-05-02T12:29:39.886 app[e286013f95d5d8] fra [info] WARN Reaped child process with pid: 374 and signal: SIGUSR1, core dumped? false
2024-05-02T12:29:42.913 app[e286013f95d5d8] fra [info] 12:29:42.912 [info] tzdata release in place is from a file last modified Fri, 22 Oct 2021 02:20:47 GMT. Release file on server was last modified Thu, 01 Feb 2024 18:40:48 GMT.
2024-05-02T12:29:44.325 app[e286013f95d5d8] fra [info] 12:29:44.325 [info] Tzdata has updated the release from 2021e to 2024a
Any advice would be much appreciated. Thanks.