I’m a few days into fly.io . I have got my app working but when I try to deploy now, it gets killed just as it’s starting up with no hint as to why.
2022-11-01T19:23:29.537 runner[fe5df107] lhr [info] Starting instance
2022-11-01T19:23:32.598 runner[fe5df107] lhr [info] Configuring virtual machine
2022-11-01T19:23:32.603 runner[fe5df107] lhr [info] Pulling container image
2022-11-01T19:33:20.352 runner[fe5df107] lhr [info] Unpacking image
2022-11-01T19:33:30.630 runner[fe5df107] lhr [info] Preparing kernel init
2022-11-01T19:33:31.075 runner[fe5df107] lhr [info] Configuring firecracker
2022-11-01T19:33:31.184 runner[fe5df107] lhr [info] Starting virtual machine
2022-11-01T19:33:31.577 app[fe5df107] lhr [info] Starting init (commit: ce4cf1b)...
2022-11-01T19:33:31.617 app[fe5df107] lhr [info] Preparing to run: `/usr/local/bin/mvn-entrypoint.sh bash ./fly-run.sh` as root
2022-11-01T19:33:31.651 app[fe5df107] lhr [info] 2022/11/01 19:33:31 listening on [fdaa:0:1989:a7b:bada:fe5d:f107:2]:22 (DNS: [fdaa::3]:53)
...
2022-11-01T19:33:35.192 app[fe5df107] lhr [info] :: Spring Boot :: (v2.5.5)
2022-11-01T19:33:35.438 app[fe5df107] lhr [info] 2022-11-01 19:33:35.435 INFO 541 --- [ main] o.c.m.CrossrefManifoldApplicationKt : Starting CrossrefManifoldApplicationKt v0.0.1-SNAPSHOT using Java 11.0.16 on fe5df107 with PID 541 (/target/manifold.jar started by root in /)
...
2022-11-01T19:33:42.082 app[fe5df107] lhr [info] 2022-11-01 19:33:42.081 INFO 541 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 8080 (http)
...
2022-11-01T19:33:46.636 app[fe5df107] lhr [info] 2022-11-01 19:33:46.632 INFO 541 --- [ main] (normal application log stuff)
2022-11-01T19:33:46.683 runner[fe5df107] lhr [info] Shutting down virtual machine
2022-11-01T19:33:46.903 app[fe5df107] lhr [info] Sending signal SIGINT to main child process w/ PID 521
2022-11-01T19:33:47.220 app[fe5df107] lhr [info] 2022-11-01 19:33:47.212 INFO 541 --- [ main] (normal application log stuff)
2022-11-01T19:33:47.537 app[fe5df107] lhr [info] 2022-11-01 19:33:47.533 INFO 541 --- [ main] (normal application log stuff)
So my app takes under a minute to boot, no report of health check failures, and is sent a SIGINT by the runner.
I’ve set my health checks to be very liberal for debugging. So I don’t even expect a health check in these first few seconds. But I’ve configured the port correctly.
[[services.tcp_checks]]
grace_period = "120s"
interval = "15s"
restart_limit = 10
timeout = "2s"
I noticed there was a ten minute gap between “Pulling container image” and “Unpacking image”. Though after that the app appeared to boot happily until it was killed.
Ideas?
Here’s some other questions about SIGINT:
Virtual machine repeatedly shutting down → I’ve configured the grace period
Is SIGINT an issue with my app or an issue with fly.io? → I’ve verified the internal port.
I’ve tried to reboot it with various tweaks a few time now (up to version 30!).
I’ve noticed that the shutdown happens in a specific place each time. Always less than a minute in (just after it succeeds in running database migrations), and just before it’s able to start the HTTP server. And, because the server takes a second or two to respond to the SIGINT, it continues as normal. So it’s not the app crashing.
Hard to tell what’s going on without code, but is it possible that allocated VM resources aren’t enough to run that Java app (also: 1, 2)? Try scaling up, if you haven’t already, to see if things then run as expected?
# 1G RAM
fly scale vm memory 1024 -a <app-name>
Additionally, you may also want to give existing JVM flags a cursory look.
Thanks! It’s got 4 GB allocated and uses about 300 MB of that. I’ve checked the resource usage graph, and there are no bumps.
The actual code is a relatively simple Spring Boot app. It does have some features, but I don’t think it gets are far as actually doing any work, as it’s killed a few seconds in.
Thanks for the tip, I’ll try with -Xmx (maximum heap) but I don’t believe the runtime will have cause to touch 4GB.
1 Like
What does fly status --all -a <app-name>
tell you about health-checks / statuses of deployed instances (VMs)?
For ex, here’s status of my NodeJS app:
➜ fly status --all -a ____
App
Name = ____
Owner = ____
Version = 435
Status = running
Hostname = ____.fly.dev
Platform = nomad
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
d3adbeef app 435 ⇡ aws run running 2 total, 2 passing 0 2022-10-23T07:38:18Z
deadb33f app 425 aws evict failed 2 total, 2 critical 0 2022-08-18T13:25:20Z
Btw, folks at codecentric.de
wrote quite a nice post about getting Spring up and running on Fly that may have a pointer or two in case you’ve not read it already.
BTW I configured -Xmx to 3 GB heap (out of 4). Happened again.
App
Name = manifold
Owner = crossref
Version = 33
Status = running
Hostname = manifold.fly.dev
Platform = nomad
Deployment Status
ID = 52df1e8d-246d-3799-4654-7f2d3f3dca5c
Version = v33
Status = successful
Description = Deployment completed successfully
Instances = 1 desired, 1 placed, 1 healthy, 0 unhealthy
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
d6d7c16c app 33 ⇡ lhr run running 1 total, 1 passing 0 10m15s ago
0430276e app 32 lhr stop complete 1 total, 1 passing 0 21m19s ago
3e495ee7 app 31 lhr stop complete 1 total, 1 passing 0 34m15s ago
701986eb app 30 lhr stop complete 1 total, 1 passing 0 18h54m ago
5b1ad98f app 12 lhr run failed 1 total, 1 critical 2 2022-10-31T20:10:06Z
95cc646f app 11 lhr run failed 1 total, 1 critical 2 2022-10-31T18:00:26Z
3ddd8c06 app 10 lhr stop failed 1 total 2 2022-10-31T17:39:41Z
93d5a92d app 9 lhr run failed 1 total 2 2022-10-30T16:17:14Z
a2810e03 app 8 lhr stop failed 1 total, 1 critical 2 2022-10-30T15:25:37Z
7117ade5 app 7 lhr stop failed 1 total, 1 critical 2 2022-10-30T14:48:42Z
b34962a7 app 6 lhr stop failed 2 2022-10-30T14:35:42Z
66ef98a9 app 5 lhr stop failed 1 total 2 2022-10-30T14:19:13Z
af0b4a0a app 4 lhr stop failed 1 total, 1 critical 2 2022-10-30T14:00:08Z
be176fd3 app 3 lhr stop failed 1 total 2 2022-10-30T13:52:20Z
5761a68b app 2 lhr run failed 0 2022-10-30T09:27:52Z
6ece49ce app 1 lhr run failed 1 total, 1 critical 2 2022-10-29T21:24:10Z
d6b550bf app 0 lhr stop failed 1 total, 1 critical 2 2022-10-29T21:15:59Z
So I see some critical health checks.
1 Like