Machine restart loop instead of Wake-on-Request

Greetings!

I’ve created a super simple node server which automatically shuts itself down after a time of inactivity. The machine then restarts on requests. This worked fine for a while, with the app being paused/off most of the time.

Today when I looked it seems to have been running for a couple of days with zero downtime, and the monitor shows that the machine restarts the second it shuts down, being stuck in an infinite restart loop.

Is this a bug, or is there some setting I have missed in the command-line interface?

Closing server due to inactivity
Starting clean up.
Umounting /dev/vdb from /opt/app/data
reboot: Restarting system

I could also imagine it being caused by something like bot traffic, which would be unfortunate.

Hey there!

Can you let us know how you created this app? (Machines API, via flyctl’s fly launch or whatever you used. Do you have a fly.toml?)

Your grafana graphs should show HTTP responses from your application that the proxy received after forwarding inbound requests (scroll down and expand it to see a count of HTTP responses). That should help correlate requests turning on the application.

Assuming the application is a web app responding to HTTP requests, does it output any logs per request? The logs for the app should show some hints on what’s going on, unless the app is “silent” (not programed to log requests it receives).

The app was created using flyctl and flyctl machine run:

flyctl machine run . -p 443:8080/tcp:tls:http -p 80:8080/tcp:http
-v data:/opt/app/data --region “$region” --cpus 1 --memory 256 -a “$app_name”

According to the grafana graphs there are few http responses. Ie, the machine restarts do not correlate to http responses:

As far as I know Express.js and Node doesn’t log anything extra by default. The only thing I log is when the server has an error or shuts down.

This is what my log looks like everywhere:

2023-03-18T14:28:07.197 app[x] cdg [info] Preparing to run: `npm start` as node
2023-03-18T14:28:07.218 app[x] cdg [info] 2023/03/18 14:28:07 listening on [y]:22 (DNS: [z]:53)
2023-03-18T14:28:08.283 app[x] cdg [info] > Tileset Explorer@3.0 start
2023-03-18T14:28:08.283 app[x] cdg [info] > node server.js
2023-03-18T14:28:08.704 app[x] cdg [info] Server listening on port 8080
2023-03-18T14:28:08.731 app[x] cdg [info] Opened SQLite3 database
2023-03-18T14:33:08.749 app[x] cdg [info] Closing server due to inactivity
2023-03-18T14:33:09.574 app[x] cdg [info] Starting clean up.
2023-03-18T14:33:09.575 app[x] cdg [info] Umounting /dev/vdb from /opt/app/data
2023-03-18T14:33:10.578 app[x] cdg [info] [ 303.501677] reboot: Restarting system
2023-03-18T14:33:11.778 app[x] cdg [info] Starting init (commit: 08b4c2b)...
2023-03-18T14:33:11.800 app[x] cdg [info] Mounting /dev/vdb at /opt/app/data w/ uid: 1000, gid: 1000 and chmod 0755
2023-03-18T14:33:11.808 app[x] cdg [info] Preparing to run: `npm start` as node
2023-03-18T14:33:11.828 app[x] cdg [info] 2023/03/18 14:33:11 listening on [y]:22 (DNS: [z]:53)
2023-03-18T14:33:12.885 app[x] cdg [info] > Tileset Explorer@3.0 start
2023-03-18T14:33:12.885 app[x] cdg [info] > node server.js
2023-03-18T14:33:13.185 app[x] cdg [info] Server listening on port 8080
2023-03-18T14:33:13.211 app[x] cdg [info] Opened SQLite3 database
2023-03-18T14:38:13.219 app[x] cdg [info] Closing server due to inactivity
2023-03-18T14:38:14.202 app[x] cdg [info] Starting clean up.
2023-03-18T14:38:14.202 app[x] cdg [info] Umounting /dev/vdb from /opt/app/data
2023-03-18T14:38:15.205 app[x] cdg [info] [ 303.517513] reboot: Restarting system
2023-03-18T14:38:16.416 app[x] cdg [info] Starting init (commit: 08b4c2b)...
2023-03-18T14:38:16.436 app[x] cdg [info] Mounting /dev/vdb at /opt/app/data w/ uid: 1000, gid: 1000 and chmod 0755
2023-03-18T14:38:16.445 app[x] cdg [info] Preparing to run: `npm start` as node
2023-03-18T14:38:16.467 app[x] cdg [info] 2023/03/18 14:38:16 listening on [y]:22 (DNS: [z]:53)
2023-03-18T14:38:17.542 app[x] cdg [info] > Tileset Explorer@3.0 start
2023-03-18T14:38:17.542 app[x] cdg [info] > node server.js
2023-03-18T14:38:17.835 app[x] cdg [info] Server listening on port 8080
2023-03-18T14:38:17.860 app[x] cdg [info] Opened SQLite3 database
2023-03-18T14:43:17.866 app[x] cdg [info] Closing server due to inactivity
2023-03-18T14:43:18.843 app[x] cdg [info] Starting clean up.
2023-03-18T14:43:18.843 app[x] cdg [info] Umounting /dev/vdb from /opt/app/data
2023-03-18T14:43:19.845 app[x] cdg [info] [ 303.507952] reboot: Restarting system
2023-03-18T14:43:21.035 app[x] cdg [info] Starting init (commit: 08b4c2b)...
2023-03-18T14:43:21.055 app[x] cdg [info] Mounting /dev/vdb at /opt/app/data w/ uid: 1000, gid: 1000 and chmod 0755
2023-03-18T14:43:21.063 app[x] cdg [info] Preparing to run: `npm start` as node
2023-03-18T14:43:21.084 app[x] cdg [info] 2023/03/18 14:43:21 listening on [y]:22 (DNS: [z]:53)
2023-03-18T14:43:22.122 app[x] cdg [info] > Tileset Explorer@3.0 start
2023-03-18T14:43:22.122 app[x] cdg [info] > node server.js
2023-03-18T14:43:22.403 app[x] cdg [info] Server listening on port 8080
2023-03-18T14:43:22.428 app[x] cdg [info] Opened SQLite3 database

What do the events show from the output of fly machine status?

I have re-created the app a few times, but the problem still persists. Here’s the output:

Machine ID: R
Instance ID: T
State: started

VM
  ID            = X                                          
  Instance ID   = Y                              
  State         = started                                                 
  Image         = tileset-explorer:deployment-Z  
  Name          = white-sky-7396                                          
  Private IP    = W                        
  Region        = cdg                                                     
  Process Group =                                                         
  CPU Kind      = shared                                                  
  vCPUs         = 1                                                       
  Memory        = 256                                                     
  Created       = 2023-03-19T11:11:33Z                                    
  Updated       = 2023-03-20T17:30:18Z                                    
  Command       =                                                         
  Volume        = V                                    

Event Logs
STATE   	EVENT	SOURCE	TIMESTAMP                    	INFO                                              
started 	start	flyd  	2023-03-20T19:30:18.807+02:00	
starting	start	flyd  	2023-03-20T19:30:18.44+02:00 	
stopped 	exit 	flyd  	2023-03-20T19:30:18.339+02:00	exit_code=0,oom_killed=false,requested_stop=false	
started 	start	flyd  	2023-03-20T19:25:14.217+02:00	
starting	start	flyd  	2023-03-20T19:25:13.779+02:00
1 Like

I think the restart policy on your machine may be set to always. In that case, flyd will attempt to restart your machine even when your app exits successfully (which is what’s happening, per the exit_code in the status output you posted).

Could you try setting your machine’s restart policy to on-fail or no (depending on the behavior you want)? E.g. flyctl machine update <machine id> --restart no. If it shows a pending change from always, that might be it.

(There have been some changes in flyctl having to do with machine restart policies in the last two or three weeks. In particular, we fixed a bug in which flyctl machine update would always set the restart policy to always, overwriting any previous setting. I wonder if you were bitten by this.)

Thank you, that did it.

So it was the default settings which changed. I didn’t even know you could configure the restart policy.

Glad it’s working now!

Out of curiosity, are there any wake-on-request docs (or other machine docs) that you’ve been referencing while working on this? If anything is out of date or misleading, I’d like to look into it.

I have been following the tutorial on how to setup a wake-on-request machine using the tired proxy. Since the only thing that proxy does is shutting down the server when idle, I just implemented something similar in node. Those docs do not mention having to configure the machine to use wake-on-request.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.