Hi! I’ve built and host www.trackmypilot.com on fly. I have 2 worker machines, 4 web machines in 2 regions (dfw, sjc), and a non-managed postgres cluster - 3 in dfw and 1 backup in sjc. Here are results of the status commands:
fly status
App
Name = track-my-pilot
Owner = personal
Hostname = track-my-pilot.fly.dev
Image = track-my-pilot:deployment-01KJN7DHSN3KFS4VV6TW7035WZ
Machines
PROCESS ID VERSION REGION STATE ROLE CHECKS LAST UPDATED
web 287d674a003d08 237 sjc started 2026-03-01T17:42:57Z
web 48e6393b7e0508 237 sjc stopped 2026-03-01T17:35:06Z
web 9185507db91183 237 dfw stopped 2026-03-01T17:35:04Z
web e2861366ad0686 237 dfw started 2026-03-01T17:35:08Z
worker 17810766b3de89 237 dfw started 2026-03-01T17:35:07Z
worker† 6e823745cd0987 237 dfw stopped 2026-03-01T17:35:04Z
Notes:
† Standby machine (it will take over only in case of host hardware failure)
fly status -a track-my-pilot
App
Name = track-my-pilot
Owner = personal
Hostname = track-my-pilot.fly.dev
Image = track-my-pilot:deployment-01KJN7DHSN3KFS4VV6TW7035WZ
Machines
PROCESS ID VERSION REGION STATE ROLE CHECKS LAST UPDATED
web 287d674a003d08 237 sjc started 2026-03-01T17:42:57Z
web 48e6393b7e0508 237 sjc stopped 2026-03-01T17:35:06Z
web 9185507db91183 237 dfw stopped 2026-03-01T17:35:04Z
web e2861366ad0686 237 dfw started 2026-03-01T17:35:08Z
worker 17810766b3de89 237 dfw started 2026-03-01T17:35:07Z
worker† 6e823745cd0987 237 dfw stopped 2026-03-01T17:35:04Z
Notes:
† Standby machine (it will take over only in case of host hardware failure)
The last 3 days performance has been incredibly slow - sometimes! Sometimes it’ll be great, other times it takes 10 seconds to load a page. Logs for the app and database show nothing. Local testing and looking at database queries shows nothing abnormal (2-3 queries per page load). Everything is built in Python / Django.
Where should I even start to look for the issue, or how can I attempt to monitor the site better?