In both apps, the secondary region instance starts semi-regularly (between every 30 to 90 minutes or so), stays running for about 5-10 minutes, and then is downscaled and shuts down. I previously posted here about one of the apps, but I’ve just noticed that another app is showing the same behavior.
On Grafana in the “Memory Utilization” for each app, I can see the the primary instance with a steady line, and the secondary instances show their repeated starts and stops. The “App Concurrency” graph, doesn’t match these restarts at all. For both apps, there may be 10 or 20 restarts of the secondary instance in a 24-hour period, but 0 or sometimes 1 instance of concurrency.
Any way I can determine what’s causing these restarts of the secondary instances?
As I understand is, scaling is not about sending all traffic to one app, until the soft limit is hit, but instead also sends traffic to closer nodes, in favor of having a app that has reduced latency.
What port are you on?
In my experience, anything on a common port, gets scanned by the internet at large. I see traffic days after putting a app on fly.io, where someone is looking for WP Admin or other common exploited routes.
I had been looking into this (but couldn’t post here since it’s locked).
I’m still checking into it further, but I think @Zane_Milakovic is correct about the closest region. So if your stopped Machine is in the closest region to a user, then a request could get routed to it even though it’s stopped and the other Machine is below the soft_limit. This behaviour is where load balancing intersects with the auto start and stop feature. (Hoping to update docs for this soon!)
But if this Machine is restarting on a very regular schedule that doesn’t match with incoming requests, then something else might be going on.
Thanks, that was my misunderstanding. I had intended for one instance to just work as a backup in case of the primary region going down, or the unlikely chance of too much traffic. Is there a (relatively simple) method to set things up that way?
The typesense app is on 443, so that also makes sense that it would be triggered occasionally by bots, etc.
The other app is Umami (website analytics), and it’s on 3000. As an analytics app, it receives a hit (with an authentication header) any time the website it’s tracking is hit. The website Umami tracks is hosted as a serverless function on Vercel’s “hobby” plan, and its primary location (San Francisco) is close to the primary locations of all of my Fly apps (Los Angeles). I couldn’t find anything about Vercel using a fallback region on my plan level, so it does seem a bit strange that an Umami instance that’s father away (Toronto) would be triggered by hits on the website.
Edit: actually, Umami has a frontend https web UI for management, so that’s likely to be getting hit by bots from various locations.
I’ll look into logging requests if I can learn of an easy to implement way, but at this point it’s kind of a matter of being curious and a bit over-obsessive about it rather than a cost issue.
Thanks. It would be interesting to learn the ins and outs of that intersection!
Aside from this particular issue, I pretty much have everything working with my cheapo typesense app: A primary typesense and a secondary backup, each running as single-node typesense instances, and both updated by another Fly app that is triggered by 1) typesense instance machine start and 2) github action on website deploy and cron. Don’t know if that would be at all useful to anyone but me, but I’ll try doing a write-up!