I have an app that hasn’t shown metrics in ams since 2022-04-19 0000 UTC. I checked another app I have that also has a region in ams, and it isn’t currently showing metrics either (but I just added them to the app config). Other regions are fine in both apps. Both my apps’ metrics endpoints respond, and both apps are otherwise running fine.
Hmm, that’s interesting-- it doesn’t look like this would be ams-wide, at least. both from checking the status page and from a quick test deployment, where I was able to view the built-in metrics with fly dashboard metrics --app $app-name
Have you been able to see metrics displayed by your second app, yet?
A couple of things you might want to check, off the bat:
- you can view all your app’s instances with
fly status --app $app-name
, and view more detailed per-instance logs withfly logs
- you can view your app’s logs in a particular region with
flyctl logs --region ams --app $app-name
- since you mentioned adding them to your app’s configs, are you using any custom metrics?
Hi @eli . Yes metrics come up for the non-ams instances of the second app. Both custom and Fly metrics are not present for ams. Is there some way to check the Prometheus instance for my organization for ams? Is there some way I can PM you my organization name?
In this case, we were able to dig into this a little further already! Generally speaking,you’re absolutely correct that app or org name can be useful information to provide when describing an issue you’re facing. Pretty much any additional information you feel comfortable sharing here can help the community help you more effectively.
We haven’t been able to observe the issue in ams while it was happening. Currently, as far as I can tell, it looks like overall ams metrics are in working order.
To investigate this further, you might want to check if you’re able to connect to the app’s metrics endpoints.
For example, you could fly ssh console
to them, and see if you’re able to get real metrics from the endpoint via a local connection with `curl "https://0.0.0.0:/.
If you see that problem app instances can successfully respond to local connections, let us know! We’re always happy to do what whatever we can to make sure things are running smoothly.
I already checked those things, I mentioned that above. The evidence above already suggests it’s something likely to do with the organization in use. Can you please provide a way for me to give you account and organization affected?
I have this issue in my personal organization. App instances there in the ams region are not showing metrics.
@eli it’s not clear if this is being looked into. Metrics for my app in ams are still not available. Thanks
Hey @anacrolix, thanks for bringing this up. I got my wires crossed with whose app was affected and mistakenly thought that this was reported resolved!
Assuming I have the correct app now (and it’d be great if you could confirm this by sharing the app name and/or the output from the troubleshooting steps suggested in this thread), I can link up with the team to see what’s up.
While we do that, you might try restarting the faulty instance, and/or scaling up your instance count in ams to see if that resolves the problem. Thanks again for your patience!
@eli Yes the app name was provided by email. It definitely has issues with metrics not showing up for ams.
Awesome, thanks for confirming. Once we got the right app name down (thanks for bearing with me), we took a look at the host and were able to ID and resolve the problem.
We can now see logs coming from that app instance in our monitoring, and I hope you can too.
We also investigated our monitoring and fixed an issue there which will allow us to respond to this kind of thing before it causes problems— thank you for bringing it up!
Thank you, it’s fixed!