Attempted to follow the guide here (Autoscale based on metrics · Fly Docs) for scaling based on metrics. But when I deploy the image I just get the following error:
{"level":"ERROR","msg":"metrics collection failed","app":"snoculars","err":"collect metric (\"qdepth\"): empty prometheus result"}
Here is my Fly config:
app = "snoculars-autoscaler"
[build]
image = "flyio/fly-autoscaler:0.3.1"
[env]
FAS_PROMETHEUS_ADDRESS = "https://api.fly.io/prometheus/snoculars-inc"
FAS_PROMETHEUS_METRIC_NAME = "qdepth"
FAS_PROMETHEUS_QUERY = "sum(queue_depth{app='$APP_NAME'})"
FAS_ORG = "snoculars-inc"
FAS_APP_NAME = "snoculars"
FAS_CREATED_MACHINE_COUNT = "min(50, qdepth / 2)"
[metrics]
port = 9090
path = "/metrics"
Here are the logs for the machine:
2024-10-17T01:33:10.534 runner[48e2255a197098] ord [info] Machine created and started in 7.487s
2024-10-17T01:33:10.579 app[48e2255a197098] ord [info] 2024/10/17 01:33:10 INFO fly-autoscaler version=v0.3.1 commit=c4cd2cf5680f1a6ae1eb33325a44698336104754
2024-10-17T01:33:10.579 app[48e2255a197098] ord [info] {"level":"INFO","msg":"connected to fly"}
2024-10-17T01:33:10.579 app[48e2255a197098] ord [info] {"level":"INFO","msg":"metrics collectors initialized","n":1}
2024-10-17T01:33:10.579 app[48e2255a197098] ord [info] {"level":"INFO","msg":"reconciler pool initialized, beginning loop","interval":"15s","timeout":"30s","appListRefreshInterval":"1m0s","collectors":1,"created":"min(50, qdepth / 2)","started":""}
2024-10-17T01:33:10.579 app[48e2255a197098] ord [info] {"level":"INFO","msg":"serving metrics","addr":":9090"}
2024-10-17T01:33:10.638 app[48e2255a197098] ord [info] 2024/10/17 01:33:10 INFO SSH listening listen_address=[fdaa:0:2968:a7b:316:919e:d688:2]:22 dns_server=[fdaa::3]:53
2024-10-17T01:33:25.964 app[48e2255a197098] ord [info] {"level":"ERROR","msg":"metrics collection failed","app":"snoculars","err":"collect metric (\"qdepth\"): empty prometheus result"}
2024-10-17T01:33:41.707 app[48e2255a197098] ord [info] {"level":"ERROR","msg":"metrics collection failed","app":"snoculars","err":"collect metric (\"qdepth\"): empty prometheus result"}
Is the guide up to date? Has anything changed? I want to scale based on CPU/memory usage… so not sure if this is the correct way to do it?