FYI, noticed an odd error when viewing metrics for 2 days:
This is a weird bug we’ve noticed when VMs restart in a burst. It resets counters and the way we push metrics seems to handle that badly. @steve would know what’s up, I think.
This is odd! I’ll have a look in to it.
Seems that, regardless of app (I have several), the metrics displays for 1 and 2 days have a single, huge spike.
Once this is fixed, would it be possible to get a 7-day view also?
I think @jerome and @steve figured this out, and then fixed it. When we deploy our load balancers, we bring up a new instance and keep the old one running for a very long time while connections drain. During deploys, two or more instances were sending metrics with the same name + labels, so the values would flap between 0 and “very high” for a couple of minutes.
They fixed it by adding a
proxy_id label to dedupe them.