Early access: build Grafana dashboard from Fly metrics

I’m sorry for my poor knowledge but for node_network_transmit_bytes, firecracker_vm_memory_active and probably the other metrics based on “bytes”. Which one should I choose as a unit for grafana:

image

You probably want Data > bytes (IEC) for memory, and Date Rate > bytes/s (IEC) for network transfer.

Here’s a pretty decent explanation of IEC vs SI, IEC uses 1024 bytes per kilobyte, SI uses 1000 bytes per kilobyte: https://www.drupal.org/project/drupal/issues/1114538

How to get the real memory used by my application? It tried with firecracker_vm_memory_active but it’s just too low to be this metric.

Is there some way to get a list of all available metrics? I noticed that the list in this point isn’t the full list.

Whoops, missed the real memory question. We expose the /proc/meminfo metrics, here’s a description: https://superuser.com/questions/521551/cat-proc-meminfo-what-do-all-those-numbers-mean

We haven’t implemented the prometheus endpoints to let you list all your metrics yet, but here’s the current list (note, it scrolls):

// edge http metrics
edge_http_responses_count
edge_http_response_time_seconds_bucket

// edge tcp metrics
edge_tcp_connects_count
edge_tcp_disconnects_count

// app concurrency/load metrics
app_service_concurrency

// app http metrics
app_http_responses_count
app_http_response_time_seconds_bucket

// app tcp metrics
app_local_connect_time_seconds_bucket
app_local_connects_count
app_local_disconnects_count

// Anycast Bandwidth
anycast_data_out
anycast_data_in

// Anycast TCP Connections
anycast_tcp_connects_count
anycast_tcp_disconnects_count
anycast_tcp_queue_time_seconds_bucket

// Memory metrics
firecracker_vm_memory_buffers
firecracker_vm_memory_cached
firecracker_vm_memory_mem_free
firecracker_vm_memory_mem_available
firecracker_vm_memory_mem_total
firecracker_vm_memory_swap_cached
firecracker_vm_memory_vmalloc_used
firecracker_vm_memory_active
firecracker_vm_memory_inactive

// CPU / load metrics
firecracker_vm_cpu
firecracker_vm_load_average
firecracker_vm_net_sent_bytes

// Disk
firecracker_vm_disk_time_io

// alloc network stats
node_network_transmit_bytes
node_network_receive_bytes

I don’t seem to be able to get any metrics. I’m able to set up the data source alright, and grafana confirms that they can connect, but no metrics show up. Do I need to do anything on my end to enable this?

Thanks!

aaaannd then they started showing up (or i finally got a query right?) just after I posted this. So all good!

That’s our fault. We haven’t fully implemented the queries Grafana needs to power its UI, so it acts strangely at times.

Have you considered scraping custom metrics from apps that expose a metrics endpoint that would be defined in “fly.toml” @kurt? Something like this:

[[services]]

  [[services.metrics]]
    path="/metrics"

That’s the plan! We have some more plumbing to build first, letting people create and query arbitrary metrics is a good way to destroy a time series DB.

Awesome :100:. It’s still fascinating to me that almost every feature mentioned on this forum is already in working/upcoming or can be implemented by the user without a toll on the UX. How do you make sure the plumbing on your side doesn’t become a complexity nightmare.

1 Like

It sometimes does! We are fortunate because we’ve run most of this kind of infrastructure before. We also cheat. Like, flyctl does all our builds, there’s nothing special in our stack to handle app bundling/building. And we run pretty standard open source projects wherever we can.

2 Likes

Any news on this? @kurt

We’re close to announcing beta access to this!

Excellent. Will it allow for custom metrics?

Yes. You’ll also be able to query fly-related metrics from the same prometheus source.

Can’t wait to test this. I feel like with every feature you add you remove one more reason for me to be on any other platform. flydotio is exactly the platform I would build if I had the time, resources and experience.

This is very useful!

Are there plans to add aggregate|Organizational metrics?

I like to set Alerts across my cloud providers to notify me when I’m consuming resources so that I can whack those I no longer need.

It would be useful to have metrics for e.g. total apps, VMs|instances.

@rugwiro @DazWilkin you might be interested in this: Feature preview: Custom metrics

We don’t currently have metrics for total apps or instances, but you should be able to get them with count prometheus queries.

1 Like