Early access: build Grafana dashboard from Fly metrics

emiliendevos · January 19, 2021, 9:58pm

I’m sorry for my poor knowledge but for node_network_transmit_bytes, firecracker_vm_memory_active and probably the other metrics based on “bytes”. Which one should I choose as a unit for grafana:

kurt · January 19, 2021, 10:20pm

You probably want Data > bytes (IEC) for memory, and Date Rate > bytes/s (IEC) for network transfer.

Here’s a pretty decent explanation of IEC vs SI, IEC uses 1024 bytes per kilobyte, SI uses 1000 bytes per kilobyte: https://www.drupal.org/project/drupal/issues/1114538

emiliendevos · January 20, 2021, 6:53pm

How to get the real memory used by my application? It tried with firecracker_vm_memory_active but it’s just too low to be this metric.

charsleysa · January 31, 2021, 10:59pm

Is there some way to get a list of all available metrics? I noticed that the list in this point isn’t the full list.

kurt · January 31, 2021, 11:05pm

Whoops, missed the real memory question. We expose the /proc/meminfo metrics, here’s a description: https://superuser.com/questions/521551/cat-proc-meminfo-what-do-all-those-numbers-mean

kurt · January 31, 2021, 11:08pm

We haven’t implemented the prometheus endpoints to let you list all your metrics yet, but here’s the current list (note, it scrolls):

// edge http metrics
edge_http_responses_count
edge_http_response_time_seconds_bucket

// edge tcp metrics
edge_tcp_connects_count
edge_tcp_disconnects_count

// app concurrency/load metrics
app_service_concurrency

// app http metrics
app_http_responses_count
app_http_response_time_seconds_bucket

// app tcp metrics
app_local_connect_time_seconds_bucket
app_local_connects_count
app_local_disconnects_count

// Anycast Bandwidth
anycast_data_out
anycast_data_in

// Anycast TCP Connections
anycast_tcp_connects_count
anycast_tcp_disconnects_count
anycast_tcp_queue_time_seconds_bucket

// Memory metrics
firecracker_vm_memory_buffers
firecracker_vm_memory_cached
firecracker_vm_memory_mem_free
firecracker_vm_memory_mem_available
firecracker_vm_memory_mem_total
firecracker_vm_memory_swap_cached
firecracker_vm_memory_vmalloc_used
firecracker_vm_memory_active
firecracker_vm_memory_inactive

// CPU / load metrics
firecracker_vm_cpu
firecracker_vm_load_average
firecracker_vm_net_sent_bytes

// Disk
firecracker_vm_disk_time_io

// alloc network stats
node_network_transmit_bytes
node_network_receive_bytes

dan · February 19, 2021, 5:03am

I don’t seem to be able to get any metrics. I’m able to set up the data source alright, and grafana confirms that they can connect, but no metrics show up. Do I need to do anything on my end to enable this?

Thanks!

dan · February 19, 2021, 5:20am

aaaannd then they started showing up (or i finally got a query right?) just after I posted this. So all good!

kurt · February 19, 2021, 3:34pm

That’s our fault. We haven’t fully implemented the queries Grafana needs to power its UI, so it acts strangely at times.

rugwiro · March 20, 2021, 10:17pm

Have you considered scraping custom metrics from apps that expose a metrics endpoint that would be defined in “fly.toml” @kurt? Something like this:

[[services]]

  [[services.metrics]]
    path="/metrics"

kurt · March 21, 2021, 3:11pm

That’s the plan! We have some more plumbing to build first, letting people create and query arbitrary metrics is a good way to destroy a time series DB.

rugwiro · March 21, 2021, 3:57pm

Awesome . It’s still fascinating to me that almost every feature mentioned on this forum is already in working/upcoming or can be implemented by the user without a toll on the UX. How do you make sure the plumbing on your side doesn’t become a complexity nightmare.

kurt · March 21, 2021, 4:03pm

It sometimes does! We are fortunate because we’ve run most of this kind of infrastructure before. We also cheat. Like, flyctl does all our builds, there’s nothing special in our stack to handle app bundling/building. And we run pretty standard open source projects wherever we can.

rugwiro · April 10, 2021, 1:07pm

Any news on this? @kurt

jerome · April 10, 2021, 1:36pm

We’re close to announcing beta access to this!

rugwiro · April 10, 2021, 2:01pm

Excellent. Will it allow for custom metrics?

jerome · April 10, 2021, 2:11pm

Yes. You’ll also be able to query fly-related metrics from the same prometheus source.

rugwiro · April 10, 2021, 2:54pm

Can’t wait to test this. I feel like with every feature you add you remove one more reason for me to be on any other platform. flydotio is exactly the platform I would build if I had the time, resources and experience.

DazWilkin · April 13, 2021, 7:43pm

This is very useful!

Are there plans to add aggregate|Organizational metrics?

I like to set Alerts across my cloud providers to notify me when I’m consuming resources so that I can whack those I no longer need.

It would be useful to have metrics for e.g. total apps, VMs|instances.

jerome · April 21, 2021, 5:15pm

@rugwiro @DazWilkin you might be interested in this: Feature preview: Custom metrics

We don’t currently have metrics for total apps or instances, but you should be able to get them with count prometheus queries.

Topic		Replies	Views
Preview: Managed Grafana Dashboards for Fly Apps metrics , announcement	24	3246	October 21, 2023
Grafana Dashboard for Fly Apps metrics	1	1182	July 19, 2021
can i modify the Grafana dashboard to view HTTP response times to a lower granularity Questions / Help metrics , grafana	4	469	May 4, 2024
Custom app metrics not making it to grafana cloud Questions / Help metrics , elixir , grafana	18	1849	March 24, 2022
In Grafana - how do I separate metrics by route grafana	1	22	November 22, 2024

Early access: build Grafana dashboard from Fly metrics

Related topics