We are building Synthetic Monitoring. Would you help us?

Hi Fly.io community :wave:

The Infra team is working hard on several fronts to get a better understanding of what you and your customers are experiencing in real-time.

One of the ideas we are exploring to get a better picture of what our customers are experiencing is to provide synthetic monitoring for Fly apps.

While you might already be using a synthetic monitoring solution, we’re building this system integrated with our Managed Grafana, which will make the correlation between application availability and performance, and the existing metrics like machine load, network usage, etc., a braze. Concurrently, we will be ingesting these metrics in our monitoring stack which will allow us to quickly identify when an elevated number of apps are unavailable or not performing well for clients outside our network.

We plan to provide this functionality through a globally distributed fleet of monitoring agents running on thousands of devices. How are we going to build this fleet, you ask? With your help! Help us to help you by opting in to run synthetic checks through your flyctl agent!

Running synthetic checks through flyctl agents will provide an accurate representation of how your customers experience your apps. Instead of testing from a server in a data center with a cross connect to our providers, these checks will be performed from users’ laptops over their ISP connections.

This is an ambitious plan and it will only work if you help us.

We are starting by releasing a new version of flyctl with the monitoring agent disabled by default. This is an opt-in only functionality. In this first iteration, we will only run probes for Fly-owned applications, such as fly.io and debug.fly.dev. We will use the collected metrics to refine the information we get from the probes and improve the system overall.

Once we are satisfied with the information we are collecting and how the system performs at scale, we will allow organizations to register their Fly apps’ endpoints for monitoring. Our approach is not fully defined yet, but we intend to permit organizations to register endpoints for monitoring only if they opt in to run probes for other organizations on their flyctl agents.

Do you want to help us to build this awesome synthetics monitoring system? Great! You’d need to upgrade your flyctl to version 0.2.95 or greater, run flyctl settings synthetics enable and restart your agent with flyctl agent restart to start listening for probes. We promise we will not DoS your agent :smiley:

Let us know your feedback!

7 Likes

Before I shoot myself in the foot, can you tell me what caliber the bullet is? ie what kind of metrics are you collecting for this initial release.

Hi @khuezy, great question! The agent is being built leveraging blackbox_exporter. For this initial release, we are running the HTTP prober for target endpoints owned by Fly.

The agent establishes a WebSocket connection with the synthetics backend application, which forwards the scrape requests to the agent. The flyctl agent executes the blackbox exporter job and return the metrics to the backend app to be inserted in our VictoriaMetrics cluster.

I hope this makes sense and answers your question. Please let me know if you have any other questions.

2 Likes

Thanks! Can you share the configuration file? I’m curious what kinds of headers fly will be collecting. Any privacy concerns here?

My first thought is that this sounds like a privacy nightmare. There’s plenty of potential for abuse.

Will this always be the case? Including after full launch? No agent running if the org chooses not to use your synthetic monitoring?

How are you ensuring privacy and security of the flyctl users?

1 Like

Yeah… does this mean my team’s dev machines and CD will be making random HTTP requests if they get opted in? Can we limit that to our own applications? I would feel much more comfortable if this was a separate application.

2 Likes

This is excellent feedback. We’re going to change how we ship this, I think. Here’s what we’re planning:

First, we can get a lot out of probes to our infrastructure. flyctl already talks to a bunch of fly.io and flyio.net endpoints, we’ll just monitor those. We’ll end up enabling this by default since it doesn’t expand the scope of what the flyctl agent talks to.

This actually gets us much better visibility into the whole network, so it’ll benefit all your apps.

At some point, we’ll allow people to opt in to hitting other Fly hosted apps. This will be purely opt in, and I’d expect most of the folks who will do this are individual/hobby level folks that aren’t part of a corporate org. If we can get enough people to opt in (perhaps by offering hosting credits) we’ll end up shipping synthetic monitoring as a feature y’all can take advantage of. THE MARKET WILL DECIDE :smiling_imp:

Organizations should get more control, though, so when we end up shipping the opt-in, but slightly scary, behavior, we’ll also allow you to disable this for everyone who’s part of your org.

7 Likes