Email Support Response Metrics

The Support team at Fly.io is trying something new - we’re making our email support metrics public! We’re doing this in order to better set expectations regarding response times, and we’ll be updating these on a monthly cadence going forward.

Customers on Launch, Scale, and Enterprise plans have access to email support. These plans come with an organization-specific address for emailing support questions, typically <org-name>@support.fly.io.

Here’s the data from the month of April 2024:

Time to First Response (Hours)

Plan Median / P50 Response Time P75 Response Time P90 Response Time P99 Response Time
All Plans 3.2 14.2 42.4 107.8
Launch 3.7 12.6 42.6 91.2
Scale 1.8 7.3 32.4 162.1
Enterprise 0.4 0.7 1.3 4.3

Average Response Times, all replies (Hours)

Plan Median / P50 Response Time P75 Response Time P90 Response Time P99 Response Time
All Plans 5.1 19.1 46.6 153.6
Launch 5.1 14.7 45.5 103.2
Scale 2.6 14.5 52.5 119.5
Enterprise 7.8 20.0 36.0 40.9


Tickets Resolved on First Response

Plan Resolved on first response
All Plans 44%
Launch 42%
Scale 48%
Enterprise 8%

What these measure:

The “Time to First Response” metrics measure the time it takes for a new ticket to get an initial response from our team. The “All Replies” values measure the average time for all responses, including followups. These can be longer than the initial response time as some tickets require further investigation.

The “Resolved on First response” metric is exactly what it says on the tin. It measures the number of tickets that were resolved with a single reply from our team and did not require further investigation.

All of the response times listed above are percentile measurements, which show the distribution of response times across plans. We use percentiles here vs a standard average to better account for outliers and give a more accurate view of what a standard support ticket response timeline is.

Half of tickets receive a response within the P50/Median value, a majority receive it within the P75, almost all receive a response by the P90 time, and all but extreme outliers receive a response by the P99 time.

As mentioned, we’ll be updating these on a monthly basis going forward. Are there other support metrics that you’d find valuable for us to include here? Let us know!

14 Likes

Interesting metrics!

For the Enterprise plan, are those metrics including both emergency and non-Emergency tickets? Since they are treated differently, it would be interesting to see the metrics broken down into those two categories.

1 Like

The Enterprise metrics do include both emergency and non-emergency tickets. The metrics for the emergency tickets are a little trickier to calculate, but we can break it down into separate categories for future posts!

Transparency is always a thing to applaud in a service provider.

However it’s quite confusing what a metric that’s supposed to be both an average and percentile actually means. A lot of people have the mental model of “we report multiple percentiles to see what the average hides”.

1 Like

Good question! I manually pulled the data for April. Emergency tickets are a very small sample size(we had five last month), so I’m going to use buckets here:

<5 min 5 - 15 min 15+ Min
2 1 2

Of the two 15min+ tickets, one was responded to in 17 minutes, and the second in slightly over 5 hours. Our target response time for emergency tickets is 15minutes, so that’s one small miss and one large miss.
The large miss was due to a misconfiguration in our paging system that prevented it from alerting as it should have. We’ve corrected this to prevent it happening again.

5 Likes

Hi, I think it simply means the header has an extra word: " Average Response Times, all replies (Hours)" should be " Response Times, all replies (Hours)" since what we’re showing below is already a central tendency measure (the median in this case) and some percentiles for reference. It would not make sense to show the median of an average unless we were calculating the median of a group of averages, which is not the case :wink:

  • Daniel
4 Likes

A new month means new metrics! Here’s the updated data for the month of May 2024:

Time to First Response (Hours)

Plan Median / P50 Response Time P75 Response Time P90 Response Time P99 Response Time
All Plans 4.1 15.1 47.7 191.1
Launch 5.4 17.2 49.9 219.5
Scale 3.2 9.9 37.9 66.3
Enterprise 1.9 3.0 6.8 11.2

Response Times, All Replies (Hours)

Plan Median / P50 Response Time P75 Response Time P90 Response Time P99 Response Time
All Plans 6.9 23.4 51.6 154.7
Launch 7.4 26.2 55.0 167.0
Scale 4.5 18.4 47.8 134.6
Enterprise 3.6 11.5 15.5 43.2


Tickets Resolved on First Response

Plan Resolved on first response
All Plans 45%
Launch 46%
Scale 46%
Enterprise 30%

There was a moderate increase in response times across the board from April’s stats. We’re taking a look at what contributed to this increase.

5 Likes

Hey! This has been a fun exercise—we really appreciate all the feedback and comments so far. The goal with this was ultimately to ship a dashboard with metrics updated daily.

This is now a thing: https://fly.io/support

A couple of notes:

  • The Enterprise metrics exclude emergency tickets
  • The Time to First Response (TTFR) metrics exclude tickets created on weekends

Any questions or suggestions? Post 'em below :point_down:

9 Likes