We shaved off 500ms from machine creation

When placing a machine, we reach out to every available host in the desired region and ask how much capacity it has. We then score the hosts and choose one for the new machine. However, we had some decommissioned hosts remaining in the table used to query this information, causing an extra 500ms delay as we waited for a response from a non-existent host to timeout. See the long red trace below:

Now that we’ve cleaned all this up, machine launches in a bunch of regions should be ~500ms faster.

9 Likes

Out of the topic, but I’mp pretty interesting: What’s the service / technology you used to monitor the showed request time?

1 Like

That’s https://www.honeycomb.io/

1 Like