Slow communication between fly machine and Tigris in FRA

I am working on a service that processes large amounts of data and we are using Tigris as storage. As we are experiencing less-than-expected performance, I set up a testing setup in a single region only (FRA) with some app machines (only one of them interacting with Tigris) and a Tigris bucket that is setup as “single region FRA”.

We are experiencing underwhelming throughput:

Found 6537 objects.
 1.35 s    9.10 kB  0.01 MB/s, 2320a755-a0c9-48b1-bf64-ddb54fe1f170.dcm.head.enc
 1.81 s  304.02 kB  0.17 MB/s, a13188a3-65f3-4e84-88cd-dbf8b7c5f3d2.dcm.enc
 1.35 s    9.09 kB  0.01 MB/s, 8b709f2e-2db8-45fc-b6b8-87ec5553b32f.dcm.head.enc
 1.43 s    9.09 kB  0.01 MB/s, 8be96f9b-762a-4070-9172-4b7f5cf5d538.dcm.head.enc
 1.37 s    9.10 kB  0.01 MB/s, 65a5df4b-aa23-4610-a873-8579a1a34642.dcm.head.enc
 0.68 s  304.02 kB  0.45 MB/s, 445c02dc-eeda-464d-8675-97bca8185c9a.dcm.enc
 0.87 s    9.09 kB  0.01 MB/s, 275985c0-f4ec-484e-8b0a-1faff6e60e1e.dcm.head.enc
 1.31 s  304.02 kB  0.23 MB/s, 66382cc5-a781-476f-aab4-94802d0e02b5.dcm.enc
 0.94 s  304.02 kB  0.32 MB/s, fd904891-e710-4d04-83aa-8c5c10de9770.dcm.enc
 0.53 s    9.09 kB  0.02 MB/s, f75b50c7-769d-4490-86b0-e4dc9bf643b2.dcm.head.enc
 1.56 s  304.02 kB  0.19 MB/s, c4554cfc-b329-4183-8ec9-d116493d80d9.dcm.enc
 0.84 s  304.02 kB  0.36 MB/s, dc5c94ab-7a40-443f-9a85-55455e116d38.dcm.enc
 0.99 s    9.09 kB  0.01 MB/s, a60e52c8-9f8d-486d-b5a8-2e5c438879cc.dcm.head.enc
 0.46 s    9.10 kB  0.02 MB/s, bb5374fc-2fa6-4eb7-a501-5140fd7277da.dcm.head.enc
 0.77 s  304.02 kB  0.39 MB/s, d249b5a2-fa1a-46b3-bf73-da0cb8a6b1eb.dcm.enc
 0.94 s    9.10 kB  0.01 MB/s, 4c935cec-4b09-4cde-9d4a-154da84ebfa0.dcm.head.enc
 1.26 s  304.02 kB  0.24 MB/s, 7a3e7747-6229-493f-8bda-9ed32dc552a5.dcm.enc
 1.49 s    9.10 kB  0.01 MB/s, 96e29e7d-7fbe-42bc-a040-0e16c5870c36.dcm.head.enc
 0.61 s  304.02 kB  0.49 MB/s, 1b72256b-4be3-4f68-b916-2ec271fab48b.dcm.enc
 0.56 s  304.02 kB  0.54 MB/s, 1ad0fe75-f06e-4d5e-837f-9efc4e1157c6.dcm.enc
 0.98 s    9.10 kB  0.01 MB/s, f69b887a-8f92-4bed-811d-593ee587214d.dcm.head.enc
 0.59 s  304.02 kB  0.51 MB/s, c1e60b10-f8e8-4f21-8d77-19250d76b2ff.dcm.enc
 0.81 s    9.09 kB  0.01 MB/s, 274b1348-86e5-4eb3-98a5-def257f8cb5d.dcm.head.enc
 1.28 s  304.02 kB  0.24 MB/s, 3223ed52-bedd-4d7e-bdba-9aca80430fee.dcm.enc
 1.15 s  304.02 kB  0.26 MB/s, d04ba2d0-0e6b-4066-884d-ee4c271c3c0c.dcm.enc
 0.73 s  304.02 kB  0.41 MB/s, 1609c6d5-bc08-41ba-a9f8-7408b1aff2d9.dcm.enc
 0.53 s  304.02 kB  0.58 MB/s, bebe84b1-017a-43bb-ad53-935a1ff3defb.dcm.enc
 0.87 s    9.10 kB  0.01 MB/s, a2c3b4bc-6cb5-4d57-b938-101b82334796.dcm.head.enc
 0.67 s  304.02 kB  0.46 MB/s, 9d104050-d2e1-41c3-97b1-9d9a12f7ea2d.dcm.enc
 0.52 s    9.10 kB  0.02 MB/s, 54a9eea3-fa5d-4162-aad1-de330b12759a.dcm.head.enc
 0.48 s  304.02 kB  0.63 MB/s, 9d120d0a-173f-42f0-85d8-7e2a31afd4d4.dcm.enc
 1.31 s  304.02 kB  0.23 MB/s, fb6db95c-42a3-4abe-978c-1b695c9e0de6.dcm.enc
 0.64 s    9.10 kB  0.01 MB/s, 0799b63e-f803-4829-a511-e0921df809ec.dcm.head.enc

You can see that performance seems to be dominated by up to >1 second overhead even on the tiny payloads.

And we are also seeing unexpectedly long routes between the communicating app machine and Tigris:

                                                                                                                            My traceroute  [v0.95]
6839edef1e1028 (172.19.10.98) -> <redacted>.t3.storage.dev (137.174.147.59)                                                                                                                                                           2026-05-06T11:57:02+0000
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                                                                                                                                                                                                     Packets               Pings
 Host                                                                                                                                                                                                                              Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 172.19.10.97                                                                                                                                                                                                                    0.0%    25    0.3   0.3   0.2   0.5   0.1
 2. unn-89-222-119-61.datapacket.com                                                                                                                                                                                                0.0%    25    0.3   0.3   0.3   0.4   0.0
 3. vl251.fra-itx7-core-2.cdn77.com                                                                                                                                                                                                 0.0%    25    0.8   0.8   0.6   3.3   0.5
 4. ffm-b5-link.ip.twelve99.net                                                                                                                                                                                                     0.0%    25    1.1   1.1   1.0   1.4   0.1
 5. ffm-bb1-link.ip.twelve99.net                                                                                                                                                                                                    0.0%    25    1.3   1.3   1.2   1.5   0.1
 6. prs-bb1-link.ip.twelve99.net                                                                                                                                                                                                    0.0%    25    9.6   9.7   9.6   9.9   0.1
 7. ldn-bb1-link.ip.twelve99.net                                                                                                                                                                                                    0.0%    25   16.1  16.2  16.1  16.5   0.1
 8. nyk-bb5-link.ip.twelve99.net                                                                                                                                                                                                    0.0%    25   84.7  84.7  84.5  85.0   0.1
 9. chi-bb1-link.ip.twelve99.net                                                                                                                                                                                                    0.0%    25  100.8 100.9 100.6 102.1   0.3
10. chi-b3-link.ip.twelve99.net                                                                                                                                                                                                     0.0%    25  101.5 101.6 101.3 103.8   0.5
11. (waiting for reply)
12. 51.162.226.242                                                                                                                                                                                                                  0.0%    25  101.7 101.7 101.6 101.9   0.1
13. 155.204.157.162                                                                                                                                                                                                                 0.0%    25  101.2 101.2 101.1 101.4   0.1
14. (waiting for reply)

Unexpectedly long route to what seems the UK?

What can we do to get consistent performance between Fly and Tigris?

can you try using fly.storage.tigris.dev instead of t3.storage.dev?

Thank you, Lilian, this indeed makes a difference. I was under the impression, this fly-specific end-point had been deprecated because it is not presented anymore when creating a new bucket.

The performance is indeed a lot better but still varies substantially and the minimum request duration is still at around 440 ms:

Found 6537 objects.
 0.52    9.10 kB  0.02 MB/s, 3c00fb06-0f0f-4697-b304-4ac14833c8a9.dcm.head.enc
 0.70  304.02 kB  0.43 MB/s, a964471d-dbd1-4aac-961c-a457b1e0446c.dcm.enc
 0.99    9.10 kB  0.01 MB/s, 6bcaa056-9d4e-4b44-a471-000c88e75449.dcm.head.enc
 1.60    9.10 kB  0.01 MB/s, c0f2c919-5ab7-4110-9b93-70f420891be6.dcm.head.enc
 0.64  304.02 kB  0.48 MB/s, bd1359fb-25a8-4dc6-8f7c-8b862fac973a.dcm.enc
 0.61  304.02 kB  0.50 MB/s, ae6b3c40-c845-43b7-b57e-82c0b543306f.dcm.enc
 1.10  304.02 kB  0.28 MB/s, 853ef409-bedd-4230-9255-bc00c6c720d0.dcm.enc
 0.66  304.02 kB  0.46 MB/s, 7f6c2dce-eb5d-4c3b-930c-d9a0b0e0df0e.dcm.enc
 0.50  304.02 kB  0.61 MB/s, 605bed82-217c-41ff-aac8-953c1e2db08e.dcm.enc
 0.50    9.10 kB  0.02 MB/s, 8f9df941-3e9e-47dd-b445-bcfff4f3720d.dcm.head.enc
 0.79  304.02 kB  0.38 MB/s, b50e7207-3bf1-4751-bdca-b10ed3ce2584.dcm.enc
 1.95  304.02 kB  0.16 MB/s, 8cc0d888-2fca-4644-bf9e-8672eaa8012d.dcm.enc
 1.12    9.10 kB  0.01 MB/s, 3da1a3b5-efc2-4627-b974-19488f63ae2c.dcm.head.enc
 0.44    9.10 kB  0.02 MB/s, 409fee47-63ab-4816-b9a0-8cbff8b9c031.dcm.head.enc
 0.56  304.02 kB  0.54 MB/s, 07131bf9-8cf8-4ac8-999d-c660f7370c7e.dcm.enc
 0.66  304.02 kB  0.46 MB/s, 70a014c2-6148-446f-970e-f4be863c7380.dcm.enc
 0.81  304.02 kB  0.38 MB/s, aa8b20fe-dbd7-4d25-a2d9-5ab25ae01a1a.dcm.enc
 0.81  304.02 kB  0.38 MB/s, 8303cecc-88e7-46d8-b7d1-5e5123543c23.dcm.enc
 0.71    9.09 kB  0.01 MB/s, 01747213-f8b1-4b86-aad0-bc0ff0519902.dcm.head.enc
 0.57    9.10 kB  0.02 MB/s, 607703e3-7554-4f81-a5e7-d84dfe098634.dcm.head.enc
 1.84    9.10 kB  0.00 MB/s, 624511f4-40ee-42d6-8cd2-6aa5cbcc9329.dcm.head.enc
 1.73  304.02 kB  0.18 MB/s, 403a3fbb-a789-453c-8c60-02d5eaf36475.dcm.enc
 0.48  304.02 kB  0.63 MB/s, 1bb0702d-5ff7-40cb-8693-34a3361889ea.dcm.enc
 1.10    9.10 kB  0.01 MB/s, e94c81c1-ff2e-4d02-90a2-1af3feba1243.dcm.head.enc
 1.21  304.02 kB  0.25 MB/s, 8a19fa63-0e76-4906-ab03-13f101b55696.dcm.enc
 1.25  304.02 kB  0.24 MB/s, def04554-c5df-456b-83ae-78fbe1164f8d.dcm.enc
 1.35  304.02 kB  0.23 MB/s, 933cb48d-66a0-4ffa-830f-58bfebc5b6bf.dcm.enc
 0.60    9.10 kB  0.02 MB/s, 75d9f0f2-e9b8-48e6-87d6-f04e8a23aa97.dcm.head.enc
 1.18  304.02 kB  0.26 MB/s, c931bb0f-c132-4806-88ac-ef7a6f648f1c.dcm.enc
 1.53    9.10 kB  0.01 MB/s, 7dca390b-b019-41a3-ba94-7409f8e5aec1.dcm.head.enc
 1.19  304.02 kB  0.26 MB/s, 1ccef9ce-0cfb-41f0-95bf-1933cc02d6f8.dcm.enc
 0.58  304.02 kB  0.52 MB/s, 6690c713-4c5c-4e43-95c6-4f94daa552cb.dcm.enc
 0.98    9.10 kB  0.01 MB/s, 31828da8-17f0-42be-a210-76e1fe847a7b.dcm.head.enc
 0.63  304.02 kB  0.48 MB/s, b9b7cdf4-437e-4325-9e57-bf761288b1da.dcm.enc

The problem with the t3 endpoint is that it seems like its DNS decided to resolve you to an IP in ord, so effectively your requests were going fraordfra. The fly endpoint is guaranteed to always route closest.

I’ve also reported this DNS issue to Tigris.

Thank you! Any idea why the transfers still have so much overhead? The numbers are created using a script that uses the Tigris CLI but they about match what I see in the client code, too.

Looking into it on our side as well, it’s most likely a geo routing issue.

I’m tempted to say these seem to be normal file ops delays, but just to rule out anything remaining on the networking side, can you add a flyio-debug: doit to some of the request headers if you can? That will give us access to some more info about how each of your requests were handled. Of course this only works when you use the fly.storage.tigris.dev domain.

Well, this comparison surely would make you hope otherwise:

About the fly-specific end-point: Looking at mtr output, this seems, it is reachable directly on the 6PN. I would guess, this host is a proxy as well. Can I get information how the traffic is routed past this end-point?

yes, every server that runs Machines also runs proxy.
if you make a http request with the flyio-debug: doit header, a Fly.io staff member can look at the logs to see the path the request takes to a Tigris backend instance, but unfortunately this information isn’t possible to access otherwise. (similarly, a mtr only gets you to the edge server, not to the backend instance) I was wrong, this doesn’t work with tigris unfortunately

The t3 endpoint uses GEO DNS so that the requests are routed to the nearest pop. In this case it seems like the geo dns entries are incorrectly tagged so the requests in FRA are incorrectly getting routed which is why you were seeing slow performance and high latency. BTW we don’t own the Geo IP database, so this is not something we can change on our own. @PeterCxy I believe is looking into get the entry in the Geo IP database fixed.

The t3 endpoint is our high performance deployment, which is also what the blog post you have mentioned references.

To update on this, the fra nodes should be fixed in Maxmind already. We’re also looking into ways to make the GeoIP behavior consistent at least between Fly Machines and Tigris, since we do also run our own DNS recursor.

That’s great.