I have a workflow that produces up to hundreds of gigabytes of data on a Fly machine running in iad and pushes it into Tigris. In trying to tune things, I’ve found that the upload is my current bottleneck. I’ve tried a number of combinations of default.s3.max_aws CLI configurations but the best I can get is ~100MiB/s upload (and download).
Is this the limit I should expect to see out of a single machine-Tigris connection?
I think I used to see much faster downloads in the past.
For what it’s worth, running the speedtest CLI on that machine yields 777.15 MB/s download and 110.26 MB/s upload over the external internet but I would have expected that since Tigris is hosted on Fly, the throughput would be higher.
I confirmed that disk performance isn’t at fault.
I’m getting reads and writes from/to the persistent attached volume around 900MB/s according to hdparm and dd
I believe the maximum bandwidth you can get between a Fly machine and another Fly machine is 1Gbps. Because of this, you will see the maximum bandwidth restricted between a single Fly machine and Tigris hosted on Fly.
And BTW, the limit being 1Gbps (around the 100MiB/s I’ve been seeing) is a totally acceptable answer for me. I just want to know if there is something I can do to make things faster or if I should move on and start scaling horizontally.
You can scale horizontally and add more machines. Once the workload is distributed across these multiple machines, you will see higher aggregate throughput.