Best practice for large machine ↔ Tigris transfer?

infovore · March 28, 2025, 6:14pm

I’ve got a project where I’d like to process some large files that are stored on Tigris. (Large = 1-10gb).

My process is:

on HTTP trigger, copy file from Tigris to fly machine (currently using Node AWS S3 SDK v3)
do some processing to make more derivative files (primarily at the commandline with tiffsplit and vips)
send all derivative files back to Tigris
delete everything locally.

Or, at least, that’s the plan.

However, download performance from Tigris feels painfully slow, and that’s before I’ve got to working on the files. I’ve tried various combinations of CPU, RAM, shared/performance and… it just feels like a network constraint.

Is there something I’m missing?

About the best performance I’ve had so far is mounting Tigris via geesefs, and just using cp. But I’m trying to work out if there’s something else I should be doing here.

Yevgeniy · March 28, 2025, 7:31pm

Hi, Infovore,

Your workflow looks solid, and we’re happy to help investigate the slowness you’re experiencing.

Could you share your bucket name with us at help@tigrisdata.com so we can take a closer look?

morse · March 31, 2025, 10:33am

What speed do you get downloading files? Is the bucket in the same region as the machine?

infovore · March 31, 2025, 4:45pm

It’s Tigris, so the default bucket location is global; the machines are lhr. I could restrict it, I suppose?

Speed is… about 15 minutes to download 3.8gb, so around 4 and a bit megabytes a second, which is roughly what I get on my domestic connection.

morse · March 31, 2025, 5:21pm

I would try restricting the region to the same as the machines, I guess then the data would be in the same data center as the Tigris bucket

infovore · April 1, 2025, 8:58pm

It looks like the issue was related to Tigris configuration following issues in the lhr region - I was inadvertently moving data further than planned. A quick prod suggests data is coming down a lot faster now, to the extent the task I’m working on feels viable again.

ovaistariq · April 1, 2025, 9:43pm

Just to add to what you said, we had deactivated the LHR region on our side temporarily, as Fly.io had some capacity issues there. Once, we verified that the capacity issues were no longer there, we reactivated the LHR region on our end.

system · April 8, 2025, 9:43pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Tigris Billing for inbound data from Fly→Tigris Questions / Help tigris , billing	4	106	March 21, 2025
Expected throughput between Tigris and Fly Machine tigris	9	43	May 2, 2025
Downloading a file from Tigris after it has been uploaded is very slow Questions / Help tigris	8	110	February 3, 2025
Tigris upload infinite timeout lhr , tigris	4	39	March 24, 2025
Question about bandwidth pricing, Machines and Tigris Questions / Help tigris , billing	2	65	February 17, 2025

Best practice for large machine ↔ Tigris transfer?

Related topics