Downloading a file from Tigris after it has been uploaded is very slow

I run a distributed app where files can be uploaded from one region and then downloaded from the other side of the planet.

I’ve been seeing a weird behavior in which a file download speed is EXTREMELY SLOW during a first download, but subsequent downloads seem to be fast enough. It’s very hard for me to replicate the issue to show, but it happens.
I’m also using Data Migration which is mirroring files from a Supabase Storage in US east 2.

I’m not entirely sure how this works, but when a file gets uploaded to Supabase Storage, it also gets uploaded to my Tigris bucket. Then, that file can be downloaded from a server in Europe and sometimes the download speed is very slow.

Is this because the file doesn’t get automatically globally distributed when first uploaded? If this is the case, would it be solved by Caching on PUT (Eager Caching) described here? If so, would this change apply to the whole bucket? Even for writes from Data Migration?

I have several users complaining about slow download speeds. And as mentioned, I’ve also seen slow download speeds when downloading assets from a GPU cloud provider that has servers in Europe and Asia.

Please advise.

@empz Thank you for explaining the use-case. If I understand correctly the file gets written in us-east2 and then reads in Europe? If that’s the case then it is possible that the first read is slow because we lazily move it to Europe on access and future reads will be fast(as you’re seeing it as well) because blocks are already in Europe.

If you want to optimize the first read as well then other option is to use multiple-regions-upload feature. This can be set on bucket level or can be done using headers during put as well. You can select one of the Europe region along with your usual region and that way the first read will be fast as well. One caveat though in this case you’ll be charged for the number of blocks that gets stored.

What about Caching on PUT? Wouldn’t that help? Object Caching | Tigris Object Storage Documentation

The thing is I don’t know where the first read will happen from. It could be from an user anywhere in the world, or it could be from a GPU cloud provider service that will use whatever server they have available that could also be anywhere in the world. It’s not a simple case that the file is written in us-east2 and read in Europe.

You mentioned you lazily move it to other regions on access. I’m looking for a way to distribute it across all regions on write.

Yes, this will also work and this is done during Put in async manner. But I just realized that you’re using shadow bucket in a write_through mode. Unfortunately, we haven’t added this feature for the shadow bucket(data migration) flow yet. We can add this feature for the shadow bucket as well but it may take some-time to release it or if you have non-shadow buckets as well where you are uploading assets directly to Tigris then you can enable this feature.

The shadow-bucket will be disabled soon so the files will be uploaded directly to Tigris.

Could you please elaborate on the extra costs incurred when enabling Eager Caching on PUT?

Thanks

There is no extra cost for eager caching on PUT.

Are you 100% sure? Because the docs says that Cache-on-Read is the most cost-effective, implying that any other kind of caching incur in extra costs.

Cache-on-Read is a more efficient way of resource utilization because it will only cache the objects in regions where they are accessed, while Cache-on-Write may end up caching objects in regions where they are never accessed.

At the moment, there is no cost difference between Cache-on-Read and Cache-on-Write. But it is possible that in the future there would be cost attached to Cache-on-Write. We just don’t have enough users right now who utilize the Cache-on-Write feature, so not enough data for us to come up with a pricing model for it.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.