Tigris global storage released a migration tool: shadow buckets

What if we let you define the cache control header at the bucket level? That should solve for what you are looking to do.

Yeah, writes wouldnā€™t propagate. Depending on what one expects from the shadow bucket mechanism, this might or might not matter.

In any case, being able to define arbitrary default headers (or at least cache-control) that would be used as a fallback when an object being written is missing the required values, would be a welcome addition.

In the Tigris website they say egress is free. Is this correct?

How long does the replication to a new region take?

Is the object originally stored in the region where it was uploaded?

We should have the feature available very soon that will allow you to set a default cache-control header at the bucket level. I will let you know once this feature is available.

1 Like

Yes, it is correct that egress is free.

Regarding object storage and replication, the object is indeed stored in the region where it was uploaded. That means you can have a bucket where some objects are stored in Frankfurt, some in San Jose, and some in Sydney. Replication happens in a few second and it is access based, meaning if we learn that the object is being accessed in a different region we replicate it there.

1 Like

Thanks for the explanation @ovaistariq .

So how long does an object stay in other regions?

What happens if an object is overwritten? Is this replicated to other regions?

That depends on the access pattern. If it continues being accessed, it will continue to be stored in other regions. Yes if the object is overwritten it gets re-replicated.

1 Like

But how long will it stayed cached on the edge if no more requests are accessing a file in a region?

Iā€™m asking because this is a problem Iā€™m having with Cloudflareā€™s CDN. Their edge cache time is quite erratic. Of course when something is accessed frequently it will stay cached on the edge, but itā€™s very rare that this happens in a single region.

My current project is an audio hosting service and Iā€™m very interested in improving the performance of our hot files currently stored in B2 US West. We have users all over the world and some have complained about delays when the file is not cached close to them. Iā€™m wondering if using Tigris would be an improvement over Cloudflareā€™s CDN.

It depends on the file size; small files (<= 1MB) get purged from the cache sooner, as in <1 hour, because it is fast to recache them. Large files remain cached for a couple of days. How large are the files in your case? Since these are audio files, I assume they will be in a couple of MBs range. Our docs have some more information about caching.

Tigris should work well for your use case. You could store the data directly in Tigris instead of B2 and get the data distribution automatically.

Let me know if there is something unclear in my response.

1 Like

When does replication for redundancy occur (not just the cache)? For example, if the Sydney region had an outage, can files originally uploaded to that region be served by other regions?

You should be able to set default cache control header for the bucket via the dashboard under bucket settings. Could you try it out and let me know how it goes?

1 Like

Replication for redundancy is separate and is always on. That one is not dependent on access.

2 Likes

Yes +99% of MP3s and FLACs weā€™re storing are above 1MB.

Thanks for all the info @ovaistariq .

Tigris seems very well suited for our use case and will give it a try in a couple of weeks.

1 Like

Thatā€™s great to hear. I am looking forward to you using Tigris.

1 Like

Thank you, it works nicely.

Thatā€™s great.

I donā€™t suppose thereā€™s a possibility that I will be able to restrict which regions may cache data from my buckets in the future? I have hundreds of thousands of sub-megabyte files, so until I also have a great many users, the chances of finding a file in the cache are pretty low with such aggressive eviction times. The chances are further reduced with every additional region that Tigris runs in.

Or what if I could pay extra for storage in order to designate a couple of regions to serve as higher level caches/replicas that wonā€™t ever evict objects as a means of combating latency for intercontinental requests? Say I create a bucket in Frankfurt and upload a million files. Then I choose that I want to replicate the contents in San Jose and Singapore (either on demand or all of the objects immediately). A request from Seattle would first be routed to the ephemeral CDN in Seattle, which would contact the San Jose replica and then Frankfurt as the last resort. At the end of the day, in terms of storage, I would be billed for the million files in Frankfurt and however many object copies exist in the replicas.

2 Likes

That sounds like a reasonable feature request to me.

What we could do is, at the time of object upload, allow you to specify which additional regions to store the data in. We would then store these copies of data for as long as the object exists (in the same sense as replicated copies).

Let me talk to the team about this.

2 Likes

BTW @ovaistariq do you have any tiered cache strategy?

What I mean isā€¦

Say the origin file is in NY and a user from Tokyo is requesting a file. Maybe they can get the file from a closer location like Singapore instead of having to go across the world.

We are implementing the functionality to tag objects with multiple regions where they will be stored. Do you want to be able to specify it on a per-object basis or at the bucket-level? Secondly, could you also share the average object size? If you are uncomfortable sharing this, we can move this conversation to help@tigrisdata.com