I"m sending streamed data over http via server sent events (mime type text/event-stream). I get inconsistent results when accessing it via fly’s network. It seems like not all data is returned until a certain amount is in the buffer, but that’s just my impression.
When ssh into my instance and connect directly to it with curl I get the results immediately as expected.
So does fly do response buffering for http? And if so, is there a way to turn it off? Or is my problem likely something else entirely?
I’m trying to use a text/event-stream response in my fly application and I’m seeing response buffering/cancellation of the stream events. Specifically in my production app I’m seeing requests show up every 5 or so seconds instead of every second. In a simple reproduction that sends the time every second, it sends about 11 events every second before freezing and failing to send any more events. My production use case is implementing open ai chat completion streaming (like chatGPT) so the response buffering makes the response streaming/typing feel way more clunky than it should.
No significant updates to share at this point, but we are looking into this issue and will keep you updated. Thanks for the reports and especially the detailed reproduction steps!
@cbarlow I’ve been trying to reproduce this issue just now and I couldn’t. I mean, your app definitely shows the issue, but I launched my own app and it works correctly.
I’m using curl to test it (I also used it for your app) and I got events every second:
❯ curl -i 'https://sse-test.fly.dev/sse' -N -s | ts
Aug 02 15:04:55 HTTP/2 200
Aug 02 15:04:55 content-type: text/event-stream
Aug 02 15:04:55 cache-control: no-cache
Aug 02 15:04:55 date: Wed, 02 Aug 2023 19:04:55 GMT
Aug 02 15:04:55 server: Fly/a0b91024 (2023-06-13)
Aug 02 15:04:55 via: 2 fly.io
Aug 02 15:04:55 fly-request-id: 01H6VT72E1WNKP16AJ3HZF4M0Q-yul
Aug 02 15:04:55
Aug 02 15:04:55 event:message
Aug 02 15:04:55 data:2023-08-02 19:04:55.492167575 +00:00:00
Aug 02 15:04:55
Aug 02 15:04:56 event:message
Aug 02 15:04:56 data:2023-08-02 19:04:56.493302526 +00:00:00
Aug 02 15:04:56
Aug 02 15:04:57 event:message
Aug 02 15:04:57 data:2023-08-02 19:04:57.494517838 +00:00:00
Aug 02 15:04:57
Aug 02 15:04:58 event:message
Aug 02 15:04:58 data:2023-08-02 19:04:58.495684459 +00:00:00
Aug 02 15:04:58
Aug 02 15:04:59 event:message
Aug 02 15:04:59 data:2023-08-02 19:04:59.495839673 +00:00:00
Aug 02 15:04:59
Aug 02 15:05:00 event:message
Aug 02 15:05:00 data:2023-08-02 19:05:00.496994892 +00:00:00
Aug 02 15:05:00
Aug 02 15:05:01 event:message
Aug 02 15:05:01 data:2023-08-02 19:05:01.498175149 +00:00:00
Aug 02 15:05:01
Aug 02 15:05:02 event:message
Aug 02 15:05:02 data:2023-08-02 19:05:02.49918752 +00:00:00
Aug 02 15:05:02
Aug 02 15:05:03 event:message
Aug 02 15:05:03 data:2023-08-02 19:05:03.500206885 +00:00:00
Aug 02 15:05:03
Aug 02 15:05:04 event:message
Aug 02 15:05:04 data:2023-08-02 19:05:04.501766409 +00:00:00
Aug 02 15:05:04
Aug 02 15:05:05 event:message
Aug 02 15:05:05 data:2023-08-02 19:05:05.50297083 +00:00:00
Aug 02 15:05:05
Aug 02 15:05:06 event:message
Aug 02 15:05:06 data:2023-08-02 19:05:06.504171856 +00:00:00
Aug 02 15:05:06
This is the Rust app’s code:
use std::{convert::Infallible, time::Duration};
use axum::{
response::sse::{Event, KeepAlive, Sse},
routing::get,
Router,
};
use futures::Stream;
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() {
let app = Router::new().route("/sse", get(sse_handler));
// run it with hyper on localhost:3000
axum::Server::bind(&"0.0.0.0:3000".parse().unwrap())
.serve(app.into_make_service())
.await
.unwrap();
}
async fn sse_handler() -> Sse<impl Stream<Item = Result<Event, Infallible>>> {
// A `Stream` that repeats an event every second
let stream = futures::stream::repeat_with(|| {
Event::default()
.event("message")
.data(time::OffsetDateTime::now_utc().to_string())
})
.map(Ok)
.throttle(Duration::from_secs(1));
Sse::new(stream).keep_alive(KeepAlive::default())
}
@cbarlow@alex-appload are you sure there’s no buffering happening in your app? Possibly different in production compared to local development.
One thing that bothers me is it takes ~11 seconds to send back the headers from the remix repro app. Sending back headers shouldn’t be blocked by anything since this is a streaming response, unless there’s processing going on in there (but there isn’t in the remix app).
Interesting that you are not able to reproduce with rust. I don’t believe there is any buffering on the remix side. Wouldn’t the code sandbox deployment working rule out remix adding in any buffering? I suppose I can try to deploy it somewhere else and see how it acts if that would help.
To make it work I changed the docker command from “litefs mount” to “npm run start” and the issue went away. I assume it could be related to consul. So maybe this is reproducible by using litefs with any server. I dont think remix is the issue here.
That’s a good point. The proxy response copy doesn’t flush automatically. I added an issue on the LiteFS repo and I should be able to get a fix pretty quickly for that.
@eugen1993 I implemented a fix (#416) and cut a new LiteFS release (v0.5.8). Streaming responses should work fine now. Let me know if you have any issues.
You’ll just need to update your LiteFS version in your Dockerfile: