I have a chat integration with OpenAI on my app that is super snappy in my local environment. But in production, the streamed responses are very, very slow.
I don’t have many users on the app, so I can’t imagine that this is a resource allocation issue. I’m experiencing the lag even when I can view the logs and see that I’m the only active user.
What are some steps I could take to diagnose where the bottleneck is and why the app performs so much worse on Fly than on my MacBook Air?
Thanks!