In my case I have three processes, plus LiteFS:
- Nginx
- Bun application
- Rust Executable
What is the best way to set the Dockerfile and the fly.toml?
In my case I have three processes, plus LiteFS:
What is the best way to set the Dockerfile and the fly.toml?
Perhaps this page will help?
“best way” would depend on what you want. If you want to scale these separately, go with separate VMs (process groups or apps). If there is a strict one-to-one mapping, go with bash/supervisord/procfile.
You could even have your bun or rust executable launch the other processes.
My idea is to use Process groups, but I’m not sure it is the best approach for this case. Do you have any example of something similar?
tl;dr: you probably want to follow the link from rubys, above. perhaps start with a bash script that backgrounds things and then iterate
it’s not clear what the use-case is, but listing some tradeoffs might help you determine what makes sense (i spent a decent amount of time this past weekend on running multiple processes and settled on the first option, below, with hivemind
–i’ll add some details for my setup which i don’t claim is optimal).
from my read on the lay of the land the options are:
fly.toml
use a process manager (bash/Procfile manager/init-like-supervisor)
this is the approach i took because it gave me a http service, background worker, and cron scheduler which could all interact with the primary litefs db (and i didn’t have to figure out HALT
locks, the litefs proxy, the fly replay header, or figure out a different option for the scheduler).
i ended up using litefs mount
to mount the database, run any migrations, and then run hivemind
. in turn, hivemind
is responsible for running the http server, background worker, and supercronic
which are all defined in my Procfile
.
this was a simple-enough approach to getting all the processes to wait for the sqlite db to be ready before starting up. hivemind is pretty simple so i think it’s possible the cron and background worker could stop running and they wouldn’t get restarted (but it doesn’t pull in python or tmux so that seemed fine for now)
use machine processes
this works essentially the same way as above, but has the downside (or upside) where if any given process blows up the machine gets shutdown.
that said, this is already pretty cool. it feels more correct than stitching together a user-space process manager and the different processes can have different images (each http service could be running in a docker image with only the application code and then have a telemetry agent running in an isolated process on the same VM)
that said, i’m a big fan of being lazy and adding a curl command to a Dockerfile
and using my Procfile
seemed really, really easy.
use apps v2 process groups in
fly.toml
this is a step along the way to separate apps. with process groups, one ends up with one Firecracker VM per process group replica. per the docs:
fly deploy
creates at least one Machine for each process group, and destroys all the Machines that belong to any process group that isn’t defined, in your app’sfly.toml
file.
so this would be nice in the case where one wants n http service replicas and for each region maybe a memcached
process (though that could be done with a separate memcached
fly.toml
App, as well).
what follows, i believe to be the case but would love to be aggressively corrected if it’s incorrect:
dig
would be able to answer this).use different fly apps
this would be how one would stand up postgres or some other piece of shared something. as far as i can tell (i didn’t read things exhaustively, or do a bunch of sleuthing), process groups are some syntactic sugar around Apps for the extremely common use-case of having side-car like things users want to run logically associated with their application.
i could see a very common pattern being:
memcached
as part of their volcano eruption detection app “volcaNooo”app
+ memcached
as process groups@woodlandsquid Thanks for the extended reply.
I am working on an example using Query with another service that shares the same LiteFS cluster. Nginx will redirect requests to the necessary service as required. I’m trying the process groups option to make it easier for the user. However, I am facing some challenges because I have to use the same Dockerfile for each service inside the Fly app, which seems unnecessary and causes some problems. Please correct me if I’m wrong about it. For my use case, the ideal scenario would be to define multiple Dockerfiles within an app, as I want to create a Fly app which includes various independent services. By “independent”, I mean the ability to deploy each service when it’s needed
~I’m going to try a multi-app approach ~
It is not possible to use the multi-app approach since I will have only one primary. Both the Query and App require access to the primary to write.
This is the final approach, running both processes in the same VM, but having a Dockerfile for all makes it messy. I have to give it a thought to provide a low-friction solution. Do you have any ideas? Your suggestions and opinions are welcome
i’d love to see other thoughts, too! (mostly, selfishly because that’d help me internalize the limitations of litefs)
for my use-case larding up the container image made sense (it was a couple of binaries and i approached it like a particularly-constrained interface to generating a VPS image).
adding the additional services/processes to the container also seems like the way to do this without using separate fly apps and each of process groups, user-space process manager, and machine processes currently rely on using a single container.
that said, after looking at Query (which looks cool, i need to spend more time looking at it) i think as you’ve built it, it slots well into being a separate fly app–especially depending on the usage patterns you’d expect between App and Query.
i’m thinking nginx could be a separate fly app and proxy requests to <app-name>.internal
and <query-name>.internal
(as appropriate based on path or whatever) in this case each of the NGINX, Query, and App would be separate apps. App needs to write to the primary but Query has exposed an HTTP API which would work for that, so App would need to be modified to use the HTTP Query API for modifying data and Query would be the primary. Alternatively, if Query is rarely going to be used for altering data it could use the HALT
lock to perform the occasional write on the primary, in this case the primary would be the App.
so the new end-state deployment would be fly proxy to nginx (nginx deployed as an app), nginx does its reverse proxy thing as appropriate to either the Query app or the App app.
and then either:
HALT
lockthat could be tested out pretty quickly by dropping nginx and using ports to expose Query and App as separate fly apps.
I was trying not to interfere with what the Apps were doing so they could decide to use an ORM or query directly to the database, and Query will act as an extra layer to use Query Studio or as an API for other services. That would be the ideal scenario, but the complexity of the architecture to reach it doesn’t seem to be entirely worth it. As you have pointed out, it can be solved using Query as the only service connected to LiteFS and as the source of truth to connect to the databases.
I still have mixed feelings and I would like to try a few cases to see if it’s feasible.
i wonder if using Query withHALT
lock for in-region cross-machine writes to the primary would be fast enough for most management use-cases and still be a great experience. if that’s the case Query and App could run as different apps (and then leave the reverse proxying as an exercise to the reader?). i can probably set up some workers to run with HALT
lock without too much work to test out what that lookslike and get some metrics for the different worker processes.
I have decided to have both processes on the same virtual machine to benefit from reading and writing directly from memory. I have added a simple proxy to Query, making it straightforward to send requests to the App process from Query. Nevertheless, this approach allows the LiteFS proxy to handle the reads and writes from the primary and the replicas efficiently.
There is always the option to use Query as an API service and have different VMs for the App and Query.
This way, we have covered both scenarios.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.