Hi everyone! I have a process that needs to fetch data from websites online and store them in a fly io postgres database that other fly io apps will use.
I could run the process locally but that will require that I store the database credentials locally, this isn’t optimal.
So the question is: Can I run a script like this on fly.io? And if yes, what is the optimum way to do it?
I guess it depends on what you need (e.g just text/html, for a search index, vs rendered pages for screenshots).
You could run something like that on Fly. Any language/library will have some way to make a web request and so fetch a site as HTML. if you want rendered output, e.g Puppeteer https://pptr.dev/ for Node. I can’t imagine there is any ToS issue with it as Fly have an official guide for doing just that:
You may want to invoke that manually but if you wanted to automate it (e.g fetch a site every hour to look for changes to a site) you would need some kind of cron. There isn’t an official Fly cron service (that I know of) but that can be done too e.g with supercronic:
What I have is a script that takes a few hours to run… It should only run once (not a server)
As Greg pointed out, supercronic works nicely enough… as do scheduled Machines: New feature: Scheduled machines
The docs around scheduled machines are scant; though, one could take hints from other Machine docs: Guides and Examples · Fly Docs