Launch and Manage Cron Jobs Effortlessly

Over the past year, we’ve noticed that many users have encountered challenges while trying to integrate Cron into their projects. In response, we’ve developed a new open-source project designed to help simplify the process.

The project can be found here: Cron Manager

Notable Benefits and Features

Isolated execution

Each job runs in its own isolated machine, preventing issues such as configuration drift, accumulation of temporary files, or other residual effects that could impact subsequent job executions. This isolation ensures that the outcome of one job does not negatively influence another, maintaining the integrity and reliability of each job.

Centralized Scheduling

Manage all your Cron jobs centrally with a simple JSON configuration. This approach removes the need to embed cron dependencies within each production environment, streamlining setup and modifications. The use of a version-controlled configuration file enhances maintainability and auditability of scheduling changes.

Simplified updates

Machines dedicated to specific Cron jobs are ephemeral and do not require updates. Any modifications to the schedules.json file will automatically be applied the next time the machine is provisioned for a scheduled job. This eliminates the need for ongoing maintenance of job environments, resulting in a more efficient update process.

Enhanced Logs and Monitoring

Operating separate machines for each job greatly simplifies monitoring and auditing. This setup allows for straightforward tracking of the outcomes and logs of individual jobs, facilitating easier debugging and performance analysis.

Why the Standalone App?

While setting up a separate application just to handle a single cron job might seem excessive. There are several compelling reasons why running this within it’s own App can still be beneficial:

Efficiency and Flexibility

The solution is designed to be lightweight and flexible, ensuring that the resource footprint remains minimal. This approach allows for quick adaptations and modifications without significant overhead.

Environment Agnosticism

By rolling this into its own project, we avoid making assumptions about your specific deployment environment. This separation ensures that the cron job manager can operate independently of the various systems it might interact with, enhancing compatibility and ease of integration.

Isolation of Dependencies and Scheduling

Dependencies and scheduling are completely isolated from your production environments. This means there is no need to modify or overload your Dockerfiles or other configuration files, keeping your production environments clean and focused solely on delivering their intended services.

Paving the Way for Future Features

This project also serves as a testing ground for a potential native cron feature. By isolating it in this way, we can experiment and refine the functionality without impacting existing setups and gather insights/feedback that can inform the development of a more integrated solution in the future!

Feedback

We’re eager to see how this can streamline your Cron job management. Give it a try, share your feedback, and help shape the future of Cron management on Fly!

14 Likes

Added help-me-help-you

@pmbanugo

1 Like

thanks for sharing Jay

Hello, I deployed the cron-manager, however the set up job is not being executed. I get this error:

2024-04-27T11:21:01.753 app[6e82992c129ed8] ord [info] api | INFO[0492] Preparing job... app-name=finohra job-id=1483 schedule=update-invoice-status

2024-04-27T11:21:01.844 app[6e82992c129ed8] ord [info] api | ERRO[0492] job processing failed app-name=finohra error="failed to launch machine: failed to launch VM: You must be authenticated to view this." job-id=1483 schedule=update-invoice-status

2024-04-27T11:21:01.844 app[6e82992c129ed8] ord [info] api | ERRO[0492] failed to process job error="failed to provision machine: failed to launch machine: failed to launch VM: You must be authenticated to view this."

Here is what my schedules.json looks like:

[
  {
    "name": "update-invoice-status",
    "app_name": "finohra",
    "schedule": "0 * * * *",
    "region": "fra",
    "command": "bundle exec rails runner 'UpdateInvoiceAndPaymentStatusJob.perform_now'",
    "command_timeout": 60,
    "enabled": true,
    "config": {
      "metadata": {
        "fly_process_group": "cron"
      },
      "auto_destroy": true,
      "disable_machine_autostart": true,
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 512
      },
      "image": "ruby:3.3.0-slim",
      "restart": {
        "max_retries": 1,
        "policy": "no"
      }
    }
  }
]

I tried resetting my token:

fly secrets set FLY_API_TOKEN=$(fly auth token),

and logout/login:

flyctl auth logout,
flyctl auth login,

I also tried to created fly deploy token:

fly tokens deploy -a finohra-cron-manager

and then I set the fly secret with the generated token:

fly secrets set FLY_API_TOKEN="FlyV1 fm2_lJPECAAAAAA...MBD9Tw=" -a finohra-cron-manager

and redeployed again, but it did not help. Can you please advise how to fix this? Thank you.

fly tokens deploy -a finohra-cron-manager

This will generate an app specific token, which won’t work if you’re trying to create a machine in a different app.

I would suggest generating a org wide token:

fly tokens org <org slug>

Let me know if that helps!

Thank you for the suggestions. I was able to resolve the “authentication” error by creating and setting a deploy token, but now I’m getting different errors. Someone reopened the original topic, so I kindly ask to respond there, to avoid duplication: cron-manager failed to provision machine error - #5 by janfly

Rudimentary question - is this fly specific… where does this state (jobs and schedules) reside and how does fly know when to launch a new machine for a job. Basically, does this take up any resources when no jobs are running?

where does this state (jobs and schedules) reside

Within a sqlite db under /data

how does fly know when to launch a new machine for a job.

This isn’t a platform level feature, the logic is contained within the App.

Basically, does this take up any resources when no jobs are running?

Yes, however, you should be able to maintain this with minimal resources.

1 Like

@shaun Does this still work? I have a basic curl set up in my cron schedule, but it’s stuck in “running” based on the output of cm jobs list 1. I’ve confirmed the intended target has not received a GET and that I can curl it from my machine.

schedules.json

[
  {
    "name": "trigger-job-enrichment",
    "app_name": "startcast-api",
    "schedule": "10 * * * *",
    "region": "bom",
    "command": "curl https://REDACTED/cron",
    "command_timeout": 30,
    "enabled": true,
    "config": {
      "metadata": {
        "fly_process_group": "cron"
      },
      "auto_destroy": true,
      "disable_machine_autostart": true,
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 512
      },
      "image": "ghcr.io/livebook-dev/livebook:0.11.4",
      "restart": {
        "max_retries": 1,
        "policy": "no"
      }
    }
  }
]

fly.toml

# fly.toml app configuration file generated for cron-manager on 2024-04-11T14:24:13-05:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#

app = 'starcast-cron-manager'
primary_region = 'bom'

[[mounts]]
  source = 'data'
  destination = '/data'

[[vm]]
  memory = '512'
  cpu_kind = 'shared'
  cpus = 1

cm jobs list 1

|----|----------------|---------|-----------|-------------------------|-------------------------|-------------|
| ID | MACHINE ID     | STATUS  | EXIT CODE | CREATED AT              | UPDATED AT              | FINISHED AT |
|----|----------------|---------|-----------|-------------------------|-------------------------|-------------|
| 2  | 48ed67ea3d4018 | running | 0         | 2024-08-22 21:10:01 UTC | 2024-08-22 21:10:04 UTC |             |
| 1  | 080e47df109368 | running | 0         | 2024-08-22 21:00:01 UTC | 2024-08-22 21:00:04 UTC |             |
|----|----------------|---------|-----------|-------------------------|-------------------------|-------------|

It looks like you’re using the example image, which may not be intended.

You should be able to see what’s going on by looking at the individual machine logs:

fly logs -i 48ed67ea3d4018 --app startcast-api

Oh I assumed it was some random flavor of Unix, should’ve been the first place I looked :slight_smile: Still getting used to Fly logs, assumed everything in the UI was all the output from the machine

Will retry with a proper distro, assume this is working unless I message here again. Thanks!

EDIT: Maybe the docs/README could emphasize that the image should be updated since I assumed it was a working template and blew through setting it up!

@shaun Okay I might be going crazy here, likely because I don’t exactly know how the cron-manager works under the hood.

I tried this with ubuntu, it clearly says curl doesn’t exist; fair that isn’t included. I tried an apt-get install curl; curl my_url and it hit me with this:

2024-08-22T22:50:07Z app[784e66dc214478] bom [info]E: Unable to locate package curl
2024-08-22T22:50:07Z app[784e66dc214478] bom [info]E: Unable to locate package &&
2024-08-22T22:50:07Z app[784e66dc214478] bom [info]E: Unable to locate package curl
2024-08-22T22:50:07Z app[784e66dc214478] bom [info]E: Unable to locate package https://my_api/enrich-jobs?source

schedules.json

[
  {
    "name": "trigger-job-enrichment",
    "app_name": "startcast-api",
    "schedule": "*/10 * * * *",
    "region": "bom",
    "command": "(apt-get install curl) && curl \"https://my_api/cron/enrich-jobs?source=cron\"" (I've tried multiple variations of this including semicolon),
    "command_timeout": 30,
    "enabled": true,
    "config": {
      "metadata": {
        "fly_process_group": "cron"
      },
      "auto_destroy": true,
      "disable_machine_autostart": true,
      "guest": {
        "cpu_kind": "shared",
        "cpus": 1,
        "memory_mb": 512
      },
      "image": "ubuntu:22.04",
      "restart": {
        "max_retries": 1,
        "policy": "no"
      }
    }
  }
]

How can I run an install command (ideally without a custom image)?

I haven’t tested this, but you may need to update first.

Could try something like this:

apt-get update && apt-get install -y curl && curl "https://my_api/cron/enrich-jobs?source=cron"

Looks like it assumes everything after apt-get update is a flag/command to that command

 Preparing to run: `apt-get update && apt-get install curl && curl https://my_api/cron/enrich-jobs?source=cron)` as root
2024-08-22T23:40:07Z app[d891245f423798] bom [info] INFO [fly api proxy] listening at /.fly/api
2024-08-22T23:40:07Z app[d891245f423798] bom [info]2024/08/22 23:40:07 INFO SSH listening listen_address=[fdaa:5:b702:a7b:177:c9f8:f8fc:2]:22 dns_server=[fdaa::3]:53
2024-08-22T23:40:07Z runner[d891245f423798] bom [info]Machine created and started in 4.378s
2024-08-22T23:40:07Z app[d891245f423798] bom [info]E: The update command takes no arguments

Make sure to use apt-get install -y curl.

The update command takes no arguments

That’s interesting. :thinking:

@faizanali

Aight, I confirmed that this works:

 "command": "sh -c 'apt-get update && apt-get install -y curl && curl ...'",

Thanks Shaun. I was in a hurry so just used a third party cron service but will come back to this in a few weeks and test it out :slight_smile:

@shaun perhaps I’m speeding past the critical documentation, but what does everything in schedules.json do? What should be changed and to what? “image” is mentioned above, but it’s still not clear what this image is used for and how one should determine an appropriate image. Does cron-manager depend on existing fly apps or is the idea that all scheduled commands reside in the cron-manager app? I’m at a loss for how to translate an existing environment where five different scripts are executed on five different schedules. Should that be 1 cron-manager app and 5 independent fly apps or should that be a single cron-manager app with all the scripts? If all scripts are in the single app, are those to be created via the Dockerfile or some other means?

Related, how do I manage env vars for the 5 scripts? Do they all get relayed from secrets defined on the cron-manager app?

I think this is the critical documentation I was missing. Seems like cron-manager is not designed to manage the actual commands being run on a schedule, which seems to indicate a dependency on at least one other Fly app. So the idea is the Fly app cron-manager depends on has no machines running and cron-manager will start a machine just for this scheduled job.

But still, what’s the image for? I found this which seems to be most relevant documentation, but not clear why an image would be specified to create a machine for an existing app. Doesn’t my app Dockerfile handle this? Is “image” optional? Is the entire “config” optional? Shouldn’t all of this be specified in an app’s fly.toml?