VM gets shut down immediately ... what am I doing wrong?


Might well be something obvious, but I can’t see what’s wrong.

I have just made an app which basically consists of a vector.toml file.

Input: syslog over UDP.
Output: New Relic (to test whether it works).

So my Dockerfile installs vector and then runs it, telling it the path to that custom config file. Seems to be all fine since I can see in the logs it starts running and seems happy, config valid etc. I’ve tried breaking the config file and that gets spotted, so it must be ok.

But … the VM gets shut down. Hmm. Why would that be?

I wondered if it is because I have no healthchecks defined in the toml. But that config is called tcp_healthchecks, and I’m using UDP, so I didn’t think those would work?

Fly does see the VM as healthy. The deploy succeeds. But then when I try to send any messages to it, nothing happens. Which would be explained if it is shut down.

What’s strange is that if I run flyctl status the app is shown as running. But in the flyctl logs, it doesn’t. I see this:

2021-04-14T20:37:42.260Z 8ad3740e lhr [info] Pulling container image
2021-04-14T20:37:45.529Z 8ad3740e lhr [info] Unpacking image
2021-04-14T20:37:45.721Z 8ad3740e lhr [info] Preparing kernel init
2021-04-14T20:37:46.036Z 8ad3740e lhr [info] Configuring firecracker
2021-04-14T20:37:46.127Z 8ad3740e lhr [info] Starting virtual machine
2021-04-14T20:37:46.263Z 8ad3740e lhr [info] Starting init (commit: 0512da4)...
2021-04-14T20:37:46.284Z 8ad3740e lhr [info] Running: `/vector/bin/vector --config /app/vector.toml` as root
2021-04-14T20:37:46.288Z 8ad3740e lhr [info] 2021/04/14 20:37:46 listening on [IP]:22 (DNS: [IP]:53)
2021-04-14T20:37:46.342Z 8ad3740e lhr [info] Apr 14 20:37:46.340  INFO vector::app: Log level is enabled. level="vector=info,codec=info,vrl=info,file_source=info,tower_limit=trace,rdkafka=info"
2021-04-14T20:37:46.344Z 8ad3740e lhr [info] Apr 14 20:37:46.342  INFO vector::sources::host_metrics: PROCFS_ROOT is unset. Using default '/proc' for procfs root.
2021-04-14T20:37:46.346Z 8ad3740e lhr [info] Apr 14 20:37:46.344  INFO vector::sources::host_metrics: SYSFS_ROOT is unset. Using default '/sys' for sysfs root.
2021-04-14T20:37:46.349Z 8ad3740e lhr [info] Apr 14 20:37:46.348  INFO vector::app: Loading configs. path=[("/app/vector.toml", None)]
2021-04-14T20:37:46.403Z 8ad3740e lhr [info] Apr 14 20:37:46.402  INFO vector::topology: Running healthchecks.
2021-04-14T20:37:46.405Z 8ad3740e lhr [info] Apr 14 20:37:46.404  INFO vector::topology::builder: Healthcheck: Passed.
2021-04-14T20:37:46.407Z 8ad3740e lhr [info] Apr 14 20:37:46.406  INFO vector::topology: Starting source. name="syslog"
2021-04-14T20:37:46.408Z 8ad3740e lhr [info] Apr 14 20:37:46.407  INFO vector::topology: Starting sink. name="new_relic"
2021-04-14T20:37:46.409Z 8ad3740e lhr [info] Apr 14 20:37:46.408  INFO source{component_kind="source" component_name=syslog component_type=syslog}: vector::sources::syslog: Listening. addr= type="udp"
2021-04-14T20:37:46.411Z 8ad3740e lhr [info] Apr 14 20:37:46.409  INFO vector: Vector has started. version="0.12.2" git_version="v0.12.2" released="Tue, 30 Mar 2021 19:11:37 +0000" arch="x86_64"
2021-04-14T20:37:46.413Z 8ad3740e lhr [info] Apr 14 20:37:46.411  INFO vector::internal_events::api: API server running. address= playground=
2021-04-14T20:38:12.123Z 3bdaab5d lhr [info] Shutting down virtual machine
2021-04-14T20:38:12.212Z 3bdaab5d lhr [info] Sending signal SIGINT to main child process w/ PID 503
2021-04-14T20:38:12.214Z 3bdaab5d lhr [info] Apr 14 20:38:12.212  INFO vector: Vector has stopped.
2021-04-14T20:38:12.218Z 3bdaab5d lhr [info] Apr 14 20:38:12.216  INFO source{component_kind="source" component_name=syslog component_type=syslog}: vector::sources::syslog: Finished sending.
2021-04-14T20:38:13.220Z 3bdaab5d lhr [info] Starting clean up.
2021-04-14T20:38:13.220Z 3bdaab5d lhr [info] Main child exited normally with code: 0

As you can see, vector starts up, and then the VM shuts down :slight_smile: I tried enabling and disabling the vector API, but it does the same. How do I keep the VM (and vector) running 24/7?

Do I need to run vector as a service? But even if I do, if the VM is shut down, that’s not going to help.


Oh, wait, the 8ad3740e is the VM ID … I assume.

In which case, it is the prior VM that is being shut down. And so the new one therefore remains running. Ah … well that’s good then. That means it is running.

Maybe there is some other issue then. Since I’m not able to SSH in to it either but can into other apps (TCP ones).

No healthchecks is fine!

Looks like 3bdaab5d was shut down but 8ad3740e has been running for the last 20 minutes. It seemed to have happened because of a deploy you made 20 minutes ago.

Are you sure the messages aren’t getting through? Would it log something somewhere? With UDP, it can be hard to tell.

We’ll look into it on our end, still :slight_smile:

1 Like

Ok, great, so it’s not the lack of tcp_healthchecks then. I didn’t think so, but good to know.

And with the different VM IDs (just spotted) then the latest VM does stay running. So that’s good too.

So yes, the weird thing is the messages aren’t being processed. Trying to figure out why. So far I haven’t set up any iptables or anything that would get in the way.

As regards ports, and UDP … I’m right in thinking you don’t support the syslog port 514 externally … ? So what I did was pick a random port, 10000, as the external port. And then set the internal port as 514. Not sure that is correct but can’t see how else to do it if I can’t expose 514 to the outside world.

For Anycast UDP, we actually don’t have a port filter; you should be able to use 514/udp.

1 Like

Ah … I didn’t know that. Maybe that’s the issue.

Ok, I think I have it working.

I think the issue is with the CDN since I can send test messages using nc using port 514 UDP which in turn appear in New Relic :slight_smile: So the app and vector part is working. Thanks for the help with this.