Docker Container Management and Persistence After Internal Errors

dbrown · September 28, 2022, 3:34am

We’ve run into an interesting issue with our Docker-based proxy apps, where an internal Fly error will cause the container to restart. Everything goes back to normal without issue, except the last script run in our Dockerfile. Since we have to create Elastic indices during build time (based on hostname), and wait for the VPN connection to be active, our Filebeat supervisor program is started after indexing via a script run during build time. Unfortunately, Filebeat isn’t starting back up as expected after these restarts, and I’m not entirely sure why just yet.

For clarity, this is when the script is run:

# setup filebeat
CMD /usr/bin/supervisord -c /etc/supervisor/supervisord.conf \
    && RUN /root/filebeat_setup.sh

My questions:

Is there any information to help us completely understand how this differs from a new build when these restarts happen? (e.g. It seems the evironment is reset to the last build, the hostname persists, but the index/filebeat script from the Dockerfile isn’t executed)
Is there a way we to persist the hostname or have a container-only hostname (specifically to avoid having to create new indices each time)?
Is there a way to add persistence to the current environment, or avoid the container restarts?
Is there a good way to monitor these restarts, so we can verify our Filebeat connection?

We of course just want to make sure everything persists in any situation, and while we can work around the current issue as-is, it’s becoming a bit too messy for my liking. Thanks in advance.

charsleysa · September 28, 2022, 4:00am

Hi @dbrown

It looks like you’re using supervisord which can restart processes when they crash/exit. Have you confirmed that it was the container restarting and not supervisord restarting the process?

As for persistence, you can use fly volumes (Volumes · Fly Docs). These are volumes that survive restarts, delete/recreate vm, etc.

If you want to see logs for your app you can use flyctl logs · Fly Docs, or if you want to see history about a specific vm you can use flyctl vm status · Fly Docs.

dbrown · September 28, 2022, 5:50am

Thanks for the reply @charsleysa

Yes, we currently use supervisord to auto-restart all processes, but the problem is that the Filebeat process isn’t starting yet, since it has to start from that script. While it is currently difficult to see the logs, since we overrun the buffer with nginx logs logged to stdout (can remove once we fix filebeat reliability), we can definitely see it is a restart since the uptime has reset. Unfortunately, we cannot use supervisord to start Filebeat on startup, since it has to run the index script before being started, and we of course don’t want that script run on process auto-restarts.

Volumes may make things easier, but I’m not entirely sure. We really don’t need to persist the data more than we need to avoid having to recreate indices. Does the random hostname persist with these volumes?

I do think the biggest benefit would be removing the need for the initial indexing, but I’m not entirely sure how we can accomplish this, since we have to update with each hostname change, and initiating on every restart will cause Filebeat to break in certain situations. Even so, I’m open to any suggestions.

dbrown · September 29, 2022, 4:35am

I’ve decided to look into ways to consolidate our Filebeat setup all-together, so we can avoid this issue. I’m still not entirely sure why the script is failing on the restart, but I’ll add some verbose logging to the script this week so we can at least figure out why its only failing during restarts. Cheers!

Topic		Replies	Views
How can I start a Docker container that is going to stop after being executed? Questions / Help	2	377	March 31, 2023
unable to deploy after suspend/resume (solved)	4	606	May 19, 2021
No suitable (healthy) instance found to handle request	9	330	October 28, 2021
restarts seem to sometimes not work?	7	293	April 1, 2022
Machine Check Failure Alerts Questions / Help machines	9	58	November 11, 2024

Docker Container Management and Persistence After Internal Errors

Related topics