Deployment fails here and there

Looking into it… we’re working on improving reliability here, will post an update once I have more info.

1 Like

exactly the same problem I’ve been having all day, had to deploy a version for a presentation and couldn’t :grimacing:

Yikes. That sucks, sorry to hear that. Inside Fly these problems are just some package or the other acting up under load, but to our customers they’re real-life problems that are often personal and irritating. I don’t have a quick answer, but the good news is that this should get better the more normal and edge cases we fix.

2 Likes

All I can say is that I look forward to your business support plans :slight_smile:. (and also the platform becomes more stable, I’d prefer to use you guys instead of the defacto which always ends up being AWS)

3 Likes

Has there been any movement with this? Went to deploy this morning and still having it complain about there being a DB missing etc.

Command: /app/bin/app eval App.Release.migrate_all
	Starting instance
	Configuring virtual machine
	Pulling container image
	Unpacking image
	Preparing kernel init
	Configuring firecracker
	Starting virtual machine
	Starting init (commit: 50ffe20)...
	Preparing to run: `/app/bin/app eval App.Release.migrate_all` as nobody
	2021/10/26 19:59:01 listening on [fdaa:0:35d4:a7b:2984:821:7a9b:2]:22 (DNS: [fdaa::3]:53)
	Reaped child process with pid: 561 and signal: SIGUSR1, core dumped? false
	19:59:05.678 [error] Could not create schema migrations table. This error usually happens due to the following:
	  * The database does not exist
	  * The "schema_migrations" table, which Ecto uses for managing
	    migrations, was defined by another library
   ....

We’ve handled the original issue that caused this error a few weeks ago, so it’s odd that it’s still showing up on your app. Could you confirm that the connection limits are not being hit on your DB instance? Or if there are any other errors in the DB logs?

Also, is this the same application where you noticed this error? deployment issues in SYD region

correct.

As for connection limits nope, there is only one machine connected to the DB, I’ve logged into the DB to check for locks etc etc and it’s all fine. So I’m confused.

Have any changes or overrides been made to the DNS resolvers in the app?

nope, it’s a standard Elixir app using the Dockefile posted in your guides.

I’m replying in the other thread deployment issues in SYD region to limit the spread of answers if that is okay?

Yeah, let’s move this there, I’ll link the DB errors.

Will you try this again and also make sure your database hasn’t reached a connection limit and the DB logs aren’t showing any errors? That particular Elixir error is not super helpful, but it’s probably not an issue on our end (this time).

Hey there! This started to happen to me as well today, around 3 hours ago.
The issue seems to be the same where migrations cannot be run due to connection error, while the deployed app is working fine :thinking:

Hey @flyio3! Sorry to see you’ve been struggling here. I’d love to help get this resolved if I can so we can improve the docs or whatever else is needed to make it better for you and others.

So let me restate what I understand the setup to be… please correct where I’m wrong.

  • You are hosting a single application instance
  • You are hosting it in syd
  • You have a single postgres database in syd (not using read-replicas)
  • Are you using Phoenix 1.6 with esbuild? I ask because the application is generated differently more recently.
  • You aren’t using the fly_postgres hex package (just making sure)
  • The DATABASE_URL is set (fly secrets list)

Sometimes the logs can provide more information. When that happens, just run fly logs to see if there’s any more info there.

When I’ve seen this problem, it’s generally because one of the following:

  • a missing inet6 config for IPv6 support (app can’t see the database)
  • It’s multi-region and the app doesn’t know what the primary region is supposed to be
  • It’s multi-region and the app is being started in a backup region which doesn’t have the database

Does any of that apply?

sorry, this has all been sorted, there was downtime on one of your services while this was happening which compounded with some issues I was having by using podman to build the images, I’ve switched to using debian:slim as a base image and no longer have any issues with regards to DNS - so I assume it was something to do with the alpine image having DNS issues when built with podman.

1 Like

@paolo.marino does any of this apply to the problem you’re having? Deployment fails here and there - #21 by brainlid

Hey! some of those. The app has been deployed there for a week or so with no issues and I haven’t made any big changes to the config.

I tried to deploy it again today and the deploy step worked, so migrations run, but now the deployed app cannot connect to the database anymore which in turns was working before the deploy.

Just want to say that your fly_postgres hex package is amazing, along with your elixir conf talk. Learned about them both here and have subsequently updated to be multi-region with PostgreSQL. Thank you!

1 Like

Thanks @mmark!

There is a new version of the fly_postgres library that I need to release. Just haven’t had time to finalize some things. Would love early testers on that when it’s ready!

1 Like

Awesome! Totally, any way I can help just let me know.