Thanks for fixing it, the cluster does seem more stable now. I felt bad posting this support ticket, I knew you were swamped with the recent DB fixes and all, and I have been posting a lot of them along the week (this was launch week of my company). Fly was the smoothest release experience I ever had with any platform, that’s why I love to stay even though you explicitly said Postgres is in beta. You have done a great job with the platform
The issue seems to have come back. The database name is indiepaper-production-db
This issue has surfaced again. My postgres database is indiepaper-production-db
.
Command: /app/bin/indie_paper eval IndiePaper.Release.migrate
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Preparing kernel init
Configuring firecracker
Starting virtual machine
Starting init (commit: 50ffe20)...
Preparing to run: `/app/bin/indie_paper eval IndiePaper.Release.migrate` as nobody
2021/10/26 08:41:14 listening on [fdaa:0:3565:a7b:21a1:7179:7b9d:2]:22 (DNS: [fdaa::3]:53)
Reaped child process with pid: 561 and signal: SIGUSR1, core dumped? false
Error: :18.295 [error] Could not create schema migrations table. This error usually happens due to the following:
* The database does not exist
* The "schema_migrations" table, which Ecto uses for managing
migrations, was defined by another library
* There is a deadlock while migrating (such as using concurrent
indexes with a migration_lock)
To fix the first issue, run "mix ecto.create".
To address the second, you can run "mix ecto.drop" followed by
"mix ecto.create". Alternatively you may configure Ecto to use
another table and/or repository for managing migrations:
config :indie_paper, IndiePaper.Repo,
migration_source: "some_other_table_for_schema_migrations",
migration_repo: AnotherRepoForSchemaMigrations
The full error report is shown below.
** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2977ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:
1. Ensuring your database is available and that you can connect to it
2. Tracking down slow queries and making sure they are running fast enough
3. Increasing the pool_size (albeit it increases resource consumption)
4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
See DBConnection.start_link/2 for more information
(ecto_sql 3.7.0) lib/ecto/adapters/sql.ex:756: Ecto.Adapters.SQL.raise_sql_call_error/1
(elixir 1.12.1) lib/enum.ex:1553: Enum."-map/2-lists^map/1-0-"/2
(ecto_sql 3.7.0) lib/ecto/adapters/sql.ex:844: Ecto.Adapters.SQL.execute_ddl/4
(ecto_sql 3.7.0) lib/ecto/migrator.ex:645: Ecto.Migrator.verbose_schema_migration/3
(ecto_sql 3.7.0) lib/ecto/migrator.ex:473: Ecto.Migrator.lock_for_migrations/4
(ecto_sql 3.7.0) lib/ecto/migrator.ex:388: Ecto.Migrator.run/4
(ecto_sql 3.7.0) lib/ecto/migrator.ex:146: Ecto.Migrator.with_repo/3
(indie_paper 0.1.0) lib/indie_paper/release.ex:12: anonymous fn/2 in IndiePaper.Release.migrate/0
Main child exited normally with code: 1
Reaped child process with pid: 563 and signal: SIGUSR1, core dumped? false
Starting clean up.
Error Release command failed, deployment aborted
I have separated out my development and production environments into two different accounts so I don’t mess up production DB by locally running some commands. So all my deployments get auto-triggered after a push to the master branch. Failing in deployment kind of erodes that trust and flow of pushing via Github. I have to manually check and verify if the deployment went in the right direction.
The development version indiepaper-development
actually went through without errors, that is the same code that fails on indiepaper-production-db
.Please fix it fast.
Looking into it… we’re working on improving reliability here, will post an update once I have more info.
exactly the same problem I’ve been having all day, had to deploy a version for a presentation and couldn’t
Yikes. That sucks, sorry to hear that. Inside Fly these problems are just some package or the other acting up under load, but to our customers they’re real-life problems that are often personal and irritating. I don’t have a quick answer, but the good news is that this should get better the more normal and edge cases we fix.
All I can say is that I look forward to your business support plans . (and also the platform becomes more stable, I’d prefer to use you guys instead of the defacto which always ends up being AWS)
Has there been any movement with this? Went to deploy this morning and still having it complain about there being a DB missing etc.
Command: /app/bin/app eval App.Release.migrate_all
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Preparing kernel init
Configuring firecracker
Starting virtual machine
Starting init (commit: 50ffe20)...
Preparing to run: `/app/bin/app eval App.Release.migrate_all` as nobody
2021/10/26 19:59:01 listening on [fdaa:0:35d4:a7b:2984:821:7a9b:2]:22 (DNS: [fdaa::3]:53)
Reaped child process with pid: 561 and signal: SIGUSR1, core dumped? false
19:59:05.678 [error] Could not create schema migrations table. This error usually happens due to the following:
* The database does not exist
* The "schema_migrations" table, which Ecto uses for managing
migrations, was defined by another library
....
We’ve handled the original issue that caused this error a few weeks ago, so it’s odd that it’s still showing up on your app. Could you confirm that the connection limits are not being hit on your DB instance? Or if there are any other errors in the DB logs?
correct.
As for connection limits nope, there is only one machine connected to the DB, I’ve logged into the DB to check for locks etc etc and it’s all fine. So I’m confused.
Have any changes or overrides been made to the DNS resolvers in the app?
nope, it’s a standard Elixir app using the Dockefile posted in your guides.
I’m replying in the other thread deployment issues in SYD region to limit the spread of answers if that is okay?
Yeah, let’s move this there, I’ll link the DB errors.
Will you try this again and also make sure your database hasn’t reached a connection limit and the DB logs aren’t showing any errors? That particular Elixir error is not super helpful, but it’s probably not an issue on our end (this time).
Hey there! This started to happen to me as well today, around 3 hours ago.
The issue seems to be the same where migrations cannot be run due to connection error, while the deployed app is working fine
Hey @flyio3! Sorry to see you’ve been struggling here. I’d love to help get this resolved if I can so we can improve the docs or whatever else is needed to make it better for you and others.
So let me restate what I understand the setup to be… please correct where I’m wrong.
- You are hosting a single application instance
- You are hosting it in
syd
- You have a single postgres database in
syd
(not using read-replicas) - Are you using Phoenix 1.6 with esbuild? I ask because the application is generated differently more recently.
- You aren’t using the
fly_postgres
hex package (just making sure) - The DATABASE_URL is set (
fly secrets list
)
Sometimes the logs can provide more information. When that happens, just run fly logs
to see if there’s any more info there.
When I’ve seen this problem, it’s generally because one of the following:
- a missing inet6 config for IPv6 support (app can’t see the database)
- It’s multi-region and the app doesn’t know what the primary region is supposed to be
- It’s multi-region and the app is being started in a backup region which doesn’t have the database
Does any of that apply?
sorry, this has all been sorted, there was downtime on one of your services while this was happening which compounded with some issues I was having by using podman to build the images, I’ve switched to using debian:slim as a base image and no longer have any issues with regards to DNS - so I assume it was something to do with the alpine image having DNS issues when built with podman.
@paolo.marino does any of this apply to the problem you’re having? Deployment fails here and there - #21 by brainlid
Hey! some of those. The app has been deployed there for a week or so with no issues and I haven’t made any big changes to the config.
I tried to deploy it again today and the deploy step worked, so migrations run, but now the deployed app cannot connect to the database anymore which in turns was working before the deploy.