Launching a new Postgres 1.7 app with DB results in "Error release command failed, deployment aborted"

> fly launch --name flxwebsites-app
Detected a Phoenix app
Selected App Name: flxwebsites-app
? Select Organization: FLX Websites (flx-websites)
? Choose a region for deployment: Secaucus, NJ (US) (ewr)
Created app flxwebsites-app in organization flx-websites
Set secrets on flxwebsites-app: SECRET_KEY_BASE
Preparing system for Elixir builds
Installing application dependencies
Running Docker release generator
Wrote config file fly.toml
? Would you like to set up a Postgresql database now? Yes
For pricing information visit: https://fly.io/docs/about/pricing/#postgresql-cl
? Select configuration: Development - Single node, 1x shared CPU, 256MB RAM, 1GB disk
Creating postgres cluster in organization flx-websites
Creating app...
Setting secrets on app flxwebsites-app-db...
Provisioning 1 of 1 machines with image flyio/postgres:14.4
Waiting for machine to start...
Machine 06e82647c60873 is created
==> Monitoring health checks
  Waiting for 06e82647c60873 to become healthy (started, 3/3)

Postgres cluster flxwebsites-app-db created
  Username:    postgres
  Password:    <password>
  Hostname:    flxwebsites-app-db.internal
  Proxy port:  5432
  Postgres port:  5433
  Connection string: postgres://postgres:<password>@flxwebsites-app-db.internal:5432

Save your credentials in a secure place -- you won't be able to see them again!

Connect to postgres
Any app within the FLX Websites organization can connect to this Postgres using the following connection string:

Now that you've set up postgres, here's what you need to understand: https://fly.io/docs/reference/postgres-whats-next/

Postgres cluster flxwebsites-app-db is now attached to flxwebsites-app
The following secret was added to flxwebsites-app:
  DATABASE_URL=postgres://flxwebsites_app:<password>@top2.nearest.of.flxwebsites-app-db.internal:5432/flxwebsites_app?sslmode=disable
Postgres cluster flxwebsites-app-db is now attached to flxwebsites-app
? Would you like to set up an Upstash Redis database now? No
? Would you like to deploy now? Yes
==> Building image
Remote builder fly-builder-bitter-feather-1279 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon   49.5kB
[+] Building 0.4s (28/28) FINISHED
 => [internal] load remote build context                                  0.0s
 => copy /context /                                                       0.0s
 => [internal] load metadata for docker.io/library/debian:bullseye-20220  0.3s
 => [internal] load metadata for docker.io/hexpm/elixir:1.14.1-erlang-25  0.3s
 => [builder  1/17] FROM docker.io/hexpm/elixir:1.14.1-erlang-25.1.1-deb  0.0s
 => [stage-1 1/6] FROM docker.io/library/debian:bullseye-20220801-slim@s  0.0s
 => CACHED [stage-1 2/6] RUN apt-get update -y && apt-get install -y lib  0.0s
 => CACHED [stage-1 3/6] RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.  0.0s
 => CACHED [stage-1 4/6] WORKDIR /app                                     0.0s
 => CACHED [stage-1 5/6] RUN chown nobody /app                            0.0s
 => CACHED [builder  2/17] RUN apt-get update -y && apt-get install -y b  0.0s
 => CACHED [builder  3/17] WORKDIR /app                                   0.0s
 => CACHED [builder  4/17] RUN mix local.hex --force &&     mix local.re  0.0s
 => CACHED [builder  5/17] COPY mix.exs mix.lock ./                       0.0s
 => CACHED [builder  6/17] RUN mix deps.get --only prod                   0.0s
 => CACHED [builder  7/17] RUN mkdir config                               0.0s
 => CACHED [builder  8/17] COPY config/config.exs config/prod.exs config  0.0s
 => CACHED [builder  9/17] RUN mix deps.compile                           0.0s
 => CACHED [builder 10/17] COPY priv priv                                 0.0s
 => CACHED [builder 11/17] COPY lib lib                                   0.0s
 => CACHED [builder 12/17] COPY assets assets                             0.0s
 => CACHED [builder 13/17] RUN mix assets.deploy                          0.0s
 => CACHED [builder 14/17] RUN mix compile                                0.0s
 => CACHED [builder 15/17] COPY config/runtime.exs config/                0.0s
 => CACHED [builder 16/17] COPY rel rel                                   0.0s
 => CACHED [builder 17/17] RUN mix release                                0.0s
 => CACHED [stage-1 6/6] COPY --from=builder --chown=nobody:root /app/_b  0.0s
 => exporting to image                                                    0.0s
 => => exporting layers                                                   0.0s
 => => writing image sha256:b041a98fa945f44cb40fad93f84616bdf945f9b1e1f0  0.0s
 => => naming to registry.fly.io/flxwebsites-app:deployment-01GJV8VDSE7N  0.0s
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/flxwebsites-app]
9cdd453cc992: Pushed
f8b831d35989: Pushed
c0d25defd6db: Pushed
d35b5ec3cba8: Pushed
5faa26dcfabd: Pushed
92a4e8a3140f: Pushed
deployment-01GJV8VDSE7ND07A4TK80QDS2D: digest: sha256:bf3b57b22530eacf52d3c25803ee2d58765a7369b0a081d9ab44eafc94f68f0d size: 1576
--> Pushing image done
image: registry.fly.io/flxwebsites-app:deployment-01GJV8VDSE7ND07A4TK80QDS2D
image size: 124 MB
==> Creating release
--> release v2 created

--> You can detach the terminal anytime without stopping the deployment
==> Release command detected: /app/bin/migrate

--> This release will not be available until the release command succeeds.
Error release command failed, deployment aborted

Seems that the generated Dockerfile for Phoenix apps does not automatically run mix ecto.create:

2022-11-27T00:47:55Z app[c909d246] ewr [info]Starting init (commit: 81d5330)...
2022-11-27T00:47:55Z app[c909d246] ewr [info]Setting up swapspace version 1, size = 512 MiB (536866816 bytes)
2022-11-27T00:47:55Z app[c909d246] ewr [info]no label, UUID=51f0caaa-b0b7-4d88-8838-cf3a201b6853
2022-11-27T00:47:55Z app[c909d246] ewr [info]Preparing to run: `/app/bin/migrate` as nobody
2022-11-27T00:47:55Z app[c909d246] ewr [info]2022/11/27 00:47:55 listening on [fdaa:0:db85:a7b:94:c909:d246:2]:22 (DNS: [fdaa::3]:53)
2022-11-27T00:48:00Z app[c909d246] ewr [info]00:48:00.709 [error] Could not create schema migrations table. This error usually happens due to the following:
2022-11-27T00:48:00Z app[c909d246] ewr [info]  * The database does not exist
2022-11-27T00:48:00Z app[c909d246] ewr [info]  * The "schema_migrations" table, which Ecto uses for managing
2022-11-27T00:48:00Z app[c909d246] ewr [info]    migrations, was defined by another library
2022-11-27T00:48:00Z app[c909d246] ewr [info]  * There is a deadlock while migrating (such as using concurrent
2022-11-27T00:48:00Z app[c909d246] ewr [info]    indexes with a migration_lock)
2022-11-27T00:48:00Z app[c909d246] ewr [info]To fix the first issue, run "mix ecto.create".

OK, I’m not crazy, following the exact steps via Getting Started · Fly Docs results in the same error:

--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/quiet-paper-6195]
712409f81d4e: Pushed
fd2f81f04489: Pushed
a8113cad50b5: Pushed
4dd480dc24a0: Pushed
9e02e1dc22e3: Pushed
92a4e8a3140f: Pushed
deployment-01GJV9WQG8C8WNCR595Y4AHAP6: digest: sha256:35297d293f1d9b085664024c22b8c8636ad75a2ff308a0899a2ec9cbd6970f07 size: 1576
--> Pushing image done
image: registry.fly.io/quiet-paper-6195:deployment-01GJV9WQG8C8WNCR595Y4AHAP6
image size: 124 MB
==> Creating release
--> release v2 created

--> You can detach the terminal anytime without stopping the deployment
==> Release command detected: /app/bin/migrate

--> This release will not be available until the release command succeeds.
         Starting instance
         Configuring virtual machine
         Pulling container image
         Unpacking image
         Preparing kernel init
         Configuring firecracker
         Starting virtual machine
         Starting init (commit: 81d5330)...
         Setting up swapspace version 1, size = 512 MiB (536866816 bytes)
         no label, UUID=5d980ff7-7059-4d33-a90e-b9b50ae478c8
         Preparing to run: `/app/bin/migrate` as nobody
         2022/11/27 01:07:33 listening on [fdaa:0:db85:a7b:94:63ac:4e70:2]:22 (DNS: [fdaa::3]:53)
           * The "schema_migrations" table, which Ecto uses for managing
         "mix ecto.create". Alternatively you may configure Ecto to use
         The full error report is shown below.
         ** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2973ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:
           4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
             (elixir 1.14.1) lib/enum.ex:1658: Enum."-map/2-lists^map/1-0-"/2
             (ecto_sql 3.9.1) lib/ecto/adapters/sql.ex:1005: Ecto.Adapters.SQL.execute_ddl/4
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:677: Ecto.Migrator.verbose_schema_migration/3
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:403: Ecto.Migrator.run/4
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:146: Ecto.Migrator.with_repo/3
         Starting clean up.
Error release command failed, deployment aborted

I’m completely bewildered about how to get a Phoenix 1.7 app running on Fly. If I create the pg DB on Fly and then do a fly postgres attach, deploying the Phoenix app for the first time, it fails because there is no schema_migrations table, from what I can tell:

         Preparing to run: `/app/bin/migrate` as nobody
         2022/11/27 17:08:55 listening on [fdaa:0:db85:a7b:95:39d9:1092:2]:22 (DNS: [fdaa::3]:53)
           * The database does not exist
           * The "schema_migrations" table, which Ecto uses for managing
             migrations, was defined by another library
           * There is a deadlock while migrating (such as using concurrent
             indexes with a migration_lock)
         To fix the first issue, run "mix ecto.create".
         To address the second, you can run "mix ecto.drop" followed by
         another table and/or repository for managing migrations:
             config :flx, Flx.Repo,
               migration_source: "some_other_table_for_schema_migrations",
           1. Ensuring your database is available and that you can connect to it
           2. Tracking down slow queries and making sure they are running fast enough
           3. Increasing the pool_size (although this increases resource consumption)
           4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
         See DBConnection.start_link/2 for more information
             (ecto_sql 3.9.1) lib/ecto/adapters/sql.ex:913: Ecto.Adapters.SQL.raise_sql_call_error/1
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:491: Ecto.Migrator.lock_for_migrations/4
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:403: Ecto.Migrator.run/4
             (ecto_sql 3.9.1) lib/ecto/migrator.ex:146: Ecto.Migrator.with_repo/3
             nofile:1: (file)

What’s worse, if this deploy fails the postgres attachment is completely wiped out (along with other secrets that were attached to the Phoenix application).

But, if I use the base postgres user / DB connection string, everything seems to work fine :thinking:

I’m about ready to give up on Fly. I’ve burned hours trying to get brand new blank applications to connect to a Postgres cluster using the fly postgres attach functionality.

Sometimes it works, and sometimes it doesn’t. I can’t seem to figure out any rhyme or reason, and I’m about ready to rip all of my hair out :sweat_smile:.

I’ve tried with both a new Phoenix 1.7 application and a Directus Node application using the following:

REGION_AND_ORG=--region ewr --org flx-websites 

launch-app:
	@fly launch \
		--name flxwebsites-app \
		--copy-config \
		--no-deploy \
		$(REGION_AND_ORG)
	@fly postgres attach \
		--app flxwebsites-app \
		--database-name flxwebsites \
		flxwebsites-db
	@fly scale vm shared-cpu-1x --memory 2048
	@fly deploy

launch-cms:
	@cd cms; fly launch \
		--name flxwebsites-cms \
		--copy-config \
		--no-deploy \
		$(REGION_AND_ORG)
	@cd cms; fly postgres attach \
		--app flxwebsites-cms \
		--database-name flxwebsites \
		flxwebsites-db
	@cd cms; cat .env.prod | fly secrets import
	@cd cms; fly scale vm shared-cpu-1x --memory 2048
	@cd cms; fly deploy

Both of the DB users created with the attach command are superusers.

The Phoenix app fails with the following:

2022/11/27 19:00:15 listening on [fdaa:0:db85:a7b:95:6eed:96ca:2]:22 (DNS: [fdaa::3]:53)
         19:00:20.399 [error] Could not create schema migrations table. This error usually happens due to the following:
           * The database does not exist
           * The "schema_migrations" table, which Ecto uses for managing
             migrations, was defined by another library
           * There is a deadlock while migrating (such as using concurrent
             indexes with a migration_lock)
         To fix the first issue, run "mix ecto.create".

And the Directus Node app fails with the following:

2022-11-27T19:01:56Z app[0da3a2d5] ewr [info]+ DB_CONNECTION_STRING=postgres://flxwebsites_cms:<password>@top2.nearest.of.flxwebsites-db.internal:5432/flxwebsites?sslmode=disable npx directus bootstrap
2022-11-27T19:02:00Z app[0da3a2d5] ewr [info]{"level":30,"time":1669575720859,"pid":538,"hostname":"0da3a2d5","msg":"Loaded extensions: directus-extension-wpslug-interface"}
2022-11-27T19:02:00Z app[0da3a2d5] ewr [info]{"level":30,"time":1669575720863,"pid":538,"hostname":"0da3a2d5","msg":"Initializing bootstrap..."}
2022-11-27T19:05:26Z app[0da3a2d5] ewr [info]{"level":50,"time":1669575926013,"pid":538,"hostname":"0da3a2d5","err":{"type":"KnexTimeoutError","message":"Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?","stack":"KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n    at Client_PG.acquireConnection (/app/node_modules/knex/lib/client.js:307:26)\n    at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:287:28)\n    at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)\n    at async validateDatabaseConnection (/app/node_modules/directus/dist/database/index.js:151:13)\n    at async waitForDatabase (/app/node_modules/directus/dist/cli/commands/bootstrap/index.js:78:5)\n    at async Command.bootstrap (/app/node_modules/directus/dist/cli/commands/bootstrap/index.js:41:5)\n    at async Command.parseAsync (/app/node_modules/commander/lib/command.js:916:5)","name":"KnexTimeoutError","sql":"SELECT 1"},"msg":"Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?"}
2022-11-27T19:05:26Z app[0da3a2d5] ewr [info]{"level":50,"time":1669575926013,"pid":538,"hostname":"0da3a2d5","msg":"Can't connect to the database."}

I’ve confirmed in both cases that the attachment DB connection strings are correct:

DATABASE_URL=postgres://flxwebsites_app:<password>@top2.nearest.of.flxwebsites-db.internal:5432/flxwebsites?sslmode=disable

and I’ve confirmed these are the strings being used to attempt connection in the apps.

Also, as mentioned in a previous message, if I use the standard postgres user and manually create a database before deploying, the first app I try to deploy can connect fine, but then my second application cannot connect to the same database via the same postgres user. Which is why it makes more sense to use postgres attach so each app can have their own user, but no idea how to get that to work reliably. I got both apps working with postgres attach at least once, but completely impossible to reproduce.

Edit, I take all of :point_up: back. I just tried to deploy a blank 1.7 app without changing any defaults and it still fails on:

2022-11-27T19:52:26Z app[5a0dc251] ewr [info]Preparing to run: `/app/bin/migrate` as nobody
2022-11-27T19:52:26Z app[5a0dc251] ewr [info]2022/11/27 19:52:26 listening on [fdaa:0:db85:a7b:95:5a0d:c251:2]:22 (DNS: [fdaa::3]:53)
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]19:52:30.984 [error] Could not create schema migrations table. This error usually happens due to the following:
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]  * The database does not exist
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]  * The "schema_migrations" table, which Ecto uses for managing
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]    migrations, was defined by another library
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]  * There is a deadlock while migrating (such as using concurrent
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]    indexes with a migration_lock)
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]To fix the first issue, run "mix ecto.create".
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]To address the second, you can run "mix ecto.drop" followed by
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]"mix ecto.create". Alternatively you may configure Ecto to use
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]another table and/or repository for managing migrations:
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]    config :testapp, Testapp.Repo,
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]      migration_source: "some_other_table_for_schema_migrations",
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]      migration_repo: AnotherRepoForSchemaMigrations
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]The full error report is shown below.
2022-11-27T19:52:30Z app[5a0dc251] ewr [info]** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2971ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:

Alright I just switched all of my instances from the ewr region to iad and everything works perfectly fine on the first and all subsequent tries…

There’s obviously something jacked up with the ewr region yesterday and today: Error failed to launch VM: nats: no responders available for request - #17 by nicksergeant

Hi Nick. Glad to hear you found iad region to run your app and thanks for not giving up. Following your report we have nailed down the problem on ewr region to a misconfigured host and fixed it. thanks again

@dangra :100: thank you! I am also glad I did not give up, I don’t want to go back to manually building out VPSes :laughing:

1 Like