Cannot attach postgres application

I am having trouble attaching my postgres to the application. On running fly postgres attach i get the following error:

Error error connecting to SSH server: err err handling connect: connect: can’t resolve address ‘fly-db.internal:22’: connect tcp [fdaa:0:3f2f::3]:53: operation timed out

Does fly ssh console -a <pg-name> work for you? If that’s failing, will you try this special build of flyctl to see if you have more success?

The suggested option of using fly-dev finally worked for me. Thank you

Ok, that’s really helpful. Thank you.

When you ran fly ssh before using fly-dev, did it give you a similar error as the attach process did the first time?

No it did not. That particular command ran just fine

Just to make sure I understand:

  1. fly pg attach failed with a timeout
  2. fly ssh console worked?
  3. fly-dev pg attached worked?

How long after you created the postgres database did you wait to run the pg attach command the first time?

fly ssh console took very long waiting for a connecting to the tunnel
fly-dev postgres attach worked very well

At first I did it immediately, but almost an hour later, it still could not attach.

1 Like

I have this fly.toml file:

# fly.toml file generated for frank on 2021-12-21T01:07:35+03:00

app = "frank"

kill_signal = "SIGTERM"
kill_timeout = 5
processes = []

[deploy]
  release_command = "/app/bin/fly eval Fly.Release.migrate"

[env]

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  http_checks = []
  internal_port = 4000
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "30s"
    interval = "15s"
    restart_limit = 6
    timeout = "2s"

Everything runs okay, however, the migrations for the database do not happen. I would appreciate help knowing what I might be doing wrong

The Dockerfile:


###
### Fist Stage - Building the Release
###
FROM hexpm/elixir:1.13.1-erlang-24.0.1-alpine-3.13.3 AS build

# install build dependencies
RUN apk add --no-cache build-base npm

# prepare build dir
WORKDIR /app

# extend hex timeout
ENV HEX_HTTP_TIMEOUT=20

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV as prod
ENV MIX_ENV=prod
ENV SECRET_KEY_BASE=nokey

# Copy over the mix.exs and mix.lock files to load the dependencies. If those
# files don't change, then we don't keep re-fetching and rebuilding the deps.
COPY mix.exs mix.lock ./
COPY config config

RUN mix deps.get --only prod && \
    mix deps.compile

# install npm dependencies
# COPY assets/package.json assets/package-lock.json ./assets/
# RUN npm --prefix ./assets ci --progress=false --no-audit --loglevel=error

# COPY priv priv
# COPY assets assets

# NOTE: If using TailwindCSS, it uses a special "purge" step and that requires
# the code in `lib` to see what is being used. Uncomment that here before
# running the npm deploy script if that's the case.
# COPY lib lib

# build assets
# RUN npm run --prefix ./assets deploy
# RUN mix phx.digest

# copy source here if not using TailwindCSS
COPY lib lib

# compile and build release
COPY rel rel
RUN mix do compile, release

###
### Second Stage - Setup the Runtime Environment
###

# prepare release docker image
FROM alpine:3.13.3 AS app
RUN apk add --no-cache libstdc++ openssl ncurses-libs

WORKDIR /app

RUN chown nobody:nobody /app

USER nobody:nobody

COPY --from=build --chown=nobody:nobody /app/_build/prod/rel/fly ./

ENV HOME=/app
ENV MIX_ENV=prod
# set app environment variables
ENV SECRET_KEY_BASE=nokey
ENV PORT=4000

CMD ["bin/fly", "start"]

When you run fly deploy, it should show logs from the release command. Can you copy and paste what you’re seeing in the logs from fly deploy?

Also, sometimes the easiest way to debug a release command is to deploy without then, then fly ssh console, then run /app/bin/fly eval Fly.Release.migrate manually and see what happens.

fly deploy output:

Deploying frank
==> Validating app configuration
--> Validating app configuration done
Services
TCP 80/443 ⇢ 4000
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  150.9MB
Step 1/24 : FROM hexpm/elixir:1.13.1-erlang-24.0.1-alpine-3.13.3 AS build
 ---> 80aeff08218f
Step 2/24 : RUN apk add --no-cache build-base npm
 ---> Using cache
 ---> ef6be8c5269b
Step 3/24 : WORKDIR /app
 ---> Using cache
 ---> a8560277d102
Step 4/24 : ENV HEX_HTTP_TIMEOUT=20
 ---> Using cache
 ---> 6de5118bb52e
Step 5/24 : RUN mix local.hex --force &&     mix local.rebar --force
 ---> Using cache
 ---> 13115c15e667
Step 6/24 : ENV MIX_ENV=prod
 ---> Using cache
 ---> c207e2794ee1
Step 7/24 : ENV SECRET_KEY_BASE=nokey
 ---> Using cache
 ---> 7d5122a1b614
Step 8/24 : COPY mix.exs mix.lock ./
 ---> Using cache
 ---> 1595870978d7
Step 9/24 : COPY config config
 ---> Using cache
 ---> 0354cf102223
Step 10/24 : RUN mix deps.get --only prod &&     mix deps.compile
 ---> Using cache
 ---> 2d9fa0d3edc8
Step 11/24 : COPY lib lib
 ---> Using cache
 ---> 2adc12f59086
Step 12/24 : COPY rel rel
 ---> Using cache
 ---> 1b7e1d084df8
Step 13/24 : RUN mix do compile, release
 ---> Using cache
 ---> f310e0d31a19
Step 14/24 : FROM alpine:3.13.3 AS app
 ---> 302aba9ce190
Step 15/24 : RUN apk add --no-cache libstdc++ openssl ncurses-libs
 ---> Using cache
 ---> 0d894fc55d66
Step 16/24 : WORKDIR /app
 ---> Using cache
 ---> 246d036e7b13
Step 17/24 : RUN chown nobody:nobody /app
 ---> Using cache
 ---> 679e400e9885
Step 18/24 : USER nobody:nobody
 ---> Using cache
 ---> 0ff8f35c4749
Step 19/24 : COPY --from=build --chown=nobody:nobody /app/_build/prod/rel/fly ./
 ---> Using cache
 ---> 6916f1e23305
Step 20/24 : ENV HOME=/app
 ---> Using cache
 ---> 1bf1277fa3fd
Step 21/24 : ENV MIX_ENV=prod
 ---> Using cache
 ---> f06696bbeb27
Step 22/24 : ENV SECRET_KEY_BASE=nokey
 ---> Using cache
 ---> f54d7e2b8f3d
Step 23/24 : ENV PORT=4000
 ---> Using cache
 ---> 6511bb69860a
Step 24/24 : CMD ["bin/fly", "start"]
 ---> Using cache
 ---> b48803b5bcb0
Successfully built b48803b5bcb0
Successfully tagged registry.fly.io/frank:deployment-1640038516
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/frank]
95aa123be4bc: Layer already exists 
09d6d00eff39: Layer already exists 
215baa09a242: Layer already exists 
6650302c27fc: Layer already exists 
0f7b3ff8b310: Layer already exists 
deployment-1640038516: digest: sha256:6b4fda9f4bdd39191dc92dda091b5ec92bb1bb872e223fdce42a8f65e620cc73 size: 1363
--> Pushing image done
Image: registry.fly.io/frank:deployment-1640038516
Image size: 23 MB
==> Creating release
Release v2 created
Release command detected: this new release will not be available until the command succeeds.

You can detach the terminal anytime without stopping the deployment
==> Release command
Command: /app/bin/fly eval Fly.Release.migrate
	 Starting instance
	 Configuring virtual machine
	 Pulling container image
	 Unpacking image
	 Preparing kernel init
	 Configuring firecracker
	 Starting virtual machine
	 Starting init (commit: 7943db6)...
	 Preparing to run: `/app/bin/fly eval Fly.Release.migrate` as nobody
	 2021/12/20 22:15:27 listening on [fdaa:0:3f2f:a7b:23c4:8d5e:f449:2]:22 (DNS: [fdaa::3]:53)
	 warning: variable "maybe_ipv6" is unused (if the variable is not meant to be used, prefix it with an underscore)
	   releases/0.1.0/runtime.exs:23
	 Reaped child process with pid: 561 and signal: SIGUSR1, core dumped? false
	 22:15:30.564 [info] Migrations already up
	 Main child exited normally with code: 0
	 Reaped child process with pid: 563 and signal: SIGUSR1, core dumped? false
	 Starting clean up.
Monitoring Deployment

1 desired, 1 placed, 1 healthy, 0 unhealthy [health checks: 1 total, 1 passing]
--> v0 deployed successfully

Let me try this and see.

This makes it look like they actually ran successfully, but the database was already up to date.

Just to make sure, I have gone ahead and created a new application together with a new instance of postgres. I have also left the migration release command. Running it manually results in Migrations already up

My assumption is that since this is a new instance, then the migrations should not have already run.

okari@okari-OMEN:~/projects/redare/trials/fly$ fly ssh console
Connecting to top1.nearest.of.antana.internal... complete
/ # /app/bin/fly eval Fly.Release.migrate
warning: variable "maybe_ipv6" is unused (if the variable is not meant to be used, prefix it with an underscore)
  app/releases/0.1.0/runtime.exs:23

22:45:43.375 [info] Migrations already up
/ # quit
/bin/sh: quit: not found
/ # exit

fly logs returns:

2021-12-20T22:46:41.038 app[2e90493f] fra [info] Request: GET /api/users
2021-12-20T22:46:41.038 app[2e90493f] fra [info] ** (exit) an exception was raised:
2021-12-20T22:46:41.038 app[2e90493f] fra [info]     ** (Postgrex.Error) ERROR 42P01 (undefined_table) relation "users" does not exist
2021-12-20T22:46:41.038 app[2e90493f] fra [info]     query: SELECT u0."id", u0."email", u0."first_name", u0."last_name", u0."inserted_at", u0."updated_at" FROM "users" AS u0
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (ecto_sql 3.7.1) lib/ecto/adapters/sql.ex:760: Ecto.Adapters.SQL.raise_sql_call_error/1
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (ecto_sql 3.7.1) lib/ecto/adapters/sql.ex:693: Ecto.Adapters.SQL.execute/5
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (ecto 3.7.1) lib/ecto/repo/queryable.ex:219: Ecto.Repo.Queryable.execute/4
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (ecto 3.7.1) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (fly 0.1.0) lib/fly_web/controllers/user_controller.ex:10: FlyWeb.UserController.index/2
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (fly 0.1.0) lib/fly_web/controllers/user_controller.ex:1: FlyWeb.UserController.action/2
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (fly 0.1.0) lib/fly_web/controllers/user_controller.ex:1: FlyWeb.UserController.phoenix_controller_pipeline/2
2021-12-20T22:46:41.038 app[2e90493f] fra [info]         (phoenix 1.6.5) lib/phoenix/router.ex:355: Phoenix.Router.__call__/2

Hmmm. So the migration is running successfully, but thinks there’s nothing to do. Then the database doesn’t have the tables you expect?

I think maybe the Dockerfile isn’t copying the migrations into the image properly. You probably need to uncomment this line:

# COPY priv priv

Migrations are stored in priv and I don’t see that getting copied anywhere else.

1 Like

I wouldn’t have caught that.

Thanks for all your help

1 Like