Help migrating postgres from heroku rails app

I am eval’ing fly and keep running into walls trying to figure out how to migrate our postgres db from heroku.

The beta guide says to run this pg_dump command:

pg_dump --no-owner -C -d $HEROKU_DATABASE_URL | psql -d $DATABASE_URL

… which doesn’t make sense to run locally (the environment variables are secrets set on the fly app, not locally available variables) but also doesn’t work as is running via the fly ssh console; I get a pg_dump version mismatch error:

pg_dump: server version: 13.8 (Ubuntu 13.8-1.pgdg20.04+1); pg_dump version: 11.17 (Debian 11.17-0+deb10u1)
pg_dump: aborting because of server version mismatch

I tried referencing this guide: Migrating from Heroku Postgres to Fly Postgres: A Complete Guide

… which essentially has the same step; the “migate” script is the same call to pg_dump on the fly console.

So I’m kind of stuck at this point. Any pointers?

Oops, I forgot to include to run that from the Fly machine. I updated the docs to read:

fly ssh console -C 'pg_dump --no-owner -C -d $HEROKU_DATABASE_URL | psql -d $DATABASE_URL'

Which runs the migration from the machine.

What do you get when you run: fly ssh console -C 'pg_dump -V'?

Here’s my output: pg_dump (PostgreSQL) 13.8 (Debian 13.8-0+deb11u1), which matches the version of the server I’m running.

Are you using the Dockerfile generated by fly launch?

Yes, I am using the fly launch generated Dockerfile for my Raisl app. For the app instance

pg_dump (PostgreSQL) 11.17 (Debian 11.17-0+deb10u1)

For the db instance itself:

pg_dump (PostgreSQL) 14.4 (Debian 14.4-1.pgdg110+1)

I’m not sure how I get the app instance to use the same pg version (I am a Docker noob).

And I did try setting ARG PG_VERSION=14.1 in the Dockerfile for the app and then deployed, but pg_dump is still reporting the same 11.17 version.

Copying and pasting from the updated help doc gives an error:

pg_dump: too many command-line arguments (first is "|")

When I log in via fly ssh console and paste the command into the console directly, it works but I get the same mismatch error

# pg_dump --no-owner -C -d $HEROKU_DATABASE_URL | psql -d $DATABASE_URL
pg_dump: server version: 13.8 (Ubuntu 13.8-1.pgdg20.04+1); pg_dump version: 11.17 (Debian 11.17-0+deb10u1)
pg_dump: aborting because of server version mismatch
#

Can you share your Dockerfile and fly.toml files? Make sure there’s nothing sensitive in it.

Dockerfile as generated by fly launch with my added PG_VERSION line in an attempt to get a matching pg_dump version:

# syntax = docker/dockerfile:experimental

# Dockerfile used to build a deployable image for a Rails application.
# Adjust as required.
#
# Common adjustments you may need to make over time:
#  * Modify version numbers for Ruby, Bundler, and other products.
#  * Add library packages needed at build time for your gems, node modules.
#  * Add deployment packages needed by your application
#  * Add (often fake) secrets needed to compile your assets

#######################################################################

# Learn more about the chosen Ruby stack, Fullstaq Ruby, here:
#   https://github.com/evilmartians/fullstaq-ruby-docker.
#
# We recommend using the highest patch level for better security and
# performance.
ARG PG_VERSION=14.4
ARG RUBY_VERSION=2.7.6
ARG VARIANT=jemalloc-slim
FROM quay.io/evl.ms/fullstaq-ruby:${RUBY_VERSION}-${VARIANT} as base

LABEL fly_launch_runtime="rails"

ARG NODE_VERSION=16.5.0
ARG BUNDLER_VERSION=2.1.4

ARG RAILS_ENV=production
ENV RAILS_ENV=${RAILS_ENV}

ENV RAILS_SERVE_STATIC_FILES true
ENV RAILS_LOG_TO_STDOUT true

ARG BUNDLE_WITHOUT=development:test
ARG BUNDLE_PATH=vendor/bundle
ENV BUNDLE_PATH ${BUNDLE_PATH}
ENV BUNDLE_WITHOUT ${BUNDLE_WITHOUT}

RUN mkdir /app
WORKDIR /app
RUN mkdir -p tmp/pids

RUN curl https://get.volta.sh | bash
ENV VOLTA_HOME /root/.volta
ENV PATH $VOLTA_HOME/bin:/usr/local/bin:$PATH
RUN volta install node@${NODE_VERSION} yarn

#######################################################################

# install packages only needed at build time

FROM base as build_deps

ARG BUILD_PACKAGES="git build-essential libpq-dev wget vim curl gzip xz-utils libsqlite3-dev"
ENV BUILD_PACKAGES ${BUILD_PACKAGES}

RUN --mount=type=cache,id=dev-apt-cache,sharing=locked,target=/var/cache/apt \
    --mount=type=cache,id=dev-apt-lib,sharing=locked,target=/var/lib/apt \
    apt-get update -qq && \
    apt-get install --no-install-recommends -y ${BUILD_PACKAGES} \
    && rm -rf /var/lib/apt/lists /var/cache/apt/archives

#######################################################################

# install gems

FROM build_deps as gems

RUN gem update --system --no-document && \
    gem install -N bundler -v ${BUNDLER_VERSION}

COPY Gemfile* ./
RUN bundle install &&  rm -rf vendor/bundle/ruby/*/cache

#######################################################################

# install node modules

FROM build_deps as node_modules

COPY package*json ./
COPY yarn.* ./
RUN yarn install

#######################################################################

# install deployment packages

FROM base

ARG DEPLOY_PACKAGES="postgresql-client file vim curl gzip libsqlite3-0"
ENV DEPLOY_PACKAGES=${DEPLOY_PACKAGES}

RUN --mount=type=cache,id=prod-apt-cache,sharing=locked,target=/var/cache/apt \
    --mount=type=cache,id=prod-apt-lib,sharing=locked,target=/var/lib/apt \
    apt-get update -qq && \
    apt-get install --no-install-recommends -y \
    ${DEPLOY_PACKAGES} \
    && rm -rf /var/lib/apt/lists /var/cache/apt/archives

# copy installed gems
COPY --from=gems /app /app
COPY --from=gems /usr/lib/fullstaq-ruby/versions /usr/lib/fullstaq-ruby/versions
COPY --from=gems /usr/local/bundle /usr/local/bundle

# copy installed node modules
COPY --from=node_modules /app/node_modules /app/node_modules

#######################################################################

# Deploy your application
COPY . .

# Adjust binstubs to run on Linux and set current working directory
RUN chmod +x /app/bin/* && \
    sed -i 's/ruby.exe/ruby/' /app/bin/* && \
    sed -i '/^#!/aDir.chdir File.expand_path("..", __dir__)' /app/bin/*

# The following enable assets to precompile on the build server.  Adjust
# as necessary.  If no combination works for you, see:
# https://fly.io/docs/rails/getting-started/existing/#access-to-environment-variables-at-build-time
ENV SECRET_KEY_BASE 1
# ENV AWS_ACCESS_KEY_ID=1
# ENV AWS_SECRET_ACCESS_KEY=1

# Run build task defined in lib/tasks/fly.rake
ARG BUILD_COMMAND="bin/rails fly:build"
RUN ${BUILD_COMMAND}

# Default server start instructions.  Generally Overridden by fly.toml.
ENV PORT 8080
ARG SERVER_COMMAND="bin/rails fly:server"
ENV SERVER_COMMAND ${SERVER_COMMAND}
CMD ${SERVER_COMMAND}

fly.toml as generated by fly with the release command no-op’d so I could get a successful deploy w/o running db:migrate:

# fly.toml file generated for lp-test on 2022-10-13T13:57:09-04:00
# bin/rails fly:release
app = "lp-test"
kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[build]
  [build.args]
    BUILD_COMMAND = "bin/rails fly:build"
    SERVER_COMMAND = "bin/rails fly:server"

[deploy]
  release_command = "echo 'lfg'"

[env]
  PORT = "8080"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  http_checks = []
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []
  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

[[statics]]
  guest_path = "/app/public"
  url_prefix = "/"

I am also having an issue with this migration process – though I made it to the docs (which are lovely, btw) after @Brad added the fly ssh console fix. :high-five: I am using the Dockerfile created by fly launch. When I run fly ssh console -C 'pg_dump -V' I get pg_dump (PostgreSQL) 13.8 (Debian 13.8-0+deb11u1).

I’ve made it a little further than @sstrudeau, but am seeing several errors such as:

ERROR:  database "daeh15gmpe6foofoo" already exists
You are now connected to database "daeh15gmpe6foofoo" as user "my-new-fly-app-name".

ERROR:  schema "heroku_ext" already exists
ERROR:  duplicate key value violates unique constraint "ar_internal_metadata_pkey"
DETAIL:  Key (key)=(environment) already exists.
ERROR:  multiple primary keys for table "ar_internal_metadata" are not allowed
ERROR:  role "bzymrsrkobbooj" does not exist
REVOKE

The database name daeh15gmpe6foofoo and role bzymrsrkobbarbar look like auto-generated names. Perhaps by heroku? So I’m not quite sure how to overcome these mismatches.

I’ve been messing around with this all weekend and a friend beat me to the punch with this elegant solution that he submitted to the Docs here:

pg_dump --no-owner --data-only -d $HEROKU_DATABASE_URL | psql -d $DATABASE_URL

By replacing the -C (which is short for --create) with --data-only, we sidestep the schema conflict errors (etc) because we’re saving only the data in the dump file. If you were following the Migrate from Heroku · Fly Docs like I was, you already created the database. So in this step, we just need to fill it with data.

This approach completely solved the problem for me. :sparkles:

2 Likes

Brilliant! Thanks @lortza!

I’m going to migrate a few of my Heroku apps to Fly today and see if I can run into some of these issues.

I just ran through the whole process again from scratch. I am creating an instance inside an organization (not personal) and chose the shared-cpu-1x tier for postgres (need more storage). The only additional step I took is updating the release task to avoid running the db:migrate task, which fails and prevents a successful deploy.

When I follow the updated instructions I still end up getting the pg version mismatch error:

# pg_dump --no-owner -C -d $HEROKU_DATABASE_URL | psql -d $DATABASE_URL
pg_dump: server version: 13.8 (Ubuntu 13.8-1.pgdg20.04+1); pg_dump version: 11.17 (Debian 11.17-0+deb10u1)
pg_dump: aborting because of server version mismatch
#

I ran through this today and am at a point where there’s a few things I can share in this thread that might be useful. That said, I think I still have some more work to do in terms of documenting how to match the versions of Fly’s PG server to Heroku’s (both can change).

Note that I’m using the docs from Migrate from Heroku · Fly Docs (not Migrating from Heroku Postgres to Fly Postgres: A Complete Guide)

Migration command

I changed the migration to command to better match Heroku’s recommendations. The new command you should try running from the fly ssh console is:

pg_dump -Fc --no-acl --no-owner -d $HEROKU_DATABASE_URL | pg_restore --verbose --clean --no-acl --no-owner -d $DATABASE_URL

I don’t know if that will solve the version mismatch problem, but maybe? I found this command to be more reliable in terms of moving the data from Heroku into my Fly DB.

Matching versions

If your Heroku instance is running a really old version of PG, you might need to upgrade it to match Fly’s version. Docs at Upgrading the Version of a Heroku Postgres Database | Heroku Dev Center.

Fly launches v14 of Postgres via Fly Postgres · Fly Docs. In theory an older version of pg can be run as an app via a Dockerfile, but it would be more manual.

I still need to do more research on this topic since I’m not very familiar with it myself. Please chime in if you know more about dealing with pg version compatibility for dumps & restores!

More on version mismatches… it looks like the app server instance created by fly launch gets pg_dump version 11. My current Heroku database is on version 13 and the postgres instance created by fly launch is on 14.

When I run pg_dump -V on the app server instance (in this case lp-test), I get version 11:

fly ssh console -C "pg_dump -V" -a lp-test
Connecting to top1.nearest.of.lp-test.internal... complete
pg_dump (PostgreSQL) 11.17 (Debian 11.17-0+deb10u1)

but if I run it on the postgres instance app (lp-test-db) I get 14:

fly ssh console -C "pg_dump -V" -a lp-test-db
Connecting to top1.nearest.of.lp-test-db.internal... complete
pg_dump (PostgreSQL) 14.4 (Debian 14.4-1.pgdg110+1)

Here is what I get when I run the updated suggested pg_dump command:

# pg_dump -Fc --no-acl --no-owner -d $HEROKU_DATABASE_URL | pg_restore --verbose --clean --no-acl --no-owner -d $DATABASE_URL
pg_dump: server version: 13.8 (Ubuntu 13.8-1.pgdg20.04+1); pg_dump version: 11.17 (Debian 11.17-0+deb10u1)
pg_dump: aborting because of server version mismatch
pg_restore: [archiver] input file is too short (read 0, expected 5)

I might try setting these secrets on the database instance instead and see if pg_dump 14 will play nice with a v13 heroku database. It is odd that the app instance gets an older postgres client.

Running pg_dump on the database instance rather than the app instance “worked” – though I am currently trying to figure out if the transfer was complete because it ended on this error message:

pg_restore: error: error returned by PQputCopyData: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

Finally had time to do some data validation and it looks like the transfer failed to complete; the last two tables were not copied over. I’m going to try running this through again, soon, but I’d like to be able to reliably reproduce a migration and not have to worry about random connection breaks if we decide to migrate for real…

I finally made it through the whole migration.

fly launch created an app server with pg_dump version 11. Setting the database URL secretes and running the migration command on the database app server (which gets version 14 from fly launch) instead of the main app server worked fine with my version 13 database on Heroku (though it did time out and fail once in my testing).

I’m going to call this solved. Hope my experience was helpful in clarifying the docs.

What worked for me was changing VARIANT in the generated Dockerfile from jemalloc-slim to jemalloc-bullseye-slim. The default base is still Debian 10.

Compare:

$ docker run --rm --interactive --tty --entrypoint 'bash' 'quay.io/evl.ms/fullstaq-ruby:2.7.6-jemalloc-slim'
# apt-get update -qq && apt-get install --no-install-recommends -y postgresql-client
# pg_dump --version
pg_dump (PostgreSQL) 11.18 (Debian 11.18-0+deb10u1)
$ docker run --rm --interactive --tty --entrypoint 'bash' 'quay.io/evl.ms/fullstaq-ruby:2.7.6-jemalloc-bullseye-slim'
# apt-get update -qq && apt-get install --no-install-recommends -y postgresql-client
# pg_dump --version
pg_dump (PostgreSQL) 13.8 (Debian 13.8-0+deb11u1)
2 Likes

Thanks for that! I’m going to update our base Gemfile to include the newer image. PR at Update base Rails image to point to Debian 11 (from Debian 10) by bradgessler · Pull Request #1462 · superfly/flyctl · GitHub

1 Like