Diagnosing postgres - insert fails into a specific (ordinary) table

Hey there, I’m moving a heroku/docker app over to fly and hitting a weird roadblock.

I’m using lucky and I was able to create my tables and seed my data just fine – this last step writes to several tables. I’m also able to write to some of my tables by interacting with the app (ruling out any kind of connection read-only weirdness, I think?).

What I’m unable to do is run an insert on a specific table. The table is ordinary, and so is the data. No postgres plugins or anything needed, just basic relational data. The insert query my app generates works when I run it on my local copy of postgres, and it runs in heroku postgres, but not on fly postgres (via psql or the app).

Nothing shows up in the logs other than what I believe to be repeated background noise:

2022-05-30T02:34:42Z app[40103901] sea [info]keeper   | 2022-05-30T02:34:42.730Z	INFO	cmd/keeper.go:1504	our db requested role is master
2022-05-30T02:34:42Z app[40103901] sea [info]keeper   | 2022-05-30T02:34:42.731Z	INFO	cmd/keeper.go:1542	already master
2022-05-30T02:34:42Z app[40103901] sea [info]keeper   | 2022-05-30T02:34:42.750Z	INFO	cmd/keeper.go:1675	postgres parameters not changed
2022-05-30T02:34:42Z app[40103901] sea [info]keeper   | 2022-05-30T02:34:42.751Z	INFO	cmd/keeper.go:1702	postgres hba entries not changed
2022-05-30T02:34:43Z app[40103901] sea [info]sentinel | 2022-05-30T02:34:43.913Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "c15d757b", "keeper": "2dbb0fd342"}

Just in case, followed instructions to downsize from a production postgres instance to a free instance. That changed nothing.

I know this is a weird and vague question, but I’ve been poking at this for a few days now and I’m at a loss. I don’t know what else to try.

Did you figure this out?

What I was thinking is that those logs are for the database (yep, kind of noise) however if the error on-INSERT is going to be reported somewhere, it would be in your app’s logs. The ones got via fly logs in the CLI. Does any INSERT-related error show up there?

I would assume the ORM (Avram, it appears) would output some kind of error on the failed INSERT for that table. All of the ORMs I’ve used have some form of log/error output, even if you also catch it yourself. Maybe something in Lucky - Logging

Like … if it was an attempt to write to a read-replica, you would get an error to say that, and could debug accordingly. Or permissions. Or something else. So, the question is what response is returned back from the database in response to that INSERT …

I actually have not figured it out. I watched the app logs as my first go-to but saw nothing, so I started digging for postgres logs. I confess I’m a little disappointed at the lack of any interaction with the fly staff here – elsewhere they’ve been incredibly helpful.

The app logs for the queries are actually normal – the query “finishes” in normal time, some number of ms. This might represent a bug in the framework/logging facility, I’m not yet sure.

The problem is not rooted in the application or framework though, because I can reproduce it with a psql command line pointed at the tunnel to fly.io.

Which port are you connecting to when you try to perform a write? Also, does the driver return any errors?

The insert query my app generates works when I run it on my local copy of postgres, and it runs in heroku postgres, but not on fly postgres

and as well as those questions from @kurt (just for an apples-to-apples comparison) is this the same version of Postgres on all three (local, heroku and fly)? Just trying to figure out what’s different … like I don’t know, some foreign key weirdness between versions or something. Weird. :thinking:

Honestly I don’t see any errors anywhere. I’m able to write to some tables but not others, and it’s not apparent to me any difference between the two. From a folder with a toml that represents the database, fly proxy 5433:5432 and then in a new terminal tabpsql -d 'postgres://user:pass@localhost:5433'.

I have a folder called fly-pg with a toml in it for the postgres database because I got tired of the verbose commands all needing -a my-app-db.

Tonight I’m able to run manual inserts on tables where I previously wasn’t, but my app still hangs on these inserts. When I fly ssh connect -C printenv I see the database url has port 5432 configured as I believe it should be.

I appreciate your input, I hadn’t considered the postgres versions. I’m running 14.2 locally, and select version() connected to the cluster shows the same. The heroku deployment is 13.6.

Tonight I’m able to run manual inserts on tables where I previously wasn’t

Well that’s baffling. Suggests either:

  1. Something changed at Fly’s end with a host/volume/routing/? etc to make it work
  2. Your database changed, for example a lock released on a table allowing writes to work

It’s interesting that the Heroku version is different however given the versions are so close that’s highly unlikely to be a cause. And anyway your local version is the same version as the one on Fly. So we know that eliminates any version changes causing any issues.

Are the inserts that aren’t happening simple, like INSERT INTO table_name (a, b) VALUES (1, 2) or do they have any dependency on another table, like INSERT INTO table_name (a, b) SELECT a,b FROM another_table; ?

Only what I’m thinking (well, guessing, by this point) is the first type wouldn’t depend on what other data is in the database and also whether any other table can’t be read from (for whatever reason). But if it’s the second type and the insert is fetching data from another table, and e.g that data is present locally but is not on Fly, that insert would run but not do anything. As there would be no data to insert (fetched from another-table). And so not insert data but also not return any error.

And from your port reply, that looks correct to me. Initially I saw 5433 but the proxy is using 5432. So doesn’t seem it’s that.

Does this database have any read replicas, in other regions? I don’t believe you mentioned. If it doesn’t then that would eliminate any write-going-to-a-read-replica consideration regardless of port.

Alright so I did a full scrape and reinstall this morning, and everything seems to be working.

It’s possible something was amiss on my end somewhere, but I have no idea where that could have been – periodic reads/writes working seems to indicate everything was setup correctly.

I appreciate your help.

1 Like

No problem. Glad it’s working now.

1 Like