Install timescaledb

I don’t think we have one ready, but I imagine adapting GitHub - fly-apps/postgres-ha: Postgres + Stolon for HA clusters as Fly apps. to replace plain Postgres with Timescale should work fine.

Will see if we can make an example, and in the meanwhile if you’re trying it out we can help out here.

I cloned the repo and did a

flyctl launch
flyctl volumes create pg_data --region sea --size 10
flyctl secrets set SU_PASSWORD=[redacted] REPL_PASSWORD=[redacted]
flyctl deploy

The deploy is giving me the following error:

2021-11-08T04:54:29.000 [info] keeper   | 2021-11-08T04:54:29.334Z      INFO    cmd/keeper.go:1676      postgres parameters not changed
2021-11-08T04:54:29.000 [info] keeper   | 2021-11-08T04:54:29.334Z      INFO    cmd/keeper.go:1703      postgres hba entries not changed
2021-11-08T04:54:34.000 [info] keeper   | 2021-11-08T04:54:34.444Z      INFO    cmd/keeper.go:1505      our db requested role is master
2021-11-08T04:54:34.000 [info] keeper   | 2021-11-08T04:54:34.445Z      INFO    cmd/keeper.go:1543      already master
2021-11-08T04:54:34.000 [info] keeper   | 2021-11-08T04:54:34.458Z      INFO    cmd/keeper.go:1676      postgres parameters not changed
2021-11-08T04:54:34.000 [info] keeper   | 2021-11-08T04:54:34.459Z      INFO    cmd/keeper.go:1703      postgres hba entries not changed
***v1 failed - Failed due to unhealthy allocations and deploying as v2 

Let me try replicating this… in the meanwhile, does the default Timescale image: Docker Hub work as is? Or are you specifically looking for a high-availability system?

Yeah, I’m looking to not steer too far from what fly-apps/postgres-ha has.

As TimescaleDB can be installed just fine as a plugin, I forked the postgres-ha repo and modified the Dockerfile to add what is needed to have TimescaleDB with w/o having to change the base Postgres image:

I can create the image on my local just fine but it also fails when deploying.

Did you use the fly.toml in the directory as well? It has a few directives that need to be enabled.

That said, the error might not be an error — is the app actually working? Most are info lines and the message actually does a retry - did it succeed finally?

Could you post the full deploy logs if possible? Would like to see what happens before and after the errors.

Yes, I used the same fly.toml. doing flyctl launch just renamed the project. The app wasn’t deployed. I’ll try to spin another one and will come back with the results.

BTW running init I’m getting:

➜  postgres-ha git:(main) flyctl init

Error: unknown command "init" for "flyctl"

Did you mean this?
        info

Run 'flyctl --help' for usage.

Error unknown command "init" for "flyctl"

Did you mean this?
        info


➜  postgres-ha git:(main) 

➜  postgres-ha git:(main) flyctl version
flyctl v0.0.250 darwin/amd64 Commit: 7a90db9 BuildDate: 2021-10-28T20:49:47Z

Think that’s deprecated, you’d probably want to do flyctl launch.

Here is the whole failing deploy

Launching

➜  postgres-ha git:(main) flyctl launch
An existing fly.toml file was found for app postgres-ha-example
? Would you like to copy its configuration to the new app? Yes
Creating app in /Users/ericktamayo/Code/metronome/postgres-ha
Scanning source code
Detected a Dockerfile app
? App Name (leave blank to use an auto-generated name): postgres-ha-example
? Select organization: Metronome (metronome)
? Select region: sea (Seattle, Washington (US))
Created app postgres-ha-example in organization metronome
Wrote config file fly.toml
? Would you like to deploy now? No
Your app is ready. Deploy with `flyctl deploy`

Here the app is created but pending deployment

➜  postgres-ha git:(main) ✗ flyctl volumes create pg_data --region sea --size 10
        ID: vol_k0o6d42gyn7v87gy
      Name: pg_data
    Region: sea
   Size GB: 10
 Encrypted: true
Created at: 08 Nov 21 18:08 UTC

Adding the secrets as per the README

➜  postgres-ha git:(main) ✗ flyctl secrets set SU_PASSWORD=[redacted]  REPL_PASSWORD=[redacted]
Secrets are staged for the first deployment

Deploying

➜  postgres-ha git:(main) ✗ flyctl deploy
Deploying postgres-ha-example
==> Validating app configuration
--> Validating app configuration done
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.8 linux x86_64
Sending build context to Docker daemon  172.5kB
[+] Building 4.2s (23/23) FINISHED                                                                                                                                                                                                             
 => [internal] load remote build context                                                                                                                                                                                                  0.0s
 => copy /context /                                                                                                                                                                                                                       0.1s
 => [internal] load metadata for docker.io/flyio/stolon:b6b9aaf                                                                                                                                                                           4.0s
 => [internal] load metadata for docker.io/library/postgres:13.4                                                                                                                                                                          4.0s
 => [internal] load metadata for docker.io/wrouesnel/postgres_exporter:latest                                                                                                                                                             4.0s
 => [internal] load metadata for docker.io/library/golang:1.16                                                                                                                                                                            4.0s
 => [stage-3 1/9] FROM docker.io/library/postgres:13.4@sha256:1adb50e5c24f550a9e68457a2ce60e9e4103dfc43c3b36e98310168165b443a1                                                                                                            0.0s
 => [postgres_exporter 1/1] FROM docker.io/wrouesnel/postgres_exporter:latest@sha256:54bd3ba6bc39a9da2bf382667db4dc249c96e4cfc837dafe91d6cc7d362829e0                                                                                     0.0s
 => [flyutil 1/5] FROM docker.io/library/golang:1.16@sha256:e04b1665f7caf60b88c732fa3ce41e2bcf5b4320ad77f42a15d5bcda76fc4b81                                                                                                              0.0s
 => [stolon 1/1] FROM docker.io/flyio/stolon:b6b9aaf@sha256:ed7dfa80c26e8cdfcc3c7316c1577c1cd60d4360d8790bb22635c619a1bf8cfe                                                                                                              0.0s
 => CACHED [stage-3 2/9] RUN apt-get update && apt-get install --no-install-recommends -y     ca-certificates curl bash dnsutils vim-tiny procps jq haproxy     postgresql-13-postgis-3     postgresql-13-postgis-3-scripts     && apt a  0.0s
 => CACHED [stage-3 3/9] COPY --from=stolon /go/src/app/bin/* /usr/local/bin/                                                                                                                                                             0.0s
 => CACHED [stage-3 4/9] COPY --from=postgres_exporter /postgres_exporter /usr/local/bin/                                                                                                                                                 0.0s
 => CACHED [stage-3 5/9] ADD /scripts/* /fly/                                                                                                                                                                                             0.0s
 => CACHED [stage-3 6/9] ADD /config/* /fly/                                                                                                                                                                                              0.0s
 => CACHED [stage-3 7/9] RUN useradd -ms /bin/bash stolon                                                                                                                                                                                 0.0s
 => CACHED [stage-3 8/9] RUN mkdir -p /run/haproxy/                                                                                                                                                                                       0.0s
 => CACHED [flyutil 2/5] WORKDIR /go/src/github.com/fly-examples/postgres-ha                                                                                                                                                              0.0s
 => CACHED [flyutil 3/5] COPY . .                                                                                                                                                                                                         0.0s
 => CACHED [flyutil 4/5] RUN CGO_ENABLED=0 GOOS=linux go build -v -o /fly/bin/flyadmin ./cmd/flyadmin                                                                                                                                     0.0s
 => CACHED [flyutil 5/5] RUN CGO_ENABLED=0 GOOS=linux go build -v -o /fly/bin/start ./cmd/start                                                                                                                                           0.0s
 => CACHED [stage-3 9/9] COPY --from=flyutil /fly/bin/* /usr/local/bin/                                                                                                                                                                   0.0s
 => exporting to image                                                                                                                                                                                                                    0.0s
 => => exporting layers                                                                                                                                                                                                                   0.0s
 => => writing image sha256:e5ef4f2170624687ae0b46b16bcf01496404f876ae19e5a693c2436606bac116                                                                                                                                              0.0s
 => => naming to registry.fly.io/postgres-ha-example:deployment-1636395062                                                                                                                                                                0.0s
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/postgres-ha-example]
245c90cee8b9: Pushed 
509bbeb2db24: Pushed 
16ef445c4836: Pushed 
87e4fd03453f: Pushed 
afeee662292a: Pushed 
b5a4612ba664: Pushed 
2b62d89bcaed: Pushed 
c6c0c5fd9172: Pushed 
9180e7a3f39f: Pushed 
5ed186b83b18: Pushed 
82b6874b44d0: Pushed 
0739eec8bae5: Pushed 
9a1a7d8bf685: Pushed 
535b38c199d6: Pushed 
bb5416d92e3c: Pushed 
1d69f9da9d06: Pushed 
651af98b41e3: Pushed 
8f6d195cb042: Pushed 
756e6b21e18e: Pushed 
0f912f02afd0: Pushed 
e8b689711f21: Mounted from metronome-sh 
deployment-1636395062: digest: sha256:e6758bf6404f8397659b85ac39109e3d10e6aedd995fab8e11c6a487adf92ce9 size: 4714
--> Pushing image done
Image: registry.fly.io/postgres-ha-example:deployment-1636395062
Image size: 781 MB
==> Creating release
Release v2 created

You can detach the terminal anytime without stopping the deployment
Monitoring Deployment

1 desired, 1 placed, 0 healthy, 1 unhealthy [health checks: 3 total, 2 passing, 1 critical]
v0 failed - Failed due to unhealthy allocations
Failed Instances

==> Failure #1

Instance
  ID            = b01e8595                        
  Process       =                                 
  Version       = 0                               
  Region        = sea                             
  Desired       = run                             
  Status        = running (leader)                
  Health Checks = 3 total, 2 passing, 1 critical  
  Restarts      = 0                               
  Created       = 4m52s ago                       

Recent Events
TIMESTAMP            TYPE       MESSAGE                 
2021-11-08T18:12:01Z Received   Task received by client 
2021-11-08T18:12:01Z Task Setup Building Task Directory 
2021-11-08T18:12:16Z Started    Task started by client  

Recent Logs
-- it repeats here several times the same thing --
2021-11-08T18:17:02.000 [info] keeper   | 2021-11-08T18:17:02.269Z      INFO    cmd/keeper.go:1505      our db requested role is master
2021-11-08T18:17:02.000 [info] keeper   | 2021-11-08T18:17:02.270Z      INFO    cmd/keeper.go:1543      already master
2021-11-08T18:17:02.000 [info] keeper   | 2021-11-08T18:17:02.284Z      INFO    cmd/keeper.go:1676      postgres parameters not changed
2021-11-08T18:17:02.000 [info] keeper   | 2021-11-08T18:17:02.284Z      INFO    cmd/keeper.go:1703      postgres hba entries not changed
***v0 failed - Failed due to unhealthy allocations and deploying as v1 

Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/

The app status on the fly dashboard says running:

But if I go to Activity:

I try to get into the db with Postico and I get a timeout:

This seems like a pretty good start, there’s a couple of things to note:

  • one of the health checks is failing, can see which one with fly checks list
  • can see the logs that the VMs itself is printing with fly logs
  • the Postgres app does not expose a listener, so this app will not be accessible to the outside world on app.fly.dev — instead you’ll want to join the wireguard network using the notes at Private Networking and access app.internal or add a public listener using the notes in Multi-region PostgreSQL

Think looking at fly checks list and fly logs would be the first steps, though. If the logs are clear we could remove the checks if they turn out to be unnecessary.

The checks might be failing because there’s only a single instance as well — I think the HA package assumes at least two instances, so you’ll want to add another volume and set the fly scale count 2 to run both.

Ok, I’m scaling to 2 instances but the fly checks list in the meantime is giving me:

➜  postgres-ha git:(main) ✗ fly checks list
Health Checks for postgres-ha-example
NAME STATUS   ALLOCATION REGION TYPE LAST UPDATED OUTPUT                               
vm   passing  b01e8595   sea    HTTP 1m41s ago    HTTP GET                             
                                                  http://172.19.2.50:5500/flycheck/vm: 
                                                  200 OK Output: "[✓] checkDisk:       
                                                  9.19 GB (94.0%) free space on        
                                                  /data/ (42.41µs)\n[✓] checkLoad:     
                                                  load averages: 0.00 0.01 0.02        
                                                  (67.33µs)\n[✓] memory: system spent  
                                                  0s of the last 60s waiting on memory 
                                                  (39.25µs)\n[✓] cpu: system spent     
                                                  270ms of the last 60s waiting on     
                                                  cpu (30.18µs)\n[✓] io: system spent  
                                                  0s of the last 60s waiting on io     
                                                  (26.46µs)"                           
role passing  b01e8595   sea    HTTP 35m5s ago    leader                               
pg   critical b01e8595   sea    HTTP 35m16s ago   HTTP GET                             
                                                  http://172.19.2.50:5500/flycheck/pg: 
                                                  500 Internal Server Error Output:    
                                                  "failed to connect to proxy: context 
                                                  deadline exceeded"   

Ok, I added a second volume and scaled it to 2. still having issues: I’ll check if I can connect using WireGuard

2 desired, 2 placed, 0 healthy, 2 unhealthy
v2 failed - Failed due to unhealthy allocations
***v2 failed - Failed due to unhealthy allocations and deploying as v3 

fly checks list renders:

➜  postgres-ha git:(main) ✗ fly checks list
Health Checks for postgres-ha-example
NAME STATUS   ALLOCATION REGION TYPE LAST UPDATED OUTPUT                               
vm   passing  b01e8595   sea    HTTP 1m21s ago    HTTP GET                             
                                                  http://172.19.2.50:5500/flycheck/vm: 
                                                  200 OK Output: "[✓] checkDisk:       
                                                  9.12 GB (93.3%) free space on        
                                                  /data/ (41.98µs)\n[✓] checkLoad:     
                                                  load averages: 0.00 0.01 0.02        
                                                  (66.73µs)\n[✓] memory: system spent  
                                                  0s of the last 60s waiting on memory 
                                                  (40.26µs)\n[✓] cpu: system spent     
                                                  318ms of the last 60s waiting on     
                                                  cpu (30.37µs)\n[✓] io: system spent  
                                                  0s of the last 60s waiting on io     
                                                  (27.44µs)"                           
role passing  b01e8595   sea    HTTP 44m29s ago   leader                               
pg   critical b01e8595   sea    HTTP 44m40s ago   HTTP GET                             
                                                  http://172.19.2.50:5500/flycheck/pg: 
                                                  500 Internal Server Error Output:    
                                                  "failed to connect to proxy: context 
                                                  deadline exceeded"                   
vm   passing  2f2ebf1a   sea    HTTP 1m7s ago     HTTP GET                             
                                                  http://172.19.1.42:5500/flycheck/vm: 
                                                  200 OK Output: "[✓] checkDisk:       
                                                  9.17 GB (93.8%) free space on        
                                                  /data/ (46.4µs)\n[✓] checkLoad:      
                                                  load averages: 0.00 0.03 0.05        
                                                  (68.07µs)\n[✓] memory: system spent  
                                                  0s of the last 60s waiting on memory 
                                                  (40.69µs)\n[✓] cpu: system spent     
                                                  318ms of the last 60s waiting on     
                                                  cpu (30.01µs)\n[✓] io: system spent  
                                                  12ms of the last 60s waiting on io   
                                                  (47.58µs)"                           
role passing  2f2ebf1a   sea    HTTP 5m28s ago    replica                              
pg   critical 2f2ebf1a   sea    HTTP 5m45s ago    HTTP GET                             
                                                  http://172.19.1.42:5500/flycheck/pg: 
                                                  500 Internal Server Error Output:    
                                                  "failed to connect to proxy: context 
                                                  deadline exceeded"                   

I’m working on replicating this (getting timeouts on the Timescale keys), in the meanwhile do you have anything on fly logs? Might be able to see what’s happening.

Nothing that explains what is going on (or I think it doesn’t)

Same thing over and over:

2021-11-08T19:58:06.170 app[2f2ebf1a] sea [info] keeper   | 2021-11-08T19:58:06.170Z    INFO    cmd/keeper.go:1557       our db requested role is standby        {"followedDB": "b89233f0"}
2021-11-08T19:58:06.170 app[2f2ebf1a] sea [info] keeper   | 2021-11-08T19:58:06.170Z    INFO    cmd/keeper.go:1576       already standby
2021-11-08T19:58:06.189 app[2f2ebf1a] sea [info] keeper   | 2021-11-08T19:58:06.189Z    INFO    cmd/keeper.go:1676       postgres parameters not changed
2021-11-08T19:58:06.190 app[2f2ebf1a] sea [info] keeper   | 2021-11-08T19:58:06.189Z    INFO    cmd/keeper.go:1703       postgres hba entries not changed
2021-11-08T19:58:07.174 app[b01e8595] sea [info] keeper   | 2021-11-08T19:58:07.174Z    INFO    cmd/keeper.go:1505       our db requested role is master
2021-11-08T19:58:07.175 app[b01e8595] sea [info] keeper   | 2021-11-08T19:58:07.175Z    INFO    cmd/keeper.go:1543       already master
2021-11-08T19:58:07.188 app[b01e8595] sea [info] keeper   | 2021-11-08T19:58:07.188Z    INFO    cmd/keeper.go:1676       postgres parameters not changed
2021-11-08T19:58:07.189 app[b01e8595] sea [info] keeper   | 2021-11-08T19:58:07.188Z    INFO    cmd/keeper.go:1703       postgres hba entries not changed

Do you have any issues installing the main repo? GitHub - fly-apps/postgres-ha: Postgres + Stolon for HA clusters as Fly apps.

This is what is giving me errors. No timescaledb stuff yet.

I’m trying to work through this whole process once… there’s a couple of extra calls to make to get the HA system running, will report back as soon as I can figure them out.

Thanks @sudhir.j you’re helping a lot!

I’d actually suggest tweaking the approach a bit - if you create a normal Fly PG HA setup using fly pg create, you can then modify the Postgres-ha repo to add timescale and just redeploy it (Fly PG apps are just Fly apps with extra initialisation). That way all the setup is already done for you.

There’s a walkthrough on adding timescale to an existing DB here How to Enable TimescaleDB on an Existing PostgreSQL Database | Severalnines

I see, I’ll try to do that and let you know