PostgreSql Performance

nicktorba · July 1, 2022, 1:51pm

Hello Fly Team,

I’m trying to get a postgres app up and running, but I am struggling to get reasonable performance. I am creating a table with the following python command:

df_.to_sql("tweets1", 
           engine, 
           if_exists="replace", 
           dtype={
                'id': BigInteger,
                'created_at': DateTime, 
                'author.username': String(15), 
                'author.id': BigInteger, 
                'referenced_tweets.replied_to.id': BigInteger, 
                'referenced_tweets.retweeted.id': BigInteger,
                'referenced_tweets.quoted.id': BigInteger, 
                'in_reply_to_user_id': BigInteger,
                "entities.mentions": JSON,
                "public_metrics.reply_count": BigInteger, 
                "author.public_metrics.followers_count": BigInteger, 
           }
)

which runs this sql command to create the table:

CREATE TABLE tweets1 (
	index BIGINT, 
	id BIGINT, 
	created_at TIMESTAMP WITHOUT TIME ZONE, 
	"author.username" VARCHAR(15), 
	"author.id" BIGINT, 
	"referenced_tweets.replied_to.id" BIGINT, 
	"referenced_tweets.retweeted.id" BIGINT, 
	"referenced_tweets.quoted.id" BIGINT, 
	in_reply_to_user_id BIGINT, 
	"entities.mentions" JSON, 
	"public_metrics.reply_count" BIGINT, 
	"author.public_metrics.followers_count" BIGINT
)

My table only has ~4500 rows in it, but it takes 30 seconds for select * from tweets1 to complete and return the data. This seems way slower than it should be. I’m not sure where I am going wrong. I wouldn’t think any optimizations are needed to get better performance for this use would be necesssary, which makes me think I am missing a very simple step.

My db is running with 2 dedicated cpu’s and 4096MB.

Could someone give me some basic db pointers to help me figure this out?

tj1 · July 4, 2022, 2:02am

Well, unless you have megabytes of data in each row, this should take much less than a second.

I would debug in the following order:

ssh into the db node to run psql with the query “select count(*) from tweets1”
ssh into the db node to run psql with the query “select * from tweets1”
ssh into the app node to run psql with the query “select count(*) from tweets1”
ssh into the app node to run psql with the query “select * from tweets1”

You’ll have to connect with “psql -U youruser -h somehost” etc.

kurt · July 4, 2022, 2:22am

@tj1 made good debugging suggestions. If the commands they suggested are fast, make sure your app and database are running in the same regions.

nicktorba · July 5, 2022, 5:09pm

Does the command fly postgres connect tweetscape-follows-views achieve the same thing as ssh’ing into the db node?

When i do this, the count(*) query is fast, but the select * query is still very slow

tj1 · July 5, 2022, 6:50pm

No, I believe that is connecting from your machine which means you would be counting that network latency.

kurt · July 5, 2022, 6:55pm

It’s a little of both. That command runs on the DB node itself, but then it has to send results over SSH to your local terminal.

Is select * from tweets1 limit 10 fast?

nicktorba · July 5, 2022, 7:30pm

0.581 ms when running it with the timer on. (turning on timer with the \timing command at psql console).

That seems slower than it should be correct?

kurt · July 5, 2022, 8:37pm

It’s hard to say. I can imagine queries that take 581ms. What are you getting for that on other systems? There’s nothing special about the Postgres you’re using. It should behave the same with the same RAM / CPU / disk as any other server has.

tj1 · July 6, 2022, 1:08am

Without logging into the db server and running the query it’s tough to separate the network portion vs the db portion of the performance.

nicktorba · July 6, 2022, 2:29pm

alright is is officially entirely a network issue on my end.

Just ran the query from my python app running in the same region as my db, and it returned in 0.04 seconds…

Is there a way for me to develop my app directly on a fly machine to avoid this network issue in the future?

Also, thank you both for your help!!! @tj1 @kurt

EDIT: just found Jupyter Notebook · Launch on Fly.io, I think this will take care of that for me!!

Topic		Replies	Views
Postgres queries slow Questions / Help postgres	4	664	December 12, 2022
Significant PostgreSQL performance difference between apps v1 and v2 Questions / Help postgres	1	540	July 21, 2023
Fly Postgres Connections Extremely Slow from Django Questions / Help postgres , django	7	829	August 1, 2023
Slow Postgres performance? Build debugging postgres , rails , ruby	8	811	April 19, 2023
Migrating Postgres from Render to Fly Questions / Help postgres	1	395	February 6, 2023

PostgreSql Performance

Related topics