Hello Fly Team,
I’m trying to get a postgres app up and running, but I am struggling to get reasonable performance. I am creating a table with the following python command:
df_.to_sql("tweets1",
engine,
if_exists="replace",
dtype={
'id': BigInteger,
'created_at': DateTime,
'author.username': String(15),
'author.id': BigInteger,
'referenced_tweets.replied_to.id': BigInteger,
'referenced_tweets.retweeted.id': BigInteger,
'referenced_tweets.quoted.id': BigInteger,
'in_reply_to_user_id': BigInteger,
"entities.mentions": JSON,
"public_metrics.reply_count": BigInteger,
"author.public_metrics.followers_count": BigInteger,
}
)
which runs this sql command to create the table:
CREATE TABLE tweets1 (
index BIGINT,
id BIGINT,
created_at TIMESTAMP WITHOUT TIME ZONE,
"author.username" VARCHAR(15),
"author.id" BIGINT,
"referenced_tweets.replied_to.id" BIGINT,
"referenced_tweets.retweeted.id" BIGINT,
"referenced_tweets.quoted.id" BIGINT,
in_reply_to_user_id BIGINT,
"entities.mentions" JSON,
"public_metrics.reply_count" BIGINT,
"author.public_metrics.followers_count" BIGINT
)
My table only has ~4500 rows in it, but it takes 30 seconds for select * from tweets1
to complete and return the data. This seems way slower than it should be. I’m not sure where I am going wrong. I wouldn’t think any optimizations are needed to get better performance for this use would be necesssary, which makes me think I am missing a very simple step.
My db is running with 2 dedicated cpu’s and 4096MB.
Could someone give me some basic db pointers to help me figure this out?