You are reading a single comment by @pascalo and its replies. Click here to read the full conversation.
  • The tables are tiny, non-sharded... low tens of millions of rows per table. Absolutely nothing special being done with them.

    I do know sharding techniques, and partitioning stuff... but I've not yet seen any reason to use those things here. The database is still less than 100GB in size.

  • And do you run the cluster yourself etc, or managed solution like RDS? What kind of set up are we talking?
    Also interested in what kind of approach you're using towards fulltext search?

    We are running RDS postgres flavour ... I have no idea what's considered a large vs a small DB.
    I really like postgres, especially the jsonb stuff, and have learnt a lot about it in the last two years. If there's a complicated reporting query they want, then it's always me writing them.

    For fulltext right now I am running a trigger that processes the relevant fields into a ts_vector. Not sure if that's a good or shit approach.

  • Large = db files physically cannot fit on one box or (write or read) operations exceed capability of one box.

    We're a long way short of that.

    I operate a single node with failover. I presently put all reads and writes through the master, and am just using a DNS record for the address and the failover mechanism will go to the slave, and that becomes the new master.

    I do not have read slaves, and don't have a traditional cluster... just a single Linode with a 12GB RAM allocation offering 6 CPU cores. It's a balance between computation and enough RAM to hold indexes. This is Linodes most powerful (computation) and high RAM instance for below $100 per month... and we have 2 of them, so it's a huge part of our hosting cost.

    Fulltext we do process relevant fields into a ts_vector with GIN index, but in a different table I named search_index. That table has some extra fields, the fields that are relevant when fulltext search, i.e. if people search text and author, then I put the author id on there too as well as as the text index columns. I only want full text search to hit that table, not to do lots of JOINs.

    I've found the PostgreSQL fulltext to be good enough for 99% of scenarios, which makes it good enough full stop.

    PostgreSQL is easy to sysadmin, easy to run, simple to configure for failover, easy to partition, easy to shard, easy enough to cluster, easy to set up read slaves.

    It's easy.

    It only really gets messy if you go overboard on either triggers or stored procs. DBA rules still apply, I only treat the database as the owner of data integrity. Meaning I permit some triggers and funcs, but only where they help enforce data integrity. I leave computation to the application calling the database.

About

Avatar for pascalo @pascalo started