-
• #6427
That's great info, thanks. I really didn't feel like managing and DBA are my strengths at all, that's why I pressed for RDS. So far it's been good. We run reporting stuff and searching etc off a read replica. Putting fulltext into a separate table is a clever move. Will keep that in mind when it comes to refactor. Actually, I won't, as I just handed in notice :p
-
• #6428
I put the fulltext stuff in a separate table to give us options:
- If we used an external fulltext thing like elasticsearch, how easy would it be to strip fulltext searching out of the database?
- If we required time as a parameter to fulltext searches, could I have a path to increasing performance of time-bounded queries (partition by time)
But I think it's been super useful anyway. The text on this site is largely static, so I'm able to keep the vacuuming separate, and the backups are easier to push across the wire because that stuff hasn't changed, etc.
The search_index table we have holds all fulltext... from sites, forums, threads, comments, PMs... everything. Which is how searching across multiple types of things is so easy. Each type of thing has it's own trigger to push a copy into the full text index table.
- If we used an external fulltext thing like elasticsearch, how easy would it be to strip fulltext searching out of the database?
-
• #6429
Makes sense. Thank you so much for the awesome insight. I'm taking notes here.
So do you generate the ts_vector entries off events in the api layer per code, or via triggers etc? -
• #6431
Triggers... because this is data integrity :)
Though that file is for adding forum hierarchies, the commit you want for most of the full text is this one:
https://github.com/microcosm-cc/microcosm/blob/bbdaa6eacbedd830a52b34e97844fd2e88a06e8c/db/migrations/20160224223626_fulltext_search.sqlNote that the weighting varies according to the source table, which is how the profiles and forums get listed higher for exact matches when threads may also match, and also how threads beat comments for exact matches.
-
• #6432
Actually, thanks for asking this stuff, I just realised I could probably save a lot of space in the search index table if I got rid of the revisions that weren't active.
Every edit of a comment is a revision, all revisions are indexed... but there we are, that's not needed... only those that could be viewed (revisions.is_current = TRUE) need to be in the search_index table.
-
• #6433
No probs. Glad to be the rubber duck.
-
• #6434
Possible small bug - or might be something to do with limiting inputs?
When I use the embed image button, then select the text and use the link button, I get
[!\[\](image.png)](link.com)
when what I want is
[![](image.png)](link.com)
-
• #6435
Nice and subtle.
It's correct in that if it were not escaped and a piece of content contained square brackets then the link would break. But it's wrong for the exact scenario of an image.
Good find.
-
• #6436
The thread below causes my iPhone browser to have an eppy*;
https://www.lfgss.com/conversations/307085/
Is it all the images in it?
*technical term
-
• #6437
Does it eventually display?
If so, could you download an image using the iPhone browser and tell me what file extension the downloaded image has.
-
• #6438
Nope - it gives up.
-
• #6439
https://lfgss.microcosm.app/api/v1/files/2e24e2e1ae326a49b06d9da9794da1c5414205a1.JPG
And that it'll in the browser? Does it load? If it loads and you download it, what file extension?
-
• #6440
Not sure what you wanted me to do there, clicking that jpg link above works fine, it loads a a jpg.
-
• #6441
That's what I wanted.
To rule out the webp optimisation that Chrome on other devices has.It appears that if a DOM element in the HTML has to wait for 10 X 1MB images to load, that is the trigger for Safari to bail.
I wonder if just adding dimensions would help... Give me a moment to turn on machines and try something.
-
• #6443
All fixed, works a treat.
-
• #6444
Interesting.
Thanks for confirming. Safari is odd.
-
• #6445
What happened to he increased file upload size? I tried to upload a bike tag this week that was 3.7Mb and it borked on me?
-
• #6446
Uploading a 3. 5mb pic via (an android) mobile doesn't seem to be working
-
• #6447
Same problem here (Chrome on Windows). Get a server error with larger files again, although even though you get an error the text posts, just the attachments don't appear.
-
• #6448
Pic loading is still an issue from a laptop - seems files must be under 3mb again to load
-
• #6449
This is just a minor stupid thing, but when you tag yourself (when you for example quote someone else who tagged you), you get an e-mail notification. Maybe tagging yourself could be exempt from notifying you? I know, seriously niche and irrelevant thing, but just a thought.
-
• #6450
Anyone know how I can search a thread for all entries by one user?
Large = db files physically cannot fit on one box or (write or read) operations exceed capability of one box.
We're a long way short of that.
I operate a single node with failover. I presently put all reads and writes through the master, and am just using a DNS record for the address and the failover mechanism will go to the slave, and that becomes the new master.
I do not have read slaves, and don't have a traditional cluster... just a single Linode with a 12GB RAM allocation offering 6 CPU cores. It's a balance between computation and enough RAM to hold indexes. This is Linodes most powerful (computation) and high RAM instance for below $100 per month... and we have 2 of them, so it's a huge part of our hosting cost.
Fulltext we do process relevant fields into a ts_vector with GIN index, but in a different table I named
search_index
. That table has some extra fields, the fields that are relevant when fulltext search, i.e. if people search text and author, then I put the author id on there too as well as as the text index columns. I only want full text search to hit that table, not to do lots of JOINs.I've found the PostgreSQL fulltext to be good enough for 99% of scenarios, which makes it good enough full stop.
PostgreSQL is easy to sysadmin, easy to run, simple to configure for failover, easy to partition, easy to shard, easy enough to cluster, easy to set up read slaves.
It's easy.
It only really gets messy if you go overboard on either triggers or stored procs. DBA rules still apply, I only treat the database as the owner of data integrity. Meaning I permit some triggers and funcs, but only where they help enforce data integrity. I leave computation to the application calling the database.