Subtle changes, bugs and feedback

Posted on Fri 28th, March 2008

Page of 312

First Prev Last Next

• #6426

Velocio in reply to @pascalo

Large = db files physically cannot fit on one box or (write or read) operations exceed capability of one box.

We're a long way short of that.

I operate a single node with failover. I presently put all reads and writes through the master, and am just using a DNS record for the address and the failover mechanism will go to the slave, and that becomes the new master.

I do not have read slaves, and don't have a traditional cluster... just a single Linode with a 12GB RAM allocation offering 6 CPU cores. It's a balance between computation and enough RAM to hold indexes. This is Linodes most powerful (computation) and high RAM instance for below $100 per month... and we have 2 of them, so it's a huge part of our hosting cost.

Fulltext we do process relevant fields into a ts_vector with GIN index, but in a different table I named search_index. That table has some extra fields, the fields that are relevant when fulltext search, i.e. if people search text and author, then I put the author id on there too as well as as the text index columns. I only want full text search to hit that table, not to do lots of JOINs.

I've found the PostgreSQL fulltext to be good enough for 99% of scenarios, which makes it good enough full stop.

PostgreSQL is easy to sysadmin, easy to run, simple to configure for failover, easy to partition, easy to shard, easy enough to cluster, easy to set up read slaves.

It's easy.

It only really gets messy if you go overboard on either triggers or stored procs. DBA rules still apply, I only treat the database as the owner of data integrity. Meaning I permit some triggers and funcs, but only where they help enforce data integrity. I leave computation to the application calling the database.
• #6427

pascalo

That's great info, thanks. I really didn't feel like managing and DBA are my strengths at all, that's why I pressed for RDS. So far it's been good. We run reporting stuff and searching etc off a read replica. Putting fulltext into a separate table is a clever move. Will keep that in mind when it comes to refactor. Actually, I won't, as I just handed in notice :p
• #6428

Velocio
I put the fulltext stuff in a separate table to give us options:
- If we used an external fulltext thing like elasticsearch, how easy would it be to strip fulltext searching out of the database?
- If we required time as a parameter to fulltext searches, could I have a path to increasing performance of time-bounded queries (partition by time)
But I think it's been super useful anyway. The text on this site is largely static, so I'm able to keep the vacuuming separate, and the backups are easier to push across the wire because that stuff hasn't changed, etc.

The search_index table we have holds all fulltext... from sites, forums, threads, comments, PMs... everything. Which is how searching across multiple types of things is so easy. Each type of thing has it's own trigger to push a copy into the full text index table.
• #6429

pascalo

Makes sense. Thank you so much for the awesome insight. I'm taking notes here.
So do you generate the ts_vector entries off events in the api layer per code, or via triggers etc?
• #6430

pascalo

Guess i know the answer: https://github.com/microcosm-cc/microcosm/blob/bbdaa6eacbedd830a52b34e97844fd2e88a06e8c/db/migrations/20151004160923_parent_microcosms.sql#L158
• #6431

Velocio in reply to @pascalo

Triggers... because this is data integrity :)

Though that file is for adding forum hierarchies, the commit you want for most of the full text is this one:
https://github.com/microcosm-cc/microcosm/blob/bbdaa6eacbedd830a52b34e97844fd2e88a06e8c/db/migrations/20160224223626_fulltext_search.sql

Note that the weighting varies according to the source table, which is how the profiles and forums get listed higher for exact matches when threads may also match, and also how threads beat comments for exact matches.
• #6432

Velocio

Actually, thanks for asking this stuff, I just realised I could probably save a lot of space in the search index table if I got rid of the revisions that weren't active.

Every edit of a comment is a revision, all revisions are indexed... but there we are, that's not needed... only those that could be viewed (revisions.is_current = TRUE) need to be in the search_index table.
• #6433

pascalo

No probs. Glad to be the rubber duck.
• #6434

branwen
Possible small bug - or might be something to do with limiting inputs?

When I use the embed image button, then select the text and use the link button, I get
```
[!\[\](image.png)](link.com)   
```
when what I want is
```
[![](image.png)](link.com)   
```
• #6435

Velocio in reply to @branwen

Nice and subtle.

It's correct in that if it were not escaped and a piece of content contained square brackets then the link would break. But it's wrong for the exact scenario of an image.

Good find.
• #6436

andyp

The thread below causes my iPhone browser to have an eppy*;

https://www.lfgss.com/conversations/307085/

Is it all the images in it?

*technical term
• #6437

Velocio in reply to @andyp

Does it eventually display?

If so, could you download an image using the iPhone browser and tell me what file extension the downloaded image has.
• #6438

andyp in reply to @Velocio

Nope - it gives up.
• #6439

Velocio

https://lfgss.microcosm.app/api/v1/files/2e24e2e1ae326a49b06d9da9794da1c5414205a1.JPG

And that it'll in the browser? Does it load? If it loads and you download it, what file extension?
• #6440

andyp in reply to @Velocio

Not sure what you wanted me to do there, clicking that jpg link above works fine, it loads a a jpg.
• #6441

Velocio in reply to @andyp

That's what I wanted.
To rule out the webp optimisation that Chrome on other devices has.

It appears that if a DOM element in the HTML has to wait for 10 X 1MB images to load, that is the trigger for Safari to bail.

I wonder if just adding dimensions would help... Give me a moment to turn on machines and try something.
• #6442

Velocio

@andyp try that link now.
• #6443

andyp in reply to @Velocio

All fixed, works a treat.
• #6444

Velocio in reply to @andyp

Interesting.

Thanks for confirming. Safari is odd.
• #6445

mashton

What happened to he increased file upload size? I tried to upload a bike tag this week that was 3.7Mb and it borked on me?
• #6446

moocher

Uploading a 3. 5mb pic via (an android) mobile doesn't seem to be working
• #6447

aggi

Same problem here (Chrome on Windows). Get a server error with larger files again, although even though you get an error the text posts, just the attachments don't appear.
• #6448

moocher

Pic loading is still an issue from a laptop - seems files must be under 3mb again to load
• #6449

Tijs

This is just a minor stupid thing, but when you tag yourself (when you for example quote someone else who tagged you), you get an e-mail notification. Maybe tagging yourself could be exempt from notifying you? I know, seriously niche and irrelevant thing, but just a thought.
• #6450

Backstop

Anyone know how I can search a thread for all entries by one user?

Page of 312

First Prev Last Next

Post a reply
- Bold
- Italics
- Link
- Image
- List
- Quote
- code
- Preview
Formatting Help

Don't worry about formatting, just type in the text and we'll take care of making sense of it. We will auto-convert links, and if you put asterisks around words we will make them bold.

Tips:

Create headers by underlining text with ==== or ----

To *italicise* text put one asterisk each side of the word

To **bold** text put two asterisks each side of the word

Embed images by entering:
![](https://www.google.co.uk/images/srpr/logo4w.png)
That's the hard one: exclamation, square brackets and then the URL to the image in brackets.

* Create lists by starting lines with asterisks

1. Create numbered lists by starting lines with a number and a dot

> Quote text by starting lines with >

Mention another user by @username

For syntax highlighting, surround the code block with three backticks:

```
Your code goes here
```
Just like Github, a blank line must precede a code block.

If you upload more than 5 files we will display all attachments as thumbnails.

For a full reference visit the Markdown syntax.

Subtle changes, bugs and feedback

About

Subtle changes, bugs and feedback

Actions

Subtle changes, bugs and feedback

Formatting Help

About

Subtle changes, bugs and feedback

Actions

LFGSS