Subtle changes, bugs and feedback

Posted on
Page
of 312
  • @Velocio just tested and I broke it even worse. Input value updates ok but doesn't seem to submit on form send :(

    I swear, I am trying to help ... I'm making a mess of all things I touch atm, job or life otherwise :(

  • Hah... don't worry about it... and thank you for the donation.

    Seriously don't worry about it. On this... the platform is run on a shoestring and yup I code in prod.

    I'll restore what was there originally... and I'll take a look after my vacation in a couple of weeks.

  • Seeing some weird behaviour on search.

    On the weekly photography thread (e.g. https://www.lfgss.com/conversations/282005/?offset=10275) click one of the #summertime hashtags, (or this one) and you'll only get two three results (when there should be at least 13).

    Then skip to the last page and click the "in_the_park" hashtag, and there are no results.

  • That is weird... and I've changed nothing (of course... as very little has changed in a long while).

    But it appears that the #in_the_park one was being treated like a search for #in and PostgreSQL seems to treat that like a pure English search... meaning in hits the stopword list. So I adjusted the hash tag to be #inthepark and the search now runs, but I don't see the results.

    This and the #summertime results should show... and it can't be a lack of indexing as it's all internal to PostgreSQL and it's not a failure to update some external search index (which would typically be what happened on most websites).

    I'll look into it whilst promising nothing at all.

  • This is what the system thinks....

    Here's your comment: https://www.lfgss.com/comments/16031751/

    And the search index says:

    Cheers. Thinking of keeping it local and as shamelessly indolent as possible so though it's a repeat, how are we with [#inthepark](https://www.lfgss.com/search/?q=%23inthepark)
    
    'cheer':1 'indol':10 'inthepark':23 'keep':4 'local':6 'possibl':12 'repeat':18 'shameless':9 'think':2 'though':14
    

    Here's my comment: https://www.lfgss.com/comments/16031937/

    And the search index says:

    But it appears that the [#in_the_park](https://www.lfgss.com/search/?q=%23in_the_park) one was being treated like a search for [#in](https://www.lfgss.com/search/?q=%23in) and PostgreSQL seems to treat that like a pure English search... meaning `in` hits the stopword list. So I adjusted the hash tag to be [#inthepark](https://www.lfgss.com/search/?q=%23inthepark) and the search now runs, but I don't see the results.\r+
    
    'adjust':56 'appear':22 'chang':7,15 'cours':10 'english':46 'extern':107 'failur':103 'happen':115 'hash':58 'hit':50 'index':90,109 'intern':95 'inthepark':62 'lack':88 'like':32,43 'list':53 'littl':13 'll':120 'long':18 'look':121 'mean':48 'noth':8,126 'one':28 'park':27 'postgresql':38,97 'promis':125 'pure':45 'result':74,79 'run':67 'search':34,47,65,108 'see':72 'seem':39 'show':81 'stopword':52 'summertim':78 'tag':59 'treat':31,41 'typic':112 'updat':105 've':6 'websit':118 'weird':3 'whilst':124 'would':111
    

    You can see that inthepark is present in both. So it's not the English search... and the plain text version is also in that... perhaps it's that though... I need to remember how the hell I implemented hashtag searching.

  • Man, you used to be cool.

    ;)

  • Hashtag searching is here: https://github.com/microcosm-cc/microcosm/blob/master/models/search_fulltext.go#L158-L168

    	var filterHashTag string
    	for _, hashtag := range m.Query.Hashtags {
    		filterHashTag += `
                  AND si.` + fullTextScope + `_text ~* '\W` + hashtag + `\W'`
    	}
    

    So that search is happening on the plain text bit just above the text index.

    What is \W in regular expressions? That matches any non-word character.

    Ah... so hypothesis time... what we're seeing is that if the hashtag starts at the beginning of a sentence or end of a sentence, and lacks one of these non-word characters, then this search doesn't match.

    Let's test that.

  • Test confirms it... the regex has been wrong for the last 8 years.

    It really should be word or line boundary at other side, not non-word character.

  • Workaround... put a space before and after a hashtag.

  • Man, you used to be cool.

    Probably high back then, or drunk.

  • I know how to fix it... but I'm going to have to do it later as I'm going into a meeting.

    In essence... I found the hashtags using a regex, and then I take the input and attempt to be clever when searching for it... but so long as I sanitise the input against the original regex that found the hashtag I can actually just do a plain text search and not a regex search... and that will match.

  • I’ve somehow sleep walked into the forum and marked M&M as ignore.

    How do I change that on an iPhone?

    Also recurring ‘why do I hate the Mark Everything as Read’ button comment.

  • I’ve somehow sleep walked into the forum and marked M&M as ignore.

    How do I change that on an iPhone?

    https://www.lfgss.com/ignored/

    Then visit the thing you wish not to ignore, and in the right-hand side unignore it.

  • Ta, and with the fastly outage I thought I’d managed to put most of the internet on ignore too.

    #earlyboomer

  • Actually... I'm going to do this document_text ~* '(^|\W)#summertime(\W|$)' which handles it fine.

  • Resolved.

  • So would #in_the_park work now?

    Edit: Doesn't get recognised as a hashtag, it seems.

    Edit 2: Added hash ...

  • Hash tags need hashes.

    #in_the_park

    The answer is yes... but performance is way worse than:

    #inthepark

    And :shrug, good enough for me.

    Hash tags without underscores are faster!

  • Oh dear, and I didn't even notice I hadn't put a hash in ... d'oh.

  • Sweet!

    What about just \b for word boundaries?

  • ...no not quite. A # is always preceded by a word boundary, so x#y would match /\b#y\b/

  • Also... I think there's a bug.

    It looked to me like a hashtag in a code block would be turned into a link, when it should be left alone.

    [#isthisalink](https://www.lfgss.com/search/?q=%23isthisalink)
    

    Yup... subtle bug.

  • I think we have success... I've applied the latest version and it works!

    I won't mention the thing that doesn't work... as the main thing works.

  • I won't mention the thing that doesn't work... as the main thing works.

    You can't do that! Now you have to tell me.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Subtle changes, bugs and feedback

Posted by Avatar for Velocio @Velocio

Actions