The things you can do with search

Posted on
Page
of 9
  • Tick "comments", before unticking "title" perhaps?

  • Now I don't have the "comments" option at all. I have

    Types to display

    • Conversations (unticked)
    • Events (unticked)

    Filters

    • forumId = 549 (ticked)
    • Following (unticked)
    • Title matched search terms (ticked)

    Order by

    • Relevancy (radio button unselected)
    • Recent updates (radio button selected)
  • I don't have the "comments" option at all.

    Sorry, I didn't realise you only had "bronze" membership.

  • I have a poo-brown badge on my avatar when I log in. That lets me look at 50 posts a week and I get sent an email on my birthday saying "YOUR A CUNT" in the subject line.

  • "YOUR A CUNT"

    Sorry!

    The proof-reading is subbed out to Ukraine these days and we is having comms trouble.

  • Thank you, that's really helpful.

    The bug happens when the forumId is added, that slows it down. Part of the way I've implemented that must no longer be good, probably since I added the sub-forums stuff. I'll look into it.

    It is blindingly fast without the forumId: https://www.lfgss.com/search/?q=%22mks%20gr9%22+sort:date

  • Thanks - to cut through the complexity, it's basically a search within a Classifieds sub forum to find a certain item. Often the items are not called out specifically in titles, as people title their thread "January Clearout" and so on.

  • I've got to the bottom of this... it can be summed up as "counting is hard".

    The actual search takes less than 2 seconds, but counting how many search results there were takes about a minute. There are only 52 results, but the search has to be fully generated and stored as a temporary table for the count to happen... and that is apparently costly.

    I'll need to split up search so that it does the main search as one query, and does the counting as another query. I'll see if I can do it this evening.

  • I did some tuning of the search whilst I was there... I've probably saved 2ms off a search that takes 600ms. Oh well.

  • Magnificent! I do appreciate your dedication.

  • I've just confirmed that running it as one query (do the search and figure out how many results there are whilst only returning the first 25) takes 1 minute and 38 seconds.

    Running it as two queries (do the search, return 25 rows and then figure out how many results there are) takes 2.5 seconds in total.

    Definitely will change this.

    Though... there is a risk that the search will be slower on subsequent pages. i.e. it may take towards 15-20 seconds (and risk timing out on some queries that return very large result sets) to fetch page 2 or more.

    In essence I had chosen to implement a way of searching in which the cost for all pages was equal, but it's probably best to tune for making page 1 very cheap, and subsequent pages expensive.

  • Would that mean that while a user is reading page 1, the subsequent pages would be being fetched and...cached for loading?...

    Sorry I'm not massively familiar with this level of coding.

  • Ooh, that's a good idea.

    It didn't mean that.

    It meant that "page 1 is fast", "pages 2 and onwards are a little slower".

    But... hey, I could prefetch the next search page and then when you click the link the browser either has it, or the server has already cached the query... either way that would be fast.

    Cool.

    So my changes this evening (when I get home) will be:

    1. Split the search query into 2 queries, one that does the search and one that counts the results.
    2. Add a prefetch meta header to the search results page.
    3. Entirely unrelated, add a meta header to prevent sending referrer headers off-site, which improves privacy.
  • Made the major change... I'll see how it goes.

    Your original query: https://www.lfgss.com/search/?q=mks%20gr9+forumId:549+sort:date

  • Cheers ^

    It worked on mobile last night but is throwing a server error for me now on both mobile and desktop...

  • OK, I have another approach... I'll try this out this weekend...

    1. WITH m AS (
    2. SELECT m.microcosm_id
    3. FROM microcosms m
    4. LEFT JOIN permissions_cache p ON p.site_id = m.site_id
    5. AND p.item_type_id = 2
    6. AND p.item_id = m.microcosm_id
    7. AND p.profile_id = 47686
    8. LEFT JOIN ignores_expanded i ON i.profile_id = 47686
    9. AND i.item_type_id = 2
    10. AND i.item_id = m.microcosm_id
    11. WHERE m.site_id = 234
    12. AND m.is_deleted IS NOT TRUE
    13. AND m.is_moderated IS NOT TRUE
    14. AND i.profile_id IS NULL
    15. AND m.microcosm_id = 549
    16. AND (
    17. (p.can_read IS NOT NULL AND p.can_read IS TRUE)
    18. OR (get_effective_permissions(234, m.microcosm_id, 2, m.microcosm_id, 47686)).can_read IS TRUE
    19. )
    20. ), sr AS (
    21. SELECT si.item_type_id
    22. ,si.item_id
    23. FROM search_index si
    24. ,plainto_tsquery('mks gr9') AS query
    25. WHERE si.site_id = 234
    26. AND si.document_vector @@ query
    27. AND si.microcosm_id IN (SELECT microcosm_id FROM m)
    28. )
    29. SELECT total
    30. ,item_type_id
    31. ,item_id
    32. ,parent_item_type_id
    33. ,parent_item_id
    34. ,last_modified
    35. ,rank
    36. ,ts_headline(document_text, query) AS highlight
    37. ,has_unread(item_type_id, item_id, 47686)
    38. FROM (
    39. SELECT COUNT(*) OVER() AS total
    40. ,f.item_type_id
    41. ,f.item_id
    42. ,f.parent_item_type_id
    43. ,f.parent_item_id
    44. ,f.last_modified
    45. ,ts_rank_cd(si.document_vector, query, 8) AS rank
    46. ,si.document_text
    47. ,query.query
    48. FROM sr
    49. JOIN search_index si ON sr.item_type_id = si.item_type_id
    50. AND sr.item_id = si.item_id
    51. JOIN flags f ON f.item_type_id = sr.item_type_id
    52. AND f.item_id = sr.item_id
    53. LEFT JOIN ignores i ON i.profile_id = 47686
    54. AND i.item_type_id = f.item_type_id
    55. AND i.item_id = f.item_id
    56. ,plainto_tsquery('mks gr9') AS query
    57. WHERE f.site_id = 234
    58. AND i.profile_id IS NULL
    59. AND f.microcosm_is_deleted IS NOT TRUE
    60. AND f.microcosm_is_moderated IS NOT TRUE
    61. AND f.parent_is_deleted IS NOT TRUE
    62. AND f.parent_is_moderated IS NOT TRUE
    63. AND f.item_is_deleted IS NOT TRUE
    64. AND f.item_is_moderated IS NOT TRUE
    65. AND (
    66. -- Things that are public by default
    67. COALESCE(f.parent_item_type_id, f.item_type_id) = 3
    68. OR -- Things in microcosms
    69. COALESCE(f.microcosm_id, f.item_id) IN (SELECT microcosm_id FROM m)
    70. )
    71. ORDER BY f.last_modified DESC
    72. LIMIT 25
    73. OFFSET 0
    74. ) r

    That correctly returns 52 results, in the right forum, bearing in mind the scope of the search and any ignores that you may have in place.

    Oh, and it does it in 111 milliseconds.

  • Notes to myself beyond the above...

    1. UPDATE search_index si
    2. SET microcosm_id = f.microcosm_id
    3. FROM flags AS f
    4. WHERE si.item_type_id = 4
    5. AND si.microcosm_id IS NULL
    6. AND si.parent_item_type_id != 5
    7. AND si.parent_item_type_id != 3
    8. AND f.item_type_id = si.parent_item_type_id
    9. AND f.item_id = si.parent_item_id;
    10. CREATE INDEX ON search_index (microcosm_id ASC NULLS LAST);
    11. -- In the search query, does this even make sense?
    12. COALESCE(f.microcosm_id, f.item_id) IN (SELECT microcosm_id FROM m)

    Also update update_revisions_search_index() to set the microcosm_id when a comment revision is posted.

  • What language is that? I can only recognise python and javascript!

  • Alright... I think I have it nailed.

    I hope I do anyway.

    Let's see.

  • By jove I think it's worked!!!!

    rep rep rep rep

  • I know.

    I'm all kinds of pleased with it.

    But I'm also looking at a search query that makes my head spin... not least because what you see above is one permutation of it, and it can actually build itself in hundreds of different ways and I needed to make sure I hadn't slowed down some of those other ways.

    Searching things that have attachments is faster now too.

    Am pleased with myself. Will go to bed smug as no-one has yet reported something has broken. If I can make it through my birthday without reports of me having broken stuff, it will be a good day.

  • @Velocio, the search function I find most useful - searching for a thread within a sub-forum - seems to always give me this these days. Wassup with that?

    Server Error
    We're sorry, there was a temporary error when we tried to process your request.

    The reason was: Too many results to process, please make your search more specific.

    Click here to go back to the homepage.

  • I wanted to see if a youtube video had already been posted on here, so I tried searching by just the video ID (because the rest of the URL can take a number of forms), but it doesn't seem to work.

    Am I missing something?

  • That basically does not work, as the search is designed for language and not links.

    But in theory I could make it work, as I already store every link in a different table. Could probably add it by adding a link:8mQxum-D­EH4 which would find all places this video were embedded: https://www.lfgss.com/comments/13061166/

  • Yeah, that was the video. I assume it's been spammed all over LFGSS by now. I did try a google search on LFGSS though, and found nothing, but Google's caches are never fully up to date...

    Have you considered Elastic search? People tell me it's the mutt's nuts. I've been meaning to take a look at it.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

The things you can do with search

Posted by Avatar for Velocio @Velocio

Actions