I mean... it might be as simple as removing the update_id and update_type_id from the UNION so that the GROUP BY merges the duplicates.
Hmm... no, wouldn't work... because then I could never determine how to render the items as "someone mentioned you" etc... nor which email notifications to send.
Maybe I could do all of the queries individually in advance and implement UNION in code once I've read them, or even in a SQL query which chooses rows WHERE NOT EXISTS.
Hmm... no, wouldn't work. On the first page this would be fine, but once you start paginating updates each query would track a different offset depending on the number of items for that type of update.
Hmm.
I can't think of any obvious way to avoid it within the query.
But maybe it can just be done in code when rendered... it would lead to uneven pages where if duplicates were discovered one was removed. So sometimes on these edge cases there would be 23 updates on a page rather than 25 updates... but this is probably OK, you might never even notice the unequal page lengths. It wouldn't solve duplicates that spanned pages, but again you probably wouldn't notice.
So it seems like this could be done at render time... keep updates only if the item hasn't already been shown.
Which leads to another question... what to keep? Probably the most specific... i.e. keep mentions above "someone commented", keep "you posted" above "new thread in forum".
To do that probably needs a double-parsing within the API that returns the updates. This is doable, at that point the slice is only 25 items and all in memory so it wouldn't hurt, and this is only on the updates page as the search page doesn't have this problem.
So that's an idea of how this could be done... remove duplicates within the API post-query but pre-render.
I mean... it might be as simple as removing the
update_id
andupdate_type_id
from theUNION
so that theGROUP BY
merges the duplicates.Hmm... no, wouldn't work... because then I could never determine how to render the items as "someone mentioned you" etc... nor which email notifications to send.
Maybe I could do all of the queries individually in advance and implement
UNION
in code once I've read them, or even in a SQL query which chooses rowsWHERE NOT EXISTS
.Hmm... no, wouldn't work. On the first page this would be fine, but once you start paginating updates each query would track a different offset depending on the number of items for that type of update.
Hmm.
I can't think of any obvious way to avoid it within the query.
But maybe it can just be done in code when rendered... it would lead to uneven pages where if duplicates were discovered one was removed. So sometimes on these edge cases there would be 23 updates on a page rather than 25 updates... but this is probably OK, you might never even notice the unequal page lengths. It wouldn't solve duplicates that spanned pages, but again you probably wouldn't notice.
So it seems like this could be done at render time... keep updates only if the item hasn't already been shown.
Which leads to another question... what to keep? Probably the most specific... i.e. keep mentions above "someone commented", keep "you posted" above "new thread in forum".
To do that probably needs a double-parsing within the API that returns the updates. This is doable, at that point the slice is only 25 items and all in memory so it wouldn't hurt, and this is only on the updates page as the search page doesn't have this problem.
So that's an idea of how this could be done... remove duplicates within the API post-query but pre-render.