You are reading a single comment by @h2o and its replies. Click here to read the full conversation.
  • Yeah, we're aware of this and are monitoring it.

    But... we haven't changed anything that should impact this, which makes it extremely hard to debug as there isn't an obvious "rollback that change and it will be fixed".

    We're also hazy on precisely when it started, contender dates are the beginning of the month, around the 15th Sept, and around 11pm on the 20th Sept. Those are the points on the graphs we have that start to show various changes in performance, though we do not know which changes in performance relate to 504 errors.

    The most likely start of the issue is the 11pm on the 20th September observed change. Though no code was modified or changed around that time, which is why we have a problem determining root cause.

  • The start of the month, the 15th, and the 20th are all around the times that the sleb photo leaks have happened. Someone's hosting naughty pictures on cloudfare servers and that's sucking up all the bandwidth, I'd guess.

  • Nah, this is definitely related to our servers.

    That last event, on the 20th... since then the Django instance (runs the frontend) has been maxing out it's RAM.

About

Avatar for h2o @h2o started