-
How can bugs be appearing a few weeks in when it worked fine to begin with? Is it just the extra traffic?
We had issues with timeouts which most of you noticed at some point in the last week. To try and resolve that we had to perform experiments at various levels of our stack, and unfortunately a few of those experiments introduced their own side effects.
Specifically, we use a Python package called requests to make HTTP requests from the front-end to the back-end, and we use a package called grequests to speed this up by making multiple HTTP requests simultaneously.
We updated the grequests at 13:17 on Friday, thinking that this was a likely contender for where the timeout issue was occurring, and the side effect of this is that the version of grequests we updated to failed to send the Content-Type header when attachments were added to multipart forms.
The attachments bug came from that, without the Content-Type we couldn't tell what type a file was, and so couldn't do some essential things (like ensuring we stored the correct mimetype).
Ergo... the timeouts issue led to other changes which had unintended side-effects which then caused new bugs to appear.
We've now rolled those changes back.
How can bugs be appearing a few weeks in when it worked fine to begin with? Is it just the extra traffic?