The site started experiencing issues at 9:36 am. We rebooted the affected server in our infrastructure and service was restored before 9:42, but the server continued to misbehave. We replaced it with a bigger box and updated the configuration. We are continuing to monitor the situation.
Site issues occurred at the following times:
09:36 to 09:42 (first event) 09:55 to 10:00 (second event) 10:14 to 10:23 (replacement of box 1) 11:00 to 11:05 (replacement of box 2) 11:16 to 11:18 (configuration change) 12:04 to 12:14 (further event)
A code change to reduce memory usage will be deployed this afternoon and an upgrade to an element of our infrastructure is scheduled for 6am tomorrow.