← Back to team overview

canonical-ci-engineering team mailing list archive

Log incidents as you firefight

 

Just a gentle reminder to log devices going down, or any kind of
service interruption in
https://wiki.canonical.com/UbuntuEngineering/CI/IncidentLog as a new
line in the table. I realise the need to resolve issues quickly drives
us all away from paperwork, but knowing where we're spending our time
with the fire extinguisher will help us iterate on the infrastructure
to spare us the headaches in the future.

If you do feel it would be useful to write some recommendations and
provide a more in depth timeline of an issue, Vincent has done a
stellar job covering the downtime we experienced yesterday. I
recommend it as a guide for future reports:

https://wiki.canonical.com/UbuntuEngineering/CI/IncidentLog/2013-10-16

I'll be combing through these and coming up with some actions. If you
want to have more of a discussion around any of these incidents or
what we should do differently in the future, do let me know.

Thanks!


Follow ups