canonical-ci-engineering team mailing list archive
-
canonical-ci-engineering team
-
Mailing list archive
-
Message #00089
We now have statsd working from m-o
It's now possible to send metrics on m-o via statsd such that they
show up on http://graphite.engineering.canonical.com.
Have a think of the ways you'd want to gauge how well the CI services
are performing after moving to 1SS, and then try to get those written
as metrics. Generally you want to write alarms on services being down
as nagios checks, but I think a before and after picture will help
understand how well the move went, what services are still wobbly,
etc.
Think of how you would explain to another team the improvements you've
made to the services you're responsible for in terms of numbers.
Foobar is now 30% faster at reticulating splines. Baz lands 10
branches per hour, on average. Think of how we'd be able to identify
growing pressure on the pain points of our services. When a service is
performing poorly, what graphs are going to help you find out what's
really going on? Get all of these written in the wiki, and start
adding calls out to statsd as you find the time:
https://wiki.canonical.com/UbuntuEngineering/CI/Metrics
Remember that this is all about making it really easy for you as a
developer to add new metrics, so don't block on me. If there's
something you think would be useful to measure, just start measuring
it.
For more on the why of 'measure anything, measure anything', this
makes for good reading:
http://codeascraft.com/2011/02/15/measure-anything-measure-everything/
== Instructions ==
If you're working in Python, please use the txstatsd client library.
The 1.0 version is in the IS archive:
http://archive.admin.canonical.com/pool/main/p/python-txstatsd/
An example of setting it up can be found in lp:daisy:
http://bazaar.launchpad.net/~daisy-pluckers/daisy/trunk/view/head:/daisy/metrics.py#L11
Substitute 'whoopsie-daisy' as the namespace for 'ubuntu-ci'. The port
is 10041 and the host is snakefruit.canonical.com.
You'll then have meters, gauges, and timings at your disposal:
https://github.com/etsy/statsd/blob/master/docs/metric_types.md
For the exact function format, consult the library code:
https://github.com/sidnei/txstatsd/blob/master/txstatsd/metrics/metrics.py
Note that you can also drive this from the shell:
evand@magners-orchestra:~$ echo "ubuntu-ci.testing.key:1|c" | nc -w 1
-u snakefruit.canonical.com 10041
You should then see your results under the Graphite.statsd.ubuntu-ci
tree on the left hand side of
http://graphite.engineering.canonical.com
If a visual example would help, here's a dashboard I created a while
back from the metrics we collect for the Ubuntu error tracker:
http://ubuntuone.com/4tYfL5JsXzKhzoxtQdXf7l
If you need help with any of this, or you have ideas for improvement,
please let me know!
Thanks,
Evan
Follow ups