Filed under: metrics

Velocity 2011 (#velocityconf): Look at Your Data

Another interesting keynote was John Rauser's conversation about data. His point was clear. The metrics that we spend so much time looking at are not that helpful. Sure, I often look at the average, p50, p90, p99, and p99.9 times. However, that only gives you a vague idea of the distribution of the actual data. You may very well miss some interesting features that occur between p50 and p75 using just these metrics. The only way to truly understand the outliers is to actually look at them.

Histograms are typically the best way to actually look at and understand this data. I really want to get back to the office next week and start playing around with some of our service metrics. I'd like to create some histograms to start really showing me what is going on. I have a problematic service right now that could really benefit from some more in-depth analysis of the data in this way.

John also made the point that there are no tools that really do this for you. He did have a really cool graph that showed a heat map overlayed with the time series data. This gave you an idea of the distribution along with the actual average latencies. I don't think this really shows you the true distribution, though. if there's a place where the curve does not fall off quite as quickly, a color comparison would have a hard time actually showing that kind of information. In any case, I expect to be generating some interesting histograms next week.

web
stats