## Variance and Covariance as Geometrical Objects

When  I first studied the concept of variance, I was shown a formula with sigma notation. Then, I was taught that if you had bunch of numbers and you were asked to find their variance, you just throw the numbers into the formula and compute a final value.

Hardly inspiring isn’t it?

That was during the undergraduate days. When I started working on my MSc, I read that Fisher, the father of modern statistics, always looked at statistical quantities geometrically, and not analytically – a feat that he supposedly developed when he was bedridden and had nothing else to do (so he did statistics in his head by manipulating geometrical objects!).

This is how Fisher understood the variance of a collection of numbers – as the average size of squares induced by the distances between a number and the mean of the collection of numbers. Figure 1 shows four data points scattered around a mean value. Each induces a square, which is large if the data point is far away from the mean, and small if close. Adding all four squares and averaging them gives the variance. A small variance therefore indicates clustering of data points around the mean, while a large variance indicates that quite a few data points deviate from the mean by a large magnitude. Figure 1. The variance as the average sum of squares induced by the distance between a data point and the mean.

The covariance is a harder concept, and for years I just treated it analytically. A while ago a friend asked me what it was actually without using any formula. I thought about it and came up with a geometric solution. I think this is how I am going to teach students about the covariance the next time they ask me :). Statistics becomes fun when you are able to see things geometrically. Figure 2. Covariance between two variables X, Y is defined as the average of the product of the deviation of X from the mean of X, and the deviation of Y from the mean of Y. The product can be thought of as signed rectangular areas, where those in white are positively signed, and those in grey are negatively signed. 