Geometric argument for constraints on corr(X,Z) given corr(X,Y) and corr(Y,Z)

(Note: This is one of three posts that I wrote some time ago that have just been languishing under the “Misc.” tab of my website for a while, because for whatever reason I didn’t feel that they were a good fit for my blog. Well, I’ve decided to go ahead and move them to the blog, so here you go!)

Link to original TalkStats thread, September 17, 2013.

Today a labmate asked the following question: if we have three random variables x, y, z, and we know the correlations r_{xy} and r_{yz}, what constraints if any does this place on the correlation r_{xz}?

At the time I reflexively answered that the remaining correlation must be the product of the two known correlations. Which of course is totally wrong. I think I was getting some mental interference from some of the equations for simple mediation floating around in my head. Anyway, after thinking about it for a while I have come up with a convincing geometric argument for what the constraints actually are. I have also verified that my answer agrees with a more complicated-looking answer to this question that I found elsewhere online.

Because I ending up spending a lot of time on this and I thought some of you would find the results interesting, I thought I would share my work here. Comments are welcome!

Okay. So imagine our variables x, y, z as vectors in n-dimensional space. The Pearson correlation coefficient between any two of these variables can be interpreted as the cosine of the angle between the corresponding vectors. This is an interesting and well-known geometric fact about correlation coefficients.

So now imagine that the x and y vectors are fixed (and hence so is their correlation), but that the vector z is free to vary so long as r_{yz} is constant. This constraint on r_{yz} means that the set of possible z vectors will form a sort of “cone” around the y vector, as in the following image:

Now it is intuitively obvious (I know this is a sneaky phrase, but that’s why I call this just an “argument” and not a “proof”) that the two possible z vectors that will lead to the minimum/maximum values of r_{xz} are the z vectors that lie on the same plane as the x and y vectors. This leads to the following expression for the minimum/maximum values of r_{xz} given r_{xy} and r_{yz}:

cos[arccos(r_{xy}) \pm arccos(r_{yz})].

One notable result following from this is that if x is orthogonal to y and y is orthogonal to z, then there is no constraint on r_{xz}, it can be anywhere from -1 to +1. But under any other circumstances, fixing r_{xy} and r_{yz} will place some constraint on the range of r_{xz}.

Okay, now for the verification part, which requires a bit of math.

So in this stats.stackexchange.com thread it is stated that the three correlations must satisfy

1+2r_{xy}r_{xz}r_{yz}-(r_{xy}^2+r_{xz}^2+r_{xy}^2) \ge 0,

the reasoning here being because this is the determinant of the correlation matrix and it cannot be negative. Anyway, this can be viewed as a quadratic inequality in r_{xz}, already in standard form:

(-1)r_{xz}^2 + (2r_{xy}r_{yz})r_{xz} + (1 - r_{xy} - r_{yz}) \ge 0.

So if we apply the quadratic formula and simplify the result, we get the following for the minimum/maximum values of r_{xz}:

r_{xy}r_{yz} \pm \sqrt{(1-r_{xy}^2)(1-r_{yz}^2)}.

Now taking my answer and applying the trig identity cos(a \pm b) = cos(a)cos(b) \mp sin(a)sin(b) we get

r_{xy}r_{yz} \pm sin(arccos(r_{xy}))sin(arccos(r_{yz})).

Now applying the identity sin(x) = \sqrt{1 - [cos(x)]^2} we get

r_{xy}r_{yz} \pm \sqrt{(1-r_{xy}^2)(1-r_{yz}^2)},

which is the answer we got from the stackexchange post. So our simpler, geometrically based answer agrees with the more conventional answer that is harder to understand.

One thought on “Geometric argument for constraints on corr(X,Z) given corr(X,Y) and corr(Y,Z)

Leave a Reply

Your email address will not be published. Required fields are marked *