Finding Correlations in Data with Uncertainty: Classical Solution

Following up on my previous post as a result of an excellent suggestion from Andrej Spiess. The data are indeed very heteroscedastic! Andrej suggested that an alternative way to attack this problem would be to use weighted correlation with weights being the inverse of the measurement variance.

Let’s look at the synthetic data first.

> library(weights)
> 
> wtd.cor(synthetic$mu.x, synthetic$mu.y, weight = 1 / synthetic$sigma.y**2)
      correlation    std.err  t.value    p.value
V1.V1   0.1945633 0.09908485 1.963603 0.05240988

This is in excellent agreement with the bootstrap results. Now the original experimental data.

> wtd.cor(original$mu.x, original$mu.y, weight = 1 / original$sigma.y**2)
      correlation    std.err  t.value      p.value
V1.V1   0.2407686 0.04606181 5.227076 2.656016e-07

Here the agreement with the bootstrap result is not as good. I’m not quite sure why, but suspect that it might have something to do with the fact that the original data are quite severely skewed so that assumptions about normality would probably be voilated.

2 thoughts on “Finding Correlations in Data with Uncertainty: Classical Solution

  1. David McArthur

    A variety of robust correlation coefficients highly suitable for heteroscedastic data can be found in RR Wilcox, Introduction to Robust Estimation and Hypothesis Testing, Elsevier, 2012 (and they are all supported with available R code).

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>