Opened 2 years ago
Closed 2 years ago
#955 closed defect (fixed)
incorrect Kendall's tau p-value
Reported by: | Peter | Owned by: | Peter |
---|---|---|---|
Priority: | major | Milestone: | yat 0.17.2 |
Component: | statistics | Version: | 0.17 |
Keywords: | Cc: |
Description
With statistics::Kendall I get the following result Score: 0.260358 One-sided P: 0.447281 Two-sided P: 0.894562
whereas via https://astatsa.com/CorrelationTest I get Kendall's rank correlation sample estimate τ= : 0.249107 alternate hypothesis: true τ ≠0 z -statistic: 4.11244903545 p-value: 0.000039
The difference between score and tau is acceptable, but something is wrong for the p-value.
Attachments (1)
Change History (6)
Changed 2 years ago by
comment:3 Changed 2 years ago by
The problem is that the wrong formula is used in the correction of variance.
There is a term counting triplets
\sum t(t-1)(t-2) * \sum u(u-1)(u-2)
which should be divided by 9n(n-1)(n-2). Not doing that overestimates the variance and since that factor growths like n cubed, the error becomes disastrous for large n.
comment:4 Changed 2 years ago by
Also, note that the term above is only non-zero when there is at least one triplet in both x and y (i.e. one x-value is populated by at least three data points and the same for y). If there are not ties in both variables or the tie is only between two data points, this term is zero and the bug does not come into play.
data