Opened 2 years ago

Closed 2 years ago

## #955 closed defect (fixed)

# incorrect Kendall's tau p-value

Reported by: | Peter | Owned by: | Peter |
---|---|---|---|

Priority: | major | Milestone: | yat 0.17.2 |

Component: | statistics | Version: | 0.17 |

Keywords: | Cc: |

### Description

With statistics::Kendall I get the following result Score: 0.260358 One-sided P: 0.447281 Two-sided P: 0.894562

whereas via https://astatsa.com/CorrelationTest I get Kendall's rank correlation sample estimate τ= : 0.249107 alternate hypothesis: true τ ≠0 z -statistic: 4.11244903545 p-value: 0.000039

The difference between score and tau is acceptable, but something is wrong for the p-value.

### Attachments (1)

### Change History (6)

### Changed 2 years ago by

### comment:3 Changed 2 years ago by

The problem is that the wrong formula is used in the correction of variance.

There is a term counting triplets

\sum t(t-1)(t-2) * \sum u(u-1)(u-2)

which should be divided by 9n(n-1)(n-2). Not doing that overestimates the variance and since that factor growths like n cubed, the error becomes disastrous for large n.

### comment:4 Changed 2 years ago by

Also, note that the term above is only non-zero when there is at least one triplet in both x and y (i.e. one x-value is populated by at least three data points and the same for y). If there are not ties in both variables or the tie is only between two data points, this term is zero and the bug does not come into play.

**Note:**See TracTickets for help on using tickets.

data