# Tie in ROC

### Description

Behaviour when ties are present appears to be random.

Hollander and Wolfe say that exact p-value is calculating by going through all permutations (ch 4.1 comment 5). That seems a naive too me; there gotta be a way to trim down the tree (similar to in the no-tie case).

This code

theplu::yat::statistics::ROC roc; for (size_t i=0; i<20; ++i) roc.add(0, i<10); cout << roc.area() << "\n";

outputs 0 when I expected 0.5 since as all values are tied.

In the large sample approximation the mean of the null distribution is unaffected of ties but the variance is modified to mn(N+1)/12-[mn/(12N(N-1)) * sum(t(t-1)(t+1))] where the last sum runs over tied groups and t is the size of the group.

I noticed that in presence of ties the distribution is no longer symmetric, which means
P_{two-sided} does not simply equal 2*P_{one-sided}. Instead in the in the two-tailed case we need to sum over both tails. In the large sample approximation there is no need as we already approximate the distribution of score/area with a Gaussian (which is symmetric).

Needs more docs. Docs for function `ROC::p_value`

is, for example, not accurate.

Ties for the score is taking care of in r998. The p-value was not modified. We should check in some text-book what is standard.