Opened 17 years ago
Closed 12 years ago
#144 closed defect (fixed)
Tie in ROC
Reported by: | Peter | Owned by: | Peter |
---|---|---|---|
Priority: | minor | Milestone: | yat 0.8 |
Component: | statistics | Version: | trunk |
Keywords: | Cc: |
Description
Behaviour when ties are present appears to be random.
Change History (13)
comment:1 Changed 16 years ago by
comment:2 Changed 13 years ago by
Hollander and Wolfe say that exact p-value is calculating by going through all permutations (ch 4.1 comment 5). That seems a naive too me; there gotta be a way to trim down the tree (similar to in the no-tie case).
comment:3 Changed 12 years ago by
This code
theplu::yat::statistics::ROC roc; for (size_t i=0; i<20; ++i) roc.add(0, i<10); cout << roc.area() << "\n";
outputs 0 when I expected 0.5 since as all values are tied.
comment:4 Changed 12 years ago by
Milestone: | yat 0.x+ → yat 0.8 |
---|
comment:5 Changed 12 years ago by
Status: | new → assigned |
---|
comment:8 Changed 12 years ago by
In the large sample approximation the mean of the null distribution is unaffected of ties but the variance is modified to mn(N+1)/12-[mn/(12N(N-1)) * sum(t(t-1)(t+1))] where the last sum runs over tied groups and t is the size of the group.
comment:9 Changed 12 years ago by
I noticed that in presence of ties the distribution is no longer symmetric, which means Ptwo-sided does not simply equal 2*Pone-sided. Instead in the in the two-tailed case we need to sum over both tails. In the large sample approximation there is no need as we already approximate the distribution of score/area with a Gaussian (which is symmetric).
comment:10 Changed 12 years ago by
comment:11 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:12 Changed 12 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Needs more docs. Docs for function ROC::p_value
is, for example, not accurate.
comment:13 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Ties for the score is taking care of in r998. The p-value was not modified. We should check in some text-book what is standard.