Opened 17 years ago

Closed 12 years ago

#144 closed defect (fixed)

Tie in ROC

Reported by: Peter Owned by: Peter
Priority: minor Milestone: yat 0.8
Component: statistics Version: trunk
Keywords: Cc:

Description

Behaviour when ties are present appears to be random.

Change History (13)

comment:1 Changed 16 years ago by Peter

Ties for the score is taking care of in r998. The p-value was not modified. We should check in some text-book what is standard.

comment:2 Changed 13 years ago by Peter

Hollander and Wolfe say that exact p-value is calculating by going through all permutations (ch 4.1 comment 5). That seems a naive too me; there gotta be a way to trim down the tree (similar to in the no-tie case).

comment:3 Changed 12 years ago by Peter

This code

theplu::yat::statistics::ROC roc;
for (size_t i=0; i<20; ++i)
  roc.add(0, i<10);
cout << roc.area() << "\n";

outputs 0 when I expected 0.5 since as all values are tied.

comment:4 Changed 12 years ago by Peter

Milestone: yat 0.x+yat 0.8

comment:5 Changed 12 years ago by Peter

Status: newassigned

comment:6 Changed 12 years ago by Peter

(In [2549]) ROC: adding a test for ties case. refs #144

comment:7 Changed 12 years ago by Peter

(In [2551]) refs #144. Fix ROC::area for the tied case

comment:8 Changed 12 years ago by Peter

In the large sample approximation the mean of the null distribution is unaffected of ties but the variance is modified to mn(N+1)/12-[mn/(12N(N-1)) * sum(t(t-1)(t+1))] where the last sum runs over tied groups and t is the size of the group.

comment:9 Changed 12 years ago by Peter

I noticed that in presence of ties the distribution is no longer symmetric, which means Ptwo-sided does not simply equal 2*Pone-sided. Instead in the in the two-tailed case we need to sum over both tails. In the large sample approximation there is no need as we already approximate the distribution of score/area with a Gaussian (which is symmetric).

comment:10 Changed 12 years ago by Peter

(In [2556]) ROC: implement correction for ties in large sample approximation o p-value. refs #144.

comment:11 Changed 12 years ago by Peter

Resolution: fixed
Status: assignedclosed

(In [2585]) handle ties in exact calculation of ROC p-value. fixes #144

comment:12 Changed 12 years ago by Peter

Resolution: fixed
Status: closedreopened

Needs more docs. Docs for function ROC::p_value is, for example, not accurate.

comment:13 Changed 12 years ago by Peter

Resolution: fixed
Status: reopenedclosed

(In [2594]) improve docs for ROC and sister class AUC. closes #144

Note: See TracTickets for help on using tickets.