Opened 15 years ago

Closed 15 years ago

# Fisher::p_value is incorrect

Reported by: Owned by: Peter Peter major yat 0.4.3 statistics 0.4.2

### Description

The p-value behaves strange - and the code looks even stranger. For instance, `p_value_one_sided` calls `p_value_exact` and in `p_value_exact` there is a comment: this makes the p_value two-sided

### comment:1 Changed 15 years ago by Peter

Status: new → assigned

### comment:2 Changed 15 years ago by Peter

Looking into this I've realized that it is not obvious how to define two-tailed p-value in case of non-symmetric distribution. A sloppy definition would be that the p-value is the probability to get the observed outcome or more extreme. The problem when the distribution is not symmetrix is to choose where to start the summation in the other tail.

Browsing the web there seem to be three alternatives, which, of course, converges when the distribution becomes symmetric.

1) The first alternative is based on one-sided p-values, or specifically the right-sided p-value P(X>=x) and the left-sided p-value P(X<=x). The two-tailed p-value is calculated as p2 = 2 * min(rp, 0.5, lp) where rp and lp are the one-sided p-values.

2) The second alternative is based on the odds ratio or the logarithm of it. Because the middle outcome gives a log oddsratio equal to zero one can run the sum over outcomes that have an absolute log oddsratio larger than the absolute value of the observed one.

3) The third alternative focuses more on the probabilities of the different outcomes. It runs the sums over outcomes that have a probability smaller than the observed one.

Strategy 3 is a bit weird because the strategy only makes sense when dealing with one-peak distributions, which hypergeometric distribution indeed is, but it makes it hard to generalize to other tests (although most distribution in practice are one-peaked)

### comment:3 Changed 15 years ago by Peter

Resolution: → fixed assigned → closed

(In ) fixes #461. Also modified implementation of cdf_hypergeometric_P, which may cause conflict with modifications done in trunk (refs #87). If so, go with the trunk version (which uses GSL 1.8).

Note: See TracTickets for help on using tickets.