Opened 14 years ago

Closed 12 years ago

#192 closed enhancement (fixed)

kNNI and WeNNI has hard coded small numbers, change to something standardized.

Reported by: Jari Häkkinen Owned by: Jari Häkkinen
Priority: minor Milestone: yat 0.5
Component: utility Version: trunk
Keywords: Cc:

Description

source:trunk/yat/utility/kNNI.cc at approximately line 73:

double d=(distance[*k].second ? distance[*k].second : 1e-10);

and source:trunk/yat/utility/WeNNI.cc at approximately line 66:

double d=(distance[*k].second ? distance[*k].second : 1e-10);

Change History (6)

comment:1 Changed 14 years ago by Jari Häkkinen

Milestone: 0.3 (Public release)later

comment:2 Changed 13 years ago by Markus Ringnér

std::numeric_limits<double>::epsilon() should perhaps be used?

comment:3 Changed 13 years ago by Peter

Well, if we look at how d in WeNNi for example

...
  // Avoid division with zero (perfect match vectors)
  double d=(distance[*k].second ? distance[*k].second : 1e-10);
  new_value+=(weight_(distance[*k].first,j) *
  data_(distance[*k].first,j)/d);
  norm+=weight_(distance[*k].first,j)/d;
}
// No impute if no contributions from neighbours.
if (norm){
  imputed_data_raw_(i,j) = new_value/norm;
  new_value+=(weight_(distance[*k].first,j) * data_(distance[*k].first,j)/d);
  norm+=weight_(distance[*k].first,j)/d;
...

we see that d is set to a small number to avoid division by zero. So we want d to be small but not too small because then "new_value/norm" will result in Inf divided by Inf or worse if only one of norm and new_value becomes Inf. How can we avoid that? std::numeric_limits<double>::epsilon() does not really do the work here. Because epsilon is defined as the difference between 1.0 and next represented number. In other words, what epsilon is the smallest number you can add to 1.0 without getting back 1.0. I can't see the connection to when you wanna divide by a small number and avoid hitting the ceiling. I'm not an expert in floating point representation, but the latter must depend on your range in exponent wehereas the epsilon tells you about precision in the Mantissa (?).

comment:4 Changed 12 years ago by Jari Häkkinen

Milestone: yat 0.x+yat 0.5
Status: newassigned

comment:5 Changed 12 years ago by Jari Häkkinen

Reading through http://www.bnikolic.co.uk/blog/cpp-rr-epsilon.html I think we should use Markus suggestion of std::numeric_limits<double>::epsilon(). The number is small enough to not matter too much but large enough to avoid infinities.

comment:6 Changed 12 years ago by Jari Häkkinen

Resolution: fixed
Status: assignedclosed

(In [1554]) Fixes #192. Using std::numeric_limits, since impute algorithms are slightly changed template results also changes.

Note: See TracTickets for help on using tickets.