Opened 16 years ago
Closed 16 years ago
#245 closed defect (fixed)
Distance should take Iterators
Reported by: | Peter | Owned by: | Peter |
---|---|---|---|
Priority: | major | Milestone: | yat 0.4 |
Component: | statistics | Version: | trunk |
Keywords: | Cc: |
Description (last modified by )
needs ticket:244
Well, in fact it should not take Iterators, but be templatized such that Iterators are accepted. But also std::vector<double>::const_iterator should be accepted.
In [889] this was implemented for iterators to unweighted containers. Extending the solution to iterators to weighted containers depends on ticket:247 and ticket:246.
In [890] this was implemented for iterators to weighted containers.
Change History (8)
comment:1 Changed 16 years ago by
comment:2 Changed 16 years ago by
Well I guess then there is only one way to this becuase the ANSI/ISO Standard says (14.5.2 p 3): "A member function template shall not be virtual." So alternative 1. will not compile.
But I also do not like being limited to AveragerPair? based distances. So I will try to figure out a third alternative for discussion.
comment:3 Changed 16 years ago by
Ok, I have looked at the code example (provided off the record), and it looks good.
Still to decide though is where the actual calculation of the distance will be performed. If we trash the Distance inheritance structure, how will it be done? Switch case blocks on a distance_tag?
comment:4 follow-ups: 5 6 Changed 16 years ago by
No the traits pattern avoids needing such things as switch case blocks (I guess you can think of it as if the switch/case is done at compile and not run-time). Here are the key components, briefly. I ignore any additional traits for the iterator types to separate for example weighted and unweighted implementations.
A function that needs to calculate a distance takes an object of type distance_tag (all names are tentative; I will try to check in a version tomorrow).
So for example
void f(const distance_tag& disttype) {
Do some things including calculating the following distance: double dist=vector_distance(begin1,begin1,end2,disttype);
}
The general distance function vector_distance only finds out what trait disttype has and calls the corresponding specialized vector_distance accordingly:
template <typename Iter, typename Dist> double vector_distance(Iter beg1,Iter end1, Iter beg2,const Dist& disttype) {
return vector_distance(beg1,end1,beg2,
typename distance_traits<Dist>::distance());
the word typename before distance_traits is needed by the compiler to have a chance to parse properly
}
Based on which object distance_traits<Dist>::distance() returns, different vector_distance functions are called, and each of them implements a specific algorithm.
Since all traits in this case only are typedefs the only overhead is an additional call to the general vector_distance function (which perhaps is optimized away sometimes?). "Everyone" should call the general vector_distance. Either way this is comparable to calling a virtual function in the solution "virtual template member function" that would not work anyway.
For classes such as NCC they would take nad keep as a private member a const distance_tag just as the function f.
Finally for completeness here is what distance_traits would look like together with two different distance tags:
template <class T> struct distance_traits {
typedef typename T::distance distance;
};
struct distance_type {
typedef distance_type distance;
}; struct euclidean_type
: public distance_type { typedef euclidean_type distance;
}; struct pearson_type
: public distance_type { typedef pearson_type distance;
};
And all we would have to do to call f in a main program with pearson distance would be to do:
f(pearson_type());
Now f will call the general vector_distance and the general vector_distance will call:
template <typename Iter> double vector_distance(Iter beg1,Iter beg2,Iter end, const pearson_type& disttype) {
Calculate and return pearson distance using beg1, beg2 and beg3
}
because distance_traits<Dist>::distance() will in this case inside f return an object of type pearson_type.
comment:5 Changed 16 years ago by
Replying to markus:
Based on which object distance_traits<Dist>::distance() returns, different vector_distance functions are called, and each of them implements a specific algorithm.
It should say creates instead of returns above.
comment:6 Changed 16 years ago by
distance_tag should be replaced with distance_type everywhere:
Here ...
A function that needs to calculate a distance takes an object of type distance_tag (all names are tentative; I will try to check in a version tomorrow).
So for example
void f(const distance_tag& disttype)
... and here ...
For classes such as NCC they would take nad keep as a private member a const distance_tag just as the function f.
... and here
Finally for completeness here is what distance_traits would look like together with two different distance tags:
comment:7 Changed 16 years ago by
Description: | modified (diff) |
---|
comment:8 Changed 16 years ago by
Description: | modified (diff) |
---|---|
Resolution: | → fixed |
Status: | new → closed |
There are two ways to do this
I admit neither of these solutions are perfect, so if someone has a better solution, spread the word. The bad side of 2) is that it forces a Distance to be calculated grom an AveragerPair?. That is perhaps not such a big limitation, but it makes, e.g., rank-based distances such as Spearman correlation impossible. The bad side about 1) is that we have a virtual templatized function, which seems a bit fishy to me and I've read somewhere that "Avoid having..."