# Changeset 821 for trunk/yat/statistics/ROC.h

Ignore:
Timestamp:
Mar 18, 2007, 5:00:05 PM (16 years ago)
Message:

Modified ROC class to use AUC class in calculation of ROC area. Refs #101

File:
1 edited

### Legend:

Unmodified
 r779 */ #include #include #include #include namespace theplu { virtual ~ROC(void); /** Adding a data value to ROC. */ void add(double value, bool target, double weight=1.0); /** The area is defines as \f$\frac{\sum w^+w^-} {\sum w^+w^-}\f$, where the sum in the numerator goes over all pairs where value+ is larger than value-. The denominator goes over all pairs. @return Area under curve. */ double area(void); /// /// minimum_size is the threshold for when a normal u_int& minimum_size(void); /** minimum_size is the threshold for when a normal approximation is used for the p-value calculation. @return const reference to minimum_size */ const u_int& minimum_size(void) const; /// /// @return number of samples /// @return sum of weights /// size_t n(void) const; /// /// @return number of positive samples (Target.binary()==true) /// @return sum of weights with negative target /// size_t n_neg(void) const; /// /// @return sum of weights with positive target /// size_t n_pos(void) const; ///the second distribution by a non-zero amount. If the smallest ///group size is larger than minimum_size (default = 10), then P ///is calculated using a normal approximation.  @return the ///one-sided p-value( if absolute true is used this is equivalent ///to the two-sided p-value.) ///is calculated using a normal approximation. /// /// \note Weights should be either zero or unity, else present /// implementation is nonsense. /// /// @return One-sided p-value. /// double p_value(void) const; double p_value_one_sided(void) const; /// Function taking \a value, \a target (+1 or -1) and vector /// defining what samples to use. The score is equivalent to /// Mann-Whitney statistics. /// @return the area under the ROC curve. If the area is less /// than 0.5 and absolute=true, 1-area is returned. Complexity is /// \f$N\log N \f$ where \f$N \f$ is number of samples. /// double score(const classifier::Target& target, const utility::vector& value); /** Function taking values, target, weight and a vector defining what samples to use. The area is defines as \f$\frac{\sum w^+w^-}{\sum w^+w^-}\f$, where the sum in the numerator goes over all pairs where value+ is larger than value-. The denominator goes over all pairs. If target is equal to 1, sample belonges to class + otherwise sample belongs to class -. @return wheighted version of area under the ROC curve. If the area is less than 0.5 and absolute=true, 1-area is returned. Complexity is \f$N^2 \f$ where \f$N \f$ is number of samples. /** @brief Two-sided p-value. @return min(2*p_value_one_sided, 2-2*p_value_one_sided) */ double score(const classifier::Target& target, const classifier::DataLookupWeighted1D& value); double p_value(void) const; /** Function taking values, target, weight and a vector defining what samples to use. The area is defines as \f$\frac{\sum w^+w^-}{\sum w^+w^-}\f$, where the sum in the numerator goes over all pairs where value+ is larger than value-. The denominator goes over all pairs. If target is equal to 1, sample belonges to class + otherwise sample belongs to class -. @return wheighted version of area under the ROC curve. If the area is less than 0.5 and absolute=true, 1-area is returned. Complexity is \f$N^2 \f$ where \f$N \f$ is number of samples. /** @brief Set everything to zero */ double score(const classifier::Target& target, const utility::vector& value, const utility::vector& weight); /// /// Function returning true if target is positive (binary()) for /// the sample with ith lowest data value, so i=0 corresponds to /// the sample with the lowest data value and i=n()-1 the sample /// with highest data value. /// bool target(const size_t i) const; void reset(void); private: /// Implemented as in MatLab 13.1 double get_p_approx(const double) const; double get_p_approx(double) const; /// Implemented as in MatLab 13.1 double area_; u_int minimum_size_; u_int nof_pos_; std::vector > vec_pair_; // class-value-pair bool weighted_; double w_neg_; double w_pos_; // > std::multimap > multimap_; }; /// /// The output operator for the ROC class. The output is an Nx2 /// matrix, where the first column is the sensitivity and second /// is the specificity. /// std::ostream& operator<< (std::ostream& s, const ROC&); }}} // of namespace statistics, yat, and theplu