# Changeset 779 for trunk/yat/statistics/tScore.h

Ignore:
Timestamp:
Mar 5, 2007, 7:58:30 PM (16 years ago)
Message:

Refs #101

File:
1 edited

### Legend:

Unmodified
 r767 #include "Score.h" #include #include public: /// /// 2brief Default Constructor. /// @brief Default Constructor. /// tScore(bool absolute=true); \frac{ \sum_i (x_i-m_x)^2 + \sum_i (y_i-m_y)^2 }{ n_x + n_y - 2 } \f$@return t-score. If absolute=true absolute value of t-score is returned */ double score(const classifier::Target& target, const utility::vector& value); const utility::vector& value) const; /** Calculates the value of t-score, i.e. the ratio between difference in mean and standard deviation of this difference. \f$ t = \frac{ m_x - m_y } {s\sqrt{\frac{1}{n_x}+\frac{1}{n_y}}} \f$where \f$ m \f$is the mean, \f$ n \f$is the number of data points and \f$ s^2 = \frac{ \sum_i (x_i-m_x)^2 + \sum_i (y_i-m_y)^2 }{ n_x + n_y - 2 } \f$@param dof double pointer in which approximation of degrees of freedom is returned: pos.n()+neg.n()-2. See AveragerWeighted. @return t-score. If absolute=true absolute value of t-score is returned */ double score(const classifier::Target& target, const utility::vector& value, double* dof) const; /** Calculates the weighted t-score, i.e. the ratio between difference in mean and standard deviation of this difference. \f$ t = \frac{ m_x - m_y }{ s\sqrt{\frac{1}{n_x}+\frac{1}{n_y}}} \f$where \f$ m \f$is the weighted mean, n is the weighted version of number of data points \f$ \frac{\left(\sum w_i\right)^2}{\sum w_i^2} \f$, and \f$ s^2 \f$is an estimation of the variance \f$ s^2 = \frac{ \sum_i w_i(x_i-m_x)^2 + \sum_i w_i(y_i-m_y)^2 }{ n_x + n_y - 2 } \f$. See AveragerWeighted for details. @param dof double pointer in which approximation of degrees of freedom is returned: pos.n()+neg.n()-2. See AveragerWeighted. @return t-score. If absolute=true absolute value of t-score is returned */ double score(const classifier::Target& target, const classifier::DataLookupWeighted1D& value, double* dof=0) const; /** */ double score(const classifier::Target& target, const classifier::DataLookupWeighted1D& value); const classifier::DataLookupWeighted1D& value) const; /// /// Calculates the weighted t-score, i.e. the ratio between /// difference in mean and standard deviation of this /// difference. \f$ t = \frac{ m_x - m_y }{ /// \frac{s2}{n_x}+\frac{s2}{n_y}} \f$where \f$ m \f$is the /// weighted mean, n is the weighted version of number of data /// points and \f$ s2 \f$is an estimation of the variance \f$ s^2 /// = \frac{ \sum_i w_i(x_i-m_x)^2 + \sum_i w_i(y_i-m_y)^2 }{ n_x /// + n_y - 2 } \f$. See AveragerWeighted for details. /// /// @return t-score if absolute=true absolute value of t-score /// is returned /// /** Calculates the weighted t-score, i.e. the ratio between difference in mean and standard deviation of this difference. \f$ t = \frac{ m_x - m_y }{ \frac{s2}{n_x}+\frac{s2}{n_y}} \f$where \f$ m \f$is the weighted mean, n is the weighted version of number of data points and \f$ s2 \f$is an estimation of the variance \f$ s^2 = \frac{ \sum_i w_i(x_i-m_x)^2 + \sum_i w_i(y_i-m_y)^2 }{ n_x + n_y - 2 } \f$. See AveragerWeighted for details. @return t-score if absolute=true absolute value of t-score is returned */ double score(const classifier::Target& target, const utility::vector& value, const utility::vector& weight); const utility::vector& weight) const; /// /// Calculates the p-value, i.e. the probability of observing a /// t-score equally or larger if the null hypothesis is true. If P /// is near zero, this casts doubt on this hypothesis. The null /// hypothesis is that the means of the two distributions are /// equal. Assumtions for this test is that the two distributions /// are normal distributions with equal variance. The latter /// assumtion is dropped in Welch's t-test. /// /// @return the one-sided p-value( if absolute=true is used /// the two-sided p-value) /// double p_value() const; /** Calculates the weighted t-score, i.e. the ratio between difference in mean and standard deviation of this difference. \f$ t = \frac{ m_x - m_y }{ \frac{s2}{n_x}+\frac{s2}{n_y}} \f$where \f$ m \f$is the weighted mean, n is the weighted version of number of data points and \f$ s2 \f$is an estimation of the variance \f$ s^2 = \frac{ \sum_i w_i(x_i-m_x)^2 + \sum_i w_i(y_i-m_y)^2 }{ n_x + n_y - 2 } \f\$. See AveragerWeighted for details. @param dof double pointer in which approximation of degrees of freedom is returned: pos.n()+neg.n()-2. See AveragerWeighted. @return t-score if absolute=true absolute value of t-score is returned */ double score(const classifier::Target& target, const utility::vector& value, const utility::vector& weight, double* dof=0) const; private: double t_; double dof_; template double score(const T& pos, const T& neg, double* dof) const { double diff = pos.mean() - neg.mean(); if (dof) *dof=pos.n()+neg.n()-2; double s2=( (pos.sum_xx_centered()+neg.sum_xx_centered())/ (pos.n()+neg.n()-2)); double t=diff/sqrt(s2/pos.n()+s2/(neg.n())); if (t<0 && absolute_) return -t; return t; } };