# Changeset 1184

Ignore:
Timestamp:
Feb 28, 2008, 7:49:57 PM (14 years ago)
Message:

refs #335 predict fixed for NBC

Location:
trunk/yat/classifier
Files:
2 edited

### Legend:

Unmodified
 r1182 Each class is modelled as a multinormal distribution with features being independent: \f$p(x|c) = \prod features being independent: \f$ P(x|c) \propto \prod \frac{1}{\sqrt{2\pi\sigma_i^2}} \exp \left( \frac{(x_i-m_i)^2}{2\sigma_i^2)} \right)\f$-\frac{(x_i-\mu_i)^2}{2\sigma_i^2)} \right)\f$ */ class NBC : public SupervisedClassifier /// /// Train the classifier using training data and targets. /// \brief Train the %classifier using training data and targets. /// /// For each class mean and variance are estimated for each /// feature (see Averager and AveragerWeighted for details). /// feature (see statistics::Averager for details). /// /// If variance can not be estimated (only one valid data point) /// for a feature and label, then that feature is ignored for that /// specific label. /// If there is only one (or zero) samples in a class, parameters /// cannot be estimated. In that case, parameters are set to NaN /// for that particular class. /// void train(const MatrixLookup&, const Target&); /// /// Train the classifier using weighted training data and targets. /// \brief Train the %classifier using weighted training data and /// targets. /// /// For each class mean and variance are estimated for each /// feature (see statistics::AveragerWeighted for details). /// /// To estimate the parameters of a class, each feature of the /// class must have at least two non-zero data points. Otherwise /// the parameters are set to NaN and any prediction will result /// in NaN for that particular class. /// void train(const MatrixLookupWeighted&, const Target&); /** \brief Predict samples using unweighted data Each sample (column) in \a data is predicted and predictions are returned in the corresponding column in passed \a res. Each row in \a res corresponds to a class. The prediction is the estimated probability that sample belong to class \f$j \f$ are returned in the corresponding column in passed \a result. Each row in \a result corresponds to a class. The prediction is the estimated probability that sample belong to class \f$j \f$: \f$P_j = \frac{1}{Z}\prod_i{\frac{1}{\sqrt{2\pi\sigma_i^2}}} \exp(\frac{(x_i-\mu_i)^2}{\sigma_i^2})\f$, where \f$\mu_i \f$ P_j = \frac{1}{Z}\prod_i\frac{1}{\sqrt{2\pi\sigma_i^2}} \exp\left(-\frac{(x_i-\mu_i)^2}{2\sigma_i^2}\right)\f$, where \f$ \mu_i \f$and \f$ \sigma_i^2 \f$are the estimated mean and variance, respectively. If a \f$ \sigma_i \f$could not be estimated during training, corresponding factor is set to unity, in other words, that feature is ignored for the prediction of that particular class. Z is chosen such that total probability, \f$ \sum P_j \f$, equals unity. respectively. Z is chosen such that total probability equals unity, \f$ \sum P_j = 1 \f$. \note If parameters could not be estimated during training, due to lack of number of sufficient data points, the output for that class is NaN and not included in calculation of normalization factor \f$ Z \f$. */ void predict(const MatrixLookup& data, utility::Matrix& res) const; void predict(const MatrixLookup& data, utility::Matrix& result) const; /** \brief Predict samples using weighted data Each sample (column) in \a data is predicted and predictions are returned in the corresponding column in passed \a res. Each row in \a res corresponds to a class. The prediction is the estimated probability that sample belong to class \f$ j \f$are returned in the corresponding column in passed \a result. Each row in \a result corresponds to a class. The prediction is the estimated probability that sample belong to class \f$ j \f$: \f$ P_j = \frac{1}{Z}\prod_i$${\frac{1}{\sqrt{2\pi\sigma_i^2}}}$$ \exp(\frac{\sum{w_i(x_i-\mu_i)^2}{\sigma_i^2}}{\sum w_i})\f$, where \f$ \mu_i \f$and \f$ \sigma_i^2 \f$are the estimated mean and variance, respectively. If a \f$ \sigma_i \f$could not be estimated during training, corresponding factor is set to unity, in other words, that feature is ignored for the prediction of that particular class. Z is chosen such that total probability, \f$ \sum P_j \f$, equals unity. \f$ P_j = \frac{1}{Z} \exp\left(-N\frac{\sum {w_i(x_i-\mu_i)^2}/(2\sigma_i^2)}{\sum w_i}\right)\f$, where \f$ \mu_i \f$and \f$ \sigma_i^2 \f$are the estimated mean and variance, respectively. Z is chosen such that total probability equals unity, \f$ \sum P_j = 1 \f$. \note If parameters could not be estimated during training, due to lack of number of sufficient data points, the output for that class is NaN and not included in calculation of normalization factor \f$ Z \f\$. */ void predict(const MatrixLookupWeighted& data, utility::Matrix& res) const; void predict(const MatrixLookupWeighted& data,utility::Matrix& result) const;