Changeset 692 for trunk/yat


Ignore:
Timestamp:
Oct 17, 2006, 2:13:32 PM (17 years ago)
Author:
Peter
Message:

Fixes #56 and some doxygen issues for SVM

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/yat/classifier/SVM.h

    r680 r692  
    5353 
    5454  public:
    55     ///
    56     /// Constructor taking the kernel and the target vector as
    57     /// input.
     55    ///
     56    /// Constructor taking the kernel and the target vector as
     57    /// input.
    5858    ///
    5959    /// @note if the @a target or @a kernel
    60     /// is destroyed the behaviour is undefined.
    61     ///
     60    /// is destroyed the behaviour is undefined.
     61    ///
    6262    SVM(const KernelLookup& kernel, const Target& target);
    6363
     
    7373    make_classifier(const DataLookup2D&, const Target&) const;
    7474
    75     ///
    76     /// @return \f$ \alpha \f$
    77     ///
     75    ///
     76    /// @return \f$ \alpha \f$
     77    ///
    7878    inline const utility::vector& alpha(void) const { return alpha_; }
    7979
    80     ///
     80    ///
    8181    /// The C-parameter is the balance term (see train()). A very
    8282    /// large C means the training will be focused on getting samples
     
    9191    ///
    9292    /// @returns mean of vector \f$ C_i \f$
    93     ///
     93    ///
    9494    inline double C(void) const { return 1/C_inverse_; }
    9595
    96     ///
     96    ///
    9797    /// Default is max_epochs set to 10,000,000.
    9898    ///
     
    101101    inline long int max_epochs(void) const {return max_epochs_;}
    102102   
    103     ///
    104     /// The output is calculated as \f$ o_i = \sum \alpha_j t_j K_{ij}
    105     /// + bias \f$, where \f$ t \f$ is the target.
    106     ///
    107     /// @return output
    108     ///
     103    /**
     104        The output is calculated as \f$ o_i = \sum \alpha_j t_j K_{ij}
     105        + bias \f$, where \f$ t \f$ is the target.
     106   
     107        @return output
     108    */
    109109    inline const theplu::yat::utility::vector&
    110110    output(void) const { return output_; }
    111111
    112     ///
    113     /// Generate prediction @a predict from @a input. The prediction
    114     /// is calculated as the output times the margin, i.e., geometric
    115     /// distance from decision hyperplane: \f$ \frac{ \sum \alpha_j
    116     /// t_j K_{ij} + bias}{w} \f$ The output has 2 rows. The first row
    117     /// is for binary target true, and the second is for binary target
    118     /// false. The second row is superfluous as it is the first row
    119     /// negated. It exist just to be aligned with multi-class
    120     /// SupervisedClassifiers. Each column in @a input and @a output
    121     /// corresponds to a sample to predict. Each row in @a input
    122     /// corresponds to a training sample, and more exactly row i in @a
    123     /// input should correspond to row i in KernelLookup that was used
    124     /// for training.
    125     ///
     112    /**
     113      Generate prediction @a predict from @a input. The prediction
     114      is calculated as the output times the margin, i.e., geometric
     115      distance from decision hyperplane: \f$ \frac{ \sum \alpha_j
     116      t_j K_{ij} + bias}{w} \f$ The output has 2 rows. The first row
     117      is for binary target true, and the second is for binary target
     118      false. The second row is superfluous as it is the first row
     119      negated. It exist just to be aligned with multi-class
     120      SupervisedClassifiers. Each column in @a input and @a output
     121      corresponds to a sample to predict. Each row in @a input
     122      corresponds to a training sample, and more exactly row i in @a
     123      input should correspond to row i in KernelLookup that was used
     124      for training.
     125    */
    126126    void predict(const DataLookup2D& input, utility::matrix& predict) const;
    127127
     
    152152       Training the SVM following Platt's SMO, with Keerti's
    153153       modifacation. Minimizing \f$ \frac{1}{2}\sum
    154        y_iy_j\alpha_i\alpha_j(K_{ij}+\frac{1}{C_i}\delta_{ij}) \f$ ,
    155        which corresponds to minimizing \f$ \sum w_i^2+\sum C_i\xi_i^2
    156        \f$.
     154       y_iy_j\alpha_i\alpha_j(K_{ij}+\frac{1}{C_i}\delta_{ij}) - \sum
     155       alpha_i\f$ , which corresponds to minimizing \f$ \sum
     156       w_i^2+\sum C_i\xi_i^2 \f$.
     157
     158       @note If the training problem is not linearly separable and C
     159       is set to infinity, the minima will be located in the infinity,
     160       and thus the minumum will not be reached within the maximal
     161       number of epochs. More exactly, when the problem is not
     162       linearly separable, there exists an eigenvector to \f$
     163       H_{ij}=y_iy_jK_{ij} \f$ within the space defined by the
     164       conditions: \f$ \alpha_i>0 \f$ and \f$ \sum \alpha_i y_i = 0
     165       \f$. As the eigenvalue is zero in this direction the quadratic
     166       term does not contribute to the objective, but the objective
     167       only consists of the linear term and hence there is no
     168       minumum. This problem only occurs when \f$ C \f$ is set to
     169       infinity because for a finite \f$ C \f$ all eigenvalues are
     170       finite. However, for a large \f$ C \f$ (and training problem is
     171       non-linearly separable) there exists an eigenvector
     172       corresponding to a small eigenvalue, which means the minima has
     173       moved from infinity to "very far away". In practice this will
     174       also result in that the minima is not reached withing the
     175       maximal number of epochs and the of \f$ C \f$ should be
     176       decreased.
     177 
     178       @return true if succesful
    157179    */
    158180    bool train();
     
    161183     
    162184  private:
    163     ///
    164     /// Copy constructor. (not implemented)
     185    ///
     186    /// Copy constructor. (not implemented)
    165187    ///
    166188    SVM(const SVM&);
     
    184206
    185207    ///
    186     ///   Private function choosing which two elements that should be
    187     ///   updated. First checking for the biggest violation (output - target =
    188     ///   0) among support vectors (alpha!=0). If no violation was found check
    189     ///   sequentially among the other samples. If no violation there as
    190     ///   well training is completed
     208    ///   Private function choosing which two elements that should be
     209    ///   updated. First checking for the biggest violation (output - target =
     210    ///   0) among support vectors (alpha!=0). If no violation was found check
     211    ///   sequentially among the other samples. If no violation there as
     212    ///   well training is completed
    191213    ///
    192214    ///  @return true if a pair of samples that violate the conditions
    193215    ///  can be found
    194     ///
     216    ///
    195217    bool choose(const theplu::yat::utility::vector&);
    196218
Note: See TracChangeset for help on using the changeset viewer.