Opened 15 years ago

Closed 15 years ago

## #56 closed request (fixed)

# SVM convergence problem

Reported by: | Peter | Owned by: | Peter |
---|---|---|---|

Priority: | trivial | Milestone: | yat 0.2 |

Component: | documentation | Version: | |

Keywords: | Cc: |

### Description

In maximal margin case ( C=infinity ) SVM sometimes do keep on training until maximal number of epochs is reached. The problem is most likely that the training problem is not linear separable (in feature space), hence the KKT conditions are never fulfilled.

The same problem arises with finite C. With finite C the only change is that the kernel is modified by a term 1/C in the diagonal. This new kernel corresponds to positions of data points in feature space that are slightly changed compared to the original position. If these new positions are not linearly separable the same problem as in paragraph above should arise.

### Change History (3)

### comment:1 Changed 15 years ago by

Component: | classifier → documentation |
---|---|

Milestone: | SVM extension → later |

Priority: | major → trivial |

Status: | new → assigned |

Type: | defect → task |

### comment:2 Changed 15 years ago by

Milestone: | later → 0.2 |
---|

### comment:3 Changed 15 years ago by

Resolution: | → fixed |
---|---|

Status: | assigned → closed |

**Note:**See TracTickets for help on using tickets.

Well, if I remember correctly, not linearly separable problems corresponds to a Hamiltonian ( H_ij = t_i K_ij t_j ) with zero eigenvalue with eigenvector in a feasible direction, i.e. x_i >= 0 and sum x_i*t_i = 0. In this direction the contribution from quadratic term is zero and the linear term wins, and thus the object function (we are minimising) goes to -Inf. Therefore the minimum is unreachable with SMO because we are only updating two variables in each epoch and consequently zigzagging outwards infinity.

In the case of a finite C this problem in theory does not arise because the contribution (1/C) from the C makes the Hamiltonian positive definite. However, if C is large this only moves the minimum from infinity to very far away and in practice we have the same problem.

CONCLUSION: This is not an implementation issue, but rather a documentation issue.