We develop a fast online kernel algorithm for classification which can be viewed as an improvement over the one suggested by Crammer, Kandola & Singer, 2003, titled "Online Classificaton on a Budget". In that previous work, the authors introduced an on-the-fly compression of the number of examples used in the prediction function using the size of the margin as a quality measure. Although displaying impressive results on relatively noise-free data we show how their algorithm is susceptible in noisy problems. Utilizing a new quality measure for an included example, namely the error induced on a selected subset of the training data, we gain improved compression rates and insensitivity to noise over the existing approach. Our method is also extendable to the batch training mode case.
Online (and Offline) on an Even Tighter Budget
Jason Weston, Antoine Bordes and Leon Bottou
Source code (Matlab objects):The following objects are for use with the Spider Matlab Machine Learning Library.
- Budget Perceptron
- Tighter Budget Perceptron (no secondary cache)
- Tighter Budget Perceptron (with secondary cache)
Note that this code is in Matlab, and therefore, far from optimal in performance. Nevertheless a timing comparison is given here . See the paper for discussion of complexity of the given algorithms.