This technology is a boosting algorithm for accelerated machine learning in the presence of misclassification noise.
Although boosting algorithms may reduce predictive error, they perform poorly when error or noise exists in a training data set. The poor performance of boosting procedures often results from over-fitting the training data set, since the later resampled training sets can over-emphasize examples that are noise. Thus, there is a need for boosting procedures that maintain good predictive characteristics when applied to noisy data sets.
The Martingale boosting algorithm combines simple predictors into more sophisticated aggregate predictors for automated learning systems. Learning proceeds in stages, and at each stage, the algorithm segments training data examples into bins. The boosting algorithm chooses a base classifier for each bin and facilitates noise-tolerant prediction based on probability. This approach is relatively simple and easily understood, significantly reduces predictive error, and achieves optimal accuracy despite noise in data.
IR M06-005
Licensing Contact: Richard Nguyen