Colloquium by Dr. Jia Xu, Monday Nov. 16th 10 am-12 pm
Dr. Jia Xu will be giving a talk in the Linguistics Department on Monday November 16th from 10 am-12 pm, in Kerr Hall, Room 273.
Title: Better bootstraps, better accuracy: in theory, in practice, in translation.
Bagging (Breiman, 96) and its variants is one of the most popular methods in aggregating classifiers and regressors. Its original analysis assumes that the bootstraps are built from an unlimited, independent source of samples. In the real world this analysis fails because there is a limited number of training samples. We analyze the effect of intersections between bootstraps to train different base predictors, which shows that the real-world bagging behaves very differently than its ideal analog (Breiman, 96) Most importantly,we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs. We prove that this is universally better than bagging. Our analytical results are backed up by experiments on general classification and regression settings, and significantly improved all machine translation systems we used in the NIST-15 C-E competition.