Enhance AdaBoost Algorithm by Integrating LDA Topic Model


AdaBoost is an ensemble method, which is considered to be one of the most influential algorithms for multi-label classification. It has been successfully applied to diverse domains for its tremendous simplicity and accurate prediction. To choose the weak hypotheses, AdaBoost has to examine the whole features individually, which will dramatically increase the computational time of classification, especially for large scale datasets. In order to tackle this problem, we a introduce Latent Dirichlet Allocation (LDA) model to improve the efficiency and effectiveness of AdaBoost by mapping word-matrix into topic-matrix. In this paper, we propose a framework integrating LDA and AdaBoost, and test it with two Chinese Language corpora. Experiments show that our method outperforms the traditional AdaBoost using BOW model.

Data Mining and Big Data, First International Conference, DMBD 2016, Bali, Indonesia, June 25-30, 2016. Proceedings
Fangyu Gai
A Ph.D. student at School of Engineering, UBC