additional data for various machine learning tasks. In this paper, we propose a method of
improving text classification accuracy by using such an additional corpus that can easily be
obtained from the web. This additional corpus can be unlabeled and independent of the
given classification task. The method proposed here uses topic modeling to extract a set of
topics from the additional corpus. Those extracted topics then act as additional features of …