Comparative study between traditional machine learning and deep learning approaches for text classification

CN Kamath, SS Bukhari, A Dengel - Proceedings of the ACM …, 2018 - dl.acm.org
CN Kamath, SS Bukhari, A Dengel
Proceedings of the ACM Symposium on Document Engineering 2018, 2018dl.acm.org
In this contemporaneous world, it is an obligation for any organization working with
documents to end up with the insipid task of classifying truckload of documents, which is the
nascent stage of venturing into the realm of information retrieval and data mining. But
classification of such humongous documents into multiple classes, calls for a lot of time and
labor. Hence a system which could classify these documents with acceptable accuracy
would be of an unfathomable help in document engineering. We have created multiple …
In this contemporaneous world, it is an obligation for any organization working with documents to end up with the insipid task of classifying truckload of documents, which is the nascent stage of venturing into the realm of information retrieval and data mining. But classification of such humongous documents into multiple classes, calls for a lot of time and labor. Hence a system which could classify these documents with acceptable accuracy would be of an unfathomable help in document engineering. We have created multiple classifiers for document classification and compared their accuracy on raw and processed data. We have garnered data used in a corporate organization as well as publicly available data for comparison. Data is processed by removing the stop-words and stemming is implemented to produce root words. Multiple traditional machine learning techniques like Naive Bayes, Logistic Regression, Support Vector Machine, Random forest Classifier and Multi-Layer Perceptron are used for classification of documents. Classifiers are applied on raw and processed data separately and their accuracy is noted. Along with this, Deep learning technique such as Convolution Neural Network is also used to classify the data and its accuracy is compared with that of traditional machine learning techniques. We are also exploring hierarchical classifiers for classification of classes and subclasses. The system classifies the data faster and with better accuracy than if done manually. The results are discussed in the results and evaluation section.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果