Text mining without document context

E SanJuan, F Ibekwe-SanJuan - Information Processing & Management, 2006 - Elsevier
… as classes henceforth. Of course, the GENIA ontology’s hierarchy, the number of classes and
… Hence, we focus on the two elementary operations: merges which is the union of disjoint

Concept decompositions for large sparse text data using clustering

IS Dhillon, DS Modha - Machine learning, 2001 - Springer
… a set of unstructured text documents as a vector space model… and sparse text data sets
such as CLASSIC3 and NSF into disjoint … indeed “discovers” the class structure underlying the …

Text clustering using frequent itemsets

W Zhang, T Yoshida, X Tang, Q Wang - Knowledge-Based Systems, 2010 - Elsevier
… Other processes in FIHC, such as making clusters disjoint and pruning, are … documents of
class i, n j is the number of documents of cluster j, and n ij is the number of documents of class i …

X-class: Associative classification of xml documents by structure

G Costa, R Ortale, E Ritacco - ACM Transactions on Information Systems …, 2013 - dl.acm.org
text processors, for instance, LibreOffice and Microsoft Word, are essentially XML files. So,
working with XML documents might be working with regular text-… into two disjoint subsets: the …

Unsupervised document classification using sequential information maximization

N Slonim, N Friedman, N Tishby - … of the 25th annual international ACM …, 2002 - dl.acm.org
… which are especially designed for text classification. Additionally, our un… disjoint classes
where each class is characterized through a distribution p(y|ck), ck ∈ C . Denoting the class

Text and non-text separation in offline document images: a survey

S Bhowmik, R Sarkar, M Nasipuri… - … Journal on Document …, 2018 - Springer
disjoint makes them simpler to process from text and non-… for text/non-text separation for
general offline document … are grouped according to the class of documents they will most likely …

Coalition game based feature selection for text non-text separation in handwritten documents using LBP based features

M Ghosh, KK Ghosh, S Bhowmik, R Sarkar - Multimedia Tools and …, 2021 - Springer
… In our daily life, we come across many handwritten documents in the form of class notes,
handwritten reports and … CCs are basically disjoint segments. To extract CCs from the binarized …

Classification and clustering of arxiv documents, sections, and abstracts, comparing encodings of natural and mathematical language

P Scharpf, M Schubotz, A Youssef, F Hamborg… - Proceedings of the …, 2020 - dl.acm.org
… Using the selection of 4900 documents, 3500 sections, and 1400 abstracts from the arXiv,
we compared the influence of text and formulae on the performance of a subject class [’math’, ’…

Text document clustering based on frequent word meaning sequences

Y Li, SM Chung, JD Holt - Data & Knowledge Engineering, 2008 - Elsevier
text documents. For experiments, we used the Reuters-21578 text collection, CISI documents
… They are different in terms of document size, cluster size, number of classes, dimension of …

Metric learning for text documents

G Lebanon - IEEE Transactions on Pattern Analysis and …, 2006 - ieeexplore.ieee.org
… terms are similar for the two methods while the top scored terms are completely disjoint. …
;IRfi differentiable functions or as equivalence classes of curves having the same velocity vectors …