Text Categorization (TC) is a task of classifying a set of documents into one or more predefined categories. Centroid-based method, a very popular TC method, aims to make classifiers simple and efficient by constructing one prototype vector for each class. It classifies a document into the class that owns the prototype vector nearest to the document. Many studies have been done on constructing prototype vectors. However, the basic philosophies of these methods are quite different from each other
2021-02-09 18:05:51 651KB text categorization; centroid-based methods;
1