Creat membership Creat membership
Sign in

Forgot password?

Confirm
  • Forgot password?
    Sign Up
  • Confirm
    Sign In
Creat membership Creat membership
Sign in

Forgot password?

Confirm
  • Forgot password?
    Sign Up
  • Confirm
    Sign In
Collection
For ¥0.57 per day, unlimited downloads CREATE MEMBERSHIP Download

toTop

If you have any feedback, Please follow the official account to submit feedback.

Turn on your phone and scan

home > search >

A NEW FEATURE SELECTION METHOD FOR TEXT CLASSIFICATION

Author:
UCHYIGIT, GULDEN   CLARK, KEITH  


Journal:
International Journal of Pattern Recognition and Artificial Intelligence


Issue Date:
2007


Abstract(summary):

Text classification is the problem of classifying a set of documents into a pre-defined set of classes. A major problem with text classification problems is the high dimensionality of the feature space. Only a small subset of these words are feature words which can be used in determining a document's class, while the rest adds noise and can make the results unreliable and significantly increase computational time. A common approach in dealing with this problem is feature selection where the number of words in the feature space are significantly reduced. In this paper we present the experiments of a comparative study of feature selection methods used for text classification. Ten feature selection methods were evaluated in this study including the new feature selection method, called the GU metric. The other feature selection methods evaluated in this study are: Chi-Squared (x(2)) statistic, NGL coefficient, GSS coefficient, Mutual Information, Information Gain, Odds Ratio, Term Frequency, Fisher Criterion, BSS/WSS coefficient. The experimental evaluations show that the GU metric obtained the best F(1) and F(2) scores. The experiments were performed on the 20 Newsgroups data sets with the Naive Bayesian Probabilistic Classifier.


Page:
423-438


VIEW PDF

The preview is over

If you wish to continue, please create your membership or download this.

Create Membership

Similar Literature

Submit Feedback

This function is a member function, members do not limit the number of downloads