Knowledge-based metadata extraction from PostScript files
DL '00 Proceedings of the fifth ACM conference on Digital libraries
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Towards context sensitive information inference
Journal of the American Society for Information Science and Technology - Mathematical, logical, and formal methods in information retrieval
Title extraction from bodies of HTML documents and its application to web page retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
This paper proposes an effective scoring scheme for feature selection in Text Mining, using characteristics of Small-World Phenomenon on the semantic networks of documents. Our focus is on the reservation of both syntactic and statistical information of words, rather than solely simple frequency summarization in prevailing scoring schemes, such as TFIDF. Experimental results on TREC dataset show that our scoring scheme outperforms the prevailing schemes.