Retrieval and novelty detection at the sentence level
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Chinese lexical analysis using hierarchical hidden Markov model
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Improving the estimation of relevance models using large external corpora
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Combining named entities and tags for novel sentence detection
Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval
Sentence-Level Novelty Detection in English and Malay
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Blended metrics for novel sentence mining
Expert Systems with Applications: An International Journal
Evaluation of novelty metrics for sentence-level novelty mining
Information Sciences: an International Journal
Detecting cyber security threats in weblogs using probabilistic models
PAISI'07 Proceedings of the 2007 Pacific Asia conference on Intelligence and security informatics
Database optimization for novelty detection
ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
Detecting novel business blogs
ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
Dimensionality reduction techniques for blog visualization
Expert Systems with Applications: An International Journal
Authorship Identification for Online Text
CW '10 Proceedings of the 2010 International Conference on Cyberworlds
An intelligent system for sentence retrieval and novelty mining
International Journal of Knowledge Engineering and Data Mining
A tag-topic model for blog mining
Expert Systems with Applications: An International Journal
D2S: Document-to-sentence framework for novelty detection
Knowledge and Information Systems
Hi-index | 0.00 |
The categorization and novelty mining of chronologically ordered documents is an important data mining problem. This paper focuses on the entire process of Chinese novelty mining, from preprocessing and categorization to the actual detection of novel information, which has rarely been studied. First, preprocessing techniques for detecting novel Chinese text are discussed and compared. Next, we investigate the categorization and novelty mining performance between English and Chinese sentences and also discuss the novelty mining performance based on the retrieval results. Moreover, we propose new novelty mining evaluation measures, Novelty-Precision, Novelty-Recall, Novelty-F Score, and Sensitivity, which measures the sensitivity of the novelty mining system to the incorrectly classified sentences. The results indicate that Chinese novelty mining at the sentence level is similar to English if the sentences are perfectly categorized. Using our new evaluation measures of Novelty-Precision, Novelty-Recall, Novelty-F Score, and Sensitivity, we can more fairly assess how the performance of novelty mining is influenced by the retrieval results.