Computation on sentence semantic distance for novelty detection

Authors:
Hua-Ping Zhang;Jian Sun;Bing Wang;Shuo Bai
Affiliations:
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P.R. China and Graduate School of the Chinese Academy of Sciences, Beijing, P.R. China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P.R. China
Venue:
Journal of Computer Science and Technology
Year:
2005

Citing 4
Cited 5

Use of syntactic context to produce term association lists for text retrieval

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical

Advances in kernel methods
Semantic computation in a Chinese question-answering system

Journal of Computer Science and Technology
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics

Combining named entities and tags for novel sentence detection

Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval
Sentence-Level Novelty Detection in English and Malay

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Evaluation of novelty metrics for sentence-level novelty mining

Information Sciences: an International Journal
Syntactic impact on sentence similarity measure in archive-based QA system

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Multilingual novelty detection

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Novelty detection is to retrieve new information and filter redundancy from given sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach to novelty detection with semantic distance computation. The motivation is to expand a sentence by introducing semantic information. Computation on semantic distance between sentences incorporates WordNet with statistical information. The novelty detection is treated as a binary classification problem: new sentence or not. The feature vector, used in the vector space model for classification, consists of various factors, including the semantic distance from the sentence to the topic and the distance from the sentence to the previous relevant context occurring before it. New sentences are then detected with Winnow and support vector machine classifiers, respectively. Several experiments are conducted to survey the relationship between different factors and performance. It is proved that semantic computation is promising in novelty detection. The ratio of new sentence size to relevant size is further studied given different relevant document sizes. It is found that the ratio reduced with a certain speed (about 0.86). Then another group of experiments is performed supervised with the ratio. It is demonstrated that the ratio is helpful to improve the novelty detection performance.