Using the structure of documents to improve the discovery of unexpected information

Authors:
François Jacquenet;Christine Largeron
Affiliations:
University of Saint-Etienne, Saint-Etienne, France;University of Saint-Etienne, Saint-Etienne, France
Venue:
Proceedings of the 2006 ACM symposium on Applied computing
Year:
2006

Citing 6
Cited 1

Discovering unexpected information from your competitors' web sites

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Emerging Topic Tracking System

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Discovery of Emerging Topics between Communities on WWW

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
KeyGraph: Automatic Indexing by Co-occurrence Graph based on Building Construction Metaphor

ADL '98 Proceedings of the Advances in Digital Libraries Conference
Discovering unexpected information for technology watch

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases

Extracting Advantage Phrases That Hint at a New Technology's Potentials

PAKM '08 Proceedings of the 7th International Conference on Practical Aspects of Knowledge Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we are interested in taking into account the structure of the documents during the discovery of unexpected information in textual databases. Following a work that aimed at designing and integrating, in the UnexpectedMiner system, some measures for the evaluation of the unexpectedness of documents, we wanted to improve the system by taking into account the structure of the documents processed. Each part of the documents are weighted by some coefficients whose values are determined by optimization techniques. Those coefficients are then used by the system in the unexpectedness measures to determine if a document contains some unexpected information or not. The efficiency of our new system is then evaluated and the experiments put forward the improvements induced by the use of the structure of the documents.