Use of ranked cross document evidence trails for hypothesis generation

Authors:
Rohini K. Srihari;Li Xu;Tushar Saxena
Affiliations:
State University of New York at Buffalo;State University of New York at Buffalo;State University of New York at Buffalo
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 13
Cited 1

A faster approximation algorithm for the Steiner problem in graphs

Information Processing Letters
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Modern Information Retrieval

Modern Information Retrieval
Using syntactic dependency as local context to resolve word sense ambiguity

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Text mining: generating hypotheses from MEDLINE

Journal of the American Society for Information Science and Technology
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Mining: Predictive Methods for Analyzing Unstructured Information

Text Mining: Predictive Methods for Analyzing Unstructured Information
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Topic themes for multi-document summarization

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLPWorkshop on Automatic summarization - Volume 4
Group and topic discovery from relations and text

Proceedings of the 3rd international workshop on Link discovery
Modeling local coherence: an entity-based approach

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Infoxtract: A customizable intermediate level information extraction engine

Natural Language Engineering

Simultaneous joint and conditional modeling of documents tagged from two perspectives

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on detecting how concepts are linked across multiple textdocuments by generating an evidence trail explaining the connection. A traditional search involving, for example, two or more person names willattempt to find documents mentioning both of these individuals. This researchfocuses on a different interpretation of such a query: what is the best evidencetrail across documents that explains a connection between these individuals? For example, allmay be good golfers. A generalization ofthis task involves query terms representing general concepts (e.g. indictment,foreign policy). Such queries reflect a special case oftext mining. Previous attempts to solve this problem have focused on graphapproaches involving hyperlinked documents, and link analysis tools exploiting named entities. A new robust framework is presented, based on (i) generating concept chain graphs, a hybrid content representation, (ii) performing graph matching to select candidate subgraphs, and (iii) subsequently using graphical models to validate hypotheses using ranked evidence trails. We adapt the DUC data set for cross-document summarization to evaluate evidence trails generated by this approach.