Capturing term dependencies using a language model based on sentence trees

Authors:
Ramesh Nallapati;James Allan
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the eleventh international conference on Information and knowledge management
Year:
2002

Citing 13
Cited 37

Introduction to algorithms

Introduction to algorithms
Inference networks for document retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
An architecture for probabilistic concept-based information retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
The use of phrases and structured queries in information retrieval

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
A system for discovering relationships by feature extraction from text databases

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Multi-strategy learning for topic detection and tracking: a joint report of CMU approaches to multilingual TDT

Topic detection and tracking
Explorations within topic tracking and detection

Topic detection and tracking
Relevance models for topic detection and tracking

HLT '02 Proceedings of the second international conference on Human Language Technology Research

Incorporating query term dependencies in language models for document retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A comparison of various approaches for using probabilistic dependencies in language modeling

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Exploiting syntactic structure of queries in a language modeling approach to IR

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Discriminative models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Linear discriminant model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Integrating word relationships into language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Set-based vector model: An efficient approach for correlation-based ranking

ACM Transactions on Information Systems (TOIS)
Dependency structure language model for topic detection and tracking

Information Processing and Management: an International Journal
Exploration of query context for information retrieval

Proceedings of the 16th international conference on World Wide Web
Effective keyword-based selection of relational databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling

IEEE Transactions on Knowledge and Data Engineering
Parsimonious translation models for information retrieval

Information Processing and Management: an International Journal
Natural language processing for information retrieval: the time is ripe (again)

Proceedings of the ACM first Ph.D. workshop in CIKM
A graph method for keyword-based selection of the top-K databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A "Bag" or a "Window" of Words for Information Filtering?

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Relating dependent indexes using dempster-shafer theory

Proceedings of the 17th ACM conference on Information and knowledge management
Developing Evaluation Model of Topical Term for Document-Level Sentiment Classification

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
Tagging Sentence Boundaries in Biomedical Literature

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
A proximity language model for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Improved IR in cohesion model for link detection system

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
A multi-dependency language modeling approach to information retrieval

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Viewing term proximity from a different perspective

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Using various term dependencies according to their utilities

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Document sentiment classification by exploring description model of topical terms

Computer Speech and Language
Using the X-IOTA system in mono- and bilingual experiments at CLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Effective query model estimation using parsimonious translation model in language modeling approach

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Binary lexical relations for text representation in information retrieval

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Knowledge discovery in web-directories: finding term-relations to build a business ontology

EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
Estimation of query model from parsimonious translation model

AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Optimising search engines using evolutionally adapted language models in typed dependency parses

SIDE'12 Proceedings of the 2012 international conference on Swarm and Evolutionary Computation
Unsupervised and supervised learning to evaluate event relatedness based on content mining from social-media streams

Expert Systems with Applications: An International Journal
Learning to explore spatio-temporal impacts for event evaluation on social media

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
The bootstrapping based recognition of conceptual relationship for text retrieval

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
A novel neighborhood based document smoothing model for information retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a new probabilistic Sentence Tree Language Modeling approach that captures term dependency patterns in Topic Detection and Tracking's (TDT) Story Link Detection task. New features of the approach include modeling the syntactic structure of sentences in documents by a sentence-bin approach and a computationally efficient algorithm for capturing the most significant sentence-level term dependencies using a Maximum Spanning Tree approach, similar to Van Rijsbergen's modeling of document-level term dependencies.The new model is a good discriminator of on-topic and off-topic story pairs providing evidence that sentence-level term dependencies contain significant information about relevance. Although runs on a subset of the TDT2 corpus show that the model is outperformed by the unigram language model, a mixture of the unigram and the Sentence Tree models is shown to improve on the best performance especially in the regions of low false alarms.