Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Hypertext, full text, and automatic linking
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text structuring and retrieval-experiments in automatic encyclopedia searching
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
A methodology for the automatic construction of a hypertext for information retrieval
SAC '93 Proceedings of the 1993 ACM/SIGAPP symposium on Applied computing: states of the art and practice
Automatic hypertext construction
Automatic hypertext construction
On the use of information retrieval techniques for the automatic construction of hypertext
Information Processing and Management: an International Journal - Special issue: methods and tools for the automatic construction of hypertext
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
Corpus structure, language models, and ad hoc information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Bibliometric impact measures leveraging topic analysis
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Data association for topic intensity tracking
ICML '06 Proceedings of the 23rd international conference on Machine learning
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Joint latent topic models for text and citations
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
COA: finding novel patents through text analysis
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying the original contribution of a document via language modeling
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Identifying the Original Contribution of a Document via Language Modeling
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
The web of topics: discovering the topology of topic evolution in a corpus
Proceedings of the 20th international conference on World wide web
Beyond keyword search: discovering relevant scientific literature
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
IPKB: a digital library for invertebrate paleontology
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Temporal corpus summarization using submodular word coverage
Proceedings of the 21st ACM international conference on Information and knowledge management
Story graphs: Tracking document set evolution using dynamic graphs
Intelligent Data Analysis - Dynamic Networks and Knowledge Discovery
Hi-index | 0.00 |
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While accessing individual documents is easy, methods for overviewing and understanding these collections as a whole are lacking in number and in scope. In this paper, we address one such global analysis task, namely the problem of automatically uncovering how ideas spread through the collection over time. We refer to this problem as Information Genealogy. In contrast to bibliometric methods that are limited to collections with explicit citation structure, we investigate content-based methods requiring only the text and timestamps of the documents. In particular, we propose a language-modeling approach and a likelihood ratio test to detect influence between documents in a statistically well-founded way. Furthermore, we show how this method can be used to infer citation graphs and to identify the most influential documents in the collection. Experiments on the NIPS conference proceedings and the Physics ArXiv show that our method is more effective than methods based on document similarity.