Integrating multiple document features in language models for expert finding

Authors:
Jianhan Zhu;Xiangji Huang;Dawei Song;Stefan Rüger
Affiliations:
University College London, Department of Computer Science, Gower Street, WC1E 6BT, London, UK;York University, School of Information Technology, M3J 1P3, Toronto, Canada;The Robert Gordon University, School of Computing, AB25 1HG, Aberdeen, UK;The Open University, Knowledge Media Institute, MK7 6AA, Milton Keynes, UK
Venue:
Knowledge and Information Systems
Year:
2010

Citing 31
Cited 8

A system for discovering relationships by feature extraction from text databases

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Expert Finding for Collaborative Virtual Environments

Communications of the ACM
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Featuring web communities based on word co-occurrence structure of communications: 736

Proceedings of the 11th international conference on World Wide Web
Query Expansion with Long-Span Collocates

Information Retrieval
Analysis of anchor text for web search

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Expertise identification using email communications

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Email as spectroscopy: automated discovery of community structure within organizations

Communities and technologies
Use of RDF for expertise matching within academia

Knowledge and Information Systems
Context-sensitive information retrieval using implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance weighting for query independent evidence

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Formal models for expert finding in enterprise corpora

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Applying language modeling to session identification from database trace logs

Knowledge and Information Systems
Hierarchical Language Models for Expert Finding in Enterprise Corpora

ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Broad expertise retrieval in sparse data environments

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Expertise drift and query expansion in expert search

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proximity-based document representation for named entity retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
EntityRank: searching entities directly and holistically

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cluster ranking with an application to mining mailbox networks

Knowledge and Information Systems
Expertise-based peer selection in Peer-to-Peer networks

Knowledge and Information Systems
Relation discovery from web data for competency management

Web Intelligence and Agent Systems
Voting techniques for expert search

Knowledge and Information Systems
The Lixto Systems Applications in Business Intelligence and Semantic Web

ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
Modeling document features for expert finding

Proceedings of the 17th ACM conference on Information and knowledge management
A study of the relationship between ad hoc retrieval and expert finding in enterprise environment

Proceedings of the 10th ACM workshop on Web information and data management
Integrating multiple windows and document features for expert finding

Journal of the American Society for Information Science and Technology
Probabilistic models for expert finding

ECIR'07 Proceedings of the 29th European conference on IR research
Modeling documents as mixtures of persons for expert finding

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
A PDD-Based searching approach for expert finding in intranet information management

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology

Discriminative models of integrating document evidence and document-candidate associations for expert search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A user-oriented model for expert finding

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Query modeling for entity search based on terms, categories, and examples

ACM Transactions on Information Systems (TOIS)
Promoting ranking diversity for biomedical information retrieval using wikipedia

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
The OU linked open data: production and consumption

ESWC'11 Proceedings of the 8th international conference on The Semantic Web
Expertise Retrieval

Foundations and Trends in Information Retrieval
Finding the right supervisor: expert-finding in a university domain

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
Using semi-structured data for assessing research paper similarity

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We argue that expert finding is sensitive to multiple document features in an organizational intranet. These document features include multiple levels of associations between experts and a query topic from sentence, paragraph, up to document levels, document authority information such as the PageRank, indegree, and URL length of documents, and internal document structures that indicate the experts’ relationship with the content of documents. Our assumption is that expert finding can largely benefit from the incorporation of these document features. However, existing language modeling approaches for expert finding have not sufficiently taken into account these document features. We propose a novel language modeling approach, which integrates multiple document features, for expert finding. Our experiments on two large scale TREC Enterprise Track datasets, i.e., the W3C and CSIRO datasets, demonstrate that the natures of the two organizational intranets and two types of expert finding tasks, i.e., key contact finding for CSIRO and knowledgeable person finding for W3C, influence the effectiveness of different document features. Our work provides insights into which document features work for certain types of expert finding tasks, and helps design expert finding strategies that are effective for different scenarios. Our main contribution is to develop an effective formal method for modeling multiple document features in expert finding, and conduct a systematic investigation of their effects. It is worth noting that our novel approach achieves better results in terms of MAP than previous language model based approaches and the best automatic runs in both the TREC2006 and TREC2007 expert search tasks, respectively.