Resource-bounded reasoning in intelligent systems
ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active Learning for Natural Language Parsing and Information Extraction
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Learning and classifying under hard budgets
ECML'05 Proceedings of the 16th European conference on Machine Learning
Proceedings of the 9th annual ACM international workshop on Web information and data management
Towards breaking the quality curse.: a web-querying approach to web people search.
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A unified approach for schema matching, coreference and canonicalization
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On co-authorship for author disambiguation
Information Processing and Management: an International Journal
Author name disambiguation in MEDLINE
ACM Transactions on Knowledge Discovery from Data (TKDD)
Resource-bounded information gathering for correlation clustering
COLT'07 Proceedings of the 20th annual conference on Learning theory
Effective self-training author name disambiguation in scholarly digital libraries
Proceedings of the 10th annual joint conference on Digital libraries
Journal of the American Society for Information Science and Technology
Exploiting Web querying for Web people search
ACM Transactions on Database Systems (TODS)
Resource-Bounded information extraction: acquiring missing feature values on demand
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Disambiguating authors in citations on the web and authorship correlations
Expert Systems with Applications: An International Journal
Cost-effective on-demand associative author name disambiguation
Information Processing and Management: an International Journal
A tool for generating synthetic authorship records for evaluating author name disambiguation methods
Information Sciences: an International Journal
Active associative sampling for author name disambiguation
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Citation-based bootstrapping for large-scale author disambiguation
Journal of the American Society for Information Science and Technology
AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB
Computational Intelligence
A brief survey of automatic methods for author name disambiguation
ACM SIGMOD Record
Ambiguous author query detection using crowdsourced digital library annotations
Information Processing and Management: an International Journal
Bootstrapping active name disambiguation with crowdsourcing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Query-driven approach to entity resolution
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Accurate entity resolution is sometimes impossible simply due to insufficient information. For example, in research paper author name resolution, even clever use of venue, title and coauthorship relations are often not enough to make a confident coreference decision. This paper presents several methods for increasing accuracy by gathering and integrating additional evidence from the web. We formulate the coreference problem as one of graph partitioning with discriminatively-trained edge weights, and then incorporate web information either as additional features or as additional nodes in the graph. Since the web is too large to incorporate all its data, we need an efficient procedure for selecting a subset of web queries and data. We formally describe the problem of resource bounded information gathering in each of these contexts, and show significant accuracy improvement with low cost.