The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews
WWW '03 Proceedings of the 12th international conference on World Wide Web
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Using term informativeness for named entity detection
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Mining for personal name aliases on the web
Proceedings of the 17th international conference on World Wide Web
Automatically Extracting Personal Name Aliases from the Web
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Real time extraction of related terms by bi-directional lexico-syntactic patterns from the web
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Query by analogical example: relational search using web search engine indices
Proceedings of the 18th ACM conference on Information and knowledge management
Hi-index | 0.00 |
The web has gained much attention as new media reflecting real-time interest in the world. This attention is driven by the proliferation of tools like bulletin boards and weblogs. The web is a source from which we can collect and summarize information about a particular object (e.g., business organization, product, person, etc.) For example, the extraction of reputation information is a major research topic in information extraction and knowledge extraction from the web. The ability to collect web pages about a particular object is essential in obtaining such information and extracting knowledge from it. A big problem in the web page collection process is that the same objects are referred to in different ways in different web documents. For example, a person may be referred to by full name, first name, affiliation and title, or nicknames. This paper proposes a method for extracting these mnemonic names of people from the web and shows experimental results using real web data.