Learning to find comparable entities on the web

Authors:
Xiaojiang Huang;Xiaojun Wan;Jianguo Xiao
Affiliations:
Institute of Computer Science and Technology & The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;Institute of Computer Science and Technology & The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;Institute of Computer Science and Technology & The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China
Venue:
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Year:
2012

Citing 23
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A cross-collection mixture model for comparative text mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
CWS: a comparative web search system

Proceedings of the 15th international conference on World Wide Web
Identifying comparative sentences in text documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Compare&contrast: using the web to discover comparable cases for news stories

Proceedings of the 16th international conference on World Wide Web
Answering relationship queries on the web

Proceedings of the 16th international conference on World Wide Web
Language-Independent Set Expansion of Named Entities Using the Web

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Extracting Product Comparisons from Discussion Boards

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Competitor Mining with the Web

IEEE Transactions on Knowledge and Data Engineering
Iterative Set Expansion of Named Entities Using the Web

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Mining comparative sentences and relations

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Automatic set expansion for list question answering

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Contrastive summarization: an experiment with consumer reviews

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Generating comparative summaries of contradictory opinions in text

Proceedings of the 18th ACM conference on Information and knowledge management
Identifying comparable entities on the web

Proceedings of the 18th ACM conference on Information and knowledge management
Comparative document summarization via discriminative sentence selection

Proceedings of the 18th ACM conference on Information and knowledge management
Web-scale distributional similarity and entity set expansion

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Comparable entity mining from comparative questions

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Entity set expansion using topic information

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Searching coordinate terms with their context from the web

WISE'06 Proceedings of the 7th international conference on Web Information Systems
WebSets: extracting sets of entities from the web using unsupervised information extraction

Proceedings of the fifth ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Comparison is a popular way for people to discover the commonality and difference between two entities (e.g. product, person, company, event, etc.). It would be very useful to automatically provide comparison results for the user. The prerequisite step of this task is to find comparable entities. In this paper, we propose a novel Web mining system to address the task of finding comparable entities for a given single entity. First, the system uses a bootstrapping method to find candidate entities for the given entity through natural language analysis in the snippets of search engine results. Then, the system uses set expansion techniques to find more candidate entities though semi-structured HTML analysis in the downloaded web pages. Finally, the system uses a supervised learning method to classify the candidate entities into either comparable or incomparable by incorporating linguistic, statistical and semantic features. Experimental results demonstrate that our proposed framework can outperform the baseline systems.