Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Learning search engine specific query transformations for question answering
Proceedings of the 10th international conference on World Wide Web
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Proceedings of the 8th international conference on Intelligent user interfaces
Improving Category Specific Web Search by Learning Query Modifications
SAINT '01 Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001)
The TREC question answering track
Natural Language Engineering
How to build a WebFountain: An architecture for very large-scale text analytics
IBM Systems Journal
Adapting ranking SVM to document retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling Discriminative Global Inference
ICSC '07 Proceedings of the International Conference on Semantic Computing
Interactive feature space construction using semantic information
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Learning to Rank for Information Retrieval
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
As the web evolves, increasing quantities of structured information is embedded in web pages in disparate formats. For example, a digital camera's description may include its price and megapixels whereas a professor's description may include her name, university, and research interests. Both types of pages may include additional ambiguous information. General search engines (GSEs) do not support queries over these types of data because they ignore the web document semantics. Conversely, describing requisite semantics through structured queries into databases populated by information extraction (IE) techniques are expensive and not easily adaptable to new domains. This paper describes a methodology for rapidly developing search engines capable of answering structured queries over unstructured corpora by utilizing machine learning to avoid explicit IE. We empirically show that with minimum additional human effort, our system outperforms a GSE with respect to structured queries with clear object semantics.