C4.5: programs for machine learning
C4.5: programs for machine learning
Performance standards and evaluations in IR test collections: cluster-based retrieval models
Information Processing and Management: an International Journal
A Web-based information system that reasons with structured collections of text
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Dynamic reference sifting: a case study in the homepage domain
Selected papers from the sixth international conference on World Wide Web
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Machine Learning
Modern Information Retrieval
Machine Learning
A Machine Learning Approach to Building Domain-Specific Search Engines
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Moving up the information food chain: deploying softbots on the world wide web
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
Interactive Integration of Information Agents on the Web
CIA '01 Proceedings of the 5th International Workshop on Cooperative Information Agents V
Domain-Specific Web Search with Keyword Spices
IEEE Transactions on Knowledge and Data Engineering
Suggesting novel but related topics: towards context-based support for knowledge model extension
Proceedings of the 10th international conference on Intelligent user interfaces
Query expansion with the minimum user feedback by transductive learning
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
SERGEANT: A framework for building more flexible web agents by exploiting a search engine
Web Intelligence and Agent Systems
Semisupervised Query Expansion with Minimal Feedback
IEEE Transactions on Knowledge and Data Engineering
Domain-specific disambiguation for typing with ambiguous keyboards
TextEntry '03 Proceedings of the 2003 EACL Workshop on Language Modeling for Text Entry Methods
Statistical approach to estimate the quality of web datasets
CIMMACS'05 Proceedings of the 4th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics
Query expansion with the minimum relevance judgments
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Discovery of environmental nodes in the web
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Hi-index | 0.00 |
This paper presents a new method for building domain-specific web search engines. Previous methods eliminate irrelevant documents from the pages accessed using heuristics based on human knowledge about the domain in question. Accordingly, they are hard to build and can not be applied to other domains. The keyword spice method, in contrast, improves search performance by adding domain-specific keywords, called keyword spices, to the user's input query; the modified query is then forwarded to a general-purpose search engine. Keyword spices can be effectively discovered automatically from web documents allowing us to build high quality domain-specific search engines in various domains without requiring the collection of heuristic knowledge. We describe a machine learning algorithm, which is a type of decision-tree learning algorithm, that can extract keyword spices. To demonstrate the value of the proposed approach, we conduct experiments in the domain of cooking. The results confirm the excellent performance of our method in terms of both precision and recall.