Evaluating topic-driven web crawlers
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Using the structure of HTML documents to improve retrieval
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Hi-index | 0.00 |
Nowadays amount of information on Internet is dramatically increasing. The ability of facilitating users to achieve useful information is more and more important for search field. CDSE, a model for the domain-based intelligent search engine is proposed in this paper. The model can help users to retrieve what they need by combining text classification with keywords extraction. Several algorithms that use key technologies are proposed, such as statistics, data mining and agents. Then a new criterion named ranking error is contributed to solve the problem of evaluation ranking inefficiency in traditional performance evaluation methodologies. The experimental results indicate that the proposed model can effectively improve retrieval precision and solve the problem of relevant document ranking behind in current search engine.