Simple BM25 extension to multiple weighted fields
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Efficient and effective spam filtering and re-ranking for large web datasets
Information Retrieval
When close enough is good enough: approximate positional indexes for efficient ranked retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Search result presentation based on faceted clustering
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We present the ChatNoir search engine which indexes the entire English part of the ClueWeb09 corpus. Besides Carnegie Mellon's Indri system, ChatNoir is the second publicly available search engine for this corpus. It implements the classic BM25F information retrieval model including PageRank and spam likelihood. The search engine is scalable and returns the first results within three seconds, which is significantly faster than Indri. A convenient API allows for implementing reproducible experiments based on retrieving documents from the ClueWeb09 corpus. The search engine has successfully accomplished a load test involving 100,000 queries.