Automatic text processing
Scalable Internet resource discovery: research problems and approaches
Communications of the ACM
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
An interactive WWW search engine for user-defined collections
Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A smart itsy bitsy spider for the web
Journal of the American Society for Information Science - Special topic issue: artificial intelligence techniques for emerging information systems applications
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Compiling document collections from the Internet
ACM SIGIR Forum
Creating a Web community chart for navigating related communities
Proceedings of the 12th ACM conference on Hypertext and Hypermedia
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
NanoPort: a web portal for nanoscale science and technology
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Extracting Large-Scale Knowledge Bases from the Web
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Focused Crawls, Tunneling, and Digital Libraries
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
WebGlimpse: combining browsing and searching
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Learning to crawl: Comparing classification schemes
ACM Transactions on Information Systems (TOIS)
Link Contexts in Classifier-Guided Topical Crawlers
IEEE Transactions on Knowledge and Data Engineering
Multilingual Web retrieval: An experiment in English–Chinese business intelligence
Journal of the American Society for Information Science and Technology
Structure-driven crawler generation by example
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Identification of time-varying objects on the web
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
CRAWLING THE CONSTRUCTION WEB-A MACHINE-LEARNING APPROACH WITHOUT NEGATIVE EXAMPLES
Applied Artificial Intelligence
Monitoring the status of a research community through a Knowledge Map
Web Intelligence and Agent Systems
Metadata domain-knowledge driven search engine in "HyperManyMedia" E-learning resources
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Focused Crawling with Heterogeneous Semantic Information
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Finding what is missing from a digital library: A case study in the Computer Science field
Information Processing and Management: an International Journal
Profile-based focused crawling for social media-sharing websites
Journal on Image and Video Processing
Exploiting Tags and Social Profiles to Improve Focused Crawling
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Metadata as seeds for building an ontology driven information retrieval system
International Journal of Hybrid Intelligent Systems
Addressing the limited scope problem of focused crawling using a result merging approach
Proceedings of the 2010 ACM Symposium on Applied Computing
Synonyms extraction using web content focused crawling
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Design and implementation of contextual information portals
Proceedings of the 20th international conference companion on World wide web
Statistical approach to estimate the quality of web datasets
CIMMACS'05 Proceedings of the 4th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics
Meta-search based web resource discovery for object-level vertical search
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Schema driven and topic specific web crawling
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Topical crawling on the web through local site-searches
Journal of Web Engineering
Hi-index | 0.00 |
Collecting domain-specific documents from the Web using focused crawlers has been considered one of the most important strategies to build digital libraries that serve the scientific community. However, because most focused crawlers use local search algorithms to traverse the Web space, they could be easily trapped within a limited sub-graph of the Web that surrounds the starting URLs and build domain-specific collections that are not comprehensive and diverse enough to scientists and researchers. In this study, we investigated the problems of traditional focused crawlers caused by local search algorithms and proposed a new crawling approach, meta-search enhanced focused crawling, to address the problems. We conducted two user evaluation experiments to examine the performance of our proposed approach and the results showed that our approach could build domain-specific collections with higher quality than traditional focused crawling techniques.