Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Internet agents: spiders, wanderers, brokers, and bots
Internet agents: spiders, wanderers, brokers, and bots
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
WebQuery: searching and visualizing the Web through connectivity
Selected papers from the sixth international conference on World Wide Web
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Measuring index quality using random walks on the Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Adding support for dynamic and focused search with Fetuccino
WWW '99 Proceedings of the eighth international conference on World Wide Web
Organizing topic-specific web information
HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Greenstone: a comprehensive open-source digital library software system
DL '00 Proceedings of the fifth ACM conference on Digital libraries
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Hubs, authorities, and communities
ACM Computing Surveys (CSUR)
Recent results in automatic Web resource discovery
ACM Computing Surveys (CSUR)
The term vector database: fast access to indexing terms for Web pages
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
WTMS: a system for collecting for collecting and analyzing topic-specific Web information
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Breadth-first crawling yields high-quality pages
Proceedings of the 10th international conference on World Wide Web
Power to the people: end-user building of digital library collections
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Compiling document collections from the Internet
ACM SIGIR Forum
ACM Transactions on Internet Technology (TOIT)
SALSA: the stochastic approach for link-structure analysis
ACM Transactions on Information Systems (TOIS)
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Mercator: A scalable, extensible Web crawler
World Wide Web
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Automatic Information Organization and Retrieval.
Automatic Information Organization and Retrieval.
Focused Crawls, Tunneling, and Digital Libraries
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
The Web-DL environment for building digital libraries from the Web
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Panorama: extending digital libraries with topical crawlers
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Web page classification without the web page
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
EBizPort: collecting and analyzing business intelligence information
Journal of the American Society for Information Science and Technology
As we may perceive: inferring logical documents from hypertext
Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Analyzing history in hypermedia collections
Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Building a research library for the history of the web
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Agreeing to disagree: search engines and their public interfaces
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Augmenting OAI-PMH repository holdings using search engine APIs
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Selection and context scoping for digital video collections: an investigation of youtube and blogs
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Improving the performance of focused web crawlers
Data & Knowledge Engineering
Computing intensions of digital library collections
ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
Viewing collections as abstractions
DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Evaluation of the NSDL and google for obtaining pedagogical resources
ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
The invention of the hyperlink and the HTTP transmission protocol caused an amazing new structure to appear on the Internet -- the World Wide Web. With the Web, there came spiders, robots, and Web crawlers, which go from one link to the next checking Web health, ferreting out information and resources, and imposing organization on the huge collection of information (and dross) residing on the net. This paper reports on the use of one such crawler to synthesize document collections on various topics in science, mathematics, engineering and technology. Such collections could be part of a digital library.