Collection synthesis

Authors:
Donna Bergmark
Affiliations:
Cornell Digital Library Research Group, Ithaca, NY
Venue:
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Year:
2002

Citing 30
Cited 17

Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
Internet agents: spiders, wanderers, brokers, and bots

Internet agents: spiders, wanderers, brokers, and bots
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
WebQuery: searching and visualizing the Web through connectivity

Selected papers from the sixth international conference on World Wide Web
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Measuring index quality using random walks on the Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Adding support for dynamic and focused search with Fetuccino

WWW '99 Proceedings of the eighth international conference on World Wide Web
Organizing topic-specific web information

HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Greenstone: a comprehensive open-source digital library software system

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Topical locality in the Web

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Hubs, authorities, and communities

ACM Computing Surveys (CSUR)
Recent results in automatic Web resource discovery

ACM Computing Surveys (CSUR)
The term vector database: fast access to indexing terms for Web pages

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
On near-uniform URL sampling

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
WTMS: a system for collecting for collecting and analyzing topic-specific Web information

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Breadth-first crawling yields high-quality pages

Proceedings of the 10th international conference on World Wide Web
Power to the people: end-user building of digital library collections

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Compiling document collections from the Internet

ACM SIGIR Forum
Searching the Web

ACM Transactions on Internet Technology (TOIT)
SALSA: the stochastic approach for link-structure analysis

ACM Transactions on Information Systems (TOIS)
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Mercator: A scalable, extensible Web crawler

World Wide Web
Focused Crawling Using Context Graphs

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Automatic Information Organization and Retrieval.

Automatic Information Organization and Retrieval.

Focused Crawls, Tunneling, and Digital Libraries

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
The Web-DL environment for building digital libraries from the Web

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Panorama: extending digital libraries with topical crawlers

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Web page classification without the web page

Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
EBizPort: collecting and analyzing business intelligence information

Journal of the American Society for Information Science and Technology
As we may perceive: inferring logical documents from hypertext

Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Analyzing history in hypermedia collections

Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Building a research library for the history of the web

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Agreeing to disagree: search engines and their public interfaces

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Augmenting OAI-PMH repository holdings using search engine APIs

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Selection and context scoping for digital video collections: an investigation of youtube and blogs

Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Improving the performance of focused web crawlers

Data & Knowledge Engineering
Computing intensions of digital library collections

ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
Viewing collections as abstractions

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Evaluation of the NSDL and google for obtaining pedagogical resources

ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
Intelligent Social Media Indexing and Sharing Using an Adaptive Indexing Search Engine

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The invention of the hyperlink and the HTTP transmission protocol caused an amazing new structure to appear on the Internet -- the World Wide Web. With the Web, there came spiders, robots, and Web crawlers, which go from one link to the next checking Web health, ferreting out information and resources, and imposing organization on the huge collection of information (and dross) residing on the net. This paper reports on the use of one such crawler to synthesize document collections on various topics in science, mathematics, engineering and technology. Such collections could be part of a digital library.