Parallel distributed processing: explorations in the microstructure, vol. 2: psychological and biological models
Information extraction as a basis for high-precision text classification
ACM Transactions on Information Systems (TOIS)
Metadata and data structures for the historical newspaper digital library
Proceedings of the eighth international conference on Information and knowledge management
Browsing the structure of multimedia stories
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Automatic generation of overview timelines
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Detecting events with date and place information in unstructured text
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Introduction to topic detection and tracking
Topic detection and tracking
Topic detection and tracking evaluation overview
Topic detection and tracking
Integrated Algorithms for Newspaper Page Decomposition and Article Tracking
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A query interface for an event gazetteer
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
A focus-context browser for multiple timelines
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Toward a metadata standard for digitized historical newspapers
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Development, implementation and testing of a discourse model for newspaper texts
HLT '93 Proceedings of the workshop on Human Language Technology
A common theory of information fusion from multiple text sources step one: cross-document structure
SIGDIAL '00 Proceedings of the 1st SIGdial workshop on Discourse and dialogue - Volume 10
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Tracking and summarizing news on a daily basis with Columbia's Newsblaster
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Interactive causal schematics for qualitative scientific explanations
ICADL'05 Proceedings of the 8th international conference on Asian Digital Libraries: implementing strategies and sharing experiences
Automated Processing of Digitized Historical Newspapers: Identification of Segments and Genres
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Exploring History with Narrative Timelines
Proceedings of the Symposium on Human Interface 2009 on ConferenceUniversal Access in Human-Computer Interaction. Part I: Held as Part of HCI International 2009
Hi-index | 0.01 |
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.