Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Data Compression Using Long Common Strings
DCC '99 Proceedings of the Conference on Data Compression
Offline Dictionary-Based Compression
DCC '99 Proceedings of the Conference on Data Compression
Using Compression to Identify Acronyms in Text
DCC '00 Proceedings of the Conference on Data Compression
How to Build a Digital Library
How to Build a Digital Library
Phrase Hierarchy Inference and Compression in Bounded Space
DCC '98 Proceedings of the Conference on Data Compression
A compression-based algorithm for Chinese word segmentation
Computational Linguistics
Identifying hierarchical structure in sequences: a linear-time algorithm
Journal of Artificial Intelligence Research
Domain-specific keyphrase extraction
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Classification automaton and its construction using learning
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Hi-index | 0.00 |
The services that digital libraries provide to users can be greatly enhanced by automatically gleaning certain kinds of information from the full text of the documents they contain. This paper reviews some recent work that applies novel techniques of machine learning (broadly interpreted) to extract information from plain text, and puts it in the context of digital library applications. We describe three areas: hierarchical phrase browsing, including efficient methods for inferring a phrase hierarchy from a large corpus of text; text mining using adaptive compression techniques, giving a new approach to generic entity extraction, word segmentation, and acronym extraction; and keyphrase extraction.