Relevance: communication and cognition
Relevance: communication and cognition
Attention, intentions, and the structure of discourse
Computational Linguistics
Topic parsing: accounting for text macro structures in full-text analysis
Information Processing and Management: an International Journal - Special issue on natural language processing and information retrieval
Robust automated topic identification
Robust automated topic identification
Topic segmentation: algorithms and applications
Topic segmentation: algorithms and applications
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Combining multiple knowledge sources for discourse segmentation
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Using micro information units for internet search
Proceedings of the eleventh international conference on Information and knowledge management
Topic Detection Using Lexical Chains
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
A bootstrapping approach for robust topic analysis
Natural Language Engineering
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using collocations for topic segmentation and link detection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A novel document similarity measure based on earth mover's distance
Information Sciences: an International Journal
Question-driven segmentation of lecture speech text: Towards intelligent e-learning systems
Journal of the American Society for Information Science and Technology
Towards a unified approach to document similarity search using manifold-ranking of blocks
Information Processing and Management: an International Journal
Beyond topical similarity: a structural similarity measure for retrieving highly similar documents
Knowledge and Information Systems
Text Entailment for Logical Segmentation and Summarization
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Computational measures for language similarity across time in online communities
ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
Unsupervised methods of topical text segmentation for Polish
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Web page cleaning for web mining through feature weighting
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Editorial: Narrative-based taxonomy distillation for effective indexing of text collections
Data & Knowledge Engineering
Document similarity search based on manifold-ranking of texttiles
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Using proportional transportation distances for measuring document similarity
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Collocational word similarity is considered a source of text cohesion that is hard to measure and quantify. The work presented here explores the use of information from a training corpus in measuring word similarity and evaluates the method in the text segmentation task. An implementation, the VecTile system, produces similarity curves over texts using pre-compiled vector representations of the contextual behavior of words. The performance of this system is shown to improve over that of the purely string-based TextTiling algorithm (Hearst, 1997).