Extended Boolean information retrieval
Communications of the ACM
Topic-conditioned novelty detection
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Retrieval and novelty detection at the sentence level
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Topic segmentation of message hierarchies for indexing and navigation support
WWW '05 Proceedings of the 14th international conference on World Wide Web
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A study of relevance propagation for web search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Tracking Information Epidemics in Blogspace
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Similarity measures for tracking information flow
Proceedings of the 14th ACM international conference on Information and knowledge management
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
Proceedings of the 15th international conference on World Wide Web
Improved annotation of the blogosphere via autotagging and hierarchical clustering
Proceedings of the 15th international conference on World Wide Web
AutoTag: a collaborative approach to automated tag assignment for weblog posts
Proceedings of the 15th international conference on World Wide Web
CUTS: CUrvature-based development pattern analysis and segmentation for blogs and other Text Streams
Proceedings of the seventeenth conference on Hypertext and hypermedia
CP/CV: concept similarity mining without frequency information from domain describing taxonomies
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
CDIP: Collection-Driven, yet Individuality-Preserving Automated Blog Tagging
ICSC '07 Proceedings of the International Conference on Semantic Computing
Efficient overlap and content reuse detection in blogs and online news articles
Proceedings of the 18th international conference on World wide web
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hypergeometric language models for republished article finding
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
As their popularity as dynamic platforms for information dissemination and sharing increases, the use of Weblogs (blogs) which track and comment on real world (political, news, entertainment) events is also growing. The success of the blog as a popular medium for information sharing, on the other hand, is also its weakest spot in that there is little support beyond keyword based searches for blog entries. Consequently, there is impending need for navigational support, which can help users relate a large, diverse, and inherently distributed collection of blogosphere. In this paper, we first note that the existence of large degrees of content overlaps in the form of quotation/commentary pairs (as well as content borrowings across media outlets) can be leveraged for tracking the topic development patterns within the blogosphere. Relying on this observation, we first propose focus and flow analysis techniques that rely on reuse detection and focus and flow to help place blog entries into logical organizations. We then show that these implicit or explicit quotations as well as focus analysis could be leveraged to identify the contexts in which entries occur; thus, resulting in more effective tagging. Thus, we propose CDIP (a collection-driven, yet individuality-preserving tagging system) which relies on relationships provided by quotation/reuse detection and semantic-focus analysis to automatically tag the blogs in such a way that, not-only the related blogs share tags, but also individuality of the entries is preserved for discriminating tag-based accesses.