A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic query expansion using query logs
Proceedings of the 11th international conference on World Wide Web
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining anchor text for query refinement
Proceedings of the 13th international conference on World Wide Web
Optimizing web search using web click-through data
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Finding advertising keywords on web pages
Proceedings of the 15th international conference on World Wide Web
Exploring social annotations for the semantic web
Proceedings of the 15th international conference on World Wide Web
Improved annotation of the blogosphere via autotagging and hierarchical clustering
Proceedings of the 15th international conference on World Wide Web
A comparison of implicit and explicit links for web page classification
Proceedings of the 15th international conference on World Wide Web
AutoTag: a collaborative approach to automated tag assignment for weblog posts
Proceedings of the 15th international conference on World Wide Web
Event detection from evolution of click-through data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
AnnoSearch: Image Auto-Annotation by Search
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A probabilistic relevance propagation model for hypertext retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
P-TAG: large scale automatic generation of personalized annotation tags for the web
Proceedings of the 16th international conference on World Wide Web
Ontologies are us: a unified model of social networks and semantics
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Folksonomy-based term extraction for word cloud generation
Proceedings of the 20th ACM international conference on Information and knowledge management
Folksonomy-Based Term Extraction for Word Cloud Generation
ACM Transactions on Intelligent Systems and Technology (TIST)
Sopra: a new social personalized ranking function for improving web search
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Web document could be seen to be composed of textual content as well as social metadata of various forms (e.g., anchor text, search query and social annotation), both of which are valuable to indicate the semantic content of the document. However, due to the free nature of the web, the two streams of web data suffer from the serious problems of noise and sparseness, which have actually become the major challenges to the success of many web mining applications. Previous work has shown that it could enhance the content of web document by integrating anchor text and search query. In this paper, we study the problem of exploring emergent social annotation for document enhancement and propose a novel reinforcement framework to generate "social representation" of document. Distinguishing from prior work, textual content and social annotation are enhanced simultaneously in our framework, which is achieved by exploiting a kind of mutual reinforcement relationship behind them. Two convergent models, social content model and social annotation model, are symmetrically derived from the framework to represent enhanced textual content and enhanced social annotation respectively. The enhanced document is referred to as Social Document or sDoc in that it could embed complementary viewpoints from many web authors and many web visitors. In this sense, the document semantics is enhanced exactly by exploring social wisdom. We build the framework on a large Del.icio.us data and evaluate it through three typical web mining applications: annotation, classification and retrieval. Experimental results demonstrate that social representation of web document could boost the performance of these applications significantly.