Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring hierarchical descriptions
Proceedings of the eleventh international conference on Information and knowledge management
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
A clustering method for news articles retrieval system
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
Automatically labeling hierarchical clusters
dg.o '06 Proceedings of the 2006 international conference on Digital government research
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying Document Topics Using the Wikipedia Category Network
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Automatic Discovery of Concepts from Text
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Enhancing text clustering by leveraging Wikipedia semantics
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Extracting user profiles from large scale data
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Prototype hierarchy based clustering for the categorization and navigation of web collections
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Analysis of structural relationships for hierarchical cluster labeling
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Wikipedia as sense inventory to improve diversity in Web search results
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Entity search: building bridges between two worlds
Proceedings of the 3rd International Semantic Search Workshop
Inducing word senses to improve web search result clustering
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Organizing query completions for web search
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
W-kmeans: clustering news articles using wordNet
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
Annotate Wikipedia with Flickr images: concepts and case study
ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
The role of queries in ranking labeled instances extracted from text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Query expansion based on clustered results
Proceedings of the VLDB Endowment
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
Word clouds of multiple search results
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Informative sentence retrieval for domain specific terminologies
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Clustering web search results with maximum spanning trees
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
WikiLabel: an encyclopedic approach to labeling documents en masse
Proceedings of the 20th ACM international conference on Information and knowledge management
Folksonomy-based term extraction for word cloud generation
Proceedings of the 20th ACM international conference on Information and knowledge management
Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia
ACM Transactions on Intelligent Systems and Technology (TIST)
A breakdown of quality flaws in Wikipedia
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
A web 2.0 approach for organizing search results using wikipedia
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Wikipedia-based smoothing for enhancing text clustering
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Towards an automatic construction of Contextual Attribute-Value Taxonomies
Proceedings of the 27th Annual ACM Symposium on Applied Computing
LDA-Based topic modeling in labeling blog posts with wikipedia entries
APWeb'12 Proceedings of the 14th international conference on Web Technologies and Applications
Cluster labeling for multilingual scatter/gather using comparable corpora
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Folksonomy-Based Term Extraction for Word Cloud Generation
ACM Transactions on Intelligent Systems and Technology (TIST)
Extracting information networks from the blogosphere
ACM Transactions on the Web (TWEB)
Guided discovery of interesting relationships between time series clusters and metadata properties
Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
Selecting keywords to represent web pages using Wikipedia information
Proceedings of the 18th Brazilian symposium on Multimedia and the web
Harnessing the crowds for smart city sensing
Proceedings of the 1st international workshop on Multimodal crowd sensing
Conceptualizing documents with Wikipedia
Proceedings of the fifth workshop on Exploiting semantic annotations in information retrieval
A clustering technique for news articles using WordNet
Knowledge-Based Systems
Exploring the existing category hierarchy to automatically label the newly-arising topics in cQA
Proceedings of the 21st ACM international conference on Information and knowledge management
Wiki3C: exploiting wikipedia for context-aware concept categorization
Proceedings of the sixth ACM international conference on Web search and data mining
Unsupervised graph-based topic labelling using dbpedia
Proceedings of the sixth ACM international conference on Web search and data mining
Semantic Query Expansion using Cluster Based Domain Ontologies
International Journal of Information Retrieval Research
How Do Users Search the Mobile Web with a Clustering Interface?: A Longitudinal Study
International Journal of Mobile Human Computer Interaction
Increasing stability of result organization for session search
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
A statistical approach to mining customers' conversational data from social media
IBM Journal of Research and Development
Hi-index | 0.00 |
This work investigates cluster labeling enhancement by utilizing Wikipedia, the free on-line encyclopedia. We describe a general framework for cluster labeling that extracts candidate labels from Wikipedia in addition to important terms that are extracted directly from the text. The "labeling quality" of each candidate is then evaluated by several independent judges and the top evaluated candidates are recommended for labeling. Our experimental results reveal that the Wikipedia labels agree with manual labels associated by humans to a cluster, much more than with significant terms that are extracted directly from the text. We show that in most cases even when human's associated label appears in the text, pure statistical methods have difficulty in identifying them as good descriptors. Furthermore, our experiments show that for more than 85% of the clusters in our test collection, the manual label (or an inflection, or a synonym of it) appears in the top five labels recommended by our system.