Repurposing social tagging data for extraction of domain-level concepts

Authors:
Sandeep Purao;Veda C. Storey;Vijayan Sugumaran;Jordi Conesa;Juliá Minguillón;Joan Casas
Affiliations:
College of Information Sciences & Technology, The Pennsylvania State University, University Park State College, PA;Department of Computer Information Systems, J. Mack Robinson College of Business, Georgia State University, Atlanta, GA;School of Business Administration, Oakland University, Rochester, MI;Estudis d'Informatica i Multimedia, Universitat Oberta de Catalunya, Barcelona, Spain;Estudis d'Informatica i Multimedia, Universitat Oberta de Catalunya, Barcelona, Spain;Estudis d'Informatica i Multimedia, Universitat Oberta de Catalunya, Barcelona, Spain
Venue:
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Year:
2011

Citing 8
Cited 0

The vocabulary problem in human-system communication

Communications of the ACM
Web Data Cleansing and Preparation for Ontology Extraction Using WordNet

WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
K-means clustering via principal component analysis

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Exploring social annotations for the semantic web

Proceedings of the 15th international conference on World Wide Web
Infotopia: How Many Minds Produce Knowledge

Infotopia: How Many Minds Produce Knowledge
Metcalfe's law, Web 2.0, and the Semantic Web

Web Semantics: Science, Services and Agents on the World Wide Web
Database and information-retrieval methods for knowledge discovery

Communications of the ACM - A Direct Path to Dependable Software
On-demand extraction of domain concepts and relationships from social tagging websites

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The World Wide Web, the world's largest resource for information, has evolved from organizing information using controlled, top-down taxonomies to a bottom up approach that emphasizes assigning meaning to data via mechanisms such as the Social Web (Web 2.0). Tagging adds meta-data, (weak semantics) to the content available on the web. This research investigates the potential for repurposing this layer of meta-data. We propose a multi-phase approach that exploits user-defined tags to identify and extract domain-level concepts. We operationalize this approach and assess its feasibility by application to a publicly available tag repository. The paper describes insights gained from implementing and applying the heuristics contained in the approach, as well as challenges and implications of repurposing tags for extraction of domain-level concepts.