The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Finding high-quality content in social media
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Classifiers without borders: incorporating fielded text from neighboring web pages
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Exploring social tagging graph for web object classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Evidence of quality of textual features on the web 2.0
Proceedings of the 18th ACM conference on Information and knowledge management
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Twitter power: Tweets as electronic word of mouth
Journal of the American Society for Information Science and Technology
Blog classification using tags: an empirical study
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Using hyperlinks to enrich message board content with linked data
Proceedings of the 6th International Conference on Semantic Systems
Patterns of temporal variation in online media
Proceedings of the fourth ACM international conference on Web search and data mining
Twitter under crisis: can we trust what we RT?
Proceedings of the First Workshop on Social Media Analytics
Topic classification in social media using metadata from hyperlinked objects
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Towards semantically-interlinked online communities
ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications
Representation models for text classification: a comparative analysis over three web document types
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Hi-index | 0.01 |
Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks for categorising the topic of individual posts. We focus our analysis on objects that have related metadata available on the Web, either via APIs or as Linked Data. Our experiments show that the inclusion of metadata from hyperlinked objects in addition to the original post content significantly improved classifier performance on two disparate datasets. We found that including selected metadata from APIs and Linked Data gave better results than including text from HTML pages. We investigate how this improvement varies across different topics. We also make use of the structure of the data to compare the usefulness of different types of external metadata for topic classification in a social media dataset.