Foundations of statistical natural language processing
Foundations of statistical natural language processing
Online conversation mining for author characterization and topic identification
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Adding semantics to microblog posts
Proceedings of the fifth ACM international conference on Web search and data mining
Investigating the statistical properties of user-generated documents
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Employing document dependency in blog search
Journal of the American Society for Information Science and Technology
A data-centric approach to feed search in blogs
International Journal of Web Engineering and Technology
RESLVE: leveraging user interest to improve entity disambiguation on short text
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
User-generated short documents assume an important role in online communication due to the established utilization of social networks and real-time text messaging on the Internet. In this paper we compare the statistics of different online user-generated datasets and traditional TREC collections, investigating their similarities and differences. Our results support the applicability of traditional techniques also to user-generated short documents albeit with proper preprocessing.