Email overload: exploring personal information management of email
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Learning routing queries in a query zone
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Detecting action-items in e-mail
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Revisiting Whittaker & Sidner's "email overload" ten years later
CSCW '06 Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work
Intelligent email: reply and attachment prediction
Proceedings of the 13th international conference on Intelligent user interfaces
Improving "email speech acts" analysis via n-gram selection
ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
TurKit: tools for iterative tasks on mechanical Turk
Proceedings of the ACM SIGKDD Workshop on Human Computation
Challenges for Sentence Level Opinion Detection in Blogs
ICIS '09 Proceedings of the 2009 Eigth IEEE/ACIS International Conference on Computer and Information Science
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Moving towards adaptive search in digital libraries
NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Scaling up high-value retrieval to medium-volume data
IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval
Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation
ACM Transactions on Interactive Intelligent Systems (TiiS) - Special section on internet-scale human problem solving and regular papers
Hi-index | 0.00 |
Attachment prediction is the task of automatically identifying email messages that should contain an attachment. This can be useful to tackle the problem of sending out emails but forgetting to include the relevant attachment (something that happens all too often). A common Information Retrieval (IR) approach in analyzing documents such as emails is to treat the entire document as a bag of words. Here we propose a finer-grained analysis to address the problem. We aim at identifying individual sentences within an email that refer to an attachment. If we detect any such sentence, we predict that the email should have an attachment. Using part of the Enron corpus for evaluation we find that our finer-grained approach outperforms previously reported document-level attachment prediction in similar evaluation settings. A second contribution this paper makes is to give another successful example of the ‘wisdom of the crowd’ when collecting annotations needed to train the attachment prediction algorithm. The aggregated non-expert judgements collected on Amazon’s Mechanical Turk can be used as a substitute for much more costly expert judgements.