Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
C4.5: programs for machine learning
C4.5: programs for machine learning
Analyzing the Subjective Interestingness of Association Rules
IEEE Intelligent Systems
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Autonomously semantifying wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
NAGA: Searching and Ranking Knowledge
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A probabilistic model of redundancy in information extraction
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Machine reading at the University of Washington
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
PRISMATIC: inducing knowledge from a large scale lexicalized relation resource
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Chart pruning for fast lexicalised-grammar parsing
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Ontologizing concept maps using graph theory
Proceedings of the 2011 ACM Symposium on Applied Computing
Towards open ontology learning and filtering
Information Systems
Unsupervised lexicon acquisition for HPSG-based relation extraction
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic knowledge extraction from documents
IBM Journal of Research and Development
Are Some Tweets More Interesting Than Others? #HardQuestion
Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
Hi-index | 0.00 |
How can we cull the facts we need from the overwhelming mass of information and misinformation that is the Web? The TextRunner extraction engine represents one approach, in which people pose keyword queries or simple questions and TextRunner returns concise answers based on tuples extracted from Web text. Unfortunately, the results returned by engines such as TextRunner include both informative facts (e.g., the FDA banned ephedra) and less useful statements (e.g., the FDA banned products). This paper therefore investigates filtering TextRunner results to enable people to better focus on interesting assertions. We first develop three distinct models of what assertions are likely to be interesting in response to a query. We then fully operationalize each of these models as a filter over TextRunner results. Finally, we develop a more sophisticated filter that combines the different models using relevance feedback. In a study of human ratings of the interestingness of TextRunner assertions, we show that our approach substantially enhances the quality of TextRunner results. Our best filter raises the fraction of interesting results in the top thirty from 41.6% to 64.1%.