The relational model for database management: version 2
The relational model for database management: version 2
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
GATE: an architecture for development of robust HLT applications
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The common pattern specification language
TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Efficient Batch Top-k Search for Dictionary-based Entity Recognition
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
To search or to crawl?: towards a query optimizer for text-centric tasks
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Extracting personal names from email: applying named entity recognition to informal text
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A history of system R and SQL/data system
VLDB '81 Proceedings of the seventh international conference on Very Large Data Bases - Volume 7
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Optimization issues in inverted index-based entity annotation
Proceedings of the 3rd international conference on Scalable information systems
Information extraction challenges in managing unstructured data
ACM SIGMOD Record
An Algebraic Approach to Rule-Based Information Extraction
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Join Optimization of Information Extraction Output: Quality Matters!
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Entity annotation based on inverse index operations
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Linking open government data: what journalists wish they had known
Proceedings of the 6th International Conference on Semantic Systems
Domain adaptation of rule-based annotators for named-entity recognition tasks
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Automatic rule refinement for information extraction
Proceedings of the VLDB Endowment
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The SystemT IDE: an integrated development environment for information extraction rules
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
SystemT: a declarative information extraction system
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
A graph approach to spelling correction in domain-centric search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Extracting insights from social media with large-scale matrix approximations
IBM Journal of Research and Development
Facilitating pattern discovery for relation extraction with semantic-signature-based clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Exploiting evidence from unstructured data to enhance master data management
Proceedings of the VLDB Endowment
Auto-parallelizing stateful distributed streaming applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
WizIE: a best practices guided development environment for information extraction
ACL '12 Proceedings of the ACL 2012 System Demonstrations
Towards efficient named-entity rule induction for customizability
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Improving recall of regular expressions for information extraction
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
HIL: a high-level scripting language for entity integration
Proceedings of the 16th International Conference on Extending Database Technology
Spanners: a formal framework for information extraction
Proceedings of the 32nd symposium on Principles of database systems
Knowledge harvesting in the big-data era
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
I can do text analytics!: designing development tools for novice developers
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatic pipeline construction for real-time annotation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Information extraction as a filtering task
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Next generation data analytics at IBM research
Proceedings of the VLDB Endowment
PREDOSE: A semantic web platform for drug abuse epidemiology using social media
Journal of Biomedical Informatics
Hi-index | 0.00 |
As information extraction (IE) becomes more central to enterprise applications, rule-based IE engines have become increasingly important. In this paper, we describe SystemT, a rule-based IE system whose basic design removes the expressivity and performance limitations of current systems based on cascading grammars. SystemT uses a declarative rule language, AQL, and an optimizer that generates high-performance algebraic execution plans for AQL rules. We compare SystemT's approach against cascading grammars, both theoretically and with a thorough experimental evaluation. Our results show that SystemT can deliver result quality comparable to the state-of-the-art and an order of magnitude higher annotation throughput.