C4.5: programs for machine learning
C4.5: programs for machine learning
Information extraction as a basis for high-precision text classification
ACM Transactions on Information Systems (TOIS)
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
Information extraction from HTML: application of a general machine learning approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Where to Position the Precision in Knowledge Extraction from Text
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
Information Extraction from HTML: Combining XML and Standard Techniques for IE from the Web
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Introduction to the special issue on word sense disambiguation: the state of the art
Computational Linguistics - Special issue on word sense disambiguation
A definition and short history of Language Engineering
Natural Language Engineering
Software infrastructure for natural language processing
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Adaptive information extraction from text by rule induction and generalisation
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
CRYSTAL inducing a conceptual dictionary
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Incremental development of CBR strategies for computing project cost probabilities
Advanced Engineering Informatics
Named entities for hot topics ranking and ontology navigation aid
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Seeking Acronym Definitions: a Web-based Approach
Proceedings of the 2009 conference on Artificial Intelligence Research and Development: Proceedings of the 12th International Conference of the Catalan Association for Artificial Intelligence
Seeking Acronym Definitions: a Web-based Approach
Proceedings of the 2009 conference on Artificial Intelligence Research and Development: Proceedings of the 12th International Conference of the Catalan Association for Artificial Intelligence
A method for web information extraction
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Tree-based overlay networks for scalable applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Automatic extraction of acronym definitions from the Web
Applied Intelligence
Learning to adapt cross language information extraction wrapper
Applied Intelligence
Hi-index | 0.00 |
Information Extraction (IE) systems that can exploit the vast source of textual information that is the internet would provide a revolutionary step forward in terms of delivering large volumes of content cheaply and precisely, thus enabling a wide range of new knowledge driven applications and services. However, despite this enormous potential, few IE systems have successfully made the transition from laboratory to commercial application. The reason may be a purely practical one—to build useable, scaleable IE systems requires bringing together a range of different technologies as well as providing clear and reproducible guidelines as to how to collectively configure and deploy those technologies.This paper is an attempt to address these issues. The paper focuses on two primary goals. Firstly, we show that an information extraction system which is used for real world applications and different domains can be built using some autonomous, corporate components (agents). Such a system has some advanced properties: clear separation to different extraction tasks and steps, portability to multiple application domain, trainability, extensibility, etc. Secondly, we show that machine learning and, in particular, learning in different ways and at different levels, can be used to build practical IE systems. We show that carefully selecting the right machine learning technique for the right task and selective sampling can be used to reduce the human effort required to annotate examples for building such systems.