A corpus-based approach to language learning
A corpus-based approach to language learning
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Finite-state phrase parsing by rule sequences
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Validation of terminological inference in an information extraction task
HLT '93 Proceedings of the workshop on Human Language Technology
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Amilcare: adaptive information extraction for document annotation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive information extraction for document annotation in amilcare
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Named Faces: Putting Names to Faces
IEEE Intelligent Systems
Can We Make Information Extraction More Adaptive?
Information Extraction: Towards Scalable, Adaptable Systems
Computing Geographical Scopes of Web Resources
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Disambiguating Geographic Names in a Historical Digital Library
ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Evaluation-driven design of a robust coreference resolution system
Natural Language Engineering
Architectural elements of language engineering robustness
Natural Language Engineering
TopCat: Data Mining for Topic Identification in a Text Corpus
IEEE Transactions on Knowledge and Data Engineering
The Talent system: TEXTRACT architecture and data model
Natural Language Engineering
Evolving GATE to meet new challenges in language engineering
Natural Language Engineering
Man vs. machine: a case study in base noun phrase learning
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Finding errors automatically in semantically tagged dialogues
HLT '01 Proceedings of the first international conference on Human language technology research
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Rule writing or annotation: cost-efficient resource usage for base noun phrase chunking
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
MUP: the UIC standoff markup tool
SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
Selecting sentences for multidocument summaries using randomized local search
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Using a text engineering framework to build an extendable and portable IE-based summarisation system
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Blueprint for a high performance NLP infrastructure
SEALTS '03 Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - Volume 8
A confidence-based framework for disambiguating geographic terms
HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
Automated judgment of document qualities: Research Articles
Journal of the American Society for Information Science and Technology
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Why nitpicking works: evidence for Occam's Razor in error correctors
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Evita: a robust event recognizer for QA systems
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Detecting discrepancies in numeric estimates using multidocument hypertext summaries
HLT '02 Proceedings of the second international conference on Human Language Technology Research
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Adapting svm for data sparseness and imbalance: A case study in information extraction
Natural Language Engineering
Automatically Harvesting and Ontologizing Semantic Relations
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
The difficulties of taxonomic name extraction and a solution
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Tools for monitoring, visualizing, and refining collections of noisy documents
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Generating an entailment corpus from news headlines
EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Active learning for part-of-speech tagging: accelerating corpus annotation
LAW '07 Proceedings of the Linguistic Annotation Workshop
On privacy preservation in text and document-based active learning for named entity recognition
Proceedings of the ACM first international workshop on Privacy and anonymity for very large databases
The difficulties of taxonomic name extraction and a solution
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
Drawing TimeML relations with TBox
Proceedings of the 2005 international conference on Annotating, extracting and reasoning about time and events
Common sense reasoning – from cyc to intelligent assistant
Ambient Intelligence in Everyday Life
Experience of using GATE for NLP R&D
Proceedings of the COLING-2000 Workshop on Using Toolsets and Architectures To Build NLP Systems
GATE Teamware: a web-based, collaborative text annotation framework
Language Resources and Evaluation
Hi-index | 0.00 |
Historically, tailoring language processing systems to specific domains and languages for which they were not originally built has required a great deal of effort. Recent advances in corpus-based manual and automatic training methods have shown promise in reducing the time and cost of this porting process. These developments have focused even greater attention on the bottleneck of acquiring reliable, manually tagged training data. This paper describes a new set of integrated tools, collectively called the Alembic Workbench, that uses a mixed-initiative approach to "bootstrapping" the manual tagging process, with the goal of reducing the overhead associated with corpus development. Initial empirical studies using the Alembic Workbench to annotate "named entities" demonstrates that this approach can approximately double the production rate. As an added benefit, the combined efforts of machine and user produce domain specific annotation rules that can be used to annotate similar texts automatically through the Alembic-NLP system. The ultimate goal of this project is to enable end users to generate a practical domain-specific information extraction system within a single session.