A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Extension of Zipf's law to words and phrases
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Detection of IUPAC and IUPAC-like chemical names
Bioinformatics
Cascaded classifiers for confidence-based chemical named entity recognition
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Hi-index | 0.00 |
Automatically extracting chemical names from text has significant value to biomedical and life science research. A major barrier in this task is the difficulty of getting a sizable good quality training set to train a reliable entity extraction model. Leveraging the well-studied random text generation techniques based on formal grammars, we explore the idea of automatically creating training sets for the task of chemical named entity extraction. Assuming the availability of an incomplete list of chemical names, we are able to generate well-controlled, random, yet realistic chemical-like training documents. Compared to state-of-the-art models learned from manually labeled data and rule-based systems using real-world data, our solutions show comparable or better results, with least human effort.