Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Frequency estimates for statistical word similarity measures
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Use of support vector machines in extended named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Two-phase biomedical NE recognition based on SVMs
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
The GENIA corpus: an annotated research abstract corpus in molecular biology domain
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Comparison between tagged corpora for the named entity task
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
A nearest-neighbor method for resolving PP-Attachment ambiguity
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Boosting performance of bio-entity recognition by combining results from multiple systems
Proceedings of the 5th international workshop on Bioinformatics
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Using conditional random fields for result identification in biomedical abstracts
Integrated Computer-Aided Engineering
Quantifying the impact of concept recognition on biomedical information retrieval
Information Processing and Management: an International Journal
A generic classifier-ensemble approach for biomedical named entity recognition
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Technical term recognition with semi-supervised learning using hierarchical bayesian language models
NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Biomedical named entity recognition: a poor knowledge HMM-based approach
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts
Journal of Biomedical Informatics
Towards a Protein-Protein Interaction information extraction system: Recognizing named entities
Knowledge-Based Systems
Hi-index | 0.00 |
Although there exists a huge number of biomedical texts online, there is a lack of tools good enough to help people get information or knowledge from them. Named entity Recognition (NER) becomes very important for further processing like information retrieval, information extraction and knowledge discovery. We introduce a Hidden Markov Model (HMM) for NER, with a word similarity-based smoothing. Our experiment shows that the word similarity-based smoothing can improve the performance by using huge unlabeled data. While many systems have laboriously hand-coded rules for all kinds of word features, we show that word similarity is a potential method to automatically get word formation, prefix, suffix and abbreviation information automatically from biomedical texts, as well as useful word distribution information.