Employing EM and Pool-Based Active Learning for Text Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Critical tokenization and its properties
Computational Linguistics
Noun-phrase analysis in unrestricted text for information retrieval
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Multiword expression filtering for building knowledge maps
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Comparing and combining a semantic tagger and a statistical tool for MWE extraction
Computer Speech and Language
A rapid method to extract multiword expressions with statistic measures and linguistic rules
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Identification of multi-word expressions by combining multiple linguistic information sources
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Topic detection and multi-word terms extraction for arabic unvowelized documents
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
We propose a hybrid approach for bilingual multiword expression extraction. There are two phases in the extraction process. In the first phase, lots of candidates are extracted from the corpus by statistic methods. The algorithm of multiple sequence alignment is sensitive to the flexible multiword. In the second phase, error-driven rules and patterns are extracted from corpus. These trained rules are used to filter the candidates. Some related experiments are designed for achieving the best performance because there are lots of parameters in this system. Experimental results showed our approach gains good performance.