An approach to the automatic construction of global thesauri
Information Processing and Management: an International Journal
Experiments in automatic statistical thesaurus construction
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Experiment on linguistically-based term associations
Information Processing and Management: an International Journal
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the first ACM international conference on Digital libraries
Experiments in multilingual information retrieval using the SPIDER system
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A cooccurrence-based thesaurus and two applications to information retrieval
Information Processing and Management: an International Journal
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical
Advances in kernel methods
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
An improved boosting algorithm and its application to text categorization
Proceedings of the ninth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Applying machine learning to anaphora resolution
Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Cross-Language Information Retrieval in a Multilingual Legal Domain
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
The role of domain information in Word Sense Disambiguation
Natural Language Engineering
A corpus-based bootstrapping algorithm for Semi-Automated semantic lexicon construction
Natural Language Engineering
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A rule-based approach to prepositional phrase attachment disambiguation
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Deep Read: a reading comprehension system
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatically discovering word senses
NAACL-Demonstrations '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations - Volume 4
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Automatic thesaurus construction based on grammatical relations
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
CRYSTAL inducing a conceptual dictionary
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Discretizing continuous attributes in AdaBoost for text categorization
ECIR'03 Proceedings of the 25th European conference on IR research
Semantic Labeling of Data by Using the Web
WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Journal of Artificial Intelligence Research
Large-scale hierarchical text classification without labelled data
Proceedings of the fourth ACM international conference on Web search and data mining
Hi-index | 0.00 |
We discuss an approach to the automatic expansion ofdomain-specific lexicons, that is, to the problem ofextending, for each ci in a predefined setC ={c1,…,cm} ofsemantic domains, an initial lexiconLi0 into a larger lexiconLi1. Our approach relies onterm categorization, defined as the task of labelingpreviously unlabeled terms according to a predefined set ofdomains. We approach this as a supervised learning problem in whichterm classifiers are built using the initial lexicons as trainingdata. Dually to classic text categorization tasks in whichdocuments are represented as vectors in a space of terms, werepresent terms as vectors in a space of documents. We present theresults of a number of experiments in which we use a boosting-basedlearning device for training our term classifiers. We test theeffectiveness of our method by using WordNetDomains, a well-knownlarge set of domain-specific lexicons, as a benchmark. Ourexperiments are performed using the documents in the Reuters CorpusVolume 1 as implicit representations for our terms.