DIRT @SBT@discovery of inference rules from text
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Text mining for product attribute extraction
ACM SIGKDD Explorations Newsletter
Unsupervised learning of field segmentation models for information extraction
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Prototype-driven learning for sequence models
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
On-demand information extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Freebase: a collaboratively created graph database for structuring human knowledge
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Joint inference in information extraction
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Semantic annotation of unstructured and ungrammatical text
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Exploiting background knowledge to build reference sets for information extraction
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Context and Domain Knowledge Enhanced Entity Spotting in Informal Text
ISWC '09 Proceedings of the 8th International Semantic Web Conference
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Unsupervised ontology induction from text
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A hybrid rule/model-based finite-state framework for normalizing SMS messages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
Recognizing named entities in tweets
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
In-domain relation discovery with meta-constraints via posterior regularization
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Template-based information extraction without the templates
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Fine-grained class label markup of search queries
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Bootstrapped named entity recognition for product attribute extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
Short listings such as classified ads or product listings abound on the web. If a computer can reliably extract information from them, it will greatly benefit a variety of applications. Short listings are, however, challenging to process due to their informal styles. In this paper, we present an unsupervised information extraction system for short listings. Given a corpus of listings, the system builds a semantic model that represents typical objects and their attributes in the domain of the corpus, and then uses the model to extract information. Two key features in the system are a semantic parser that extracts objects and their attributes and a listing-focused clustering module that helps group together extracted tokens of same type. Our evaluation shows that the semantic model learned by these two modules is effective across multiple domains.