Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
ACM SIGKDD Explorations Newsletter
Automatic glossary extraction: beyond terminology identification
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Named entity recognition with a maximum entropy approach
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Information extraction from Wikipedia: moving down the long tail
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
YAGO: A Large Ontology from Wikipedia and WordNet
Web Semantics: Science, Services and Agents on the World Wide Web
Deriving a large scale taxonomy from Wikipedia
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
PORE: positive-only relation extraction from wikipedia text
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Towards knowledge acquisition from information extraction
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Large scale relation detection
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Automatic metadata extraction from multilingual enterprise content
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
An academic search and analysis prototype for specific domain
APWeb'12 Proceedings of the 14th international conference on Web Technologies and Applications
Harnessing linked knowledge sources for topic classification in social media
Proceedings of the 24th ACM Conference on Hypertext and Social Media
Hi-index | 0.00 |
A common vocabulary is vital to smooth business operation, yet codifying and maintaining an enterprise vocabulary is an arduous, manual task. We describe a process to automatically extract a domain specific vocabulary (terms and types) from unstructured data in the enterprise guided by term definitions in Linked Open Data (LOD). We validate our techniques by applying them to the IT (Information Technology) domain, taking 58 Gartner analyst reports and using two specific LOD sources --- DBpedia and Freebase. We show initial findings that address the generalizability of these techniques for vocabulary extraction in new domains, such as the energy industry.