Automatic metadata extraction from multilingual enterprise content

Authors:
Melike Şah;Vincent Wade
Affiliations:
Trinity College Dublin, Dublin, Ireland;Trinity College Dublin, Dublin, Ireland
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 3
Cited 4

DocBook: The Definitive Guide with CD-ROM

DocBook: The Definitive Guide with CD-ROM
Automating metadata generation: the simple indexing interface

WWW '05 Proceedings of the 14th international conference on World Wide Web
Extracting Enterprise Vocabularies Using Linked Open Data

ISWC '09 Proceedings of the 8th International Semantic Web Conference

Automatic mining of cognitive metadata using fuzzy inference

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Finding relevant information of certain types from enterprise data

Proceedings of the 20th ACM international conference on Information and knowledge management
Semantic search in the World News domain using automatically extracted metadata files

Knowledge-Based Systems
Automatic metadata mining from multilingual enterprise content

Web Semantics: Science, Services and Agents on the World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Enterprises provide professionally authored content about their products/services in different languages for use in web sites and customer care. For customer care, personalization/personalized information delivery is becoming important since it re-encourages users to return to the service provider. Personalization usually requires both contextual and descriptive metadata. But current metadata authored by content developers is usually quite simple. In this paper, we introduce an automatic metadata extraction framework, which can extract multilingual metadata from the enterprise content, for a personalized information retrieval system. We introduce two new ontologies for metadata creation and a novel semi-automatic topic vocabulary extraction algorithm. We demonstrate and evaluate our approach on the English and German Symantec Norton 360 technical content. Evaluations indicate that the proposed approach produces rich and high quality metadata for a personalized information retrieval system.