VN-KIM IE: automatic extraction of Vietnamese named-entities on the web

Authors:
Truc-Vien T. Nguyen;Tru H. Cao
Affiliations:
Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam;Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam
Venue:
New Generation Computing
Year:
2007

Citing 8
Cited 5

Annotea: an open RDF infrastructure for shared Web annotations

Proceedings of the 10th international conference on World Wide Web
Creating Semantic Web Contents with Protégé-2000

IEEE Intelligent Systems
MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
S-CREAM - Semi-automatic CREAtion of Metadata

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema

ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Semantic annotation, indexing, and retrieval

Web Semantics: Science, Services and Agents on the World Wide Web
From manual to semi-automatic semantic annotation: about ontology-based text annotation tools

Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content

Named entity recognition for Vietnamese

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
Ontology-based proximity search

Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
A hybrid approach of pattern extraction and semi-supervised learning for vietnamese named entity recognition

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Ripple down rules for vietnamese named entity recognition

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
VNLP: an open source framework for Vietnamese natural language processing

Proceedings of the Fourth Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The most fascinating advantage of the semantic web would be its capabiility of understanding and processing the contents of web pages automatically. Basically, the semantic web realization involves two main tasks: (1) Representation and management of a large amount of data and metadata for web contents; (2) Information extraction and annotation on web pages. On the one hand, recognition of named-entities is regarded as a basic and important problem to be solved, before deeper semantics of a web page could be extracted. On the other hand, semantic web information extraction is a language-dependent problem, which requires particular natural language processing techniques. This paper introduces VN-KIM IE, the information extraction module of the semantic web system VN-KIM that we have developed. The function of VN-KIM IE is to automatically recognize named-entities in Vietnamese web pages, by identifying their classes, and addresses if existing, in the knowledge base of discourse. That information is then annotated to those web pages, providing a basis for NE-based searching on them, as compared to the current keyword-based one. The design, implementation, and performance of VN-KIM IE are presented and discussed.