Experimental study on the extraction and distribution of textual domain keywords

Authors:
Xiangfeng Luo;Ning Fang;Weimin Xu;Sheng Yu;Kai Yan;Huizhe Xiao
Affiliations:
Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China;Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China;Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China;Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China;Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China;Digital Content Computing and Semantic Grid Group, Key Lab of Grid Technology, Shanghai University, Shanghai 200072, China
Venue:
Concurrency and Computation: Practice & Experience
Year:
2008

Citing 0
Cited 6

Automatic Construction of SP Problem-Solving Resource Space

CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Building associated semantic overlay for discovering associated services

ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
User interest modeling and its application for question recommendation in user-interactive question answering systems

Information Processing and Management: an International Journal
Research in keyword extraction

WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Learning path construction based on association link network

ICWL'12 Proceedings of the 11th international conference on Advances in Web-Based Learning
Cloud service: automatic construction and evolution of software process problem-solving resource space

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Domain keywords of text play a primary role in text classifying, clustering and personalized services. This paper proposes a term frequency inverse document frequency (TFIDF) based method called TDDF (TFIDF direct document frequency of domain) to extract domain keywords from multi-texts. First, we discuss the optimal parameters of TFIDF, which are used to extract textual keywords and domain keywords. Second, TDDF is proposed to extract domain keywords from multi-texts, which takes document frequency of domain into account. Finally, the distribution of domain keywords on scientific texts is studied. Experiments and applications show that TDDF is more effective than the optimal TFIDF in the extraction of domain keywords. Domain keywords accord with normal distribution on a single text after deleting the ubiquitous domain keywords. Copyright © 2008 John Wiley & Sons, Ltd.