Using SOFM to improve web site text content

Authors:
Sebastían A. Ríos;Juan D. Velásquez;Eduardo S. Vera;Hiroshi Yasuda;Terumasa Aoki
Affiliations:
Research Center for Advanced Science and Technology, University of Tokyo;Department of Industrial Engineering, University of Chile;Center for Collaborative Research, University of Tokyo;Research Center for Advanced Science and Technology, University of Tokyo;Research Center for Advanced Science and Technology, University of Tokyo
Venue:
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Year:
2005

Citing 9
Cited 6

User interface directions for the Web

Communications of the ACM
Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Concept-based knowledge discovery in texts extracted from the Web

ACM SIGKDD Explorations Newsletter
A vector space model for automatic indexing

Communications of the ACM
Seeing the whole in parts: text summarization for web browsing on handheld devices

Proceedings of the 10th international conference on World Wide Web
Analysis of navigation behaviour in web sites integrating multiple information systems

The VLDB Journal — The International Journal on Very Large Data Bases
Data mining for hypertext: a tutorial survey

ACM SIGKDD Explorations Newsletter
A Methodology to Find Web Site Keywords

EEE '04 Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'04)
Web mining in soft computing framework: relevance, state of the art and future directions

IEEE Transactions on Neural Networks

Semantic analysis of web site audience

Proceedings of the 2006 ACM symposium on Applied computing
A hybrid system for concept-based web usage mining

International Journal of Hybrid Intelligent Systems
Web site improvements based on representative pages identification

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Improving web sites with web usage mining, web content mining, and semantic analysis

SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Category-Based audience metrics for web site content improvement using ontologies and page classification

NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
Conceptual classification to improve a web site content

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a new method to improve web site text content by identifying the most relevant free text in the web pages. In order to understand the variations in web page text, we collect pages during a period. The page text content is then transformed into a feature vector and is used as input of a clustering algorithm (SOFM), which groups the vectors by common text content. In each cluster, a centroid and its neighbor vectors are extracted. Then using a reverse clustering analysis, the pages represented by each vector are reviewed in order to find the similar. Furthermore, the proposed method was tested in a real web site, proving the effectiveness of this approach.