Constructing Web Corpora through Topical Web Partitioning for Term Recognition

Authors:
Wilson Wong;Wei Liu;Mohammed Bennamoun
Affiliations:
School of Computer Science and Software Engineering, University of Western Australia, Crawley, WA 6009;School of Computer Science and Software Engineering, University of Western Australia, Crawley, WA 6009;School of Computer Science and Software Engineering, University of Western Australia, Crawley, WA 6009
Venue:
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2008

Citing 7
Cited 2

Creating Adaptive Web Sites Through Usage-Based Clustering of URLs

KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
Improving Web Clustering by Cluster Selection

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Tree-Traversing Ant Algorithm for term clustering based on featureless similarities

Data Mining and Knowledge Discovery
Determining termhood for learning domain ontologies in a probabilistic framework

AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Handbook of Research on Text and Web Mining Technologies

Handbook of Research on Text and Web Mining Technologies
Web Categorisation Using Distance-Based Decision Trees

Electronic Notes in Theoretical Computer Science (ENTCS)

Resources for Turkish morphological processing

Language Resources and Evaluation
Ontology learning from text: A look back and into the future

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The need for on-demand discovery of very large, incremental text corpora for unrestricted range of domains for term recognition in ontology learning is becoming more and more pressing. In this paper, we introduce a new 3-phase web partitioning approach for automatically constructing web corpora to support term recognition. An evaluation of the web corpora constructed using our web partitioning approach demonstrated high precision in the context of term recognition, a result comparable to the use of manually-created local corpora.