An approach for selecting seed URLs of focused crawler based on user-interest ontology

Authors:
Yajun Du;Yufeng Hai;Chunzhi Xie;Xiaoming Wang
Affiliations:
-;-;-;-
Venue:
Applied Soft Computing
Year:
2014

Citing 28
Cited 0

A translation approach to portable ontology specifications

Knowledge Acquisition - Special issue: Current issues in knowledge modeling
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
A collaborative approach to ontology design

Communications of the ACM - Ontology: different ways of representing the same concept
Formal Concept Analysis: Mathematical Foundations

Formal Concept Analysis: Mathematical Foundations
Extracting Large-Scale Knowledge Bases from the Web

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Topical web crawlers: Evaluating adaptive algorithms

ACM Transactions on Internet Technology (TOIT)
Learnable topic-specific web crawler

Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
Position paper: ontology construction from online ontologies

Proceedings of the 15th international conference on World Wide Web
Partially constructed knowledge for semantic query

Expert Systems with Applications: An International Journal
Incrementally Updating Concept Context Graph (CCG) for Focused Web Crawling Based on FCA

APCIP '09 Proceedings of the 2009 Asia-Pacific Conference on Information Processing - Volume 02
FCA-MERGE: bottom-up merging of ontologies

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Knowledge accumulation through automatic merging of ontologies

Expert Systems with Applications: An International Journal
A multilayer perceptron-based medical decision support system for heart disease diagnosis

Expert Systems with Applications: An International Journal
Combining ontological profiles with context in information retrieval

Data & Knowledge Engineering
Towards automatic merging of domain ontologies: The HCONE-merge approach

Web Semantics: Science, Services and Agents on the World Wide Web
Strategy for mining association rules for web pages based on formal concept analysis

Applied Soft Computing
OntoCrawler: A focused crawler with ontology-supported website models for information agents

Expert Systems with Applications: An International Journal
New fast algorithm for constructing concept lattice

ICCSA'07 Proceedings of the 2007 international conference on Computational science and Its applications - Volume Part II
Scaling up top-K cosine similarity search

Data & Knowledge Engineering
Application of structured document parsing to focused web crawling

Computer Standards & Interfaces
Merging domain ontologies based on the WordNet system and Fuzzy Formal Concept Analysis techniques

Applied Soft Computing
Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine

Web Semantics: Science, Services and Agents on the World Wide Web
Personalized Content Retrieval in Context Using Ontological Knowledge

IEEE Transactions on Circuits and Systems for Video Technology
Change management in evolving web ontologies

Knowledge-Based Systems
Concept maps as the first step in an ontology construction method

Information Systems
Focused crawling of tagged web resources using ontology

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Seed URLs selection for focused Web crawler intends to guide related and valuable information that meets a user's personal information requirement and provide more effective information retrieval. In this paper, we propose a seed URLs selection approach based on user-interest ontology. In order to enrich semantic query, we first intend to apply Formal Concept Analysis to construct user-interest concept lattice with user log profile. By using concept lattice merger, we construct the user-interest ontology which can describe the implicit concepts and relationships between them more appropriately for semantic representation and query match. On the other hand, we make full use of the user-interest ontology for extracting the user interest topic area and expanding user queries to receive the most related pages as seed URLs, which is an entrance of the focused crawler. In particular, we focus on how to refine the user topic area using the bipartite directed graph. The experiment proves that the user-interest ontology can be achieved effectively by merging concept lattices and that our proposed approach can select high quality seed URLs collection and improve the average precision of focused Web crawler.