Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Formal Concept Analysis: Mathematical Foundations
Formal Concept Analysis: Mathematical Foundations
An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources
IEEE Transactions on Knowledge and Data Engineering
Learnable topic-specific web crawler
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
The indexable web is more than 11.5 billion pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Ontology-based concept similarity in Formal Concept Analysis
Information Sciences: an International Journal
Formal concept analysis in knowledge discovery: a survey
ICCS'10 Proceedings of the 18th international conference on Conceptual structures: from information to intelligence
Semantic ranking of web pages based on formal concept analysis
Journal of Systems and Software
Review: Formal concept analysis in knowledge processing: A survey on applications
Expert Systems with Applications: An International Journal
Formal concept analysis approach for data extraction from a limited deep web database
Journal of Intelligent Information Systems
Hi-index | 0.00 |
With Internet growing exponentially, topic-specific web crawler is becoming more and more popular in the web data mining. How to order the unvisited URLs was studied deeply, we present the notion of concept similarity context graph, and propose a novel approach to topic-specific web crawler, which calculates the unvisited URLs' prediction score by concepts' similarity in Formal Concept Analysis (FCA), while improving the retrieval precision and recall ratio. We firstly build a concept lattice using the visited pages, extract the core concepts which reflect the user's query topic from the concept lattice, and then construct our concept similarity context graph based on the semantic similarities between the core concepts and other concepts.