Evaluating topic-driven web crawlers
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Formal Concept Analysis: Mathematical Foundations
Formal Concept Analysis: Mathematical Foundations
Exploiting hierarchical domain structure to compute similarity
ACM Transactions on Information Systems (TOIS)
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Link Contexts in Classifier-Guided Topical Crawlers
IEEE Transactions on Knowledge and Data Engineering
Pair-Wise entity resolution: overview and challenges
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Using HMM to learn user browsing patterns for focused web crawling
Data & Knowledge Engineering - Special issue: WIDM 2004
A machine learning approach to web page filtering using content and structure analysis
Decision Support Systems
Concept similarity in Formal Concept Analysis: An information content approach
Knowledge-Based Systems
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
Partially constructed knowledge for semantic query
Expert Systems with Applications: An International Journal
Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Improving the performance of focused web crawlers
Data & Knowledge Engineering
SCTWC: An online semi-supervised clustering approach to topical web crawlers
Applied Soft Computing
Strategy for mining association rules for web pages based on formal concept analysis
Applied Soft Computing
OntoCrawler: A focused crawler with ontology-supported website models for information agents
Expert Systems with Applications: An International Journal
Scaling up top-K cosine similarity search
Data & Knowledge Engineering
A relational vector space model using an advanced weighting scheme for image retrieval
Information Processing and Management: an International Journal
Information Sciences: an International Journal
Using concept lattices for text retrieval and mining
Formal Concept Analysis
Updating broken web links: An automatic recommendation system
Information Processing and Management: an International Journal
ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining
Ontology-based concept similarity in Formal Concept Analysis
Information Sciences: an International Journal
PROBABILISTIC MODELS FOR FOCUSED WEB CRAWLING
Computational Intelligence
Semantic ranking of web pages based on formal concept analysis
Journal of Systems and Software
Reprint of: The anatomy of a large-scale hypertextual web search engine
Computer Networks: The International Journal of Computer and Telecommunications Networking
Reprint of: Efficient crawling through URL ordering
Computer Networks: The International Journal of Computer and Telecommunications Networking
A new case-based classification using incremental concept lattice knowledge
Data & Knowledge Engineering
FoCUS: Learning to Crawl Web Forums
IEEE Transactions on Knowledge and Data Engineering
Review: Formal Concept Analysis in knowledge processing: A survey on models and techniques
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
With the Internet growing exponentially, search engines are encountering unprecedented challenges. A focused search engine selectively seeks out web pages that are relevant to user topics. Determining the best strategy to utilize a focused search is a crucial and popular research topic. At present, the rank values of unvisited web pages are computed by considering the hyperlinks (as in the PageRank algorithm), a Vector Space Model and a combination of them, and not by considering the semantic relations between the user topic and unvisited web pages. In this paper, we propose a concept context graph to store the knowledge context based on the user's history of clicked web pages and to guide a focused crawler for the next crawling. The concept context graph provides a novel semantic ranking to guide the web crawler in order to retrieve highly relevant web pages on the user's topic. By computing the concept distance and concept similarity among the concepts of the concept context graph and by matching unvisited web pages with the concept context graph, we compute the rank values of the unvisited web pages to pick out the relevant hyperlinks. Additionally, we constitute the focused crawling system, and we retrieve the precision, recall, average harvest rate, and F-measure of our proposed approach, using Breadth First, Cosine Similarity, the Link Context Graph and the Relevancy Context Graph. The results show that our proposed method outperforms other methods.