The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
IEEE Transactions on Pattern Analysis and Machine Intelligence
Intelligent crawling on the World Wide Web with arbitrary predicates
Proceedings of the 10th international conference on World Wide Web
ACM Transactions on Internet Technology (TOIT)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Image Similarity Measure Based on Graph Matching
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Estimating frequency of change
ACM Transactions on Internet Technology (TOIT)
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
Tracking point of view in narrative
Computational Linguistics
Topical web crawlers: Evaluating adaptive algorithms
ACM Transactions on Internet Technology (TOIT)
Graph Edit Distance from Spectral Seriation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A General Evaluation Framework for Topical Crawlers
Information Retrieval
Learning to crawl: Comparing classification schemes
ACM Transactions on Information Systems (TOIS)
ACM Transactions on Information Systems (TOIS)
Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums
ACM Transactions on Information Systems (TOIS)
Multi-taxonomy: Determining Perceived Brand Characteristics from Web Data
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Computer
AI, E-government, and Politics 2.0
IEEE Intelligent Systems
Journal of Management Information Systems
Guest Editors' Introduction: Social Computing in the Blogosphere
IEEE Internet Computing
The Journal of Machine Learning Research
A focused crawler for Dark Web forums
Journal of the American Society for Information Science and Technology
IEEE Intelligent Systems
Information Systems Research
Graph Classification and Clustering Based on Vector Space Embedding
Graph Classification and Clustering Based on Vector Space Embedding
Selecting Attributes for Sentiment Classification Using Feature Relation Networks
IEEE Transactions on Knowledge and Data Engineering
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data
Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A pattern-based selective recrawling approach for object-level vertical search
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Despite the increased prevalence of sentiment-related information on the Web, there has been limited work on focused crawlers capable of effectively collecting not only topic-relevant but also sentiment-relevant content. In this article, we propose a novel focused crawler that incorporates topic and sentiment information as well as a graph-based tunneling mechanism for enhanced collection of opinion-rich Web content regarding a particular topic. The graph-based sentiment (GBS) crawler uses a text classifier that employs both topic and sentiment categorization modules to assess the relevance of candidate pages. This information is also used to label nodes in web graphs that are employed by the tunneling mechanism to improve collection recall. Experimental results on two test beds revealed that GBS was able to provide better precision and recall than seven comparison crawlers. Moreover, GBS was able to collect a large proportion of the relevant content after traversing far fewer pages than comparison methods. GBS outperformed comparison methods on various categories of Web pages in the test beds, including collection of blogs, Web forums, and social networking Web site content. Further analysis revealed that both the sentiment classification module and graph-based tunneling mechanism played an integral role in the overall effectiveness of the GBS crawler.