Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
A comparison of techniques to find mirrored hosts on the WWW
Journal of the American Society for Information Science
Enhanced topic distillation using text, markup tags, and hyperlinks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Fast Approximate Energy Minimization via Graph Cuts
IEEE Transactions on Pattern Analysis and Machine Intelligence
Template detection via data mining and its applications
Proceedings of the 11th international conference on World Wide Web
Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
What Energy Functions Can Be Minimizedvia Graph Cuts?
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using link analysis to improve layout on mobile devices
Proceedings of the 13th international conference on World Wide Web
Adapting Web Pages for Small-Screen Devices
IEEE Internet Computing
WISDOM: Web Intrapage Informative Structure Mining Based on Document Object Model
IEEE Transactions on Knowledge and Data Engineering
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
The volume and evolution of web page templates
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Proceedings of the 15th international conference on World Wide Web
Page-level template detection via isotonic smoothing
Proceedings of the 16th international conference on World Wide Web
Extracting content structure for web pages based on visual representation
APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
A densitometric approach to web page segmentation
Proceedings of the 17th ACM conference on Information and knowledge management
Webpage segmentation for extracting images and their surrounding contextual information
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Enhanced Gestalt Theory Guided Web Page Segmentation for Mobile Browsing
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Learning document aboutness from implicit user feedback and document structure
Proceedings of the 18th ACM conference on Information and knowledge management
Boilerplate detection using shallow text features
Proceedings of the third ACM international conference on Web search and data mining
Bricolage: example-based retargeting for web design
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Page segmentation by web content clustering
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
A site oriented method for segmenting web pages
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Tightly coupling visual and linguistic features for enriching audio-based web browsing experience
Proceedings of the 20th ACM international conference on Information and knowledge management
VisHue: web page segmentation for an improved query interface for medlineplus medical encyclopedia
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
Thematic organization of web content for distraction-free text-to-speech narration
Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility
A Web-based resource model for scholarship 2.0: object reuse & exchange
Concurrency and Computation: Practice & Experience
A hybrid approach for extracting informative content from web pages
Information Processing and Management: an International Journal
Structured positional entity language model for enterprise entity retrieval
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Proxy Service to Contextualize Web Browsing for the Visually Impaired
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
We consider the problem of segmenting a webpage into visually and semantically cohesive pieces. Our approach is based on formulating an appropriate optimization problem on weighted graphs, where the weights capture if two nodes in the DOM tree should be placed together or apart in the segmentation; we present a learning framework to learn these weights from manually labeled data in a principled manner. Our work is a significant departure from previous heuristic and rule-based solutions to the segmentation problem. The results of our empirical analysis bring out interesting aspects of our framework, including variants of the optimization problem and the role of learning.