A statistical interpretation of term specificity and its application in retrieval
Document retrieval systems
Using micro information units for internet search
Proceedings of the eleventh international conference on Information and knowledge management
Discovering informative content blocks from Web documents
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving pseudo-relevance feedback in web information retrieval using web page segmentation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Detecting web page structure for adaptive viewing on small form factor devices
WWW '03 Proceedings of the 12th international conference on World Wide Web
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Hierarchical clustering of WWW image search results using visual, textual and link information
Proceedings of the 12th annual ACM international conference on Multimedia
WISDOM: Web Intrapage Informative Structure Mining Based on Document Object Model
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the 15th international conference on World Wide Web
Robust web page segmentation for mobile terminal using content-distances and page layout information
Proceedings of the 16th international conference on World Wide Web
A graph-theoretic approach to webpage segmentation
Proceedings of the 17th international conference on World Wide Web
A densitometric approach to web page segmentation
Proceedings of the 17th ACM conference on Information and knowledge management
2-DOM: A 2-Dimensional Object Model towards Web Image Annotation
SMAP '08 Proceedings of the 2008 Third International Workshop on Semantic Media Adaptation and Personalization
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Measuring the semantic similarity of texts
EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Using measures of semantic relatedness for word sense disambiguation
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Web page DOM node characterization and its application to page segmentation
IMSAA'09 Proceedings of the 3rd IEEE international conference on Internet multimedia services architecture and applications
Extracting informative textual parts from web pages containing user-generated content
Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
Heuristic role detection of visual elements of web pages
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Hi-index | 0.00 |
Web page segmentation is an important task with benefits for a variety of applications, reaching from data extraction to accessibility improvement. Focusing on the smallest content units of a web page, page segmentation can be reduced to a clustering of web contents to structural and semantical cohesive groups. To investigate the web page segmentation task from the clustering point of view, we define three possible distance measures for content units based on their DOM, geometric and semantic properties. We combine these distance measures with common clustering techniques and evaluate the web page segmentation accuracy on a labelled collection by applying widely used validity measures.