Clean up your Web pages with HP's HTML tidy
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A brief survey of web data extraction tools
ACM SIGMOD Record
Robust Pointing by XPath Language: Authoring Support and Empirical Evaluation
SAINT '03 Proceedings of the 2003 Symposium on Applications and the Internet
Learning block importance models for web pages
Proceedings of the 13th international conference on World Wide Web
Extracting content structure for web pages based on visual representation
APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
Towards more personalized web: extraction and integration of dynamic content from the web
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Robust web extraction: an approach based on a probabilistic tree-edit model
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Providing resilient XPaths for external adaptation engines
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Automated browsing in AJAX websites
Data & Knowledge Engineering
Sift: an end-user tool for gathering web content on the go
Proceedings of the 2012 ACM symposium on Document engineering
Leveraging spatial join for robust tuple extraction from web pages
Information Sciences: an International Journal
Hi-index | 0.00 |
We demonstrate myPortal - an application for web content block extraction and aggregation. The research issues behind the tool are also explained, with an emphasis on robustness of web content extraction.