The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Record-boundary discovery in Web documents
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Two approaches to bringing Internet services to WAP devices
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Proceedings of the 10th international conference on World Wide Web
Function-based object model towards website adaptation
Proceedings of the 10th international conference on World Wide Web
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
Adding Structure to Unstructured Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Visual Based Content Understanding towards Web Adaptation
AH '02 Proceedings of the Second International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Discovering informative content blocks from Web documents
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving pseudo-relevance feedback in web information retrieval using web page segmentation
WWW '03 Proceedings of the 12th international conference on World Wide Web
HTML Page Analysis Based on Visual Cues
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-model similarity propagation and its application for web image retrieval
Proceedings of the 12th annual ACM international conference on Multimedia
Extracting semantic structure of web documents using content and visual information
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A study on combination of block importance and relevance to estimate page relevance
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
As we may perceive: inferring logical documents from hypertext
Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
ViPER: augmenting automatic information extraction with visual perceptions
Proceedings of the 14th ACM international conference on Information and knowledge management
A web browsing system based on adaptive presentation of web contents for cellular phones
W4A '06 Proceedings of the 2006 international cross-disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility?
Template detection for large scale search engines
Proceedings of the 2006 ACM symposium on Applied computing
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
MyPortal: robust extraction and aggregation of web content
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Clustering and searching WWW images using link and page layout analysis
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Towards domain-independent information extraction from web tables
Proceedings of the 16th international conference on World Wide Web
Extraction of flat and nested data records from web pages
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
A web page topic segmentation algorithm based on visual criteria and content layout
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Geo-tagging for imprecise regions of different sizes
Proceedings of the 4th ACM workshop on Geographical information retrieval
An automatic approach to construct domain-specific web portals
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A graph-theoretic approach to webpage segmentation
Proceedings of the 17th international conference on World Wide Web
Enhancing web page classification through image-block importance analysis
Information Processing and Management: an International Journal
Math information retrieval: user requirements and prototype implementation
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
A densitometric approach to web page segmentation
Proceedings of the 17th ACM conference on Information and knowledge management
Granular modeling of web documents: impact on information retrieval systems
Proceedings of the 10th ACM workshop on Web information and data management
Using a sentiment map for visualizing credibility of news sites on the web
Proceedings of the 2nd ACM workshop on Information credibility on the web
Browsing on small displays by transforming Web pages into hierarchically structured subpages
ACM Transactions on the Web (TWEB)
Extracting article text from the web with maximum subsequence segmentation
Proceedings of the 18th international conference on World wide web
Extracting the Latent Hierarchical Structure of Web Documents
Advanced Internet Based Systems and Applications
Refining search results using a mining framework
Expert Systems with Applications: An International Journal
A Structured Approach to Data Reverse Engineering of Web Applications
ICWE '9 Proceedings of the 9th International Conference on Web Engineering
Table extraction using spatial reasoning on the CSS2 visual box model
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Indexing by permeability in block structured web pages
Proceedings of the 9th ACM symposium on Document engineering
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Entropy-Based Visual Tree Evaluation on Block Extraction
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Learning document aboutness from implicit user feedback and document structure
Proceedings of the 18th ACM conference on Information and knowledge management
Boilerplate detection using shallow text features
Proceedings of the third ACM international conference on Web search and data mining
Web data extracion using visual features
Proceedings of the International Conference and Workshop on Emerging Trends in Technology
A probabilistic relational approach for web document clustering
Information Processing and Management: an International Journal
Enhancing web page readability for non-native readers
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Detecting visually similar Web pages: Application to phishing detection
ACM Transactions on Internet Technology (TOIT)
Clustering-based relevance feedback for web pages
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
A novel method of extracting and rendering news web sites on mobile devices
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
Fair news reader: recommending news articles with different sentiments based on user preference
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
CETR: content extraction via tag ratios
Proceedings of the 19th international conference on World wide web
An open source web browser for visually impaired
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Finding and using the content texts of HTML pages
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
An effective method supporting data extraction and schema recognition on deep web
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
ObjectRunner: lightweight, targeted extraction and querying of structured web data
Proceedings of the VLDB Endowment
Identifying primary content from web pages and its application to web search ranking
Proceedings of the 20th international conference companion on World wide web
Unexpected results in automatic list extraction on the web
ACM SIGKDD Explorations Newsletter
Time-weighted web authoritative ranking
Information Retrieval
An approach to assess the quality of web pages in the deep web
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
DOM based content extraction via text density
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Accessibility summarization & simplification in a template-based web transcoder
Journal of Web Engineering
Extracting general lists from web documents: a hybrid approach
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Towards a spatial instance learning method for deep web pages
ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
News information extraction based on adaptive weighting using unsupervised Bayesian algorithm
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Extracting data records from query result pages based on visual features
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Block-based language modeling approach towards web search
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
User preference modeling based on interest and impressions for news portal site systems
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Cleaning web pages for effective web content mining
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Hybrid model of content extraction
Journal of Computer and System Sciences
MenuMiner: revealing the information architecture of large web sites by analyzing maximal cliques
Proceedings of the 21st international conference companion on World Wide Web
VisHue: web page segmentation for an improved query interface for medlineplus medical encyclopedia
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
TEX: An efficient and effective unsupervised Web information extractor
Knowledge-Based Systems
Measuring the Visual Complexities of Web Pages
ACM Transactions on the Web (TWEB)
A hybrid approach for extracting informative content from web pages
Information Processing and Management: an International Journal
Visually extracting data records from the deep web
Proceedings of the 22nd international conference on World Wide Web companion
Identifying salient entities in web pages
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Domain specific multistage query language for medical document repositories
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and automatic page adaptation can benefit from this structure. This paper presents an automatic top-down, tag-tree independent approach to detect web content structure. It simulates how a user understands web layout structure based on his visual perception. Comparing to other existing techniques, our approach is independent to underlying documentation representation such as HTML and works well even when the HTML structure is far different from layout structure. Experiments show satisfactory results.