The use of cluster hierarchies in hypertext information retrieval
HYPERTEXT '89 Proceedings of the second annual ACM conference on Hypertext
Silk from a sow's ear: extracting usable structures from the Web
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
User-oriented document clustering: a framework for learning in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Cut as a querying unit for WWW, Netnews, and E-mail
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Design principles for data-intensive Web sites
ACM SIGMOD Record
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
Defining logical domains in a web site
HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Automatic information extraction from web pages
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Discovering authorities and hubs in different topological Web graph structures
Information Processing and Management: an International Journal
Automating extraction of logical domains in a web site
Data & Knowledge Engineering
IEEE Internet Computing
Improving Web Usability Through Visualization
IEEE Internet Computing
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Jedi: Extracting and Synthesizing Information from the Web
COOPIS '98 Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
The use of web structure and content to identify subjectively interesting web usage patterns
ACM Transactions on Internet Technology (TOIT)
DEXA '00 Proceedings of the 11th International Workshop on Database and Expert Systems Applications
Automatic Web Page Classification in a Dynamic and Hierarchical Way
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Hi-index | 0.00 |
Today, surfing on the net is not limited to the search of scientific information, indeed a generic user is interested in different kinds of information about business, music, travel and so on. When accessing web documents, however, the lack of explicit structure does not facilitate in understanding data semantics, thus the comprehension of logical organization of web data relies on user's intuition of the underlying author's schema. In this paper, we present an approach to web structuring based on the analysis of the structure and the semantics of both web pages and sites, in order to discover and provide users with hidden schemas. Aimed benefits from this work are to facilitate the navigation inside web documents/sites, to promote the use of more powerful, semantic-based search methods and to allow better pages/sites management and re-design.