Pictures of relevance: a geometric analysis of similarity measures
Journal of the American Society for Information Science
Systematic hypermedia application design with OOHDM
Proceedings of the the seventh ACM conference on Hypertext
Improving Web information systems with navigational patterns
WWW '99 Proceedings of the eighth international conference on World Wide Web
Web Modeling Language (WebML): a modeling language for designing Web sites
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Hierarchical topic segmentation of websites
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Information Architecture for the World Wide Web
Information Architecture for the World Wide Web
Mining key information of web pages: A method and its application
Expert Systems with Applications: An International Journal
Identifying a hierarchy of bipartite subgraphs for web site abstraction
Web Intelligence and Agent Systems
Designing web navigation
Web site topic-hierarchy generation based on link structure
Journal of the American Society for Information Science and Technology
The RE-UWA approach to recover user centered conceptual models from Web applications
International Journal on Software Tools for Technology Transfer (STTT)
Incorporating concept hierarchies into usage mining based recommendations
WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
Expert Systems with Applications: An International Journal
Learning website hierarchies for keyword enrichment in contextual advertising
Proceedings of the fourth ACM international conference on Web search and data mining
EIDWT '11 Proceedings of the 2011 International Conference on Emerging Intelligent Data and Web Technologies
Document hierarchies from text and links
Proceedings of the 21st international conference on World Wide Web
Data extraction from web pages based on structural-semantic entropy
Proceedings of the 21st international conference companion on World Wide Web
MenuMiner: revealing the information architecture of large web sites by analyzing maximal cliques
Proceedings of the 21st international conference companion on World Wide Web
Hi-index | 0.00 |
The logical hierarchies of Web sites (i.e. Web site taxonomies) are obvious to humans, because humans can distinguish different menu levels and their relationships. But such accurate information about the logical structure is not yet available to machines. Many applications would benefit if Web site taxonomies could be mined from menus, but it was an almost unsolvable problem in the past. While a tag newly introduced in HTML5 and novel mining methods allow to distinguish menus from other contents today, it has not yet been researched, how the underlying taxonomies can be extracted, given the menus. In this paper we present the first detailed analysis of the problem and introduce rule-based concepts for addressing each identified sub problem. We report on a large-scale study on mining hierarchical menus of 350 randomly selected domains. Our methods allow extracting Web site taxonomy information that was not available before with high precision and high recall.