Constant interaction-time scatter/gather browsing of very large document collections
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
GTM: the generative topographic mapping
Neural Computation
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Defining logical domains in a web site
HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Automatic personalization based on Web usage mining
Communications of the ACM
Towards adaptive Web sites: conceptual framework and case study
Artificial Intelligence - Special issue on Intelligent internet systems
ACM SIGKDD Explorations Newsletter
Discovering unexpected information from your competitors' web sites
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning to Probabilistically Identify Authoritative Documents
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Stochastic models for the Web graph
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Web intelligence (WI): what makes wisdom web?
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Identifying document topics using the Wikipedia category network
Web Intelligence and Agent Systems
Wrapping VRXQuery with self-adaptive fuzzy capabilities
Web Intelligence and Agent Systems
MenuMiner: revealing the information architecture of large web sites by analyzing maximal cliques
Proceedings of the 21st international conference companion on World Wide Web
Search result presentation: supporting post-search navigation by integration of taxonomy data
Proceedings of the 22nd international conference on World Wide Web companion
Mining taxonomies from web menus: rule-based concepts and algorithms
ICWE'13 Proceedings of the 13th international conference on Web Engineering
CooL-AgentSpeak: Endowing AgentSpeak-DL agents with plan exchange and ontology services
Web Intelligence and Agent Systems
RoClust: Role discovery for graph clustering
Web Intelligence and Agent Systems
Hi-index | 0.00 |
The Web is transforming from a merely information dissemination platform towards a distributed knowledge-based platform for supporting complex problem solving. However, the existing Web contains a large amount of knowledge which is only tagged using layout related markups, making them hard to be discovered and used. In this paper, we purpose to model semantic-rich and self-contained knowledge units embedded in a web site as a mixture of bipartite sub-graphs and to extract the subgraphs as the web site abstraction via hyperlink structure and file hierarchy analysis. A recursive algorithm, named ReHITS, is derived which can identify bipartite sub-graphs with a hierarchical organization. Each identified sub-graph contains a set of associated authorities and hubs as its summarized semantic description. The effectiveness of the algorithm has been evaluated using three real web sites (containing ∼ 10000 web pages) with promising results. Detailed interpretation of the experimental results and qualitative comparison with other related work are also included.