Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Aliasing on the world wide web: prevalence and performance implications
Proceedings of the 11th international conference on World Wide Web
Template detection via data mining and its applications
Proceedings of the 11th international conference on World Wide Web
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Cluster-Based Delta Compression of a Collection of Files
WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
Visual Based Content Understanding towards Web Adaptation
AH '02 Proceedings of the Second International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems
WWW '03 Proceedings of the 12th international conference on World Wide Web
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Fully Automated Object Extraction System for the World Wide Web
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Redundancy elimination within large collections of files
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Design, implementation, and evaluation of duplicate transfer detection in HTTP
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
View invalidation for dynamic content caching in multitiered architectures
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Replica-aware caching for Web proxies
Computer Communications
A web content manipulation technique based on page Fragmentation
Journal of Network and Computer Applications
Sampling, information extraction and summarisation of hidden web databases
Data & Knowledge Engineering - Special issue: WIDM 2004
Scalable Delivery of Dynamic Content Using a Cooperative Edge Cache Grid
IEEE Transactions on Knowledge and Data Engineering
A Novel Web Page Analysis Method for Efficient Reasoning of User Preference
APCHI '08 Proceedings of the 8th Asia-Pacific conference on Computer-Human Interaction
A classification for content adaptation system
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Foundations and Trends in Databases
Highly scalable web applications with zero-copy data transfer
Proceedings of the 18th international conference on World wide web
A survey on dynamic Web content generation and delivery techniques
Journal of Network and Computer Applications
Caching and Materialization for Web Databases
Foundations and Trends in Databases
Fuzzy web surfer models: theory and experiments
WImBI'06 Proceedings of the 1st WICI international conference on Web intelligence meets brain informatics
Expert Systems with Applications: An International Journal
User Modeling and User-Adapted Interaction
Review: A survey on content-centric technologies for the current Internet: CDN and P2P solutions
Computer Communications
Using Description Logics for the Provision of Context-Driven Content Adaptation Services
International Journal of Systems and Service-Oriented Engineering
The MACE Approach for Caching Mashups
International Journal of Web Services Research
A survey on content adaptation systems towards energy consumption awareness
Advances in Multimedia
Performance improvement of web caching in Web 2.0 via knowledge discovery
Journal of Systems and Software
Hi-index | 0.00 |
Constructing Web pages from fragments has been shown to provide significant benefits for both content generation and caching. In order for a Web site to use fragment-based content generation, however, good methods are needed for fragmenting the Web pages. Manual fragmentation of Web pages is expensive, error prone, and unscalable. This paper proposes a novel scheme to automatically detect and flag fragments that are cost-effective cache units in Web sites serving dynamic content. Our approach analyzes Web pages with respect to their information sharing behavior, personalization characteristics, and change patterns. We identify fragments which are shared among multiple documents or have different lifetime or personalization characteristics. Our approach has three unique features. First, we propose a framework for fragment detection, which includes a hierarchical and fragment-aware model for dynamic Web pages and a compact and effective data structure for fragment detection. Second, we present an efficient algorithm to detect maximal fragments that are shared among multiple documents. Third, we develop a practical algorithm that effectively detects fragments based on their lifetime and personalization characteristics. This paper shows the results when the algorithms are applied to real Web sites. We evaluate the proposed scheme through a series of experiments, showing the benefits and costs of the algorithms. We also study the impact of using the fragments detected by our system on key parameters such as disk space utilization, network bandwidth consumption, and load on the origin servers.