Web caching and replication
Zipf and Heaps Laws' Coefficients Depend on Language
CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Characteristics of WWW Client-based Traces
Characteristics of WWW Client-based Traces
Web Page Downloading and Classification
CBMS '01 Proceedings of the Fourteenth IEEE Symposium on Computer-Based Medical Systems
Analysis of web page image tag distribution characteristics
Information Processing and Management: an International Journal
Changeability of Web Objects - Browser Perspective
ISDA '05 Proceedings of the 5th International Conference on Intelligent Systems Design and Applications
Hi-index | 0.00 |
The data for Web mining is usually extracted from the WWW server or proxy server log files. The paper examines the advantages and disadvantages of exploiting another source of input data – the browser buffer. The properties of data extracted from different types of sources are compared. The browser buffer contains data about user navigational habits as well as the formal properties and the content of all recently accessed WWW objects. The paper uses the data obtained from this source to examine the statistical properties of different types of texts extracted from HTML pages.