PARC HTTP-NG: Web Characterization Reading List

This page contains the beginnings of relevant Web Characterization readings. The readings are separated into four sections: clients, proxy caches, servers, and the Web. The clients category deals with characterizations based upon instrumented browsers and/or application level caching. Research related to the design and performance of proxy caches and/or networks of proxy caches is contained in the proxy cache section. Readings surrounding the analysis of usage and design of server based caches is found in the server category. Finally, attempts to characterize the entire Web are listed in the Web section.

Several warnings are in order.

  1. The list is not complete nor guaranteed to be correct. Please send additional readings, suggestions, and/or comments to pitkow@parc.xerox.com.
  2. Although an attempt has been made to properly classify each paper, some papers deal with multiple issues.
  3. Some papers contain antiquated research. These have been included to help provide longitudinal characterization analysis, trend projection, and provide credit where credit is due.
  4. Serious effort has been made to cite the most definitive manifestation of the work, i.e., if the paper was a technical report but was later published in a conference, the conference citation is listed.

Clients

Bestavros, A., R. L. Carver, et al. (1995). Application-level document caching in the Internet. Proceedings of Second International Workshop on Services in Distributed and Networked Environments (SDNE '95).

Catledge, L. D. and J. E. Pitkow (1995). "Characterizing browsing strategies in the World-Wide Web." Computer Networks and ISDN Systems 26(6): 1065-1073.

Cunha, C. R., A. Bestavros, et al. (1995). Characteristics of WWW client-based traces. Boston, MA, Computer Science Dept., Boston University.

Kuenning, G. H., G. J. Popek, et al. (1994). An analysis of trace data for predictive file caching in mobile computing. Proceedings of the 1994 Summer Usenix Conference.

Tauscher, L. (1996). Evaluating history mechanisms: an empirical study of reuse patterns in World Wide Web navigation. Department of Computer Science. Alberta, Canada, University of Calgary.

Tauscher, L. and S. Greenberg (1997). "How people revisit Web pages: empirical findings and implications for the design of history systems." International Journal of Human Computer Studies 47(1).

Tauscher, L. and S. Greenberg (1997). Revisitation patterns in World Wide Web navigation. Proceedings of the ACM SIGCHI'97 Conference on Human Factors in Computing Systems, Atlanta, GA, ACM.

Servers

Almeida, V. A. F. and A. Oliveira (1996). On the fractal nature of WWW and its applications to cache modeling. Boston, MA, Computer Science Department, Boston University.

Almeida, V., A. Bestavros, et al. (1996). Characterizing Reference Locality in the WWW. Proceedings of PDIS'96: The IEEE Conference on Parallel and Distributed Information Systems, Miami Beach, FL, IEEE.

Arlitt, M. F. and C. L. Williamson (1995). A synthetic workload model for Internet Mosaic traffic. Proceedings of the 1995 Summer Computer Simulation Conference, Ottawa Canada.

Arlitt, M. F. and C. L. Williamson (1996). Web server workload characterization: the search for invariants. Proceedings of the 1996 ACM SIGMETRICS Conference on the Measurement and Modeling of Computer Systems, Philadelphia, PA, ACM.

Bestavros, A. (1995). Using speculation to reduce server load and service time on the WWW. Proceedings of CIKM'95: The 4th ACM International Conference on Information and Knowledge Management, Baltimore, MD.

Bestavros, A. (1995). Demand-based document dissemination to reduce traffic and balance load in distributed information systems. Proceedings of SPDP'95: The Seventh IEEE Symposium on Parallel and Distributed Processing, San Anotonio, TX.

Bestavros, A. and C. Cunha (1996). "Server-initiated Document Dissemination for the WWW." IEEE Data Engineering Bulletin September.

Bestavros, A. (1997). "WWW traffic reduction and load balancing through server-based caching." IEEE Concurreny: Special Issue on Parallel and Distributed Technology 5(Jan-Mar).

Blumson, S. (1994). Workload characterization in a large distributed file system. Ann Arbor, MI, Center for Information Technology Integration, University of Michigan.

Bolot, J.-C. and P. Hoschka (1996). "Performance engineering of the World Wide Web: application to dimensioning and cache design." The World Wide Web Journal 1(3): 185-195.

Braun, H.-W. and K. Claffy (1994). Web traffic characterization: an assessment of the impact of caching documents from NCSA's Web server. Proceedings of the Second International World Wide Web Conference, Chicago, IL.

Crovella, M. and A. Bestavros (1995). Explaining World Wide Web traffic self-similarity. Boston, MA, Computer Science Department, Boston University.

Crovella, M. and A. Bestavros (1997). Self-similarity in World Wide Web traffic: evidence and possible causes. Proceedings of SIGMETRICS'96: The ACM International Conference on Measurement and Modeling of Computer Systems, Philadelphia, PA.

Deng, S. (1996). Empirical model of WWW document arrivals at access link. Proceedings of the International Communications Conference (ICC'96), Dallas, TX, IEEE Press.

Kwan, T., R. McGrath, et al. (1995). "NCSA's World Wide Web server: design and performance." IEEE Computer 28(11): 68-74.

Maffeis, S. (1993). "File access patterns in public FTP archives and an index for locality of reference." ACM Sigmetrics Performance Evaluation Review 20(3).

Mogul, J. C. (1995). Network behavior of a busy Web server and its clients. CA, Digital Western Research Laboratory.

Mogul, J. (1995). "The case for persistent-connection HTTP." Computer Communication Review 25(4): 299-313.

Pitkow, J. E. and M. M. Recker (1994). A simple yet robust caching algorithm based upon dynamic access patterns. Proceedings of the Second International World Wide Web Conference, Chicago, IL, NCSA.

Recker, M. M. and J. E. Pitkow (1996). "Predicting document access in large multimedia repositories." Transactions on Computer-Human Interaction 3(4): 352-375.

Thiebaut, D. (1989). "On the fractal dimension of computer programs and its applications to the prediction of the cache miss ratio." IEEE Transactions on Computers 38(7): 1012-1026.

Yan, T. W., M. Jacobsen, et al. (1996). "From user access patterns to dynamic hypertext linking." The World Wide Web Journal 1(3).

Proxy Caches

Abrams, M., C. R. Standridge, et al. (1995). Multimedia traffic analysis using Chitra95. ACM Multimedia '95, San Francisco, CA, ACM.

Abrams, M., C. R. Standridge, et al. (1995). "Caching proxies: limitations and potentials." The World Wide Web Journal 1(1).

Alexander, T. (1995). A distributed predictive cache for high performance computer system. Computer Science. Durham, NC, Duke University.

Chankhuthod, A., P. Danzig, et al. (1996). A hierarchical Internet object cache. Proceedings of the USENIX Annual Technical Conference, San Diego, CA.

Danzig, P., R. Hall, et al. (1993). A case for caching file objects inside networks. Proceedings of ACM SIGCOMM '93.

Glassman, S. (1994). "A caching relay for the World Wide Web." Computer Networks and ISDN Systems 27(2).

Gwertzman, J. (1996). World Wide Web cache consistency. 1996 Proceedings of the Usenix Technical Conference, Boston, MA, Harvard University.

Luotonen, A. and K. Atlas (1994). "World-Wide Web proxies." Computer Networks and ISDN Systems 27(2).

Malpani, R., J. Lorch, et al. (1995). "Making World Wide Web caching servers cooperate." The World Wide Web Journal 1(1): 107-117.

Markatos, E. P. (1996). "Main memory caching of Web documents." The World Wide Web Journal 1(3).

Nabeshima, M. (1997). The Japan cache project: an experiment on domain cache. The Sixth International World Wide Web Conference, Santa Clara, CA.

O'Callaghan, D. (1995). A central caching proxy server for WWW users at the University of Melbourne. Proceedings of AusWeb95, the First Australian World Wide Web Conference, University of Melbourne, Australia.

Smith, N. (1994). What can archives offer the World-Wide Web. The First International World Wide Web Conference, Geneva, Switzerland, CERN.

Wessels, D. (1995). Intelligent caching for World-Wide Web objects. Proceedings of INET '95, Honolulu, HI.

Williams, S., M. Abrams, et al. (1996). Removal policies in network caches for World-Wide Web documents. Proceedings of SIGCOMM.

Wooster, R. P. (1996). Optimizing response time, rather than hit rates, of WWW proxy caches. Computer Science. Blacksburg, Virginia, Virginia Polytechnic Institute and State University: 114.

Wooster, R. P. and M. Abrams (1997). Proxy caching that estimates page load delays. The Sixth International World Wide Web Conference, Santa Clara, CA.

The Web

Bray, T. (1996). "Measuring the Web." The World Wide Web Journal 1(3).

Woodruff, A., P. M. Aoki, et al. (1996). "An investigation of documents from the World Wide Web." The World Wide Web Journal 1(3).


updated: $Date: 1997/08/16 00:28:33 $ GMT by $Author: janssen $