Semester

Spring

Date of Graduation

2006

Document Type

Thesis

Degree Type

MS

College

Statler College of Engineering and Mineral Resources

Department

Lane Department of Computer Science and Electrical Engineering

Committee Chair

Katerina Goseva Popstojanova

Abstract

Web servers have a significant presence in today's Internet. Corporations want to achieve high availability, scalability, and consistent performance for respective Web systems, maintaining high customer service standards. Web Workload characterization and the analysis of Web log files are the basis on which Web server modeling for efficiency, scalability and availability can be planned. This thesis analyzes the Web access logs of six public Web sites: Department of Computer Science and Electrical Engineering at West Virginia University, West Virginia University, three NASA IVV servers, and Clarknet server. In addition, three private NASA IVV servers are also analyzed.;We characterize sessions using several attributes such as number of request per session, session length in time units, number of bytes transferred per session, and number of erroneous requests per session. We use clustering, as unsupervised learning methods, to classify Web server sessions. Unlike most other studies which were focused on building user profiles based on their navigational patterns, we use session attributes as basis for clustering. We also study the effectiveness of the Principal Component Analysis on session classification based on clustering.

Share

COinS