Performance Issues of Designing a High Performance Intranet

By Sudha Jamthe and Subhash Agrawal

Abstract: This paper provides an analysis of the factors affecting performance of an Intranet.
It begins by looking at the working of the Web Server and the evolving performance issues of
the HTTP Protocol. It addresses the impact of proposed new features on the performance on
your enterprise solutions and offers tips and techniques for building a high performance Intranet.

1. Introduction

1.1 What technologies make up the Intranet?

Intranet is a collection of Web Servers, Client browsers and Web-enabled databases and applications set up within the firewall of an enterprise. This client-server model builds upon a distributed environment and enables hardware independent information dissemination across the company. Intranets are used for storing and sharing intra-departmental information and encouraging employee interactions through chats and threaded discussions. Intranets also deploy new applications to different business units of an enterprise. The applications may be developed for the web or could be independent applications integrated with the web by using Common Gateway Interface (CGI) programs.

The Web operates on a multi-protocol environment. It serves as a common interface to interact with email, ftp, news, gopher and open files on your systems. Email packages depend on Simple Mail Transport Protocol (SMTP) and Post Office Protocol (POP), the news reader connects through Network News Transfer Protocol (NNTP), file downloads happen through File Transfer Protocol (FTP) and interactions with your browser happen through the Hyper Text Transfer Protocol (HTTP).

(End of Page 1)

 

1.2. Performance Issues of Intranets

Web Servers are primary components of the Intranets because they serve as the central node communicating with the clients and web applications. Web Servers and most clients work on HTTP/1.0 protocol. Web transactions cause transfer of a variety of objects such as data, images, audio, video or movies. The main factors that govern the performance of an Intranet are:

  • Web Server protocol
  • Underlying Operating System tuning
  • CPU power of the underlying hardware
  • Performance Tuning of Web Server

The Intranet builds upon multiple layers of independently designed protocols. Most Intranets are built around a single Web Server and scaled as the Web Server load increases. Ever increasing load on servers and networks exacerbated by the mix of audio, video, and data burden the web servers and network and hence increase the response time of web access to the extent that the World Wide Web (WWW) is sometimes referred to as the World Wide Wait.

1.3 How the Web Works: HTTP/1.0 Protocol

HTTP/1.0 is the protocol on which the web works. HTTP uses TCP as a transport layer to transfer information across a distributed client-server environment. On a web server, the httpd daemon listens at a port (default assigned to httpd is port 80) for any request from a client, the web browser. The server responds to the request and closes the connection. HTTP was designed to be stateless i.e. the server opens a new connection for each request.


1.3.1 An HTTP Transaction

In its simplest form, an HTTP transaction is made up of 4 steps:

  1. Connection
  2. Client Request
  3. Server Response
  4. Disconnection

1.3.1.1 Connection

The client (browser) receives a ‘click’ or a URL (Uniform Resource Identifier) to fetch for the user. For example:

http://www.somename.com/pub/thefile.html

The browser separates the domain name "www.somename.com" and the file name or URI (Uniform Resource Identifier) "/pub/thefile.html" It passes the domain name to the local (DNS) Domain Name Server and gets it translated to an IP address.

It establishes the connection with the server by opening a socket at the HTTP port, which defaults to 80 unless explicitly overridden. It does this by simply making the following system calls:

socket() to open a raw network endpoint

bind() to bind the endpoint to the local address of the browser’s machine

connect() to specify the remote address and attempt to talk.

Inside the Operating System, connect() builds a TCP packet and gives it to the lower-level IP layer to wrap the packet with the IP protocol. This then goes to the device driver, which wraps it with an Ethernet frame. The packet is unleashed into the intranet to find the server.

(End of Page 2)


1.3.1.1 Connection(contd.)

In a simple intranet with one Web Server, the DNS maps the server name to its IP and the packet reaches it directly. In case of an intranet with multiple servers, the packet hops across many intermediate machines called proxies, gateways and firewalls and reaches the Web Server host.

The Operating System of the host machine will receive and decode the packet. If the Web Server is running, it would have declared itself to be the contact point for all requests coming at it's port (default port 80) and receives the packet from the Operating System. If the Server is not up, the Operating System calls a daemon called "inetd" which finds that port 80 requires the Web Server and starts it and gives it the packet.

Now, the client has established a connection with the Server.

1.3.1.2 Client Request

Once a connection is established with the Server, the client is ready to send a request for the file that the user requested as part of the URL. The client uses a set of keywords called "Request Method" to communicate with the Server.

The request methods included in HTTP/1.0 are:

  • GET - Used to retrieve documents identified by the URL
  • HEAD - Sends header information about the document.
  • POST - Allows the client to send data to the server for inclusion as a subordinate resource to the one referred to in the URL.
  • DELETE - Allows deletion of files/objects on the server and rarely used.
  • PUT - Allows creating of new resources on the server and rarely used.

The client may issue a simple request with the request method to get the relevant file information. For example: GET /pub/thefile.html HTTP/1.0 gets the file "thefile.html" from the directory "pub" in the server’s host machine.

The browser (client) may issue a full request with some optional header information and direct the server to take specific actions such as request additional header information about a file. For example: a search engine robot collecting web pages information may query web servers to check the last modified date of a file without actually requesting the file.


1.3.1.3 Server Response

The Server responds to each client request with a server response. It responds with a simple response for a simple request or a full response for a full request. A simple response contains only the object requested by the client as a stream of bytes e.g. content of a web page requested. A full response contains a status line, optional headers and optional data that include the status code, also known as error codes.

1.3.1.4 Disconnection

The server closes the connection when the transfer is complete or the client closes it when the user presses the Stop Sequence or Back or Forward button of the browser.

1.3.2 A browser "Click"

A browser 'click' can be just one transaction if the page loaded is the simplest form of a web page with no images. If there are inline images, there is a transaction with the server for each image. The browser requests each transaction separately and sequentially. For each transaction, it establishes a new connection i.e. handshakes are done many times consuming network bandwidth and real time. Therefore, note that the total elapsed time to bring the complete page can be the sum of several HTTP transactions.

(End of Page 3)


1.4 Web Performance Metrics

Web Server Performance is measured by the following metrics :

Web Response Time (RT) of an HTTP Transaction

The time from beginning of transmission of the click to the availability of the last byte of the desired page on the browser (client) machine is called the Latency or Response Time. It following figure best illustrates this:

Figure 1:
------------------------------ Response Time of a HTTP Transaction --------------------------

Locate Server

Establish Connection

Transmit Request Server Locates Data Server Transmits Data Close Connection

Other metrics are:

  • The total throughput of the web server or Requests per second (RPS) is defined as connections served or the number of transactions a server serves for one or more clients.
  • Error rates measures the rate of occurrence of errors to the rate of transactions served within a certain time frame. It measures how robust the web server is in handling errors.

2. Performance Problems

2.1 HTTP and TCP Interactions

2.1.1 Separate TCP Connection for each request

Each click can cause many transactions with the server. For example, if a document has inline images (or embedded images), the browser initiates a new request with the Web Server for each image, each of which involves a connection, request, response and disconnection for each request. This adds to the number of RTs (Response Times) for a document and makes the response very slow. This is particularly painful because the time to establish connection is the largest component of the response time.

2.1.2 TCP Slow Start

TCP uses ‘sliding window’ protocol to ensure that the data segments are delivered reliably. To avoid congestion, it uses a mechanism called slow start. [Ref 10] In this mechanism, for every new connection, the sender sends one segment and waits for its acknowledgment. i.e. window size = 1 segment. It then increments its sliding window size as acknowledgments arrive. As typical maximum segment size (MSS) is 536, any web page longer than 536 bytes suffers from additional delay due to this slow start.

2.1.3 TCP Time Wait

When a server closes a TCP connection it keeps information about that connection for a period of time (generally for 240 seconds) in case a delayed packet turns up. [Ref 7] On a busy server, this can add up to thousands of control blocks increasing the load on the server.

(End of Page 4)


2.1.4 Multiple IP entries in DNS

HTTP/1.0 requires that each web site use a separate IP address. This creates a problem when a single machine is used to host many web sites and associate each web site with a separate domain name. Each domain name translates to a different IP address on the same machine. Equivalently, this means that, a single host now has multiple IP addresses. This wastes the IP addresses available on the internet. In an intranet setting, a company may spend more on IP addresses with the associated difficulty of managing them.

2.2 Simplicity of HTTP/1.0 Protocol

2.2.1 Lack of data compression

HTTP/1.0 does not support compression of web pages transported. When web pages are longer, the file size can be huge and cause higher network traffic.

2.2.2 Unoptimized use of server cache

HTTP/1.0 is cache enabled but it does not explain how the server or other applications can extend the cache. As a result, web application developers cannot control the placement of information in cache and use cache optimally. For example the results of searches in a web site can be maintained in cache and re-used when other users request the same search, reducing repeated searches and thereby saving time.

2.2.3 Broken requests

HTTP/1.0 is stateless. It establishes a connection for each HTTP transaction. If a request is broken half-way during a data transfer, when the connection is re-established, HTTP starts all over again and causes multiple transfers. It would improve response time if it could start from where it left off by remembering the state.


2.3 Content Complexity

In addition to problems caused by large number of pages or large pages themselves, the performance of an intranet can be affected by how the server retrieves directory listings and complex web pages that contain graphics, CGI scripts, META tags.

2.3.1 Complex pages with inline graphics

Inline graphics cause multiple HTTP transactions as discussed earlier. So, if a web page contains many inline graphics, items will take long time to transfer and create the impression of the server being stalled. If one of the images is very large, it can stall the rest of the web page from loading.

2.3.2 Complex pages with CGI scripts

HTML (Hypertext Markup Language) used in development of web pages is limited in its capabilities. It can only display static content on the web. Web Servers offer a limited capability called CGI (Common Gateway Interface) to allow the Web Server to create a process to execute any executable or gateway extension and display the results on the client browser. For example, a web page can display the results of a dynamic database query using a CGI script. CGI programs are very slow and make the server busy during the time of their execution.

2.3.3 Content Negotiation

Content Negotiation is the process by which the server searches the disk to get the correct file from the URL requested by the client. Because of the flexibility offered by HTTP/HTML, this process can be very complex.

(End of Page 5)


For example,

  • HTTP allows web site authors to put multiple versions of the same information under a single URL. The server has to decode the URL and select one file to send.
  • HTTP allows the client to send some tags called "META tags" to direct the server to take special actions. e.g. jump to a new location and display the new file from that location.

The algorithm a server uses to negotiate this content and get to the data impacts the server response time directly. Currently, different servers, all following HTTP/1.0 implement content negotiation differently. So, web servers and HTML code are not truly portable. This has a profound impact in an intranet when you move web servers from one system to another or scale them and attempt to run the same web applications on them.

2.4 Server Inefficiencies

2.4.1 Too many processes

Many Web Servers use processes and not threads. The Web Server (httpd process) forks a httpd process for each transaction it is processing. With HTTP/1.0 being stateless and causing many transactions per click, it can cause many httpd processes to be forked. This is a performance overhead as time is spent in requesting a process from the Operating System for each transaction. Also, these processes may soon fill-up the process table of the server host.

2.4.2 Inefficiencies in server algorithms

Web servers are differentiated by the algorithm they use for handling content and searching data requested by the client. The efficiency of these algorithms eventually affects the response time of a user click.


3. Proposed Solutions

A number of solutions are being proposed to address these problems. These are in different stages of implementation. Below we provide a brief overview of the proposed solutions for the problems related to HTTP and TCP Interactions, Simplicity of HTTP/1.0 protocol and Content Complexity.

3.1 HTTP/1.1

W3C1 and IETF2 have proposed the following for the next version of the HTTP protocol, HTTP/1.1. They have looked at many proposals and are considering some of them , many of which are discussed below. [Ref 8 ]

3.1.1 Persistent-connections

Persistent HTTP, commonly known as P-HTTP [Ref 9] proposes keeping a TCP connection open after the first HTTP connection from the client to the server. This removes the need for many new connections to be established between the same server and client. It can minimize the TCP Slow Start problem if there are many inline images in a HTML Page. However, it can cause an overhead if subsequent packets come with a significant time delay between them. In such a case, the forked server process or thread is locked for future requests as it waits on a single request from a client. Re-using a single connection can add to the application-layer complexity in case of file large transfers.

A version of persistent TCP connections called "Keep-Alive" extensions is available in current Web Servers. They are discussed later in this paper.

3.1. 2 Pipelining of Requests to HTTP Server

Even with persistent connections there is a network round trip for each inline image in a web page. The client interacts with the server in a stop-and-wait fashion. It sends a request for an inlined image only after having received the data for the previous one. Performance can improve if the client can request all files (or a large number of image files) at the same time and the server could send them. This batching of requests is called pipelining and can be implemented in one of the following ways [Ref 2].

(End of Page 6)


a. The GETALL method

This method allows a client to request a document and specify the component inline images in the content-length fields and the server will respond by sending the document and content of all inline images in that document in a single connection. This method is not fully effective as it does not use any previously cached images from the client cache.

b. The GETLIST method

This method allows a client to request a set of documents or images from a server. The client can request only those images that are not currently in its cache and still get all images in a single connection. The requests are handled in a batch in a single connection and the latency is reduced.

3.1.3 Data Compression

Huge files can be transmitted by the server to the client when the client requests a file. Huge files can be transmitted by the client to the server when the client issues a PUT to place a file on the server. The client and server can compress and decompress the data. Compression can reduce the file size, transmission time, and storage space required for the file. It is being debated whether the compression should occur at the server side or client side. There are many proposed algorithms for HTML compression across a distributed client server setup.

3.1.4 Caching Extensions

HTTP/1.1 proposed cache extensions to allow marking of certain areas as cache. This will enable optimal usage of cache by web applications and result in better response times.

 

 


3.1.5 Range Validation

A browser needs to know if the format of a document is changed. Range requests proposes to send the range of data request and may be very useful to retrieve the remainder of cached images after a communications failure or user interrupted transfer, avoiding retransmission of data already successfully transferred. They may also be used to avoid excessive serialization of requests behind a large transfer.

3.1.6 Multiple IP Support for DNS Entry

Domain Name Servers (DNS) contain a list of IP addresses and map then to domain names. The new proposal includes server support for single DNS entries and single IP addresses for multiple web sites. This can help multiple domain names to reside on a single machine and allow the DNS administrator to maintain a single entry in the host table.

3.1.7 Transparent Content Negotiation

Transparent content negotiation is a mechanism, layered on top of HTTP, for automatically selecting the best version when the URL is accessed. This enables the smooth deployment of new web data formats and markup (html) tags.

3.1.8 Remote Variant Selection

A remote variant selection algorithm is proposed to speed up the transparent negotiation process. [Ref 16]

(End of Page 7)


3.2 HTTP-NG from PARC

PARC HTTP-NG Project [Ref 17] from Xerox Corporation set out to redesign HTTP. They are now interested in developing a binary distributed object protocol, for use with the Web, which is

  1. optimized for Internet use;
  2. at least as efficient as HTTP 1.1 for World Wide Web use; and
  3. also provides direct support for remote service invocation models such as Distributed Component Object Model (DCOM) Binary Protocol [Ref 19] CORBA (Common Object Request Broker) [Ref 20].

3.3 Protocol Extension Protocol (PEP)

The Protocol Extension Protocol (PEP) [Ref 15] is an extension mechanism to accommodate extension of HTTP clients and servers by software components. PEP allows applications to employ extensions dynamically by providing a mechanism for mapping the global definition of an extension to its local representation in a particular transaction.

 


4. What can you do today?

4.1 Set up web servers correctly

4.1.1 Size for load

Choose the right machine to host your server. Look at Specrate and SPECweb benchmarks. Choose the right amount of memory and CPU power required depending on the planned load on the server. For example, a server expected to do many CGI calls should have a faster CPU.

4.1.2 Stand alone server process

A standalone server starts once it is invoked and stops when the system shuts down. Web servers should not be invoked by inetd like other deamons because this will increase latency greatly by the increased time of server invocation for each client request.

4.2 Design Web Application and Servers correctly

4.2.1 Use API extensions

CGI scripts are used to execute commands to show the output dynamically as web pages. CGIs cause extra load on the Operating System from the communication between CGI scripts and Web Server, and hence increase the number of external processes required for a request. Server API extensions are available to integrate a new functionality with the server by programming. e.g. NSAPI in Netscape and ISAPI in Internet Explorer. It is advisable to use API extensions instead of CGIs as this can be faster and reduce the number of external processes required for a request. However, API extensions require development time and effort.

(End of Page 8)


4.2.2 Use new technologies to match the data being handled

a. Use ODBMS for Data Objects

ODBMS (Object Database Management Systems) which store data as objects and not just as rows and columns are evolving to integrate Object Oriented development languages like Java, C++ and Visual Languages like Visual Basic and Delphi. Support for extended relationships yields exceptional performance for distributed applications that combine many associated data types, such as the components of a dynamically generated Web page. Distributing cache will provide local, in-memory data access to the various services and components that make up the application, and both data and processing are distributed across each application tier. This cuts down on repetitive and unnecessary network traffic, removes database server bottlenecks, and allows scaling without adding expensive hardware.

b. Use RAID disks

The server response involves searching and accessing of data from disks by the server. Mirroring the data can speed up data access.

4.3 Partition Load (Modular design)

Web Servers need more processing power to handle more load. They should be scaled to more web servers as their load increases as this is more economical than moving to a faster machine with processing power. When multiple servers are setup, a user request needs can be routed to the right server by

  • using intermediate proxy servers that cache repeated requests or
  • in a round robin fashion by the DNS routing the requests to servers or
  • by manually by asking the user to choose an appropriate server.

Such a modular design of multiple servers can be achieved in many ways such as the following.


4.3.1 By replicating data

The same data can be mirrored in two or more servers and the load can be spread among them. Such a setup is called a Server Farm. This requires management of consistency among the mirrors which can be done by periodic syncs. Server Farms may not handle loads in a performance effective manner when each server is assigned equal load because a server may be overloaded at a time when the other is free. If requests are assigned equally in round robin fashion, a loaded server may simply lose a request.

4.3.2 By partitioning data

The data can be manually partitioned among multiple servers depending on several factors such as logical affinity of data and their proximity to clients. For example, sales data may be located closer to Sales department. In geographically separate client (browser) locations, a manual page can guide the user to choose a server closer to them.

4.3.3 By partitioning tasks

Web Servers can be assigned different tasks during the design of the intranet. Some servers can handle simple page requests. Servers hosted on powerful CPU systems can be assigned as CGI Servers and setup to handle only CGI requests. Systems hosting databases can handle database queries in APIs or CGIs. Such a design is more efficient in handling requests in a performance effective manner. It helps in capacity planning as we can predict the number of each type of request that can occur in a certain time frame.

4.4 Choose a Web Server that supports HTTP/1.1 improvements

There are two features that you can really choose to implement today:

4.4.1 Keep Alive Extensions

Recent Web Servers such as Apache, Netscape and Jigsaw provide a configurable option to set the server to operate in a mode called "Keep-Alive" that will retain the TCP connections thereby reducing the TCP bottlenecks of one connection per request.

(End of Page 9)


  • Keep-Alive is effective only if the browser supports it. Many browsers such as Netscape Navigator 2.0 or later and Microsoft Internet Explorer 2.0 or later support Keep-Alive. Older browsers do not support Keep-Alive and occasionally hang on a connect.
  • Keep-Alive support is active only where the length of the file is known. So, most CGI scripts, directory listings and server-side includes will not use Keep-Alive even if the server and browser support it. They will cause a large latency but will not fail the connection.
  • Keep-Alive fails in an intranet if more than one Proxy Server is used.

4.4.2 Multi-threaded or Pooled Process Web Server

Most recent web servers can be configured to accept a pool of processes like the NFS deamon or Oracle Shared Server do. This will make multiple instances of the server process httpd to run continuously. When the client requests a connection, one process listens at port 80 and quickly passes the request to the other server process thereby eliminating the need to fork a new process for each request. Similarly, many recent servers support multi-threading. A pool of threads can be assigned to the server to make it work optimally without requesting a thread for each transaction and thereby improving performance.

4.5 Establish a proactive program of performance management

Look at Web Server logs and watch for increasing error rates. Use commercial tools e.g. BEST/1, HP, BMC and system utilities to monitor your computers and networks for performance problems. [Ref 6]

5. Conclusion

Most Web Servers available today work on HTTP/1.0 which works inefficiently with TCP and has many other simplicities that impact the performance of intranet servers and applications. The performance of an Intranet can be improved by developing an understanding of the way the web works and how the proposed solutions in upcoming new servers will impact your intranet. Selecting the right servers, partitioning the load across a modular design of servers, designing applications for the web and establishing a proactive program of performance management will greatly improve intranet perfrormance.


6. References

[1]"Network Performance Effects of HTTP/1.1, CSS1, and PNG" Working NOTE 14-February-1997 by Jim Gettys, Henrik Frystyk Nielsen, Anselm Baird-Smith, Eric Prud'hommeaux, Håkon Wium Lie and Chris Lilley of W3C.

[2] V.N. Padmanabhan, J. Mogul, "Improving HTTP Latency", Computer Networks and ISDN Systems, v.28, pp. 25-35, Dec. 1995. Slightly revised version of paper in Proc. 2nd International WWW Conference '94: Mosaic and the Web, Oct. 1994

[3] T. Berners-Lee, R. Fielding, H. Frystyk. "Informational RFC 1945 - Hypertext Transfer Protocol -- HTTP/1.0," MIT/LCS, UC Irvine, May 1996

[4] "Transmission Control Protocol," J. Postel, RFC-793, September 1981.

[5] "What happens when you Click" by Neil Randall , PC Magazine, Oct. 22, 1996

[6] Cockcroft, Adrian, "Watching your Web Server: How do I monitor my Web server’s performance and what can I do about it" in SunWorld Online, March , 1996

[7] Analysis of HTTP Performance problems by Simon E Spero, July 1994

[8] Fielding, et al., "Hypertext Transfer Protocol - HTTP/1.1," (working draft), June 7, 1996.

[9] Mogul, J., "The Case for Persistent-Connection HTTP," ACM Sigcomm '95, August 1995, pp. 299-313.



[10] Internetworking with TCP/IP by Douglas Comer.

[11] The Effect of HTML Compression on a PPP Modem Line by Henrik Frystyk Nielsen, IETF HTTP working group., April 1997

[12] The Effect of HTML Compression on a LAN by Henrik Frystyk Nielsen, IETF HTTP working group., April 1997

[13] A Note on Distributed Computing. Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall TR-94-29 (November 1994)

[14] RFC 2068 - HTTP/1.1 Proposal by Network Working Group, R. Fielding Request for Comments: 2068 UC Irvine Category: Standards Track J. Gettys J. Mogul, H. Frystyk T. Berners-Lee, Jan 1997

[15] PEP - an Extension Mechanism for HTTP by H. Frystyk Internet Draft W3C April 1997

[16] HTTP Remote Variant Selection Algorithm -- RVSA/1.0 by Koen Holtman, TUE Internet-Draft Andrew Mutz, Hewlett-Packard, March 23, 1997

[17] PARC HTTP-NG Project

[18] "Analysis of HTTP Performance" by Joe Touch, John Heidemann, and Katia Obraczkaof USC/Information Sciences Institute, August 16, 1996

[19]Distributed Component Object Model (DCOM) Binary Protocol, May 1996

[20]CORBA (ftp.omg.org.pub/docs), April 1996

1 W3C is W3 Consortium, a body that researches the future direction of the web.

2 IETF is Internet Effective Task Force that works with W3C to set Internet Protocol Standards.