|
Info Guide NET002: NetWare Server Tuning and Optimisation This document was written to give an overview of the factors that affect the performance of a server, and the changes that can be made in both hardware and software to optimise the performance of the server. It is NOT intended to be an in-depth guide, and does not cover all the factors that may relate to performance-tuning a server. It is intended that this document be used as a guide, with reference being made to the relevant product manuals for a more in-depth description of the options available. Also worth reading are a number of Novell Application Notes that have been issued on server tuning and performance optimisation. The document is split into three sections, namely the server hardware, operating system tuning parameters, and the application software in use. Not covered are aspects such as the workstation hardware, the topology used and the network protocols used, although these will also play an important part in the performance of the system. 1.1. CPU 2. Operating System 2.1. File Cache Parameters 3. User Software 3.1 Introduction
The first stage of server performance tuning and optimisation is to identify potential bottlenecks in the hardware of the server in question. These can be the CPU, the memory, the network adapter, and/or the disk sub-system. During the following discussions, the basic file server performance graph below will be used to illustrate the various aspects of server tuning. This graph assumes that for a given configuration, the server hardware is optimised, and shows how, as server usage increases, the number of transactions per second increases, until you reach a threshold where the server disk cache hits are at their maximum, and the throughput on the network adapter is also at maximum. As this threshold is passed, transactions per second start to decrease as the disk cache hit rate reduces and more and more data is having to be pulled directly from the disk sub-system.
1.1. CPU A general statement made by Novell regarding CPUs under NetWare is that the faster the CPU the better, especially when the server is acting as a database server, where a 486/66 or greater is recommended. Having said that, not all server operations are server CPU intensive, and there are a number of steps that can be taken to reduce CPU loading, therefore reducing the need for faster and better processors in the server. The most important of these steps is the use of bus-mastering or parallel tasking adapters for the disk and network interfaces. The CPU must execute instructions for each user I/O request. Insufficient processor speed degrades the servers I/O performance. Increasing processor speed beyond what is required will have little impact on response time and performance, however it does improve the capacity to add extra users. Also, as I/O performance of network and disk sub-systems improves, CPU performance load increases, requiring either faster CPUs, or some means of reducing the workload of the CPU.
Bus-mastering cards reduce the CPU loading on servers dramatically. Tests by IBM show a typical CPU load can be reduced to less than 20% (over normal non-bus-mastering adapters) for a given operation by implementing bus-mastering technology in servers. The effect of adding bus-mastering cards on CPU utilisation is typically increased user capacity, as there is less CPU load involved in disk and network I/O, although on a server that is already showing insufficient CPU resources, there should be a noticeable increase in server performance, as the processor is spending less cycles executing I/O instructions. Parallel tasking LAN cards, such as the 3C5x9 EtherLink III range from 3Com reduce the CPU loading by performing separate tasks in parallel to transfer data with greater efficiency. Tests carried out by LANQuest Labs in an IBM PS/2 model 95 show that for most typical workloads, the parallel tasking 3C529 will out-perform the single tasking 3C527 (a bus-mastering card) by up to 52%, only falling behind when the network is at near saturation (when the 3C527 is 10% faster). In general, most currently installed 486 based servers have ample CPU performance, typical bottlenecks are in the network and disk sub-systems. 1.2. MEMORY Adding memory to a NetWare server improves the disk cache hit rate, improving temporarily the sustained throughput rate of the server. The performance improvements will however degrade as additional users are added. Also, the benefits of extra memory are dependant on other server hardware factors. A higher disk cache hit rate will imply an increase in network utilisation. A slow network adapter will therefore minimise the impact of adding additional memory. Also, a higher disk cache hit rate implies higher CPU utilisation, and poor CPU performance can also reduce the impact of adding extra memory.
1.3. NETWORK ADAPTER Adding faster network adapters will affect server performance by improving the maximum peak transaction rate. Under heavy loads however, the performance improvement will only have a slight effect, as the server will be disk-bound. Also, as network adapter performance improves, increased CPU time is required to service additional requests, therefore an increase in server CPU performance is required. Using bus-mastering or parallel tasking network adapters will reduce the CPU impact while still preserving the performance benefits of faster cards.
1.4. DISK SUB-SYSTEM Adding a faster disk sub-system has the greatest impact on a server under heavy loads, as it improves the minimum sustained transfer rate. It will however have minimal performance impact under light server loads, as most requests are serviced directly from the disk cache (network transfer times form a relatively large component of the overall performance on a lightly loaded server), and disk transfer times are hidden by the use of the disk cache. Ways of improving the performance of the disk sub-system include upgrading to physically faster disks, and implementing data striping (in simple terms, spanning a NetWare volume over multiple disks) with multiple controller cards. If a server is currently running with disk mirroring (two sets of disks off one disk controller), upgrading this to a duplexed configuration (two sets of disks and two disk controllers, one set of disks off each controller) will maintain data integrity without loss of performance (as the disk controller does not have to issue two sets of commands for each disk write). As server disk performance improves, increased network adapter performance and CPU performance are required to support greater disk I/O transaction rates. Using bus-master disk controller cards will reduce the impact of faster controllers on CPU loading.
1.5. SUMMARY Overall, it can be seen that upgrading just one of the sub-systems in a
server will not have a marked effect in improving all aspects of the servers
performance. It is only when multiple improvements are made that overall performance gains
can be made. However, selective server hardware upgrading can help improve the performance
of a poorly performing server so long as there is careful consideration of where the
bottlenecks on the server are, and the applications that are suffering as a result. What follows is an explanation of some of the NetWare tuneable SET parameters, and how changing them can affect server performance. The list is NOT a complete one, and is far from exhaustive, but should be used to illustrate how the various parameters can be used to optimise the performance of NetWare for particular environments. For full details, please consult the relevant NetWare manuals. WARNING..... 2.1. FILE CACHE PARAMETERS 2.1.1. Set Cache Buffer Size By default, under NetWare 3.x, the block size is 4K and the cache buffer size is also 4K. 2.1.2. Set Dirty Disk Cache Delay Time 2.1.3. Set Maximum Concurrent Disk Cache Writes 2.2. DIRECTORY CACHE PARAMETERS 2.2.1. Set Minimum and Maximum Directory Cache Buffers 2.2.2. Set Maximum Concurrent Directory Cache Writes 2.2.3. Set Dirty Directory Cache Delay Time 2.2.4. Set Directory Cache Buffer Nonreferenced Delay 2.3. BLOCK SIZES 2.3.1. NetWare 3.X Also, remember that the block size can only be set when the volume is created. It cannot be changed later! 2.3.2. NetWare 4.X 2.4. NAME SPACE CONSIDERATIONS Adding name space support to a volume creates an additional file directory entry for each additional name space loaded for every file on that volume. This results in directory writes on a volume with added name space support taking longer than on a volume without added name space support. Therefore, added name spaces should only be used on volumes that actually need them. This is without taking into consideration the extra RAM required for the FATs of a volume with added name space support. 2.5. PACKET RECEIVE BUFFERS 2.5.1. Set Maximum and Minimum Packet Receive Buffers In a NetWare 4.x environment, the server has increased priority for packet routing, so the packet receive buffer requirements are lower than those for an equivalent NetWare 3.x configuration. Servers routing between fast and slow network topologies (such as between LAN and WAN, or between 16 and 4Mb/s Token Ring) will have increased needs for receive buffers, and so again, the maximum receive buffer allocation may need to be increased. Setting the MINIMUM PACKET RECEIVE BUFFER value higher than default has the effect of eliminating any delay before the operating system allocates packet receive buffers. Setting the MAXIMUM PACKET RECEIVE BUFFER value has the effect of limiting the amount of server memory allocated to the receive buffers. 2.5.2. Set New Packet Receive Buffer Wait Time 2.6. NCP PACKET SIGNATURES As of the release of NetWare 3.12 (and optionally in NetWare 3.11 if Packet Burst is enabled), NetWare adds a unique signature to each NCP packet sent across the network, in an attempt to make the system more secure. Each network request that has an NCP packet signature requires an additional 400 instructions to be carried out on the server in order to sign and validate each NCP packet. In an environment where this extra level of security is not vital, turning off the NCP Packet Signature option can reclaim vital CPU resources on a busy server. This needs to be set on the server (SET NCP PACKET SIGNATURE OPTION = 0 or 1) and in the NET.CFG file of each workstation using VLMs (SIGNATURE LEVEL = 0) 2.7. SOME SERVER MONITORING TOOLS NetWare 3.x and above is supplied as standard with MONITOR.NLM. This tool can be used to watch a number of the factors that affect server performance, such as CPU utilisation, the number of service processes, cache buffer values, packet receive buffers, and so on. The main disadvantage of using MONITOR is that the values shown are those at a particular moment in time, rather than displaying historical trends (which are more useful when charting overall server utilisation). To get a better feel for the trends of the various performance statistics, software such as Novells NetWare Management Services (NMS) can be used. This allows the trends of multiple servers to be monitored from a central location, and can output the data in a format that can be read directly into spreadsheets. The main disadvantage of this approach is cost. In conjunction with the above, LANalyzer for Windows, or the NetWare
LANalyzer Agents for NMS can be used to give an indication of the overall network
bandwidth utilisation, and can allow a more detailed look at the types of server requests
being made across the network so that application behaviour can be more effectively
monitored (to see for example whether there are more read than write requests during a
large file update). 3.1 INTRODUCTION The third factor in optimising the performance of a server is understanding how the applications used on the network impact on the performance of that network. The impact of software on the server performance should not be under-estimated, as poorly designed software can have a major impact on the overall performance of the system. In real terms, understanding the behaviour of the software in use on the network should be the first step in optimising the performance of a server, as the steps needed to improve performance will vary dependant on the workload placed on the server by the applications in use. Adding cache RAM to a server where the primary applications do not take advantage of cache technology will not improve performance dramatically, whereas speeding up the disk sub-system or the LAN interface could offer better overall performance gains. 3.2 SOME EXAMPLES 3.2.1 DOS Copy The DOS COPY command works by issuing 64Kb (128 sector) sequential I/O requests, which does not reflect the way most popular applications operate. Because of the way the COPY command works, high performance figures can be easily manipulated by using read-ahead caching disk controllers. These controllers are however very likely to perform differently when placed in a real application environment. 3.2.2 Lotus 1-2-3 As the majority of requests are NetWare Disk Cache hits, the primary performance bottleneck is related to getting the data onto the network, therefore adding higher performance network cards, or splitting the network load between multiple network cards would offer the best performance gains. 3.2.3 WordPerfect If a read-ahead cache is in use, a read request from WordPerfect would be SLOWER than on a non-cached system, as for each block read, the cache would proceed to read in the next block of data (which WordPerfect has already read), causing the number of disk reads to increase. Also, when writing, WordPerfect uses large sequential blocks, and so the effect of having a write-cache would be limited, as the data throughput would be sustained at a high level for the duration of the write. This type of application would benefit most by increasing the throughput of the disk sub-system. 3.2.4 cc:Mail The overall result of all these different requests for a single transaction are that the CPU utilisation has much more of an influence on performance than other aspects. As an example, to move a defined block of data via cc:Mail may put the server CPU utilisation at 100%, but an equivalent amount of data under WordPerfect may only cause 20% CPU utilisation. In order to optimise the server performance for this kind of application, improving the server CPU performance (to counter the large number of I/O requests) and network adapter performance (to counter the small network frames) should be considered. 3.3 UPGRADING SOFTWARE Upgrading software versions can also have a marked effect on the performance of a network. Although the upgraded software may have more functionality, the system performance hit taken as a result of the upgrade may have more of a negative effect on the system than the benefits gained by the new features. 3.3.1 DOS vs Windows Lotus 1-2-3
In summary, 57% more requests are made through the network to perform a load/save operation under Windows than under DOS. The bulk of the increased overhead is because of the use of DLL and font files under Windows, as the files are only accessed when needed. 3.3.2 Quattro V1.01 to Quattro Pro V4.0
In summary, there are 31 times more requests through the network to perform a load/save operation under version 4.0 compared to version 1.01. 3.3.3 DOS vs Windows WordPerfect
For a given load/save operation, the Windows version of
WordPerfect will result in 45% more network requests than the DOS version. Note that the
Windows version of WordPerfect also reads files backwards!
|