1.\" Copyright (c) 2001, Matthew Dillon. Terms and conditions are those of 2.\" the BSD Copyright as specified in the file "/usr/src/COPYRIGHT" in 3.\" the source tree. 4.\" 5.\" $FreeBSD$ 6.\" 7.Dd June 25, 2002 8.Dt TUNING 7 9.Os 10.Sh NAME 11.Nm tuning 12.Nd performance tuning under FreeBSD 13.Sh SYSTEM SETUP - DISKLABEL, NEWFS, TUNEFS, SWAP 14When using 15.Xr disklabel 8 16or 17.Xr sysinstall 8 18to lay out your filesystems on a hard disk it is important to remember 19that hard drives can transfer data much more quickly from outer tracks 20than they can from inner tracks. 21To take advantage of this you should 22try to pack your smaller filesystems and swap closer to the outer tracks, 23follow with the larger filesystems, and end with the largest filesystems. 24It is also important to size system standard filesystems such that you 25will not be forced to resize them later as you scale the machine up. 26I usually create, in order, a 128M root, 1G swap, 128M 27.Pa /var , 28128M 29.Pa /var/tmp , 303G 31.Pa /usr , 32and use any remaining space for 33.Pa /home . 34.Pp 35You should typically size your swap space to approximately 2x main memory. 36If you do not have a lot of RAM, though, you will generally want a lot 37more swap. 38It is not recommended that you configure any less than 39256M of swap on a system and you should keep in mind future memory 40expansion when sizing the swap partition. 41The kernel's VM paging algorithms are tuned to perform best when there is 42at least 2x swap versus main memory. 43Configuring too little swap can lead 44to inefficiencies in the VM page scanning code as well as create issues 45later on if you add more memory to your machine. 46Finally, on larger systems 47with multiple SCSI disks (or multiple IDE disks operating on different 48controllers), we strongly recommend that you configure swap on each drive 49(up to four drives). 50The swap partitions on the drives should be approximately the same size. 51The kernel can handle arbitrary sizes but 52internal data structures scale to 4 times the largest swap partition. 53Keeping 54the swap partitions near the same size will allow the kernel to optimally 55stripe swap space across the N disks. 56Do not worry about overdoing it a 57little, swap space is the saving grace of 58.Ux 59and even if you do not normally use much swap, it can give you more time to 60recover from a runaway program before being forced to reboot. 61.Pp 62How you size your 63.Pa /var 64partition depends heavily on what you intend to use the machine for. 65This 66partition is primarily used to hold mailboxes, the print spool, and log 67files. 68Some people even make 69.Pa /var/log 70its own partition (but except for extreme cases it is not worth the waste 71of a partition ID). 72If your machine is intended to act as a mail 73or print server, 74or you are running a heavily visited web server, you should consider 75creating a much larger partition \(en perhaps a gig or more. 76It is very easy 77to underestimate log file storage requirements. 78.Pp 79Sizing 80.Pa /var/tmp 81depends on the kind of temporary file usage you think you will need. 82128M is 83the minimum we recommend. 84Also note that sysinstall will create a 85.Pa /tmp 86directory. 87Dedicating a partition for temporary file storage is important for 88two reasons: first, it reduces the possibility of filesystem corruption 89in a crash, and second it reduces the chance of a runaway process that 90fills up 91.Oo Pa /var Oc Ns Pa /tmp 92from blowing up more critical subsystems (mail, 93logging, etc). 94Filling up 95.Oo Pa /var Oc Ns Pa /tmp 96is a very common problem to have. 97.Pp 98In the old days there were differences between 99.Pa /tmp 100and 101.Pa /var/tmp , 102but the introduction of 103.Pa /var 104(and 105.Pa /var/tmp ) 106led to massive confusion 107by program writers so today programs haphazardly use one or the 108other and thus no real distinction can be made between the two. 109So it makes sense to have just one temporary directory. 110However you handle 111.Pa /tmp , 112the one thing you do not want to do is leave it sitting 113on the root partition where it might cause root to fill up or possibly 114corrupt root in a crash/reboot situation. 115.Pp 116The 117.Pa /usr 118partition holds the bulk of the files required to support the system and 119a subdirectory within it called 120.Pa /usr/local 121holds the bulk of the files installed from the 122.Xr ports 7 123hierarchy. 124If you do not use ports all that much and do not intend to keep 125system source 126.Pq Pa /usr/src 127on the machine, you can get away with 128a 1 gigabyte 129.Pa /usr 130partition. 131However, if you install a lot of ports 132(especially window managers and Linux-emulated binaries), we recommend 133at least a 2 gigabyte 134.Pa /usr 135and if you also intend to keep system source 136on the machine, we recommend a 3 gigabyte 137.Pa /usr . 138Do not underestimate the 139amount of space you will need in this partition, it can creep up and 140surprise you! 141.Pp 142The 143.Pa /home 144partition is typically used to hold user-specific data. 145I usually size it to the remainder of the disk. 146.Pp 147Why partition at all? 148Why not create one big 149.Pa / 150partition and be done with it? 151Then I do not have to worry about undersizing things! 152Well, there are several reasons this is not a good idea. 153First, 154each partition has different operational characteristics and separating them 155allows the filesystem to tune itself to those characteristics. 156For example, 157the root and 158.Pa /usr 159partitions are read-mostly, with very little writing, while 160a lot of reading and writing could occur in 161.Pa /var 162and 163.Pa /var/tmp . 164By properly 165partitioning your system fragmentation introduced in the smaller more 166heavily write-loaded partitions will not bleed over into the mostly-read 167partitions. 168Additionally, keeping the write-loaded partitions closer to 169the edge of the disk (i.e. before the really big partitions instead of after 170in the partition table) will increase I/O performance in the partitions 171where you need it the most. 172Now it is true that you might also need I/O 173performance in the larger partitions, but they are so large that shifting 174them more towards the edge of the disk will not lead to a significant 175performance improvement whereas moving 176.Pa /var 177to the edge can have a huge impact. 178Finally, there are safety concerns. 179Having a small neat root partition that 180is essentially read-only gives it a greater chance of surviving a bad crash 181intact. 182.Pp 183Properly partitioning your system also allows you to tune 184.Xr newfs 8 , 185and 186.Xr tunefs 8 187parameters. 188Tuning 189.Xr newfs 8 190requires more experience but can lead to significant improvements in 191performance. 192There are three parameters that are relatively safe to tune: 193.Em blocksize , bytes/i-node , 194and 195.Em cylinders/group . 196.Pp 197.Fx 198performs best when using 8K or 16K filesystem block sizes. 199The default filesystem block size is 16K, 200which provides best performance for most applications, 201with the exception of those that perform random access on large files 202(such as database server software). 203Such applications tend to perform better with a smaller block size, 204although modern disk characteristics are such that the performance 205gain from using a smaller block size may not be worth consideration. 206Using a block size larger than 16K 207can cause fragmentation of the buffer cache and 208lead to lower performance. 209.Pp 210The defaults may be unsuitable 211for a filesystem that requires a very large number of i-nodes 212or is intended to hold a large number of very small files. 213Such a filesystem should be created with an 8K or 4K block size. 214This also requires you to specify a smaller 215fragment size. 216We recommend always using a fragment size that is 1/8 217the block size (less testing has been done on other fragment size factors). 218The 219.Xr newfs 8 220options for this would be 221.Dq Li "newfs -f 1024 -b 8192 ..." . 222.Pp 223If a large partition is intended to be used to hold fewer, larger files, such 224as a database files, you can increase the 225.Em bytes/i-node 226ratio which reduces the number of i-nodes (maximum number of files and 227directories that can be created) for that partition. 228Decreasing the number 229of i-nodes in a filesystem can greatly reduce 230.Xr fsck 8 231recovery times after a crash. 232Do not use this option 233unless you are actually storing large files on the partition, because if you 234overcompensate you can wind up with a filesystem that has lots of free 235space remaining but cannot accommodate any more files. 236Using 32768, 65536, or 262144 bytes/i-node is recommended. 237You can go higher but 238it will have only incremental effects on 239.Xr fsck 8 240recovery times. 241For example, 242.Dq Li "newfs -i 32768 ..." . 243.Pp 244.Xr tunefs 8 245may be used to further tune a filesystem. 246This command can be run in 247single-user mode without having to reformat the filesystem. 248However, this is possibly the most abused program in the system. 249Many people attempt to 250increase available filesystem space by setting the min-free percentage to 0. 251This can lead to severe filesystem fragmentation and we do not recommend 252that you do this. 253Really the only 254.Xr tunefs 8 255option worthwhile here is turning on 256.Em softupdates 257with 258.Dq Li "tunefs -n enable /filesystem" . 259(Note: in 260.Fx 4.5 261and later, softupdates can be turned on using the 262.Fl U 263option to 264.Xr newfs 8 , 265and 266.Xr sysinstall 8 267will typically enable softupdates automatically for non-root filesystems). 268Softupdates drastically improves meta-data performance, mainly file 269creation and deletion. 270We recommend enabling softupdates on most filesystems; however, there 271are two limitations to softupdates that you should be aware of when 272determining whether to use it on a filesystem. 273First, softupdates guarantees filesystem consistency in the 274case of a crash but could very easily be several seconds (even a minute!) 275behind updating the physical disk. 276If you crash you may lose more work 277than otherwise. 278Secondly, softupdates delays the freeing of filesystem 279blocks. 280If you have a filesystem (such as the root filesystem) which is 281close to full, doing a major update of it, e.g.\& 282.Dq Li "make installworld" , 283can run it out of space and cause the update to fail. 284.Pp 285A number of run-time 286.Xr mount 8 287options exist that can help you tune the system. 288For this reason, softupdates will not be enabled on the root filesystem 289during a typical install. 290The most obvious and most dangerous one is 291.Cm async . 292Do not ever use it, it is far too dangerous. 293A less dangerous and more 294useful 295.Xr mount 8 296option is called 297.Cm noatime . 298.Ux 299filesystems normally update the last-accessed time of a file or 300directory whenever it is accessed. 301This operation is handled in 302.Fx 303with a delayed write and normally does not create a burden on the system. 304However, if your system is accessing a huge number of files on a continuing 305basis the buffer cache can wind up getting polluted with atime updates, 306creating a burden on the system. 307For example, if you are running a heavily 308loaded web site, or a news server with lots of readers, you might want to 309consider turning off atime updates on your larger partitions with this 310.Xr mount 8 311option. 312However, you should not gratuitously turn off atime 313updates everywhere. 314For example, the 315.Pa /var 316filesystem customarily 317holds mailboxes, and atime (in combination with mtime) is used to 318determine whether a mailbox has new mail. 319You might as well leave 320atime turned on for mostly read-only partitions such as 321.Pa / 322and 323.Pa /usr 324as well. 325This is especially useful for 326.Pa / 327since some system utilities 328use the atime field for reporting. 329.Sh STRIPING DISKS 330In larger systems you can stripe partitions from several drives together 331to create a much larger overall partition. 332Striping can also improve 333the performance of a filesystem by splitting I/O operations across two 334or more disks. 335The 336.Xr vinum 8 337and 338.Xr ccdconfig 8 339utilities may be used to create simple striped filesystems. 340Generally 341speaking, striping smaller partitions such as the root and 342.Pa /var/tmp , 343or essentially read-only partitions such as 344.Pa /usr 345is a complete waste of time. 346You should only stripe partitions that require serious I/O performance, 347typically 348.Pa /var , /home , 349or custom partitions used to hold databases and web pages. 350Choosing the proper stripe size is also 351important. 352Filesystems tend to store meta-data on power-of-2 boundaries 353and you usually want to reduce seeking rather than increase seeking. 354This 355means you want to use a large off-center stripe size such as 1152 sectors 356so sequential I/O does not seek both disks and so meta-data is distributed 357across both disks rather than concentrated on a single disk. 358If 359you really need to get sophisticated, we recommend using a real hardware 360RAID controller from the list of 361.Fx 362supported controllers. 363.Sh SYSCTL TUNING 364.Xr sysctl 8 365variables permit system behavior to be monitored and controlled at 366run-time. 367Some sysctls simply report on the behavior of the system; others allow 368the system behavior to be modified; 369some may be set at boot time using 370.Xr rc.conf 5 , 371but most will be set via 372.Xr sysctl.conf 5 . 373There are several hundred sysctls in the system, including many that appear 374to be candidates for tuning but actually are not. 375In this document we will only cover the ones that have the greatest effect 376on the system. 377.Pp 378The 379.Va kern.ipc.shm_use_phys 380sysctl defaults to 0 (off) and may be set to 0 (off) or 1 (on). 381Setting 382this parameter to 1 will cause all System V shared memory segments to be 383mapped to unpageable physical RAM. 384This feature only has an effect if you 385are either (A) mapping small amounts of shared memory across many (hundreds) 386of processes, or (B) mapping large amounts of shared memory across any 387number of processes. 388This feature allows the kernel to remove a great deal 389of internal memory management page-tracking overhead at the cost of wiring 390the shared memory into core, making it unswappable. 391.Pp 392The 393.Va vfs.vmiodirenable 394sysctl defaults to 1 (on). 395This parameter controls how directories are cached 396by the system. 397Most directories are small and use but a single fragment 398(typically 1K) in the filesystem and even less (typically 512 bytes) in 399the buffer cache. 400However, when operating in the default mode the buffer 401cache will only cache a fixed number of directories even if you have a huge 402amount of memory. 403Turning on this sysctl allows the buffer cache to use 404the VM Page Cache to cache the directories. 405The advantage is that all of 406memory is now available for caching directories. 407The disadvantage is that 408the minimum in-core memory used to cache a directory is the physical page 409size (typically 4K) rather than 512 bytes. 410We recommend turning this option off in memory-constrained environments; 411however, when on, it will substantially improve the performance of services 412that manipulate a large number of files. 413Such services can include web caches, large mail systems, and news systems. 414Turning on this option will generally not reduce performance even with the 415wasted memory but you should experiment to find out. 416.Pp 417The 418.Va vfs.write_behind 419sysctl defaults to 1 (on). 420This tells the filesystem to issue media 421writes as full clusters are collected, which typically occurs when writing 422large sequential files. 423The idea is to avoid saturating the buffer 424cache with dirty buffers when it would not benefit I/O performance. 425However, 426this may stall processes and under certain circumstances you may wish to turn 427it off. 428.Pp 429The 430.Va vfs.hirunningspace 431sysctl determines how much outstanding write I/O may be queued to 432disk controllers system-wide at any given instance. 433The default is 434usually sufficient but on machines with lots of disks you may want to bump 435it up to four or five megabytes. 436Note that setting too high a value 437(exceeding the buffer cache's write threshold) can lead to extremely 438bad clustering performance. 439Do not set this value arbitrarily high! 440Also, 441higher write queueing values may add latency to reads occuring at the same 442time. 443.Pp 444There are various other buffer-cache and VM page cache related sysctls. 445We do not recommend modifying these values. 446As of 447.Fx 4.3 , 448the VM system does an extremely good job tuning itself. 449.Pp 450The 451.Va net.inet.tcp.sendspace 452and 453.Va net.inet.tcp.recvspace 454sysctls are of particular interest if you are running network intensive 455applications. 456This controls the amount of send and receive buffer space 457allowed for any given TCP connection. 458The default sending buffer is 32K; the default receiving buffer 459is 64K. 460You can often 461improve bandwidth utilization by increasing the default at the cost of 462eating up more kernel memory for each connection. 463We do not recommend 464increasing the defaults if you are serving hundreds or thousands of 465simultaneous connections because it is possible to quickly run the system 466out of memory due to stalled connections building up. 467But if you need 468high bandwidth over a fewer number of connections, especially if you have 469gigabit Ethernet, increasing these defaults can make a huge difference. 470You can adjust the buffer size for incoming and outgoing data separately. 471For example, if your machine is primarily doing web serving you may want 472to decrease the recvspace in order to be able to increase the 473sendspace without eating too much kernel memory. 474Note that the routing table (see 475.Xr route 8 ) 476can be used to introduce route-specific send and receive buffer size 477defaults. 478.Pp 479As an additional management tool you can use pipes in your 480firewall rules (see 481.Xr ipfw 8 ) 482to limit the bandwidth going to or from particular IP blocks or ports. 483For example, if you have a T1 you might want to limit your web traffic 484to 70% of the T1's bandwidth in order to leave the remainder available 485for mail and interactive use. 486Normally a heavily loaded web server 487will not introduce significant latencies into other services even if 488the network link is maxed out, but enforcing a limit can smooth things 489out and lead to longer term stability. 490Many people also enforce artificial 491bandwidth limitations in order to ensure that they are not charged for 492using too much bandwidth. 493.Pp 494Setting the send or receive TCP buffer to values larger then 65535 will result 495in a marginal performance improvement unless both hosts support the window 496scaling extension of the TCP protocol, which is controlled by the 497.Va net.inet.tcp.rfc1323 498sysctl. 499These extensions should be enabled and the TCP buffer size should be set 500to a value larger than 65536 in order to obtain good performance out of 501certain types of network links; specifically, gigabit WAN links and 502high-latency satellite links. 503RFC1323 support is enabled by default. 504.Pp 505The 506.Va net.inet.tcp.always_keepalive 507sysctl determines whether or not the TCP implementation should attempt 508to detect dead TCP connections by intermittently delivering 509.Dq keepalives 510on the connection. 511By default, this is enabled for all applications; by setting this 512sysctl to 0, only applications that specifically request keepalives 513will use them. 514In most environments, TCP keepalives will improve the management of 515system state by expiring dead TCP connections, particularly for 516systems serving dialup users who may not always terminate individual 517TCP connections before disconnecting from the network. 518However, in some environments, temporary network outages may be 519incorrectly identified as dead sessions, resulting in unexpectedly 520terminated TCP connections. 521In such environments, setting the sysctl to 0 may reduce the occurrence of 522TCP session disconnections. 523.Pp 524The 525.Va net.inet.tcp.inflight_enable 526sysctl turns on bandwidth delay product limiting for all TCP connections. 527The system will attempt to calculate the bandwidth delay product for each 528connection and limit the amount of data queued to the network to just the 529amount required to maintain optimum throughput. This feature is useful 530if you are serving data over modems, GigE, or high speed WAN links (or 531any other link with a high bandwidth*delay product), especially if you are 532also using window scaling or have configured a large send window. If 533you enable this option you should also be sure to set 534.Va net.inet.tcp.inflight_debug 535to 0 (disable debugging), and for production use setting 536.Va net.inet.tcp.inflight_min 537to at least 6144 may be beneficial. Note, however, that setting high 538minimums may effectively disable bandwidth limiting depending on the link. 539The limiting feature reduces the amount of data built up in intermediate 540router and switch packet queues as well as reduces the amount of data built 541up in the local host's interface queue. With fewer packets queued up, 542interactive connections, especially over slow modems, will also be able 543to operate with lower round trip times. However, note that this feature 544only effects data transmission (uploading / server-side). It does not 545effect data reception (downloading). 546.Pp 547The 548.Va kern.ipc.somaxconn 549sysctl limits the size of the listen queue for accepting new TCP connections. 550The default value of 128 is typically too low for robust handling of new 551connections in a heavily loaded web server environment. 552For such environments, 553we recommend increasing this value to 1024 or higher. 554The service daemon 555may itself limit the listen queue size (e.g.\& 556.Xr sendmail 8 , 557apache) but will 558often have a directive in its configuration file to adjust the queue size up. 559Larger listen queues also do a better job of fending off denial of service 560attacks. 561.Pp 562The 563.Va kern.maxfiles 564sysctl determines how many open files the system supports. 565The default is 566typically a few thousand but you may need to bump this up to ten or twenty 567thousand if you are running databases or large descriptor-heavy daemons. 568The read-only 569.Va kern.openfiles 570sysctl may be interrogated to determine the current number of open files 571on the system. 572.Pp 573The 574.Va vm.swap_idle_enabled 575sysctl is useful in large multi-user systems where you have lots of users 576entering and leaving the system and lots of idle processes. 577Such systems 578tend to generate a great deal of continuous pressure on free memory reserves. 579Turning this feature on and adjusting the swapout hysteresis (in idle 580seconds) via 581.Va vm.swap_idle_threshold1 582and 583.Va vm.swap_idle_threshold2 584allows you to depress the priority of pages associated with idle processes 585more quickly then the normal pageout algorithm. 586This gives a helping hand 587to the pageout daemon. 588Do not turn this option on unless you need it, 589because the tradeoff you are making is to essentially pre-page memory sooner 590rather then later, eating more swap and disk bandwidth. 591In a small system 592this option will have a detrimental effect but in a large system that is 593already doing moderate paging this option allows the VM system to stage 594whole processes into and out of memory more easily. 595.Sh LOADER TUNABLES 596Some aspects of the system behavior may not be tunable at runtime because 597memory allocations they perform must occur early in the boot process. 598To change loader tunables, you must set their values in 599.Xr loader.conf 5 600and reboot the system. 601.Pp 602.Va kern.maxusers 603controls the scaling of a number of static system tables, including defaults 604for the maximum number of open files, sizing of network memory resources, etc. 605As of 606.Fx 4.5 , 607.Va kern.maxusers 608is automatically sized at boot based on the amount of memory available in 609the system, and may be determined at run-time by inspecting the value of the 610read-only 611.Va kern.maxusers 612sysctl. 613Some sites will require larger or smaller values of 614.Va kern.maxusers 615and may set it as a loader tunable; values of 64, 128, and 256 are not 616uncommon. 617We do not recommend going above 256 unless you need a huge number 618of file descriptors; many of the tunable values set to their defaults by 619.Va kern.maxusers 620may be individually overridden at boot-time or run-time as described 621elsewhere in this document. 622Systems older than 623.Fx 4.4 624must set this value via the kernel 625.Xr config 8 626option 627.Cd maxusers 628instead. 629.Pp 630.Va kern.ipc.nmbclusters 631may be adjusted to increase the number of network mbufs the system is 632willing to allocate. 633Each cluster represents approximately 2K of memory, 634so a value of 1024 represents 2M of kernel memory reserved for network 635buffers. 636You can do a simple calculation to figure out how many you need. 637If you have a web server which maxes out at 1000 simultaneous connections, 638and each connection eats a 16K receive and 16K send buffer, you need 639approximate 32MB worth of network buffers to deal with it. 640A good rule of 641thumb is to multiply by 2, so 32MBx2 = 64MB/2K = 32768. 642So for this case 643you would want to set 644.Va kern.ipc.nmbclusters 645to 32768. 646We recommend values between 6471024 and 4096 for machines with moderates amount of memory, and between 4096 648and 32768 for machines with greater amounts of memory. 649Under no circumstances 650should you specify an arbitrarily high value for this parameter, it could 651lead to a boot-time crash. 652The 653.Fl m 654option to 655.Xr netstat 1 656may be used to observe network cluster use. 657Older versions of 658.Fx 659do not have this tunable and require that the 660kernel 661.Xr config 8 662option 663.Dv NMBCLUSTERS 664be set instead. 665.Pp 666More and more programs are using the 667.Xr sendfile 2 668system call to transmit files over the network. 669The 670.Va kern.ipc.nsfbufs 671sysctl controls the number of filesystem buffers 672.Xr sendfile 2 673is allowed to use to perform its work. 674This parameter nominally scales 675with 676.Va kern.maxusers 677so you should not need to modify this parameter except under extreme 678circumstances. 679.Sh KERNEL CONFIG TUNING 680There are a number of kernel options that you may have to fiddle with in 681a large scale system. 682In order to change these options you need to be 683able to compile a new kernel from source. 684The 685.Xr config 8 686manual page and the handbook are good starting points for learning how to 687do this. 688Generally the first thing you do when creating your own custom 689kernel is to strip out all the drivers and services you do not use. 690Removing things like 691.Dv INET6 692and drivers you do not have will reduce the size of your kernel, sometimes 693by a megabyte or more, leaving more memory available for applications. 694.Pp 695.Dv SCSI_DELAY 696and 697.Dv IDE_DELAY 698may be used to reduce system boot times. 699The defaults are fairly high and 700can be responsible for 15+ seconds of delay in the boot process. 701Reducing 702.Dv SCSI_DELAY 703to 5 seconds usually works (especially with modern drives). 704Reducing 705.Dv IDE_DELAY 706also works but you have to be a little more careful. 707.Pp 708There are a number of 709.Dv *_CPU 710options that can be commented out. 711If you only want the kernel to run 712on a Pentium class CPU, you can easily remove 713.Dv I386_CPU 714and 715.Dv I486_CPU , 716but only remove 717.Dv I586_CPU 718if you are sure your CPU is being recognized as a Pentium II or better. 719Some clones may be recognized as a Pentium or even a 486 and not be able 720to boot without those options. 721If it works, great! 722The operating system 723will be able to better-use higher-end CPU features for MMU, task switching, 724timebase, and even device operations. 725Additionally, higher-end CPUs support 7264MB MMU pages which the kernel uses to map the kernel itself into memory, 727which increases its efficiency under heavy syscall loads. 728.Sh IDE WRITE CACHING 729.Fx 4.3 730flirted with turning off IDE write caching. 731This reduced write bandwidth 732to IDE disks but was considered necessary due to serious data consistency 733issues introduced by hard drive vendors. 734Basically the problem is that 735IDE drives lie about when a write completes. 736With IDE write caching turned 737on, IDE hard drives will not only write data to disk out of order, they 738will sometimes delay some of the blocks indefinitely when under heavy disk 739loads. 740A crash or power failure can result in serious filesystem 741corruption. 742So our default was changed to be safe. 743Unfortunately, the 744result was such a huge loss in performance that we caved in and changed the 745default back to on after the release. 746You should check the default on 747your system by observing the 748.Va hw.ata.wc 749sysctl variable. 750If IDE write caching is turned off, you can turn it back 751on by setting the 752.Va hw.ata.wc 753loader tunable to 1. 754More information on tuning the ATA driver system may be found in 755.Xr ata 4 . 756.Pp 757There is a new experimental feature for IDE hard drives called 758.Va hw.ata.tags 759(you also set this in the boot loader) which allows write caching to be safely 760turned on. 761This brings SCSI tagging features to IDE drives. 762As of this 763writing only IBM DPTA and DTLA drives support the feature. 764Warning! 765These 766drives apparently have quality control problems and I do not recommend 767purchasing them at this time. 768If you need performance, go with SCSI. 769.Sh CPU, MEMORY, DISK, NETWORK 770The type of tuning you do depends heavily on where your system begins to 771bottleneck as load increases. 772If your system runs out of CPU (idle times 773are perpetually 0%) then you need to consider upgrading the CPU or moving to 774an SMP motherboard (multiple CPU's), or perhaps you need to revisit the 775programs that are causing the load and try to optimize them. 776If your system 777is paging to swap a lot you need to consider adding more memory. 778If your 779system is saturating the disk you typically see high CPU idle times and 780total disk saturation. 781.Xr systat 1 782can be used to monitor this. 783There are many solutions to saturated disks: 784increasing memory for caching, mirroring disks, distributing operations across 785several machines, and so forth. 786If disk performance is an issue and you 787are using IDE drives, switching to SCSI can help a great deal. 788While modern 789IDE drives compare with SCSI in raw sequential bandwidth, the moment you 790start seeking around the disk SCSI drives usually win. 791.Pp 792Finally, you might run out of network suds. 793The first line of defense for 794improving network performance is to make sure you are using switches instead 795of hubs, especially these days where switches are almost as cheap. 796Hubs 797have severe problems under heavy loads due to collision backoff and one bad 798host can severely degrade the entire LAN. 799Second, optimize the network path 800as much as possible. 801For example, in 802.Xr firewall 7 803we describe a firewall protecting internal hosts with a topology where 804the externally visible hosts are not routed through it. 805Use 100BaseT rather 806than 10BaseT, or use 1000BaseT rather then 100BaseT, depending on your needs. 807Most bottlenecks occur at the WAN link (e.g.\& 808modem, T1, DSL, whatever). 809If expanding the link is not an option it may be possible to use 810.Xr dummynet 4 811feature to implement peak shaving or other forms of traffic shaping to 812prevent the overloaded service (such as web services) from affecting other 813services (such as email), or vice versa. 814In home installations this could 815be used to give interactive traffic (your browser, 816.Xr ssh 1 817logins) priority 818over services you export from your box (web services, email). 819.Sh SEE ALSO 820.Xr netstat 1 , 821.Xr systat 1 , 822.Xr ata 4 , 823.Xr dummynet 4 , 824.Xr login.conf 5 , 825.Xr rc.conf 5 , 826.Xr sysctl.conf 5 , 827.Xr firewall 7 , 828.Xr hier 7 , 829.Xr ports 7 , 830.Xr boot 8 , 831.Xr ccdconfig 8 , 832.Xr config 8 , 833.Xr disklabel 8 , 834.Xr fsck 8 , 835.Xr ifconfig 8 , 836.Xr ipfw 8 , 837.Xr loader 8 , 838.Xr mount 8 , 839.Xr newfs 8 , 840.Xr route 8 , 841.Xr sysctl 8 , 842.Xr sysinstall 8 , 843.Xr tunefs 8 , 844.Xr vinum 8 845.Sh HISTORY 846The 847.Nm 848manual page was originally written by 849.An Matthew Dillon 850and first appeared 851in 852.Fx 4.3 , 853May 2001. 854