xref: /freebsd/share/man/man7/tuning.7 (revision c4f6a2a9e1b1879b618c436ab4f56ff75c73a0f5)
1.\" Copyright (c) 2001, Matthew Dillon.  Terms and conditions are those of
2.\" the BSD Copyright as specified in the file "/usr/src/COPYRIGHT" in
3.\" the source tree.
4.\"
5.\" $FreeBSD$
6.\"
7.Dd June 25, 2002
8.Dt TUNING 7
9.Os
10.Sh NAME
11.Nm tuning
12.Nd performance tuning under FreeBSD
13.Sh SYSTEM SETUP - DISKLABEL, NEWFS, TUNEFS, SWAP
14When using
15.Xr disklabel 8
16or
17.Xr sysinstall 8
18to lay out your filesystems on a hard disk it is important to remember
19that hard drives can transfer data much more quickly from outer tracks
20than they can from inner tracks.
21To take advantage of this you should
22try to pack your smaller filesystems and swap closer to the outer tracks,
23follow with the larger filesystems, and end with the largest filesystems.
24It is also important to size system standard filesystems such that you
25will not be forced to resize them later as you scale the machine up.
26I usually create, in order, a 128M root, 1G swap, 128M
27.Pa /var ,
28128M
29.Pa /var/tmp ,
303G
31.Pa /usr ,
32and use any remaining space for
33.Pa /home .
34.Pp
35You should typically size your swap space to approximately 2x main memory.
36If you do not have a lot of RAM, though, you will generally want a lot
37more swap.
38It is not recommended that you configure any less than
39256M of swap on a system and you should keep in mind future memory
40expansion when sizing the swap partition.
41The kernel's VM paging algorithms are tuned to perform best when there is
42at least 2x swap versus main memory.
43Configuring too little swap can lead
44to inefficiencies in the VM page scanning code as well as create issues
45later on if you add more memory to your machine.
46Finally, on larger systems
47with multiple SCSI disks (or multiple IDE disks operating on different
48controllers), we strongly recommend that you configure swap on each drive
49(up to four drives).
50The swap partitions on the drives should be approximately the same size.
51The kernel can handle arbitrary sizes but
52internal data structures scale to 4 times the largest swap partition.
53Keeping
54the swap partitions near the same size will allow the kernel to optimally
55stripe swap space across the N disks.
56Do not worry about overdoing it a
57little, swap space is the saving grace of
58.Ux
59and even if you do not normally use much swap, it can give you more time to
60recover from a runaway program before being forced to reboot.
61.Pp
62How you size your
63.Pa /var
64partition depends heavily on what you intend to use the machine for.
65This
66partition is primarily used to hold mailboxes, the print spool, and log
67files.
68Some people even make
69.Pa /var/log
70its own partition (but except for extreme cases it is not worth the waste
71of a partition ID).
72If your machine is intended to act as a mail
73or print server,
74or you are running a heavily visited web server, you should consider
75creating a much larger partition \(en perhaps a gig or more.
76It is very easy
77to underestimate log file storage requirements.
78.Pp
79Sizing
80.Pa /var/tmp
81depends on the kind of temporary file usage you think you will need.
82128M is
83the minimum we recommend.
84Also note that sysinstall will create a
85.Pa /tmp
86directory.
87Dedicating a partition for temporary file storage is important for
88two reasons: first, it reduces the possibility of filesystem corruption
89in a crash, and second it reduces the chance of a runaway process that
90fills up
91.Oo Pa /var Oc Ns Pa /tmp
92from blowing up more critical subsystems (mail,
93logging, etc).
94Filling up
95.Oo Pa /var Oc Ns Pa /tmp
96is a very common problem to have.
97.Pp
98In the old days there were differences between
99.Pa /tmp
100and
101.Pa /var/tmp ,
102but the introduction of
103.Pa /var
104(and
105.Pa /var/tmp )
106led to massive confusion
107by program writers so today programs haphazardly use one or the
108other and thus no real distinction can be made between the two.
109So it makes sense to have just one temporary directory.
110However you handle
111.Pa /tmp ,
112the one thing you do not want to do is leave it sitting
113on the root partition where it might cause root to fill up or possibly
114corrupt root in a crash/reboot situation.
115.Pp
116The
117.Pa /usr
118partition holds the bulk of the files required to support the system and
119a subdirectory within it called
120.Pa /usr/local
121holds the bulk of the files installed from the
122.Xr ports 7
123hierarchy.
124If you do not use ports all that much and do not intend to keep
125system source
126.Pq Pa /usr/src
127on the machine, you can get away with
128a 1 gigabyte
129.Pa /usr
130partition.
131However, if you install a lot of ports
132(especially window managers and Linux-emulated binaries), we recommend
133at least a 2 gigabyte
134.Pa /usr
135and if you also intend to keep system source
136on the machine, we recommend a 3 gigabyte
137.Pa /usr .
138Do not underestimate the
139amount of space you will need in this partition, it can creep up and
140surprise you!
141.Pp
142The
143.Pa /home
144partition is typically used to hold user-specific data.
145I usually size it to the remainder of the disk.
146.Pp
147Why partition at all?
148Why not create one big
149.Pa /
150partition and be done with it?
151Then I do not have to worry about undersizing things!
152Well, there are several reasons this is not a good idea.
153First,
154each partition has different operational characteristics and separating them
155allows the filesystem to tune itself to those characteristics.
156For example,
157the root and
158.Pa /usr
159partitions are read-mostly, with very little writing, while
160a lot of reading and writing could occur in
161.Pa /var
162and
163.Pa /var/tmp .
164By properly
165partitioning your system fragmentation introduced in the smaller more
166heavily write-loaded partitions will not bleed over into the mostly-read
167partitions.
168Additionally, keeping the write-loaded partitions closer to
169the edge of the disk (i.e. before the really big partitions instead of after
170in the partition table) will increase I/O performance in the partitions
171where you need it the most.
172Now it is true that you might also need I/O
173performance in the larger partitions, but they are so large that shifting
174them more towards the edge of the disk will not lead to a significant
175performance improvement whereas moving
176.Pa /var
177to the edge can have a huge impact.
178Finally, there are safety concerns.
179Having a small neat root partition that
180is essentially read-only gives it a greater chance of surviving a bad crash
181intact.
182.Pp
183Properly partitioning your system also allows you to tune
184.Xr newfs 8 ,
185and
186.Xr tunefs 8
187parameters.
188Tuning
189.Xr newfs 8
190requires more experience but can lead to significant improvements in
191performance.
192There are three parameters that are relatively safe to tune:
193.Em blocksize , bytes/i-node ,
194and
195.Em cylinders/group .
196.Pp
197.Fx
198performs best when using 8K or 16K filesystem block sizes.
199The default filesystem block size is 16K,
200which provides best performance for most applications,
201with the exception of those that perform random access on large files
202(such as database server software).
203Such applications tend to perform better with a smaller block size,
204although modern disk characteristics are such that the performance
205gain from using a smaller block size may not be worth consideration.
206Using a block size larger than 16K
207can cause fragmentation of the buffer cache and
208lead to lower performance.
209.Pp
210The defaults may be unsuitable
211for a filesystem that requires a very large number of i-nodes
212or is intended to hold a large number of very small files.
213Such a filesystem should be created with an 8K or 4K block size.
214This also requires you to specify a smaller
215fragment size.
216We recommend always using a fragment size that is 1/8
217the block size (less testing has been done on other fragment size factors).
218The
219.Xr newfs 8
220options for this would be
221.Dq Li "newfs -f 1024 -b 8192 ..." .
222.Pp
223If a large partition is intended to be used to hold fewer, larger files, such
224as a database files, you can increase the
225.Em bytes/i-node
226ratio which reduces the number of i-nodes (maximum number of files and
227directories that can be created) for that partition.
228Decreasing the number
229of i-nodes in a filesystem can greatly reduce
230.Xr fsck 8
231recovery times after a crash.
232Do not use this option
233unless you are actually storing large files on the partition, because if you
234overcompensate you can wind up with a filesystem that has lots of free
235space remaining but cannot accommodate any more files.
236Using 32768, 65536, or 262144 bytes/i-node is recommended.
237You can go higher but
238it will have only incremental effects on
239.Xr fsck 8
240recovery times.
241For example,
242.Dq Li "newfs -i 32768 ..." .
243.Pp
244.Xr tunefs 8
245may be used to further tune a filesystem.
246This command can be run in
247single-user mode without having to reformat the filesystem.
248However, this is possibly the most abused program in the system.
249Many people attempt to
250increase available filesystem space by setting the min-free percentage to 0.
251This can lead to severe filesystem fragmentation and we do not recommend
252that you do this.
253Really the only
254.Xr tunefs 8
255option worthwhile here is turning on
256.Em softupdates
257with
258.Dq Li "tunefs -n enable /filesystem" .
259(Note: in
260.Fx 4.5
261and later, softupdates can be turned on using the
262.Fl U
263option to
264.Xr newfs 8 ,
265and
266.Xr sysinstall 8
267will typically enable softupdates automatically for non-root filesystems).
268Softupdates drastically improves meta-data performance, mainly file
269creation and deletion.
270We recommend enabling softupdates on most filesystems; however, there
271are two limitations to softupdates that you should be aware of when
272determining whether to use it on a filesystem.
273First, softupdates guarantees filesystem consistency in the
274case of a crash but could very easily be several seconds (even a minute!)
275behind updating the physical disk.
276If you crash you may lose more work
277than otherwise.
278Secondly, softupdates delays the freeing of filesystem
279blocks.
280If you have a filesystem (such as the root filesystem) which is
281close to full, doing a major update of it, e.g.\&
282.Dq Li "make installworld" ,
283can run it out of space and cause the update to fail.
284.Pp
285A number of run-time
286.Xr mount 8
287options exist that can help you tune the system.
288For this reason, softupdates will not be enabled on the root filesystem
289during a typical install.
290The most obvious and most dangerous one is
291.Cm async .
292Do not ever use it, it is far too dangerous.
293A less dangerous and more
294useful
295.Xr mount 8
296option is called
297.Cm noatime .
298.Ux
299filesystems normally update the last-accessed time of a file or
300directory whenever it is accessed.
301This operation is handled in
302.Fx
303with a delayed write and normally does not create a burden on the system.
304However, if your system is accessing a huge number of files on a continuing
305basis the buffer cache can wind up getting polluted with atime updates,
306creating a burden on the system.
307For example, if you are running a heavily
308loaded web site, or a news server with lots of readers, you might want to
309consider turning off atime updates on your larger partitions with this
310.Xr mount 8
311option.
312However, you should not gratuitously turn off atime
313updates everywhere.
314For example, the
315.Pa /var
316filesystem customarily
317holds mailboxes, and atime (in combination with mtime) is used to
318determine whether a mailbox has new mail.
319You might as well leave
320atime turned on for mostly read-only partitions such as
321.Pa /
322and
323.Pa /usr
324as well.
325This is especially useful for
326.Pa /
327since some system utilities
328use the atime field for reporting.
329.Sh STRIPING DISKS
330In larger systems you can stripe partitions from several drives together
331to create a much larger overall partition.
332Striping can also improve
333the performance of a filesystem by splitting I/O operations across two
334or more disks.
335The
336.Xr vinum 8
337and
338.Xr ccdconfig 8
339utilities may be used to create simple striped filesystems.
340Generally
341speaking, striping smaller partitions such as the root and
342.Pa /var/tmp ,
343or essentially read-only partitions such as
344.Pa /usr
345is a complete waste of time.
346You should only stripe partitions that require serious I/O performance,
347typically
348.Pa /var , /home ,
349or custom partitions used to hold databases and web pages.
350Choosing the proper stripe size is also
351important.
352Filesystems tend to store meta-data on power-of-2 boundaries
353and you usually want to reduce seeking rather than increase seeking.
354This
355means you want to use a large off-center stripe size such as 1152 sectors
356so sequential I/O does not seek both disks and so meta-data is distributed
357across both disks rather than concentrated on a single disk.
358If
359you really need to get sophisticated, we recommend using a real hardware
360RAID controller from the list of
361.Fx
362supported controllers.
363.Sh SYSCTL TUNING
364.Xr sysctl 8
365variables permit system behavior to be monitored and controlled at
366run-time.
367Some sysctls simply report on the behavior of the system; others allow
368the system behavior to be modified;
369some may be set at boot time using
370.Xr rc.conf 5 ,
371but most will be set via
372.Xr sysctl.conf 5 .
373There are several hundred sysctls in the system, including many that appear
374to be candidates for tuning but actually are not.
375In this document we will only cover the ones that have the greatest effect
376on the system.
377.Pp
378The
379.Va kern.ipc.shm_use_phys
380sysctl defaults to 0 (off) and may be set to 0 (off) or 1 (on).
381Setting
382this parameter to 1 will cause all System V shared memory segments to be
383mapped to unpageable physical RAM.
384This feature only has an effect if you
385are either (A) mapping small amounts of shared memory across many (hundreds)
386of processes, or (B) mapping large amounts of shared memory across any
387number of processes.
388This feature allows the kernel to remove a great deal
389of internal memory management page-tracking overhead at the cost of wiring
390the shared memory into core, making it unswappable.
391.Pp
392The
393.Va vfs.vmiodirenable
394sysctl defaults to 1 (on).
395This parameter controls how directories are cached
396by the system.
397Most directories are small and use but a single fragment
398(typically 1K) in the filesystem and even less (typically 512 bytes) in
399the buffer cache.
400However, when operating in the default mode the buffer
401cache will only cache a fixed number of directories even if you have a huge
402amount of memory.
403Turning on this sysctl allows the buffer cache to use
404the VM Page Cache to cache the directories.
405The advantage is that all of
406memory is now available for caching directories.
407The disadvantage is that
408the minimum in-core memory used to cache a directory is the physical page
409size (typically 4K) rather than 512 bytes.
410We recommend turning this option off in memory-constrained environments;
411however, when on, it will substantially improve the performance of services
412that manipulate a large number of files.
413Such services can include web caches, large mail systems, and news systems.
414Turning on this option will generally not reduce performance even with the
415wasted memory but you should experiment to find out.
416.Pp
417The
418.Va vfs.write_behind
419sysctl defaults to 1 (on).
420This tells the filesystem to issue media
421writes as full clusters are collected, which typically occurs when writing
422large sequential files.
423The idea is to avoid saturating the buffer
424cache with dirty buffers when it would not benefit I/O performance.
425However,
426this may stall processes and under certain circumstances you may wish to turn
427it off.
428.Pp
429The
430.Va vfs.hirunningspace
431sysctl determines how much outstanding write I/O may be queued to
432disk controllers system-wide at any given instance.
433The default is
434usually sufficient but on machines with lots of disks you may want to bump
435it up to four or five megabytes.
436Note that setting too high a value
437(exceeding the buffer cache's write threshold) can lead to extremely
438bad clustering performance.
439Do not set this value arbitrarily high!
440Also,
441higher write queueing values may add latency to reads occuring at the same
442time.
443.Pp
444There are various other buffer-cache and VM page cache related sysctls.
445We do not recommend modifying these values.
446As of
447.Fx 4.3 ,
448the VM system does an extremely good job tuning itself.
449.Pp
450The
451.Va net.inet.tcp.sendspace
452and
453.Va net.inet.tcp.recvspace
454sysctls are of particular interest if you are running network intensive
455applications.
456This controls the amount of send and receive buffer space
457allowed for any given TCP connection.
458The default sending buffer is 32K; the default receiving buffer
459is 64K.
460You can often
461improve bandwidth utilization by increasing the default at the cost of
462eating up more kernel memory for each connection.
463We do not recommend
464increasing the defaults if you are serving hundreds or thousands of
465simultaneous connections because it is possible to quickly run the system
466out of memory due to stalled connections building up.
467But if you need
468high bandwidth over a fewer number of connections, especially if you have
469gigabit Ethernet, increasing these defaults can make a huge difference.
470You can adjust the buffer size for incoming and outgoing data separately.
471For example, if your machine is primarily doing web serving you may want
472to decrease the recvspace in order to be able to increase the
473sendspace without eating too much kernel memory.
474Note that the routing table (see
475.Xr route 8 )
476can be used to introduce route-specific send and receive buffer size
477defaults.
478.Pp
479As an additional management tool you can use pipes in your
480firewall rules (see
481.Xr ipfw 8 )
482to limit the bandwidth going to or from particular IP blocks or ports.
483For example, if you have a T1 you might want to limit your web traffic
484to 70% of the T1's bandwidth in order to leave the remainder available
485for mail and interactive use.
486Normally a heavily loaded web server
487will not introduce significant latencies into other services even if
488the network link is maxed out, but enforcing a limit can smooth things
489out and lead to longer term stability.
490Many people also enforce artificial
491bandwidth limitations in order to ensure that they are not charged for
492using too much bandwidth.
493.Pp
494Setting the send or receive TCP buffer to values larger then 65535 will result
495in a marginal performance improvement unless both hosts support the window
496scaling extension of the TCP protocol, which is controlled by the
497.Va net.inet.tcp.rfc1323
498sysctl.
499These extensions should be enabled and the TCP buffer size should be set
500to a value larger than 65536 in order to obtain good performance out of
501certain types of network links; specifically, gigabit WAN links and
502high-latency satellite links.
503RFC1323 support is enabled by default.
504.Pp
505The
506.Va net.inet.tcp.always_keepalive
507sysctl determines whether or not the TCP implementation should attempt
508to detect dead TCP connections by intermittently delivering
509.Dq keepalives
510on the connection.
511By default, this is enabled for all applications; by setting this
512sysctl to 0, only applications that specifically request keepalives
513will use them.
514In most environments, TCP keepalives will improve the management of
515system state by expiring dead TCP connections, particularly for
516systems serving dialup users who may not always terminate individual
517TCP connections before disconnecting from the network.
518However, in some environments, temporary network outages may be
519incorrectly identified as dead sessions, resulting in unexpectedly
520terminated TCP connections.
521In such environments, setting the sysctl to 0 may reduce the occurrence of
522TCP session disconnections.
523.Pp
524The
525.Va net.inet.tcp.inflight_enable
526sysctl turns on bandwidth delay product limiting for all TCP connections.
527The system will attempt to calculate the bandwidth delay product for each
528connection and limit the amount of data queued to the network to just the
529amount required to maintain optimum throughput.  This feature is useful
530if you are serving data over modems, GigE, or high speed WAN links (or
531any other link with a high bandwidth*delay product), especially if you are
532also using window scaling or have configured a large send window.  If
533you enable this option you should also be sure to set
534.Va net.inet.tcp.inflight_debug
535to 0 (disable debugging), and for production use setting
536.Va net.inet.tcp.inflight_min
537to at least 6144 may be beneficial.  Note, however, that setting high
538minimums may effectively disable bandwidth limiting depending on the link.
539The limiting feature reduces the amount of data built up in intermediate
540router and switch packet queues as well as reduces the amount of data built
541up in the local host's interface queue.  With fewer packets queued up,
542interactive connections, especially over slow modems, will also be able
543to operate with lower round trip times.  However, note that this feature
544only effects data transmission (uploading / server-side).  It does not
545effect data reception (downloading).
546.Pp
547The
548.Va kern.ipc.somaxconn
549sysctl limits the size of the listen queue for accepting new TCP connections.
550The default value of 128 is typically too low for robust handling of new
551connections in a heavily loaded web server environment.
552For such environments,
553we recommend increasing this value to 1024 or higher.
554The service daemon
555may itself limit the listen queue size (e.g.\&
556.Xr sendmail 8 ,
557apache) but will
558often have a directive in its configuration file to adjust the queue size up.
559Larger listen queues also do a better job of fending off denial of service
560attacks.
561.Pp
562The
563.Va kern.maxfiles
564sysctl determines how many open files the system supports.
565The default is
566typically a few thousand but you may need to bump this up to ten or twenty
567thousand if you are running databases or large descriptor-heavy daemons.
568The read-only
569.Va kern.openfiles
570sysctl may be interrogated to determine the current number of open files
571on the system.
572.Pp
573The
574.Va vm.swap_idle_enabled
575sysctl is useful in large multi-user systems where you have lots of users
576entering and leaving the system and lots of idle processes.
577Such systems
578tend to generate a great deal of continuous pressure on free memory reserves.
579Turning this feature on and adjusting the swapout hysteresis (in idle
580seconds) via
581.Va vm.swap_idle_threshold1
582and
583.Va vm.swap_idle_threshold2
584allows you to depress the priority of pages associated with idle processes
585more quickly then the normal pageout algorithm.
586This gives a helping hand
587to the pageout daemon.
588Do not turn this option on unless you need it,
589because the tradeoff you are making is to essentially pre-page memory sooner
590rather then later, eating more swap and disk bandwidth.
591In a small system
592this option will have a detrimental effect but in a large system that is
593already doing moderate paging this option allows the VM system to stage
594whole processes into and out of memory more easily.
595.Sh LOADER TUNABLES
596Some aspects of the system behavior may not be tunable at runtime because
597memory allocations they perform must occur early in the boot process.
598To change loader tunables, you must set their values in
599.Xr loader.conf 5
600and reboot the system.
601.Pp
602.Va kern.maxusers
603controls the scaling of a number of static system tables, including defaults
604for the maximum number of open files, sizing of network memory resources, etc.
605As of
606.Fx 4.5 ,
607.Va kern.maxusers
608is automatically sized at boot based on the amount of memory available in
609the system, and may be determined at run-time by inspecting the value of the
610read-only
611.Va kern.maxusers
612sysctl.
613Some sites will require larger or smaller values of
614.Va kern.maxusers
615and may set it as a loader tunable; values of 64, 128, and 256 are not
616uncommon.
617We do not recommend going above 256 unless you need a huge number
618of file descriptors; many of the tunable values set to their defaults by
619.Va kern.maxusers
620may be individually overridden at boot-time or run-time as described
621elsewhere in this document.
622Systems older than
623.Fx 4.4
624must set this value via the kernel
625.Xr config 8
626option
627.Cd maxusers
628instead.
629.Pp
630.Va kern.ipc.nmbclusters
631may be adjusted to increase the number of network mbufs the system is
632willing to allocate.
633Each cluster represents approximately 2K of memory,
634so a value of 1024 represents 2M of kernel memory reserved for network
635buffers.
636You can do a simple calculation to figure out how many you need.
637If you have a web server which maxes out at 1000 simultaneous connections,
638and each connection eats a 16K receive and 16K send buffer, you need
639approximate 32MB worth of network buffers to deal with it.
640A good rule of
641thumb is to multiply by 2, so 32MBx2 = 64MB/2K = 32768.
642So for this case
643you would want to set
644.Va kern.ipc.nmbclusters
645to 32768.
646We recommend values between
6471024 and 4096 for machines with moderates amount of memory, and between 4096
648and 32768 for machines with greater amounts of memory.
649Under no circumstances
650should you specify an arbitrarily high value for this parameter, it could
651lead to a boot-time crash.
652The
653.Fl m
654option to
655.Xr netstat 1
656may be used to observe network cluster use.
657Older versions of
658.Fx
659do not have this tunable and require that the
660kernel
661.Xr config 8
662option
663.Dv NMBCLUSTERS
664be set instead.
665.Pp
666More and more programs are using the
667.Xr sendfile 2
668system call to transmit files over the network.
669The
670.Va kern.ipc.nsfbufs
671sysctl controls the number of filesystem buffers
672.Xr sendfile 2
673is allowed to use to perform its work.
674This parameter nominally scales
675with
676.Va kern.maxusers
677so you should not need to modify this parameter except under extreme
678circumstances.
679.Sh KERNEL CONFIG TUNING
680There are a number of kernel options that you may have to fiddle with in
681a large scale system.
682In order to change these options you need to be
683able to compile a new kernel from source.
684The
685.Xr config 8
686manual page and the handbook are good starting points for learning how to
687do this.
688Generally the first thing you do when creating your own custom
689kernel is to strip out all the drivers and services you do not use.
690Removing things like
691.Dv INET6
692and drivers you do not have will reduce the size of your kernel, sometimes
693by a megabyte or more, leaving more memory available for applications.
694.Pp
695.Dv SCSI_DELAY
696and
697.Dv IDE_DELAY
698may be used to reduce system boot times.
699The defaults are fairly high and
700can be responsible for 15+ seconds of delay in the boot process.
701Reducing
702.Dv SCSI_DELAY
703to 5 seconds usually works (especially with modern drives).
704Reducing
705.Dv IDE_DELAY
706also works but you have to be a little more careful.
707.Pp
708There are a number of
709.Dv *_CPU
710options that can be commented out.
711If you only want the kernel to run
712on a Pentium class CPU, you can easily remove
713.Dv I386_CPU
714and
715.Dv I486_CPU ,
716but only remove
717.Dv I586_CPU
718if you are sure your CPU is being recognized as a Pentium II or better.
719Some clones may be recognized as a Pentium or even a 486 and not be able
720to boot without those options.
721If it works, great!
722The operating system
723will be able to better-use higher-end CPU features for MMU, task switching,
724timebase, and even device operations.
725Additionally, higher-end CPUs support
7264MB MMU pages which the kernel uses to map the kernel itself into memory,
727which increases its efficiency under heavy syscall loads.
728.Sh IDE WRITE CACHING
729.Fx 4.3
730flirted with turning off IDE write caching.
731This reduced write bandwidth
732to IDE disks but was considered necessary due to serious data consistency
733issues introduced by hard drive vendors.
734Basically the problem is that
735IDE drives lie about when a write completes.
736With IDE write caching turned
737on, IDE hard drives will not only write data to disk out of order, they
738will sometimes delay some of the blocks indefinitely when under heavy disk
739loads.
740A crash or power failure can result in serious filesystem
741corruption.
742So our default was changed to be safe.
743Unfortunately, the
744result was such a huge loss in performance that we caved in and changed the
745default back to on after the release.
746You should check the default on
747your system by observing the
748.Va hw.ata.wc
749sysctl variable.
750If IDE write caching is turned off, you can turn it back
751on by setting the
752.Va hw.ata.wc
753loader tunable to 1.
754More information on tuning the ATA driver system may be found in
755.Xr ata 4 .
756.Pp
757There is a new experimental feature for IDE hard drives called
758.Va hw.ata.tags
759(you also set this in the boot loader) which allows write caching to be safely
760turned on.
761This brings SCSI tagging features to IDE drives.
762As of this
763writing only IBM DPTA and DTLA drives support the feature.
764Warning!
765These
766drives apparently have quality control problems and I do not recommend
767purchasing them at this time.
768If you need performance, go with SCSI.
769.Sh CPU, MEMORY, DISK, NETWORK
770The type of tuning you do depends heavily on where your system begins to
771bottleneck as load increases.
772If your system runs out of CPU (idle times
773are perpetually 0%) then you need to consider upgrading the CPU or moving to
774an SMP motherboard (multiple CPU's), or perhaps you need to revisit the
775programs that are causing the load and try to optimize them.
776If your system
777is paging to swap a lot you need to consider adding more memory.
778If your
779system is saturating the disk you typically see high CPU idle times and
780total disk saturation.
781.Xr systat 1
782can be used to monitor this.
783There are many solutions to saturated disks:
784increasing memory for caching, mirroring disks, distributing operations across
785several machines, and so forth.
786If disk performance is an issue and you
787are using IDE drives, switching to SCSI can help a great deal.
788While modern
789IDE drives compare with SCSI in raw sequential bandwidth, the moment you
790start seeking around the disk SCSI drives usually win.
791.Pp
792Finally, you might run out of network suds.
793The first line of defense for
794improving network performance is to make sure you are using switches instead
795of hubs, especially these days where switches are almost as cheap.
796Hubs
797have severe problems under heavy loads due to collision backoff and one bad
798host can severely degrade the entire LAN.
799Second, optimize the network path
800as much as possible.
801For example, in
802.Xr firewall 7
803we describe a firewall protecting internal hosts with a topology where
804the externally visible hosts are not routed through it.
805Use 100BaseT rather
806than 10BaseT, or use 1000BaseT rather then 100BaseT, depending on your needs.
807Most bottlenecks occur at the WAN link (e.g.\&
808modem, T1, DSL, whatever).
809If expanding the link is not an option it may be possible to use
810.Xr dummynet 4
811feature to implement peak shaving or other forms of traffic shaping to
812prevent the overloaded service (such as web services) from affecting other
813services (such as email), or vice versa.
814In home installations this could
815be used to give interactive traffic (your browser,
816.Xr ssh 1
817logins) priority
818over services you export from your box (web services, email).
819.Sh SEE ALSO
820.Xr netstat 1 ,
821.Xr systat 1 ,
822.Xr ata 4 ,
823.Xr dummynet 4 ,
824.Xr login.conf 5 ,
825.Xr rc.conf 5 ,
826.Xr sysctl.conf 5 ,
827.Xr firewall 7 ,
828.Xr hier 7 ,
829.Xr ports 7 ,
830.Xr boot 8 ,
831.Xr ccdconfig 8 ,
832.Xr config 8 ,
833.Xr disklabel 8 ,
834.Xr fsck 8 ,
835.Xr ifconfig 8 ,
836.Xr ipfw 8 ,
837.Xr loader 8 ,
838.Xr mount 8 ,
839.Xr newfs 8 ,
840.Xr route 8 ,
841.Xr sysctl 8 ,
842.Xr sysinstall 8 ,
843.Xr tunefs 8 ,
844.Xr vinum 8
845.Sh HISTORY
846The
847.Nm
848manual page was originally written by
849.An Matthew Dillon
850and first appeared
851in
852.Fx 4.3 ,
853May 2001.
854