xref: /titanic_41/usr/src/lib/libshell/misc/shell_styleguide.docbook (revision b98131cff90a91303826565dacf89c46a422e6c5)
1<?xml version="1.0"?>
2<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V5.0//EN" "http://www.oasis-open.org/docbook/xml/5.0b5/dtd/docbook.dtd" [
3    <!ENTITY tag_bourneonly   '<inlinemediaobject><imageobject><imagedata fileref="images/tag_bourne.png"></imagedata></imageobject><textobject><phrase>[Bourne]</phrase></textobject></inlinemediaobject> '>
4    <!ENTITY tag_kshonly      '<inlinemediaobject><imageobject><imagedata fileref="images/tag_ksh.png"></imagedata></imageobject><textobject><phrase>[ksh]</phrase></textobject></inlinemediaobject> '>
5    <!ENTITY tag_ksh88only    '<inlinemediaobject><imageobject><imagedata fileref="images/tag_ksh88.png"></imagedata></imageobject><textobject><phrase>[ksh88]</phrase></textobject></inlinemediaobject> '>
6    <!ENTITY tag_ksh93only    '<inlinemediaobject><imageobject><imagedata fileref="images/tag_ksh93.png"></imagedata></imageobject><textobject><phrase>[ksh93]</phrase></textobject></inlinemediaobject> '>
7    <!ENTITY tag_performance  '<inlinemediaobject><imageobject><imagedata fileref="images/tag_perf.png"></imagedata></imageobject><textobject><phrase>[perf]</phrase></textobject></inlinemediaobject> '>
8    <!ENTITY tag_i18n         '<inlinemediaobject><imageobject><imagedata fileref="images/tag_i18n.png"></imagedata></imageobject><textobject><phrase>[i18n]</phrase></textobject></inlinemediaobject> '>
9    <!ENTITY tag_l10n         '<inlinemediaobject><imageobject><imagedata fileref="images/tag_l10n.png"></imagedata></imageobject><textobject><phrase>[l10n]</phrase></textobject></inlinemediaobject> '>
10]>
11<!--
12
13 CDDL HEADER START
14
15 The contents of this file are subject to the terms of the
16 Common Development and Distribution License (the "License").
17 You may not use this file except in compliance with the License.
18
19 You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
20 or http://www.opensolaris.org/os/licensing.
21 See the License for the specific language governing permissions
22 and limitations under the License.
23
24 When distributing Covered Code, include this CDDL HEADER in each
25 file and include the License file at usr/src/OPENSOLARIS.LICENSE.
26 If applicable, add the following below this CDDL HEADER, with the
27 fields enclosed by brackets "[]" replaced with your own identifying
28 information: Portions Copyright [yyyy] [name of copyright owner]
29
30 CDDL HEADER END
31
32-->
33
34<!--
35
36 Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
37 Use is subject to license terms.
38
39-->
40
41<!-- tag images were created like this:
42$ (text="perf" ;
43   pbmtext -nomargins -lspace 0 -builtin fixed "${text}" |
44       pbmtopgm 1 1 |
45       pgmtoppm 1.0,1.0,1.0-0,0,0 /dev/stdin |
46       ppmtogif |
47       giftopnm |
48       pnmtopng >"tag_${text}.png")
49-->
50
51<!-- compile with:
52xsltproc &minus;&minus;stringparam generate.section.toc.level 0 \
53         &minus;&minus;stringparam toc.max.depth 3 \
54         &minus;&minus;stringparam toc.section.depth 12 \
55         &minus;&minus;xinclude -o opensolaris_shell_styleguide.html /usr/share/sgml/docbook/docbook-xsl-stylesheets-1.69.1/html/docbook.xsl opensolaris_shell_styleguide.docbook
56-->
57
58<article
59    xmlns:xlink="http://www.w3.org/1999/xlink"
60    xmlns="http://docbook.org/ns/docbook"
61    xml:lang="en">
62    <!-- xmlns:xi="http://www.w3.org/2001/XInclude" -->
63
64  <info>
65    <title><emphasis>[DRAFT]</emphasis> Bourne/Korn Shell Coding Conventions</title>
66
67    <!-- subtitle abuse -->
68    <subtitle>
69      This page is currently work-in-progress until it is approved by the OS/Net community. Please send any comments to
70      <email>shell-discuss@opensolaris.org</email>.
71    </subtitle>
72
73
74    <authorgroup>
75<!--
76        <author><personname>David G. Korn</personname><email>dgk@research.att.com</email></author>
77        <author><personname>Roland Mainz</personname><email>roland.mainz@nrubsig.org</email></author>
78        <author><personname>Mike Shapiro</personname><email>mike.shapiro@sun.com</email></author>
79-->
80        <author><orgname>OpenSolaris.org</orgname></author>
81    </authorgroup>
82  </info>
83
84<section xml:id="intro">
85  <title>Intro</title>
86  <para>This document describes the shell coding style used for all the SMF script changes integrated into (Open)Solaris.</para>
87  <para>All new SMF shell code should conform to this coding standard, which is intended to match our existing C coding standard.</para>
88  <para>When in doubt, think "what would be the C-Style equivalent ?" and "What does the POSIX (shell) standard say ?"</para>
89</section><!-- end of intro -->
90
91
92<section xml:id="rules">
93  <title>Rules</title>
94
95
96
97  <section xml:id="general">
98  <title>General</title>
99
100      <section xml:id="basic_format">
101          <title>Basic Format</title>
102          <para>Similar to <literal>cstyle</literal>, the basic format is that all
103          lines are indented by TABs or eight spaces, and continuation lines (which
104          in the shell end with "\") are indented by an equivalent number of TABs
105          and then an additional four spaces, e.g.
106<programlisting>
107cp foo bar
108cp some_realllllllllllllllly_realllllllllllllly_long_path \
109   to_another_really_long_path
110</programlisting>
111          </para>
112          <para>The encoding used for the shell scripts is either <literal>ASCII</literal>
113          or <literal>UTF-8</literal>, alternative encodings are only allowed when the
114          application requires this.</para>
115      </section>
116
117
118      <section xml:id="commenting">
119          <title>Commenting</title>
120          <para>Shell comments are preceded by the '<literal>#</literal>' character. Place
121          single-line comments in the right-hand margin. Use an extra '<literal>#</literal>'
122          above and below the comment in the case of multi-line comments:
123<programlisting>
124cp foo bar		# Copy foo to bar
125
126#
127# Modify the permissions on bar.  We need to set them to root/sys
128# in order to match the package prototype.
129#
130chown root bar
131chgrp sys bar
132</programlisting>
133          </para>
134      </section>
135
136
137      <section xml:id="interpreter_magic">
138          <title>Interpreter magic</title>
139          <para>The proper interpreter magic for your shell script should be one of these:
140<programlisting>
141#!/bin/sh        Standard Bourne shell script
142#!/bin/ksh -p    Standard Korn shell 88 script.  You should always write ksh
143                 scripts with -p so that ${ENV} (if set by the user) is not
144                 sourced into your script by the shell.
145#!/bin/ksh93     Standard Korn shell 93 script (-p is not needed since ${ENV} is
146                 only used for interactive shell sessions).
147</programlisting>
148          </para>
149      </section>
150
151
152      <section xml:id="harden_your_script_against_unexpected_input">
153          <title>Harden the script against unexpected (user) input</title>
154          <para>Harden your script against unexpected (user) input, including
155          command line options, filenames with blanks (or other special
156          characters) in the name, or file input</para>
157      </section>
158
159
160      <section xml:id="use_builtin_commands">
161          <title>&tag_kshonly;&tag_performance;Use builtin commands if the shell provides them</title>
162          <para>
163          Use builtin commands if the shell provides them. For example ksh93s+
164          (ksh93, version 's+') delivered with Solaris (as defined by PSARC 2006/550)
165          supports the following builtins:
166          <simplelist type="inline">
167          <member>basename</member>
168          <member>cat</member>
169          <member>chgrp</member>
170          <member>chmod</member>
171          <member>chown</member>
172          <member>cmp</member>
173          <member>comm</member>
174          <member>cp</member>
175          <member>cut</member>
176          <member>date</member>
177          <member>dirname</member>
178          <member>expr</member>
179          <member>fds</member>
180          <member>fmt</member>
181          <member>fold</member>
182          <member>getconf</member>
183          <member>head</member>
184          <member>id</member>
185          <member>join</member>
186          <member>ln</member>
187          <member>logname</member>
188          <member>mkdir</member>
189          <member>mkfifo</member>
190          <member>mv</member>
191          <member>paste</member>
192          <member>pathchk</member>
193          <member>rev</member>
194          <member>rm</member>
195          <member>rmdir</member>
196          <member>stty</member>
197          <member>tail</member>
198          <member>tee</member>
199          <member>tty</member>
200          <member>uname</member>
201          <member>uniq</member>
202          <member>wc</member>
203          <member>sync</member>
204          </simplelist>
205          Those builtins can be enabled via <literal>$ builtin name_of_builtin #</literal> in shell
206          scripts (note that ksh93 builtins implement exact POSIX behaviour - some
207          commands in Solaris <filename>/usr/bin/</filename> directory implement pre-POSIX behaviour.
208          Add <literal>/usr/xpg6/bin/:/usr/xpg4/bin</literal> before
209          <filename>/usr/bin/</filename> in <envar>${PATH}</envar> to test whether your script works with
210          the XPG6/POSIX versions)
211          </para>
212      </section>
213
214
215      <section xml:id="use_blocks_not_subshells">
216          <title>&tag_performance;Use blocks and not subshells if possible</title>
217          <para>Use blocks and not subshells if possible, e.g. use
218          <literal>$ { print "foo" ; print "bar" ; }</literal> instead of
219          <literal>$ (print "foo" ; print "bar") #</literal> - blocks are
220          faster since they do not require to save the subshell context (ksh93) or
221          trigger a shell child process (Bourne shell, bash, ksh88 etc.)
222          </para>
223      </section>
224
225
226      <section xml:id="use_long_options_for_set_builtin">
227           <title>&tag_kshonly; use long options for "<literal>set</literal>"</title>
228           <para>use long options for "<literal>set</literal>", for example instead of <literal>$ set -x #</literal>
229           use <literal>$ set -o xtrace #</literal> to make the code more readable.</para>
230      </section>
231
232
233      <section xml:id="use_posix_command_substitutions_syntax">
234          <title>&tag_kshonly; Use <literal>$(...)</literal> instead of <literal>`...`</literal> command substitutions</title>
235          <para>Use <literal>$(...)</literal> instead of <literal>`...`</literal> - <literal>`...`</literal>
236          is an obsolete construct in ksh+POSIX sh scripts and <literal>$(...)</literal>.is a cleaner design,
237          requires no escaping rules, allows easy nesting etc.</para>
238
239          <note><title>&tag_ksh93only; <literal>${ ...;}</literal>-style command substitutions</title>
240          <para>ksh93 has support for an alternative version of command substitutions with the
241          syntax <literal>${ ...;}</literal> which do not run in a subshell.
242          </para></note>
243      </section>
244
245
246      <section xml:id="put_command_substitution_result_in_quotes">
247          <title>&tag_kshonly; Always put the result of a <literal>$(...)</literal> or
248          <literal>$( ...;)</literal> command substitution in quotes</title>
249          <para>Always put the result of <literal>$( ... )</literal> or <literal>$( ...;)</literal> in
250          quotes (e.g. <literal>foo="$( ... )"</literal> or <literal>foo="$( ...;)"</literal>) unless
251          there is a very good reason for not doing it</para>
252      </section>
253
254
255      <section xml:id="always_set_path">
256          <title>Scripts should always set their <envar>PATH</envar></title>
257          <para>Scripts should always set their <envar>PATH</envar> to make sure they do not use
258          alternative commands by accident (unless the value of <envar>PATH</envar> is well-known
259          and guaranteed to be set by the caller)</para>
260      </section>
261
262
263      <section xml:id="make_sure_commands_are_available">
264          <title>Make sure that commands from other packages/applications are really installed on the machine</title>
265          <para>Scripts should make sure that commands in optional packages are really
266          there, e.g. add a "precheck" block in scipts to avoid later failure when
267          doing the main job</para>
268      </section>
269
270
271      <section xml:id="check_usage_of_boolean_variables">
272          <title>Check how boolean values are used/implemented in your application</title>
273          <para>Check how boolean values are used in your application.</para>
274          <para>For example:
275<programlisting>
276mybool=0
277# do something
278if [ $mybool -eq 1 ] ; then do_something_1 ; fi
279</programlisting>
280could be rewritten like this:
281<programlisting>
282mybool=false # (valid values are "true" or "false", pointing
283# to the builtin equivalents of /bin/true or /bin/false)
284# do something
285if ${mybool} ; then do_something_1 ; fi
286</programlisting>
287or
288<programlisting>
289integer mybool=0 # values are 0 or 1
290# do something
291if (( mybool==1 )) ; then do_something_1 ; fi
292</programlisting>
293          </para>
294      </section>
295
296      <section xml:id="shell_uses_characters_not_bytes">
297          <title>&tag_i18n;The shell always operates on <emphasis>characters</emphasis> not bytes</title>
298          <para>Shell scripts operate on characters and <emphasis>not</emphasis> bytes.
299          Some locales use multiple bytes (called "multibyte locales") to represent one character</para>
300
301          <note><para>ksh93 has support for binary variables which explicitly
302          operate on bytes, not characters. This is the <emphasis>only</emphasis> allowed
303          exception.</para></note>
304      </section>
305
306
307      <section xml:id="multibyte_locale_input">
308          <title>&tag_i18n;Multibyte locales and input</title>
309          <para>Think about whether your application has to handle file names or
310          variables in multibyte locales and make sure all commands used in your
311          script can handle such characters (e.g. lots of commands in Solaris's
312          <filename>/usr/bin/</filename> are <emphasis>not</emphasis> able to handle such values - either use ksh93
313          builtin constructs (which are guaranteed to be multibyte-aware) or
314          commands from <filename>/usr/xpg4/bin/</filename> and/or <filename>/usr/xpg6/bin</filename>)
315          </para>
316      </section>
317
318
319      <section xml:id="use_external_filters_only_for_large_datasets">
320          <title>&tag_performance;Only use external filters like <literal>grep</literal>/<literal>sed</literal>/<literal>awk</literal>/etc.
321          if you want to process lots of data with them</title>
322          <para>Only use external filters like <literal>grep</literal>/<literal>sed</literal>/<literal>awk</literal>/etc.
323          if a significant amount of data is processed by the filter or if
324          benchmarking shows that the use of builtin commands is significantly slower
325          (otherwise the time and resources needed to start the filter are
326          far greater then the amount of data being processed,
327          creating a performance problem).</para>
328          <para>For example:
329<programlisting>
330if [ "$(echo "$x" | egrep '.*foo.*')" != "" ] ; then
331    do_something ;
332done
333</programlisting>
334can be re-written using ksh93 builtin constructs, saving several
335<literal>|fork()|+|exec()|</literal>'s:
336<programlisting>
337if [[ "${x}" == ~(E).*foo.* ]] ; then
338    do_something ;
339done
340</programlisting>
341          </para>
342      </section>
343
344
345      <section xml:id="use_dashdash_if_first_arg_is_variable">
346          <title>If the first operand of a command is a variable, use <literal>--</literal></title>
347          <para>If the first operand of a command is a variable, use <literal>--</literal>
348          for any command that accepts this as end of argument to
349          avoid problems if the variable expands to a value starting with <literal>-</literal>.
350          </para>
351          <note><para>
352          At least
353          <simplelist type="inline">
354              <member>print</member>
355              <member>/usr/bin/fgrep</member><member>/usr/xpg4/bin/fgrep</member>
356              <member>/usr/bin/grep</member> <member>/usr/xpg4/bin/grep</member>
357              <member>/usr/bin/egrep</member><member>/usr/xpg4/bin/egrep</member>
358          </simplelist>
359          support <literal>--</literal> as "end of arguments"-terminator.
360          </para></note>
361      </section>
362
363      <section xml:id="use_export">
364          <title>&tag_kshonly;&tag_performance;Use <literal>$ export FOOBAR=val #</literal> instead of
365          <literal>$ FOOBAR=val ; export FOOBAR #</literal></title>
366          <para>Use <literal>$ export FOOBAR=val # instead of $ FOOBAR=val ; export FOOBAR #</literal> -
367          this is much faster.</para>
368      </section>
369
370
371      <section xml:id="use_subshell_around_set_dashdash_usage">
372          <title>Use a subshell (e.g. <literal>$ ( mycmd ) #</literal>) around places which use
373              <literal>set -- $(mycmd)</literal> and/or <literal>shift</literal></title>
374          <para>Use a subshell (e.g. <literal>$ ( mycmd ) #</literal>) around places which use
375          <literal>set -- $(mycmd)</literal> and/or <literal>shift</literal> unless the variable
376          affected is either a local one or if it's guaranteed that this variable will no longer be used
377          (be careful for loadable functions, e.g. ksh/ksh93's <literal>autoload</literal> !!!!)
378          </para>
379      </section>
380
381
382      <section xml:id="be_careful_with_tabs_in_script_code">
383          <title>Be careful with using TABS in script code, they are not portable
384          between editors or platforms</title>
385          <para>Be careful with using TABS in script code, they are not portable
386          between editors or platforms.</para>
387          <para>If you use ksh93 use <literal>$'\t'</literal> to include TABs in sources, not the TAB character itself.</para>
388      </section>
389
390
391      <section xml:id="centralise_error_exit">
392           <title>If you have multiple points where your application exits with an error
393           message create a central function for this purpose</title>
394           <para>If you have multiple points where your application exits with an error
395           message create a central function for this, e.g.
396<programlisting>
397if [ -z "$tmpdir" ] ; then
398        print -u2 "mktemp failed to produce output; aborting."
399        exit 1
400fi
401if [ ! -d $tmpdir ] ; then
402        print -u2 "mktemp failed to create a directory; aborting."
403        exit 1
404fi
405</programlisting>
406should be replaced with
407<programlisting>
408function fatal_error
409{
410    print -u2 "${progname}: $*"
411    exit 1
412}
413# do something (and save ARGV[0] to variable "progname")
414if [ -z "$tmpdir" ] ; then
415        fatal_error "mktemp failed to produce output; aborting."
416fi
417if [ ! -d "$tmpdir" ] ; then
418        fatal_error "mktemp failed to create a directory; aborting."
419fi
420</programlisting>
421          </para>
422      </section>
423
424
425      <section xml:id="use_set_o_nounset">
426          <title>&tag_kshonly; Think about using <literal>$ set -o nounset #</literal> by default</title>
427          <para>Think about using <literal>$ set -o nounset #</literal> by default (or at least during the
428    script's development phase) to catch errors where variables are used
429    when they are not set (yet), e.g.
430<screen>
431$ <userinput>(set -o nounset ; print ${foonotset})</userinput>
432<computeroutput>/bin/ksh93: foonotset: parameter not set</computeroutput>
433</screen>
434           </para>
435      </section>
436
437
438      <section xml:id="avoid_eval_builtin">
439          <title>Avoid using <literal>eval</literal> unless absolutely necessary</title>
440          <para>Avoid using <literal>eval</literal> unless absolutely necessary.  Subtle things
441          can happen when a string is passed back through the shell
442          parser.  You can use name references to avoid uses such as
443          <literal>eval $name="$value"</literal>.
444          </para>
445      </section>
446
447
448      <section xml:id="use_concatenation_operator">
449          <title>&tag_ksh93only;Use the string/array concatenation operator <literal>+=</literal></title>
450          <para>Use <literal>+=</literal> instead of manually adding strings/array elements, e.g.
451<programlisting>
452foo=""
453foo="${foo}a"
454foo="${foo}b"
455foo="${foo}c"
456</programlisting>
457should be replaced with
458<programlisting>
459foo=""
460foo+="a"
461foo+="b"
462foo+="c"
463</programlisting>
464          </para>
465      </section>
466
467      <section xml:id="use_source_not_dot">
468          <title>&tag_ksh93only;Use <literal>source</literal> instead of '<literal>.</literal> '(dot)
469          to include other shell script fragments</title>
470          <para>Use <literal>source</literal> instead of '<literal>.</literal>'
471          (dot) to include other shell script fragments - the new form is much
472          more readable than the tiny dot and a failure can be caught within the script.</para>
473      </section>
474
475
476      <section xml:id="use_builtin_localisation_support">
477          <title>&tag_ksh93only;&tag_performance;&tag_l10n;Use <literal>$"..."</literal> instead of
478          <literal>gettext ... "..."</literal> for strings that need to be localized for different locales</title>
479          <para>Use $"..." instead of <literal>gettext ... "..."</literal> for strings that need to be
480          localized for different locales. <literal>gettext</literal> will require a
481          <literal>fork()+exec()</literal> and
482          reads the whole catalog each time it's called, creating a huge overhead for localisation
483          (and the  <literal>$"..."</literal> is easier to use, e.g. you only have to put a
484          <literal>$</literal> in front of the catalog and the string will be localised).
485          </para>
486      </section>
487
488
489      <section xml:id="use_set_o_noglob">
490          <title>&tag_kshonly;&tag_performance;Use <literal>set -o noglob</literal> if you do not need to expand files</title>
491          <para>If you don't expect to expand files, you can do set <literal>-f</literal>
492          (<literal>set -o noglob</literal>) as well.  This way the need to use <literal>""</literal> is
493          greatly reduced.</para>
494      </section>
495
496
497      <section xml:id="use_empty_ifs_to_handle_spaces">
498          <title>&tag_ksh93only;Use <literal>IFS=</literal> to avoid problems with spaces in filenames</title>
499          <para>Unless you want to do word splitting, put <literal>IFS=</literal>
500          at the beginning of a command.  This way spaces in
501          file names won't be a problem.  You can do
502          <literal>IFS='delims' read -r</literal> line
503          to override <envar>IFS</envar> just for the <literal>read</literal> command.  However,
504          you can't do this for the <literal>set</literal> builtin.</para>
505      </section>
506
507
508      <section xml:id="set_locale_when_comparing_against_localised_output">
509          <title>Set the message locale if you process output of tools which may be localised</title>
510          <para>Set the message locale (<envar>LC_MESSAGES</envar>) if you process output of tools which may be localised</para>
511          <example><title>Set <envar>LC_MESSAGES</envar> when testing for specific outout of the <filename>/usr/bin/file</filename> utility:</title>
512<programlisting>
513# set french as default message locale
514export LC_MESSAGES=fr_FR.UTF-8
515
516...
517
518# test whether the file "/tmp" has the filetype "directory" or not
519# we set LC_MESSAGES to "C" to ensure the returned message is in english
520if [[ "$(LC_MESSAGES=C file /tmp)" = *directory ]] ; then
521    print "is a directory"
522fi
523</programlisting>
524          <note><para>The environment variable <envar>LC_ALL</envar> always
525          overrides any other <envar>LC_*</envar> environment variables
526          (and <envar>LANG</envar>, too),
527          including <envar>LC_MESSAGES</envar>.
528          if there is the chance that <envar>LC_ALL</envar> may be set
529          replace <envar>LC_MESSAGES</envar> with <envar>LC_ALL</envar>
530          in the example above.</para></note>
531          </example>
532      </section>
533
534      <section xml:id="cleanup_after_yourself">
535          <title>Cleanup after yourself.</title>
536          <para>Cleanup after yourself. For example ksh/ksh93 have an <literal>EXIT</literal> trap which
537          is very useful for this.
538          </para>
539          <note><para>
540          Note that the <literal>EXIT</literal> trap is executed for a subshell and each subshell
541          level can run it's own <literal>EXIT</literal> trap, for example
542<screen>
543$ <userinput>(trap "print bam" EXIT ; (trap "print snap" EXIT ; print "foo"))</userinput>
544<computeroutput>foo
545snap
546bam</computeroutput>
547</screen>
548          </para></note>
549      </section>
550
551      <section xml:id="use_proper_exit_code">
552          <title>Use a proper <literal>exit</literal> code</title>
553          <para>Explicitly set the exit code of a script, otherwise the exit code
554          from the last command executed will be used which may trigger problems
555          if the value is unexpected.</para>
556      </section>
557
558
559      <section xml:id="shell_lint">
560          <title>&tag_ksh93only;Use <literal>shcomp -n scriptname.sh /dev/null</literal> to check for common errors</title>
561          <para>Use <literal>shcomp -n scriptname.sh /dev/null</literal> to
562          check for common problems (such as insecure, depreciated or ambiguous constructs) in shell scripts.</para>
563      </section>
564  </section><!-- end of general -->
565
566
567
568
569
570  <section xml:id="functions">
571      <title>Functions</title>
572
573      <section xml:id="use_functions">
574          <title>Use functions to break up your code</title>
575          <para>Use functions to break up your code into smaller, logical blocks.</para>
576      </section>
577
578      <section xml:id="do_not_reserved_keywords_for_function_names">
579          <title>Do not use function names which are reserved keywords in C/C++/JAVA or the POSIX shell standard</title>
580          <para>Do not use function names which are reserved keywords (or function names) in C/C++/JAVA or the POSIX shell standard
581          (to avoid confusion and/or future changes/updates to the shell language).
582          </para>
583      </section>
584
585      <section xml:id="use_ksh_style_function_syntax">
586          <title>&tag_kshonly;&tag_performance;Use ksh-style <literal>function</literal></title>
587          <para>It is <emphasis>highly</emphasis> recommended to use ksh style functions
588          (<literal>function foo { ... }</literal>) instead
589          of Bourne-style functions (<literal>foo() { ... }</literal>) if possible
590          (and local variables instead of spamming the global namespace).</para>
591
592          <warning><para>
593          The difference between old-style Bourne functions and ksh functions is one of the major differences
594          between ksh88 and ksh93 - ksh88 allowed variables to be local for Bourne-style functions while ksh93
595          conforms to the POSIX standard and will use a function-local scope for variables declared in
596          Bourne-style functions.</para>
597          <para>Example (note that "<literal>integer</literal>" is an alias for "<literal>typeset -li</literal>"):
598<programlisting>
599# new style function with local variable
600$ ksh93 -c 'integer x=2 ; function foo { integer x=5 ; } ; print "x=$x"
601; foo ; print "x=$x" ;'
602x=2
603x=2
604# old style function with an attempt to create a local variable
605$ ksh93 -c 'integer x=2 ; foo() { integer x=5 ; } ; print "x=$x" ; foo ;
606print "x=$x" ;'
607x=2
608x=5
609</programlisting>
610
611          <uri xlink:href="http://www.opensolaris.org/os/project/ksh93-integration/docs/ksh93r/general/compatibility/">usr/src/lib/libshell/common/COMPATIBILITY</uri>
612          says about this issue:
613<blockquote><para>
614Functions, defined with name() with ksh-93 are compatible with
615the POSIX standard, not with ksh-88.  No local variables are
616permitted, and there is no separate scope.  Functions defined
617with the function name syntax, maintain compatibility.
618This also affects function traces.
619</para></blockquote>
620(this issue also affects <filename>/usr/xpg4/bin/sh</filename> in Solaris 10 because it is based on ksh88. This is a bug.).
621          </para></warning>
622
623      </section>
624
625
626      <section xml:id="use_proper_return_code">
627          <title>Use a proper <literal>return</literal> code</title>
628          <para>Explicitly set the return code of a function - otherwise the exit code
629          from the last command executed will be used which may trigger problems
630          if the value is unexpected.</para>
631          <para>The only allowed exception is if a function uses the shell's <literal>errexit</literal> mode to leave
632          a function, subshell or the script if a command returns a non-zero exit code.
633          </para>
634      </section>
635
636      <section xml:id="use_fpath_to_load_common_code">
637          <title>&tag_kshonly;Use <envar>FPATH</envar> to load common functions, not <literal>source</literal></title>
638          <para>
639          Use the ksh <envar>FPATH</envar> (function path) feature to load functions which are shared between scripts
640          and not <literal>source</literal> - this allows to load such a function on demand and not all at once.</para>
641      </section>
642
643  </section><!-- end of functions -->
644
645
646
647
648  <section xml:id="if_for_while">
649      <title><literal>if</literal>, <literal>for</literal> and <literal>while</literal></title>
650
651      <section xml:id="if_for_while_format">
652          <title>Format</title>
653          <para>To match <literal>cstyle</literal>, the shell token equivalent to the <literal>C</literal>
654          "<literal>{</literal>" should appear on the same line, separated by a
655          "<literal>;</literal>", as in:
656<programlisting>
657if [ "$x" = "hello" ] ; then
658    echo $x
659fi
660
661if [[ "$x" = "hello" ]] ; then
662    print $x
663fi
664
665for i in 1 2 3; do
666    echo $i
667done
668
669for ((i=0 ; i &lt; 3 ; i++)); do
670    print $i
671done
672
673while [ $# -gt 0 ]; do
674    echo $1
675    shift
676done
677
678while (( $# &gt; 0 )); do
679  print $1
680  shift
681done
682</programlisting>
683          </para>
684      </section>
685
686
687      <section xml:id="test_builtin">
688          <title><literal>test</literal> Builtin</title>
689          <para>DO NOT use the test builtin. Sorry, executive decision.</para>
690          <para>In our Bourne shell, the <literal>test</literal> built-in is the same as the "["
691          builtin (if you don't believe me, try "type test" or refer to <filename>usr/src/cmd/sh/msg.c</filename>).</para>
692          <para>
693          So please do not write:
694<programlisting>
695if test $# -gt 0 ; then
696</programlisting>
697instead use:
698<programlisting>
699if [ $# -gt 0 ] ; then
700</programlisting>
701          </para>
702      </section>
703
704
705      <section xml:id="use_ksh_test_syntax">
706           <title>&tag_kshonly;&tag_performance;Use "<literal>[[ expr ]]</literal>" instead of "<literal>[ expr ]</literal>"</title>
707           <para>Use "<literal>[[ expr ]]</literal>" instead of "<literal>[ expr ]</literal>" if possible
708           since it avoids going through the whole pattern expansion/etc. machinery and
709           adds additional operators not available in the Bourne shell, such as short-circuit
710           <literal>&amp;&amp;</literal> and <literal>||</literal>.
711           </para>
712      </section>
713
714
715      <section xml:id="use_posix_arithmetic_expressions">
716          <title>&tag_kshonly; Use "<literal>(( ... ))</literal>" for arithmetic expressions</title>
717          <para>Use "<literal>(( ... ))</literal>" instead of "<literal>[ expr ]</literal>"
718          or "<literal>[[ expr ]]</literal>" expressions.
719          </para>
720          <para>
721          Example: Replace
722<programlisting>
723i=5
724# do something
725if [ $i -gt 5 ] ; then
726</programlisting>
727with
728<programlisting>
729i=5
730# do something
731if (( i &gt; 5 )) ; then
732</programlisting>
733          </para>
734      </section>
735
736
737      <section xml:id="compare_exit_code_using_math">
738          <title>&tag_kshonly;&tag_performance;Compare exit code using arithmetic expressions expressions</title>
739          <para>Use POSIX arithmetic expressions to test for exit/return codes of commands and functions.
740          For example turn
741<programlisting>
742if [ $? -gt 0 ] ; then
743</programlisting>
744into
745<programlisting>
746if (( $? &gt; 0 )) ; then
747</programlisting>
748         </para>
749      </section>
750
751
752      <section xml:id="use_builtin_commands_in_loops">
753         <title>&tag_bourneonly; Use builtin commands in conditions for <literal>while</literal> endless loops</title>
754         <para>Make sure that your shell has a "<literal>true</literal>" builtin (like ksh93) when
755         executing endless loops like <literal>$ while true ; do do_something ; done #</literal> -
756         otherwise each loop cycle runs a <literal>|fork()|+|exec()|</literal>-cycle to run
757         <filename>/bin/true</filename>
758         </para>
759      </section>
760
761
762      <section xml:id="single_line_if_statements">
763         <title>Single-line if-statements</title>
764         <para>It is permissible to use <literal>&amp;&amp;</literal> and <literal>||</literal> to construct
765         shorthand for an "<literal>if</literal>" statement in the case where the if statement has a
766         single consequent line:
767<programlisting>
768[ $# -eq 0 ] &amp;&amp; exit 0
769</programlisting>
770instead of the longer:
771<programlisting>
772if [ $# -eq 0 ]; then
773  exit 0
774fi
775</programlisting>
776         </para>
777      </section>
778
779
780      <section xml:id="exit_status_and_if_for_while">
781         <title>Exit Status and <literal>if</literal>/<literal>while</literal> statements</title>
782         <para>Recall that "<literal>if</literal>" and "<literal>while</literal>"
783         operate on the exit status of the statement
784         to be executed. In the shell, zero (0) means true and non-zero means false.
785         The exit status of the last command which was executed is available in the $?
786         variable. When using "<literal>if</literal>" and "<literal>while</literal>",
787         it is typically not necessary to use
788         <literal>$?</literal> explicitly, as in:
789<programlisting>
790grep foo /etc/passwd &gt;/dev/null 2>&amp;1
791if [ $? -eq 0 ]; then
792  echo "found"
793fi
794</programlisting>
795Instead, you can more concisely write:
796<programlisting>
797if grep foo /etc/passwd &gt;/dev/null 2>&amp;1; then
798  echo "found"
799fi
800</programlisting>
801Or, when appropriate:
802<programlisting>
803grep foo /etc/passwd &gt;/dev/null 2>&amp;1 &amp;&amp; echo "found"
804</programlisting>
805         </para>
806      </section>
807
808  </section><!-- end of if/for/while -->
809
810
811
812
813
814
815  <section xml:id="variables">
816  <title>Variable types, naming and usage</title>
817
818      <section xml:id="names_should_be_lowercase">
819          <title>Names of local, non-environment, non-constant variables should be lowercase</title>
820          <para>Names of variables local to the current script which are not exported to the environment
821          should be lowercase while variable names which are exported to the
822          environment should be uppercase.</para>
823          <para>The only exception are global constants (=global readonly variables,
824          e.g. <literal>$ float -r M_PI=3.14159265358979323846 #</literal> (taken from &lt;math.h&gt;))
825          which may be allowed to use uppercase names, too.
826          </para>
827
828          <warning><para>
829              Uppercase variable names should be avoided because there is a good chance
830              of naming collisions with either special variable names used by the shell
831              (e.g.  <literal>PWD</literal>, <literal>SECONDS</literal> etc.).
832          </para></warning>
833      </section>
834
835      <section xml:id="do_not_reserved_keywords_for_variable_names">
836          <title>Do not use variable names which are reserved keywords/variable names in C/C++/JAVA or the POSIX shell standard</title>
837          <para>Do not use variable names which are reserved keywords in C/C++/JAVA or the POSIX shell standard
838          (to avoid confusion and/or future changes/updates to the shell language).
839          </para>
840          <note>
841            <para>The Korn Shell and the POSIX shell standard have many more
842            reserved variable names than the original Bourne shell. All
843            these reserved variable names are spelled uppercase.
844            </para>
845          </note>
846      </section>
847
848      <section xml:id="use_brackets_around_long_names">
849          <title>Always use <literal>'{'</literal>+<literal>'}'</literal> when using variable
850          names longer than one character</title>
851          <para>Always use <literal>'{'</literal>+<literal>'}'</literal> when using
852          variable names longer than one character unless a simple variable name is
853          followed by a blank, <literal>/</literal>, <literal>;</literal>, or <literal>$</literal>
854          character (to avoid problems with array,
855          compound variables or accidental misinterpretation by users/shell)
856<programlisting>
857print "$foo=info"
858</programlisting>
859should be rewritten to
860<programlisting>
861print "${foo}=info"
862</programlisting>
863          </para>
864      </section>
865
866
867      <section xml:id="quote_variables_containing_filenames_or_userinput">
868          <title><emphasis>Always</emphasis> put variables into quotes when handling filenames or user input</title>
869          <para><emphasis>Always</emphasis> put variables into quotes when handling filenames or user input, even if
870          the values are hardcoded or the values appear to be fixed. Otherwise at
871          least two things may go wrong:
872          <itemizedlist>
873          <listitem><para>a malicious user may be able to exploit a script's inner working to
874          infect his/her own code</para></listitem>
875          <listitem><para>a script may (fatally) misbehave for unexpected input (e.g. file names
876          with blanks and/or special symbols which are interpreted by the shell)</para></listitem>
877          </itemizedlist>
878          </para>
879
880          <note><para>
881          As alternative a script may set <literal>IFS='' ; set -o noglob</literal> to turn off the
882          interpretation of any field seperators and the pattern globbing.
883          </para></note>
884      </section>
885
886
887
888      <section xml:id="use_typed_variables">
889          <title>&tag_kshonly;&tag_performance;Use typed variables if possible.</title>
890          <para>For example the following is very
891          inefficient since it transforms the integer values to strings and back
892          several times:
893<programlisting>
894a=0
895b=1
896c=2
897# more code
898if [ $a -lt 5 -o $b -gt c ] ; then do_something ; fi
899</programlisting>
900This could be rewritten using ksh constructs:
901<programlisting>
902integer a=0
903integer b=1
904integer c=2
905# more code
906if (( a &lt; 5 || b &gt; c )) ; then do_something ; fi
907</programlisting>
908          </para>
909      </section>
910
911
912      <section xml:id="store_lists_in_arrays">
913          <title>&tag_ksh93only; Store lists in arrays or associative arrays</title>
914          <para>Store lists in arrays or associative arrays - this is usually easier
915          to manage.</para>
916          <para>
917    For example:
918<programlisting>
919x="
920/etc/foo
921/etc/bar
922/etc/baz
923"
924echo $x
925</programlisting>
926can be replaced with
927<programlisting>
928typeset -a mylist
929mylist[0]="/etc/foo"
930mylist[1]="/etc/bar"
931mylist[2]="/etc/baz"
932print "${mylist[@]}"
933</programlisting>
934or (ksh93-style append entries to a normal (non-associative) array)
935<programlisting>
936typeset -a mylist
937mylist+=( "/etc/foo" )
938mylist+=( "/etc/bar" )
939mylist+=( "/etc/baz" )
940print "${mylist[@]}"
941</programlisting>
942          </para>
943          <note>
944              <title>Difference between expanding arrays with mylist[@] and mylist[*] subscript operators</title>
945              <para>
946              Arrays may be expanded using two similar subscript operators, @ and *. These subscripts
947              differ only when the variable expansion appears within double quotes. If the variable expansion
948              is between double-quotes, "${mylist[*]}" expands to a single string with the value of each array
949              member separated by the first character of the <envar>IFS</envar> variable, and "${mylist[@]}"
950              expands each element of name to a separate string.
951              </para>
952              <example><title>Difference between [@] and [*] when expanding arrays</title>
953<programlisting>
954typeset -a mylist
955mylist+=( "/etc/foo" )
956mylist+=( "/etc/bar" )
957mylist+=( "/etc/baz" )
958IFS=","
959printf "mylist[*]={ 0=|%s| 1=|%s| 2=|%s| 3=|%s| }\n" "${mylist[*]}"
960printf "mylist[@]={ 0=|%s| 1=|%s| 2=|%s| 3=|%s| }\n" "${mylist[@]}"
961</programlisting>
962<para>will print:</para>
963<screen>
964<computeroutput>mylist[*]={ 0=|/etc/foo,/etc/bar,/etc/baz| 1=|| 2=|| 3=|| }
965mylist[@]={ 0=|/etc/foo| 1=|/etc/bar| 2=|/etc/baz| 3=|| }
966</computeroutput>
967</screen>
968              </example>
969          </note>
970      </section>
971
972
973      <section xml:id="use_compound_variables_or_lists_for_grouping">
974          <title>&tag_ksh93only; Use compound variables or associative arrays to group similar variables together</title>
975          <para>Use compound variables or associative arrays to group similar variables together.</para>
976          <para>
977    For example:
978<programlisting>
979box_width=56
980box_height=10
981box_depth=19
982echo "${box_width} ${box_height} ${box_depth}"
983</programlisting>
984could be rewritten to ("associative array"-style)
985<programlisting>
986typeset -A -E box=( [width]=56 [height]=10 [depth]=19 )
987print -- "${box[width]} ${box[height]} ${box[depth]}"
988</programlisting>
989or ("compound variable"-style
990<programlisting>
991box=(
992    float width=56
993    float height=10
994    float depth=19
995    )
996print -- "${box.width} ${box.height} ${box.depth}"
997</programlisting>
998          </para>
999      </section>
1000  </section><!-- end of variables -->
1001
1002
1003
1004
1005
1006
1007
1008  <section xml:id="io">
1009  <title>I/O</title>
1010
1011      <section xml:id="avoid_echo">
1012          <title>Avoid using the "<literal>echo</literal>" command for output</title>
1013          <para>The behaviour of "<literal>echo</literal>" is not portable
1014          (e.g. System V, BSD, UCB and ksh93/bash shell builtin versions all
1015          slightly differ in functionality) and should be avoided if possible.
1016          POSIX defines the "<literal>printf</literal>" command as replacement
1017          which provides more flexible and portable behaviour.</para>
1018
1019          <note>
1020              <title>&tag_kshonly;Use "<literal>print</literal>" and not "<literal>echo</literal>" in Korn Shell scripts</title>
1021              <para>Korn shell scripts should prefer the "<literal>print</literal>"
1022              builtin which was introduced as replacement for "<literal>echo</literal>".</para>
1023              <caution>
1024                  <para>Use <literal>$ print -- ${varname}" #</literal> when there is the slightest chance that the
1025                  variable "<literal>varname</literal>" may contain symbols like "-". Or better use "<literal>printf</literal>"
1026                  instead, for example
1027<programlisting>
1028integer fx
1029# do something
1030print $fx
1031</programlisting>
1032may fail if "f" contains a negative value. A better way may be to use
1033<programlisting>
1034integer fx
1035# do something
1036printf "%d\n" fx
1037</programlisting>
1038                  </para>
1039              </caution>
1040          </note>
1041      </section>
1042
1043      <section xml:id="use_redirect_not_exec_to_open_files">
1044          <title>&tag_ksh93only;Use <literal>redirect</literal> and not <literal>exec</literal> to open files</title>
1045          <para>Use <literal>redirect</literal> and not <literal>exec</literal> to open files - <literal>exec</literal>
1046          will terminate the current function or script if an error occurs while <literal>redirect</literal>
1047          just returns a non-zero exit code which can be caught.</para>
1048<para>Example:
1049<programlisting>
1050if redirect 5&lt;/etc/profile ; then
1051    print "file open ok"
1052    head &lt;&amp;5
1053else
1054    print "could not open file"
1055fi
1056</programlisting>
1057           </para>
1058      </section>
1059
1060      <section xml:id="group_identical_redirections_together">
1061          <title>&tag_performance;Avoid redirections per command when the output goes into the same file,
1062          e.g. <literal>$ echo "foo" &gt;xxx ; echo "bar" &gt;&gt;xxx ; echo "baz" &gt;&gt;xxx #</literal></title>
1063          <para>Each of the redirections above trigger an
1064          <literal>|open()|,|write()|,|close()|</literal>-sequence. It is much
1065          more efficient (and faster) to group the rediction into a block,
1066          e.g. <literal>{ echo "foo" ; echo "bar" ; echo "baz" } &gt;xxx #</literal></para>
1067      </section>
1068
1069
1070      <section xml:id="avoid_using_temporary_files">
1071          <title>&tag_performance;Avoid the creation of temporary files and store the values in variables instead</title>
1072          <para>Avoid the creation of temporary files and store the values in variables instead if possible</para>
1073          <para>
1074    Example:
1075<programlisting>
1076ls -1 &gt;xxx
1077for i in $(cat xxx) ; do
1078    do_something ;
1079done
1080</programlisting>
1081can be replaced with
1082<programlisting>
1083x="$(ls -1)"
1084for i in ${x} ; do
1085    do_something ;
1086done
1087</programlisting>
1088           </para>
1089           <note><para>ksh93 supports binary variables (e.g. <literal>typeset -b varname</literal>) which can hold any value.</para></note>
1090      </section>
1091
1092
1093      <section xml:id="create_subdirs_for_multiple_temporary_files">
1094          <title>If you create more than one temporary file create an unique subdir</title>
1095          <para>If you create more than one temporary file create an unique subdir for
1096          these files and make sure the dir is writable. Make sure you cleanup
1097          after yourself (unless you are debugging).
1098          </para>
1099      </section>
1100
1101
1102      <section xml:id="use_dynamic_file_descriptors">
1103          <title>&tag_ksh93only;Use {n}&lt;file instead of fixed file descriptor numbers</title>
1104          <para>When opening a file use {n}&lt;file, where <envar>n</envar> is an
1105          integer variable rather than specifying a fixed descriptor number.</para>
1106          <para>This is highly recommended in functions to avoid that fixed file
1107          descriptor numbers interfere with the calling script.</para>
1108<example><title>Open a network connection and store the file descriptor number in a variable</title>
1109<programlisting>
1110function cat_http
1111{
1112    integer netfd
1113
1114...
1115
1116    # open TCP channel
1117    redirect {netfd}&lt;&gt;"/dev/tcp/${host}/${port}"
1118
1119    # send HTTP request
1120    request="GET /${path} HTTP/1.1\n"
1121    request+="Host: ${host}\n"
1122    request+="User-Agent: demo code/ksh93 (2007-08-30; $(uname -s -r -p))\n"
1123    request+="Connection: close\n"
1124    print "${request}\n" &gt;&amp;${netfd}
1125
1126    # collect response and send it to stdout
1127    cat &lt;&amp;${netfd}
1128
1129    # close connection
1130    exec {netfd}&lt;&amp;-
1131
1132...
1133
1134}
1135</programlisting>
1136</example>
1137      </section>
1138
1139
1140      <section xml:id="use_inline_here_documents">
1141          <title>&tag_ksh93only;&tag_performance;Use inline here documents
1142          instead of <literal>echo "$x" | command</literal></title>
1143          <para>Use inline here documents, for example
1144<programlisting>
1145command &lt;&lt;&lt; $x
1146</programlisting>
1147       rather than
1148<programlisting>
1149print -r -- "$x" | command
1150</programlisting>
1151          </para>
1152      </section>
1153
1154
1155      <section xml:id="use_read_r">
1156          <title>&tag_ksh93only;Use the <literal>-r</literal> option of <literal>read</literal> to read a line</title>
1157          <para>Use the <literal>-r</literal> option of <literal>read</literal> to read a line.
1158          You never know when a line will end in <literal>\</literal> and without a
1159          <literal>-r</literal> multiple
1160          lines can be read.</para>
1161      </section>
1162
1163
1164      <section xml:id="print_compound_variables_using_print_C">
1165          <title>&tag_ksh93only;Print compound variables using <literal>print -C varname</literal> or <literal>print -v varname</literal></title>
1166          <para>Print compound variables using <literal>print -C varname</literal> or
1167          <literal>print -v varname</literal> to make sure that non-printable characters
1168          are correctly encoded.</para>
1169<example><title>Print compound variable with non-printable characters</title>
1170<programlisting>
1171compound x=(
1172    a=5
1173    b="hello"
1174    c=(
1175        d=9
1176        e="$(printf "1\v3")" <co xml:id="co.vertical_tab1" />
1177    )
1178)
1179print -v x
1180</programlisting>
1181<para>will print:</para>
1182<screen>
1183<computeroutput>(
1184        a=5
1185        b=hello
1186        c=(
1187                d=9
1188                e=$'1\0133' <co xml:id="co.vertical_tab2" />
1189        )
1190)</computeroutput>
1191</screen>
1192<calloutlist>
1193  <callout arearefs="co.vertical_tab1 co.vertical_tab2">
1194    <para>vertical tab, <literal>\v</literal>, octal=<literal>\013</literal>.</para>
1195  </callout>
1196</calloutlist>
1197</example>
1198      </section>
1199
1200      <section xml:id="command_name_before_redirections">
1201          <title>Put the command name and arguments before redirections</title>
1202          <para>Put the command name and arguments before redirections.
1203          You can legally do <literal>$ &gt; file date</literal> instead of <literal>date &gt; file</literal>
1204          but don't do it.</para>
1205      </section>
1206
1207      <section xml:id="enable_gmacs_editor_mode_for_user_prompts">
1208          <title>&tag_ksh93only;Enable the <literal>gmacs</literal> editor
1209          mode when reading user input using the <literal>read</literal> builtin</title>
1210          <para>Enable the <literal>gmacs</literal>editor mode before reading user
1211          input using the <literal>read</literal> builtin to enable the use of
1212          cursor+backspace+delete keys in the edit line</para>
1213<example><title>Prompt user for a string with gmacs editor mode enabled</title>
1214<programlisting>
1215set -o gmacs <co xml:id="co.enable_gmacs" />
1216typeset inputstring="default value"
1217...
1218read -v<co xml:id="co.read_v" /> inputstring<co xml:id="co.readvar" />?"Please enter a string: "<co xml:id="co.prompt" />
1219...
1220printf "The user entered the following string: '%s'\n" "${inputstring}"
1221
1222...
1223</programlisting>
1224<calloutlist>
1225  <callout arearefs="co.enable_gmacs">
1226    <para>Enable gmacs editor mode.</para>
1227  </callout>
1228  <callout arearefs="co.read_v">
1229    <para>The value of the variable is displayed and used as a default value.</para>
1230  </callout>
1231  <callout arearefs="co.readvar">
1232    <para>Variable used to store the result.</para>
1233  </callout>
1234  <callout arearefs="co.prompt">
1235    <para>Prompt string which is displayed in stderr.</para>
1236  </callout>
1237</calloutlist>
1238</example>
1239      </section>
1240  </section><!-- end of I/O -->
1241
1242
1243
1244
1245
1246
1247  <section xml:id="math">
1248  <title>Math</title>
1249
1250      <section xml:id="use_builtin_arithmetic_expressions">
1251          <title>&tag_kshonly;&tag_performance;Use builtin arithmetic expressions instead of external applications</title>
1252          <para>Use builtin (POSIX shell) arithmetic expressions instead of
1253          <filename>expr</filename>,
1254          <filename>bc</filename>,
1255          <filename>dc</filename>,
1256          <filename>awk</filename>,
1257          <filename>nawk</filename> or
1258          <filename>perl</filename>.
1259          </para>
1260          <note>
1261              <para>ksh93 supports C99-like floating-point arithmetic including special values
1262              such as
1263              <simplelist type="inline">
1264              <member>+Inf</member>
1265              <member>-Inf</member>
1266              <member>+NaN</member>
1267              <member>-NaN</member>
1268              </simplelist>.
1269              </para>
1270          </note>
1271      </section>
1272
1273
1274      <section xml:id="use_floating_point_arithmetic_expressions">
1275          <title>&tag_ksh93only; Use floating-point arithmetic expressions if
1276          calculations may trigger a division by zero or other exceptions</title>
1277          <para>Use floating-point arithmetic expressions if calculations may
1278          trigger a division by zero or other exceptions - floating point arithmetic expressions in
1279          ksh93 support special values such as <literal>+Inf</literal>/<literal>-Inf</literal> and
1280          <literal>+NaN</literal>/<literal>-NaN</literal> which can greatly simplify testing for
1281          error conditions, e.g. instead of a <literal>trap</literal> or explicit
1282          <literal>if ... then... else</literal> checks for every sub-expression
1283          you can check the results for such special values.
1284          </para>
1285          <para>Example:
1286<screen>
1287$ <userinput>ksh93 -c 'integer i=0 j=5 ; print -- "x=$((j/i)) "'</userinput>
1288<computeroutput>ksh93: line 1: j/i: divide by zero</computeroutput>
1289$ <userinput>ksh93 -c 'float i=0 j=-5 ; print -- "x=$((j/i)) "'</userinput>
1290<computeroutput>x=-Inf</computeroutput>
1291</screen>
1292          </para>
1293      </section>
1294
1295
1296      <section xml:id="use_printf_a_for_passing_float_values">
1297          <title>&tag_ksh93only; Use <literal>printf "%a"</literal> when passing floating-point values</title>
1298          <para>Use <literal>printf "%a"</literal> when passing floating-point values between scripts or
1299          as output of a function to avoid rounding errors when converting between
1300          bases.</para>
1301          <para>
1302    Example:
1303<programlisting>
1304function xxx
1305{
1306    float val
1307
1308    (( val=sin(5.) ))
1309    printf "%a\n" val
1310}
1311float out
1312(( out=$(xxx) ))
1313xxx
1314print -- $out
1315</programlisting>
1316This will print:
1317<programlisting>
1318-0.9589242747
1319-0x1.eaf81f5e09933226af13e5563bc6p-01
1320</programlisting>
1321          </para>
1322      </section>
1323
1324
1325      <section xml:id="put_constants_into_readonly_variables">
1326         <title>&tag_kshonly;&tag_performance;Put constant values into readonly variables</title>
1327         <para>Put constant values into readonly variables</para>
1328         <para>For example:
1329<programlisting>
1330float -r M_PI=3.14159265358979323846
1331</programlisting>
1332or
1333<programlisting>
1334float M_PI=3.14159265358979323846
1335readonly M_PI
1336</programlisting>
1337          </para>
1338      </section>
1339
1340
1341      <section xml:id="avoid_unnecessary_string_number_conversions">
1342         <title>&tag_kshonly;&tag_performance;Avoid string to number
1343         (and/or number to string) conversions in arithmetic expressions
1344         expressions</title>
1345         <para>Avoid string to number and/or number to string conversions in
1346         arithmetic expressions expressions to avoid performance degradation
1347         and rounding errors.</para>
1348         <example><title>(( x=$x*2 )) vs. (( x=x*2 ))</title>
1349<programlisting>
1350float x
1351...
1352(( x=$x*2 ))
1353</programlisting>
1354<para>
1355will convert the variable "x" (stored in the machine's native
1356<literal>|long double|</literal> datatype) to a string value in base10 format,
1357apply pattern expansion (globbing), then insert this string into the
1358arithmetic expressions and parse the value which converts it into the internal |long double| datatype format again.
1359This is both slow and generates rounding errors when converting the floating-point value between
1360the internal base2 and the base10 representation of the string.
1361</para>
1362<para>
1363The correct usage would be:
1364</para>
1365<programlisting>
1366float x
1367...
1368(( x=x*2 ))
1369</programlisting>
1370<para>
1371e.g. omit the '$' because it's (at least) redundant within arithmetic expressions.
1372</para>
1373         </example>
1374
1375
1376         <example><title>x=$(( y+5.5 )) vs. (( x=y+5.5 ))</title>
1377<programlisting>
1378float x
1379float y=7.1
1380...
1381x=$(( y+5.5 ))
1382</programlisting>
1383<para>
1384will calculate the value of <literal>y+5.5</literal>, convert it to a
1385base-10 string value amd assign the value to the floating-point variable
1386<literal>x</literal> again which will convert the string value back to the
1387internal |long double| datatype format again.
1388</para>
1389<para>
1390The correct usage would be:
1391</para>
1392<programlisting>
1393float x
1394float y=7.1
1395...
1396(( x=y+5.5 ))
1397</programlisting>
1398<para>
1399i.e. this will save the string conversions and avoid any base2--&gt;base10--&gt;base2-conversions.
1400</para>
1401          </example>
1402      </section>
1403
1404
1405      <section xml:id="set_lc_numeric_when_using_floating_point">
1406         <title>&tag_ksh93only;Set <envar>LC_NUMERIC</envar> when using floating-point constants</title>
1407         <para>Set <envar>LC_NUMERIC</envar> when using floating-point constants to avoid problems with radix-point
1408         representations which differ from the representation used in the script, for example the <literal>de_DE.*</literal> locale
1409         use ',' instead of '.' as default radix point symbol.</para>
1410         <para>For example:
1411<programlisting>
1412# Make sure all math stuff runs in the "C" locale to avoid problems with alternative
1413# radix point representations (e.g. ',' instead of '.' in de_DE.*-locales). This
1414# needs to be set _before_ any floating-point constants are defined in this script)
1415if [[ "${LC_ALL}" != "" ]] ; then
1416    export \
1417        LC_MONETARY="${LC_ALL}" \
1418        LC_MESSAGES="${LC_ALL}" \
1419        LC_COLLATE="${LC_ALL}" \
1420        LC_CTYPE="${LC_ALL}"
1421        unset LC_ALL
1422fi
1423export LC_NUMERIC=C
1424...
1425float -r M_PI=3.14159265358979323846
1426</programlisting>
1427          </para>
1428
1429          <note><para>The environment variable <envar>LC_ALL</envar> always overrides all other <envar>LC_*</envar> variables,
1430          including <envar>LC_NUMERIC</envar>. The script should always protect itself against custom <envar>LC_NUMERIC</envar> and
1431          <envar>LC_ALL</envar> values as shown in the example above.
1432          </para></note>
1433      </section>
1434
1435
1436
1437  </section><!-- end of math -->
1438
1439
1440
1441
1442
1443
1444  <section xml:id="misc">
1445  <title>Misc</title>
1446
1447      <section xml:id="debug_use_lineno_in_ps4">
1448          <title>Put <literal>[${LINENO}]</literal> in your <envar>PS4</envar></title>
1449          <para>Put <literal>[${LINENO}]</literal> in your <envar>PS4</envar> prompt so that you will get line
1450          numbers with you run with <literal>-x</literal>. If you are looking at performance
1451          issues put <literal>$SECONDS</literal> in the <envar>PS4</envar> prompt as well.</para>
1452      </section>
1453
1454  </section><!-- end of misc -->
1455
1456
1457
1458
1459</section><!-- end of RULES -->
1460
1461
1462
1463
1464</article>
1465