usd/04.csh/csh.3

        -
 Copyright (c) 1980, 1993
 The Regents of the University of California. All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
 notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.
 3. Neither the name of the University nor the names of its contributors
 may be used to endorse or promote products derived from this software
 without specific prior written permission.

 THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGE.

 @(#)csh.3 8.1 (Berkeley) 6/8/93
 $FreeBSD$

.nr H1 2
Shell control structures and command scripts

Introduction

It is possible to place commands in files and to cause shells to be
invoked to read and execute commands from these files,
which are called
 "shell scripts." We here detail those features of the shell useful to the writers of such
scripts.

Make

It is important to first note what shell scripts are
 not useful for.
There is a program called
 make which is very useful for maintaining a group of related files
or performing sets of operations on related files.
For instance a large program consisting of one or more files
can have its dependencies described in a
 makefile which contains definitions of the commands used to create these
different files when changes occur.
Definitions of the means for printing listings, cleaning up the directory
in which the files reside, and installing the resultant programs
are easily, and most appropriately placed in this
 makefile. This format is superior and preferable to maintaining a group of shell
procedures to maintain these files.

Similarly when working on a document a
 makefile may be created which defines how different versions of the document
are to be created and which options of
 nroff or
 troff are appropriate.

Invocation and the argv variable

A
 csh command script may be interpreted by saying

% csh script ...

where
 script is the name of the file containing a group of
 csh commands and
`...' is replaced by a sequence of arguments.
The shell places these arguments in the variable
 argv and then begins to read commands from the script.
These parameters are then available through the same mechanisms
which are used to reference any other shell variables.

If you make the file
`script'
executable by doing

chmod 755 script

and place a shell comment at the beginning of the shell script
(i.e. begin the file with a `#' character)
then a `/bin/csh' will automatically be invoked to execute `script' when
you type

script

If the file does not begin with a `#' then the standard shell
`/bin/sh' will be used to execute it.
This allows you to convert your older shell scripts to use
 csh at your convenience.

Variable substitution

After each input line is broken into words and history substitutions
are done on it, the input line is parsed into distinct commands.
Before each command is executed a mechanism know as
 "variable substitution" is done on these words.
Keyed by the character `$' this substitution replaces the names
of variables by their values.
Thus

echo $argv

when placed in a command script would cause the current value of the
variable
 argv to be echoed to the output of the shell script.
It is an error for
 argv to be unset at this point.

A number of notations are provided for accessing components and attributes
of variables.
The notation

$?name

expands to `1' if name is
 set or to `0'
if name is not
 set. It is the fundamental mechanism used for checking whether particular
variables have been assigned values.
All other forms of reference to undefined variables cause errors.

The notation

$#name

expands to the number of elements in the variable
 name. Thus

% set argv=(a b c)
% echo $?argv
1
% echo $#argv
3
% unset argv
% echo $?argv
0
% echo $argv
Undefined variable: argv.
%


It is also possible to access the components of a variable
which has several values.
Thus

$argv[1]

gives the first component of
 argv or in the example above `a'.
Similarly

$argv[$#argv]

would give `c',
and

$argv[1-2]

would give `a b'. Other notations useful in shell scripts are

$n

where
 n is an integer as a shorthand for

$argv[n\|]

the
 n\|th parameter and

$*

which is a shorthand for

$argv

The form

$$

expands to the process number of the current shell.
Since this process number is unique in the system it can
be used in generation of unique temporary file names.
The form

$<

is quite special and is replaced by the next line of input read from
the shell's standard input (not the script it is reading). This is
useful for writing shell scripts that are interactive, reading
commands from the terminal, or even writing a shell script that
acts as a filter, reading lines from its input file.
Thus the sequence

echo 'yes or no?\ec'
set a=($<)

would write out the prompt `yes or no?' without a newline and then
read the answer into the variable `a'. In this case `$#a' would be
`0' if either a blank line or end-of-file (^D) was typed.

One minor difference between `$n\|' and `$argv[n\|]'
should be noted here.
The form
`$argv[n\|]'
will yield an error if
 n is not in the range
`1-$#argv'
while `$n'
will never yield an out of range subscript error.
This is for compatibility with the way older shells handled parameters.

Another important point is that it is never an error to give a subrange
of the form `n-'; if there are less than
 n components of the given variable then no words are substituted.
A range of the form `m-n' likewise returns an empty vector without giving
an error when m exceeds the number of elements of the given variable,
provided the subscript n is in range.

Expressions

In order for interesting shell scripts to be constructed it
must be possible to evaluate expressions in the shell based on the
values of variables.
In fact, all the arithmetic operations of the language C are available
in the shell
with the same precedence that they have in C.
In particular, the operations `==' and `!=' compare strings
and the operators `&&' and `|\|\||' implement the boolean and/or operations.
The special operators `=~' and `!~' are similar to `==' and `!=' except
that the string on the right side can have pattern matching characters
(like *, ? or []) and the test is whether the string on the left matches
the pattern on the right.

The shell also allows file enquiries of the form

-? filename

where `?' is replace by a number of single characters.
For instance the expression primitive

-e filename

tell whether the file
`filename'
exists.
Other primitives test for read, write and execute access to the file,
whether it is a directory, or has non-zero length.

It is possible to test whether a command terminates normally,
by a primitive of the
form `{ command }' which returns true, i.e. `1' if the command
succeeds exiting normally with exit status 0, or `0' if the command
terminates abnormally or with exit status non-zero.
If more detailed information about the execution status of a command
is required, it can be executed and the variable `$status' examined
in the next command.
Since `$status' is set by every command, it is very transient.
It can be saved if it is inconvenient to use it only in the single
immediately following command.

For a full list of expression components available see the manual
section for the shell.

Sample shell script

A sample shell script which makes use of the expression mechanism
of the shell and some of its control structure follows:

% cat copyc
#
# Copyc copies those C programs in the specified list
# to the directory ~/backup if they differ from the files
# already in ~/backup
#
set noglob
foreach i ($argv)

 if ($i !~ *.c) continue # not a .c file so do nothing

 if (! -r ~/backup/$i:t) then
 echo $i:t not in backup... not cp\e\'ed
 continue
 endif

 cmp -s $i ~/backup/$i:t # to set $status

 if ($status != 0) then
 echo new backup of $i
 cp $i ~/backup/$i:t
 endif
end


This script makes use of the
 foreach command, which causes the shell to execute the commands between the
 foreach and the matching
 end for each of the values given between `(' and `)' with the named
variable, in this case `i' set to successive values in the list.
Within this loop we may use the command
 break to stop executing the loop
and
 continue to prematurely terminate one iteration
and begin the next.
After the
 foreach loop the iteration variable
(i in this case)
has the value at the last iteration.

We set the variable
 noglob here to prevent filename expansion of the members of
 argv. This is a good idea, in general, if the arguments to a shell script
are filenames which have already been expanded or if the arguments
may contain filename expansion metacharacters.
It is also possible to quote each use of a `$' variable expansion,
but this is harder and less reliable.

The other control construct used here is a statement of the form

if ( expression ) then
 command
 ...
endif

The placement of the keywords here is
 not flexible due to the current implementation of the shell.\(dg
.FS
\(dgThe following two formats are not currently acceptable to the shell:
if ( expression ) # Won't work!
then
 command
 ...
endif

and
if ( expression ) then command endif # Won't work

.FE

The shell does have another form of the if statement of the form

if ( expression ) command

which can be written

if ( expression ) \e
 command

Here we have escaped the newline for the sake of appearance.
The command must not involve `\||\|', `&' or `;'
and must not be another control command.
The second form requires the final `\e' to
 immediately precede the end-of-line.

The more general
 if statements above also admit a sequence of
 else-if pairs followed by a single
 else and an
 endif, e.g.:

if ( expression ) then
 commands
else if (expression ) then
 commands
...

else
 commands
endif


Another important mechanism used in shell scripts is the `:' modifier.
We can use the modifier `:r' here to extract a root of a filename or
`:e' to extract the
 extension. Thus if the variable
 i has the value
`/mnt/foo.bar'
then
% echo $i $i:r $i:e
/mnt/foo.bar /mnt/foo bar
%

shows how the `:r' modifier strips off the trailing `.bar' and the
the `:e' modifier leaves only the `bar'.
Other modifiers will take off the last component of a pathname leaving
the head `:h' or all but the last component of a pathname leaving the
tail `:t'.
These modifiers are fully described in the
 csh manual pages in the User's Reference Manual.
It is also possible to use the
 "command substitution" mechanism described in the next major section to perform modifications
on strings to then reenter the shell's environment.
Since each usage of this mechanism involves the creation of a new process,
it is much more expensive to use than the `:' modification mechanism.\(dd
.FS
\(dd It is also important to note that
the current implementation of the shell limits the number of `:' modifiers
on a `$' substitution to 1.
Thus
% echo $i $i:h:t
/a/b/c /a/b:t
%

does not do what one would expect.
.FE
Finally, we note that the character `#' lexically introduces a shell
comment in shell scripts (but not from the terminal).
All subsequent characters on the input line after a `#' are discarded
by the shell.
This character can be quoted using `\'' or `\e' to place it in
an argument word.

Other control structures

The shell also has control structures
 while and
 switch similar to those of C.
These take the forms

while ( expression )
 commands
end

and

switch ( word )

case str1:
 commands
 breaksw

 ...

case strn:
 commands
 breaksw

default:
 commands
 breaksw

endsw

For details see the manual section for
 csh. C programmers should note that we use
 breaksw to exit from a
 switch while
 break exits a
 while or
 foreach loop.
A common mistake to make in
 csh scripts is to use
 break rather than
 breaksw in switches.

Finally,
 csh allows a
 goto statement, with labels looking like they do in C, i.e.:

loop:
 commands
 goto loop


Supplying input to commands

Commands run from shell scripts receive by default the standard
input of the shell which is running the script.
This is different from previous shells running
under \s-2UNIX\s0. It allows shell scripts to fully participate
in pipelines, but mandates extra notation for commands which are to take
inline data.

Thus we need a metanotation for supplying inline data to commands in
shell scripts.
As an example, consider this script which runs the editor to
delete leading blanks from the lines in each argument file:

% cat deblank
# deblank -- remove leading blanks
foreach i ($argv)
ed - $i << \'EOF\'
1,$s/^[ ]*//
w
q
\'EOF\'
end
%

The notation `<< \'EOF\''
means that the standard input for the
 ed command is to come from the text in the shell script file
up to the next line consisting of exactly `\'EOF\''.
The fact that the `EOF' is enclosed in `\'' characters, i.e. quoted,
causes the shell to not perform variable substitution on the
intervening lines.
In general, if any part of the word following the `<<' which the
shell uses to terminate the text to be given to the command is quoted
then these substitutions will not be performed.
In this case since we used the form `1,$' in our editor script
we needed to insure that this `$' was not variable substituted.
We could also have insured this by preceding the `$' here with a `\e',
i.e.:

1,\e$s/^[ ]*//

but quoting the `EOF' terminator is a more reliable way of achieving the
same thing.

Catching interrupts

If our shell script creates temporary files, we may wish to catch
interruptions of the shell script so that we can clean up
these files.
We can then do

onintr label

where
 label is a label in our program.
If an interrupt is received the shell will do a
`goto label'
and we can remove the temporary files and then do an
 exit command (which is built in to the shell)
to exit from the shell script.
If we wish to exit with a non-zero status we can do

exit(1)

e.g. to exit with status `1'.

What else?

There are other features of the shell useful to writers of shell
procedures.
The
 verbose and
 echo options and the related
 -v and
 -x command line options can be used to help trace the actions of the shell.
The
 -n option causes the shell only to read commands and not to execute
them and may sometimes be of use.

One other thing to note is that
 csh will not execute shell scripts which do not begin with the
character `#', that is shell scripts that do not begin with a comment.
Similarly, the `/bin/sh' on your system may well defer to `csh'
to interpret shell scripts which begin with `#'.
This allows shell scripts for both shells to live in harmony.

There is also another quotation mechanism using `"' which allows
only some of the expansion mechanisms we have so far discussed to occur
on the quoted string and serves to make this string into a single word
as `\'' does.
.bp