Software tools

Discussed non-software tools and how easily you can combine them, how multi-purpose they are. You can hit just about anything with a hammer.

In contrast, software is often presented as monolithic large programs which indeed do have many features, but if the implementers didn't happen to think of the feature you wanted, it's not easily added.

The "software tools" idea is about writing small, simple programs, which do one thing well; and having powerful and general ways to combine them.

Cute quotation: "Unix is user-friendly; it's just choosy about who its friends are."

Summary: "Do one thing well."

Tools which do just one thing can be combined in arbitrary ways.


One thing a bit odd in unix is that program output doesn't contain headers.

Consider the "who" command. Example output:

ajr      console  Jan  8 06:28
ajr      ttyp1    Jan  8 09:25
ajr      ttyp2    Jan  8 09:26
(The "who" output is more exciting on a system with multiple users, especially if no one's on the console and creating multiple terminal windows; try it on cslinux or seawolf (as appropriate to the campus you're taking this course on...).)

We can see how many entries there are by using the "word count" program "wc", with the option "-l" which means "only display the line count":

% who | wc -l
       3
% 
On many non-unix systems we would expect output with a header, identifying the columns, like this:
User     Terminal  Login time
------------------------------
ajr      console  Jan  8 06:28
ajr      ttyp1    Jan  8 09:25
ajr      ttyp2    Jan  8 09:26
But this would cause problems for the software tools model. In the "who | wc -l" case, the line count above would be off by two; in fact we would get funny results from many tools. For example, a "grep" (display only lines matching a search expression) to see who is logged in and has a "-" in their logname would also display the header separation line, or if a user were named "ogi", then "who | grep ogi" would also display the header line.


Software tools

Software tools principles, after Doug McIlroy (with some text from Ian Darwin):

  1. Write small programs that do one thing well.
  2. Expect the output of every program to become the input to another, as yet unknown, program.
  3. Make programs' input formats easy to generate or type.
    If every file has the same format, users only need one set of tools. If the format is simple, the tools are easy to write.
    If everything in the system is a file, users can go further with one set of tools.
  4. Use programs to write programs.
Don't force people to use the system in one way.


Filters

"filter"

e.g. grep: print lines which match a pattern.

Some ways data goes INTO a command: command-line: "globbing" done by the shell. Special treatment of '.' at the beginning of a file name: must be matched explicitly.


Here are some filters. All of these, and everything else, has man pages. Get used to reading man pages, especially to find obscure options.

I frequently read man pages. The on-line help in unix is very comprehensive. There's a lot to know and you don't have to remember it all.

grep

    who | grep ajr
    grep /~ajr/209/ /var/httpd/log/access_log
    lpq | grep ajr | cut -f1 | xargs lprm

tr

    tr '\015' '\012' <file.mac >file.unix
    tr A-Z a-z
    tr a-zA-Z n-za-mN-ZA-M

head, tail

    last | head
    tail /var/log/messages
    tail -40 /var/log/messages

sort

    sort
    sort -k2
    sort -n
    sort -n -k3
lots of other options such as case-insensitive, reverse

uniq

    tr -cs a-zA-Z0-9 '\012' <file | tr A-Z a-z | sort | uniq -c

sed

    s/Fred/Wilma/
    s/Fred/Wilma/g
    s/Fred[a-z]*/Wilma/g
    5d, 10q, /pat/d
regular expressions: ., [, *


Here are some other fundamental unix tools:

echo

provide output, e.g.
    echo Please enter repeat count:
    echo -n 'Please enter repeat count: '
-> note how it takes any number of arguments, outputs them separated by spaces.

Use "tr" to convert x's to y's in xylophone:

cat

various options depending on unix version, such as -n to number the lines, -s to eliminate multiple blank lines; note that a plain "cat" is just a buffer, used as a data-wise no-op
(cat actually is a filter, could be in the section above)

ls

ls dir or ls file; ls -d to avoid descending into a directory
use xargs to make it read stdin in any interesting way
-a, -l, -i, -q, -t, -r
how options combine: ls -lart
ls strangely (and unsimply) acts differently by default based on whether its output is a "tty" or not, but there are options -C to force columnar output and -1 to force one file per line (mnemonic: "one column")

cp

either 2 args or multi args plus directory; -p, -r

mv

similar options, always -p

rm

-r, -f

cmp

, also cmp -l

diff

, also diff -b, also -c

comm

-> students enrolled in CSC 209 before and after the drop date (fictional)
% comm -1 students newstudentlist
% comm -12 students newstudentlist
% comm -13 students newstudentlist
% comm -23 students newstudentlist

join

join newstudentlist grades

idea of "-" file name

-> Summary: small programs that do one thing well.


Find

find /u/ajr/web/270/example -name mergen.c -print
find /u/ajr/web/270/notes -mtime -30 -exec ls -ld '{}' ';'
find /u/ajr/web/270/notes -type f -mtime -30 -exec ls -ld '{}' ';'


A little more about the shell

Further understanding of command-line arguments through attempting to cat a file called "-a"

% cat -a
cat: illegal option -- a
usage: cat [-benstuv] [-] [file ...]
% 
Note that the "-" detection is lexical; file names have more complex semantics.

So use another path name which refers to the same file, but does not have the property that the zeroth character of the string is '-'.

Example 1: cat ./-a
Example 2: cat /u/ajr/-a

Another method: There is a feature of getopt(), the library function used to parse the command-line options, to say "that's all the options", after which you can safely say "-a" to refer to a file named "-a".

cat -- -a

Trivial shell scripts via "sh file":

"sh" (the shell) will take file names as arguments just like all those filters above, and will process the contents of the file just like typing it in.

Example shell script which compiles gcd.c and tests it with several arguments:

gcc -Wall gcd.c
./a.out 3 4
./a.out 12 0
./a.out 0 12
./a.out 12 18
./a.out 18 12

If this is in a file "testgcd", execute the list of commands by saying "sh testgcd".

I/O redirection:


[list of topics covered]
[course notes available so far]
[main course page]