CS 279 - Week 5 Lab - 9-19-12

Wildcards in pathnames, part 2
-------------------------------
*   [cset]
    *   this matches any single character in cset

    [moxie] - matches a single m o x i or e

    *   you can have ranges --
        [0-9] same as [0123456789]
        [a-z] and [A-Z] and [d-f]
        and these match any single element from the range,
	inclusive

    *   there are predefined sets:
        each of these names goes within its own
        square bracket, a colon, the name, another colon,
        and then a closing square bracket -- and this
        goes WITHIN a set of square brackets (possibly with
        other options)

        [[:digit:]]
        same as [0-9] 
        same as [0123456789]

        [[:alpha:]] will match the same set as
	    [a-zA-Z] plus any other characters considered
            letters in your locale

        [[:upper:]] and [[:lower:]] match the upper
            and lowercase letters, respectively

        [[:space:]] matches "whitespace" chars such as
            space, newline, and more
        [[:blank:]] matches space and tab

        [[:cntrl:]] matches control characters

        (see more on pp. 75-76)

    *   I can use an ! before a cset to indicate you
        want to match something that ISN'T one of those

        starts with an uppercase letter,
        ends with anything BUT a digit:

        [A-Z]*[![:digit:]]
        
*   note the following limitations on these wildcards:
    *   a / in the actual pathname MUST be matched by an
        explicit / in the pattern (not a wildcard)

    *   a . in the actual pathname that comes at the 
        beginning or follows a / must similarly
        be matched by an explicit . in the pattern

EXAMPLES...
   gn*.1     gnu.1 - YES
             gneiss.1 - YES
             geewhiz.1 - NO
             gn.1 - YES
             gnu - NO
             gn/x.1 - NO

  ~/.[[:alpha:]]*   ~/.login - YES
                    ~/..login - NO
                    ~/.mailrc - YES
                    ~/login - NO

  */doit*           one/doit - YES
                    two/doit.c - YES
                    three/doit.cpp - YES
                    doit - NO

  zz?               zz1 - YES
                    zztop - NO
                    zz - NO

  [A-Z]*[![:digit:]] BAGEL - YES
                     Bagel# - YES
                     bagel - NO
                     Bagel37X - YES
                     oops7 - NO
                     G - NO
                     Gz - YES

   *.[acAC]          file.a - YES
                     .a   <-- NO?! <-- correct, .a can only
                                   be matched by a pattern
				   starting with an explicit .
                     file - NO
                     stuff.ac - NO

*   this is not built into the UNIX file mechanism
    -- it is recognized by various shells
*   those shells call it by different names --
    pathname expansion, filename expansion, globbing, and
    maybe more
*   the shell expands the pattern into a space-separated
    list of matching filenames BEFORE the command is
    done 

STANDARD FILES and REDIRECTION
*   many UNIX commands accept input from the keyboard,
    and output to the screen.

*   each UNIX process gets 3 files associated with it,
    standard input,  (stdin)
    standard output, (stdout)
    and standard error (stderr)

*   by default, standard input is the keyboard
    by default, standard output is the screen
    by default, standard error is the screen

*   UNIX lets you redirect standard output with a
    > -- when you put > and a name at the end of a command,
    that command's standard output is the file with name
    you gave
    (you redirected standard output to that file)

    UNIX lets you redirect standard input with a < 
    by putting < and a desired input file name

*   in bash, 
    (and some other shells)
    you can redirect standard error by putting 2>
    (2 followed by > with NOOOO space in between)
    followed by the file you want standard error to
    go into

    ls *.c 2> errors.txt
    little.sh 2> little-errors
    
    hey, and 2>> appends the error messages to a stated 
    file

*   what's the precedence between redirection and pipes?

    from p. 57:
    "first the redirections are associated with commands,
    and then the commands with their redirections are
    passed through the pipes"

    grep "pest" phones | sort > hangups

    *   first > hangups is associated with sort,
        then grep is done and its output becomes
        sort's input, with sort's output going into
        hangups...

*   filter - a program that expects standard input
    and produces standard output -- these kinds
    of programs are great for pipes!


*   Quotation 

    *   how do you indicate when special characters
        are special and when they aren't

        metacharacters is another term for special
        characters...

    *   you can quote or escape them when you don't
        want them to be special;

    *   3 rules that generally hold between the different
        shells:

        1. a backslash \ generally acts as an escape
           for the special character following it

           (or it may GIVE a special meaning to 
           a NON-special character following it)
           ^ these are predefined...

           echo \$TERM
           $TERM
           echo $TERM
           xterm256-color

           echo -e "hi there! \n how are you?"
           hi there!
            how are you?

        2. when text is enclosed in double quotes,
           MOST special characters are just data characters,
           BUT those that specify substitutions -- such
           $ -- are still special, and behave as special

           echo "my term is $TERM"
  
        3. when text is enclosed in single quotes,
           all characters are treated as data characters
	   (no substitutions)

           echo 'my term is $TERM'