-

Bourne Shell: Chapter 6a

-

Please contact or Mt Xia for assistance with all your shell programming needs.


  • 6. UNIX TOOLS. Go to the bottom, first,previous,next, last section, table of contents.

    6. UNIX TOOLS

    6.1 Processes

    A process is the execution of a command by UNIX. Processes can also be executed by the operating system itself. Like the file structure, the process structure is hierarchical. It contains parents, children, and even a root. A parent can fork (or spawn) a child process. That child can, in turn, fork other processes. The first thing the operating system does to begin execution is to create a single process, PID number 1. PID stands for Process IDentification. This process will hold the same position as the root directory in the file structure. This process is the ancestor to all processes that each user works with. It forks a process for each terminal. Each one of these processes becomes a Shell process when the user logs on.

    6.2 Executing a Command

    When you give a command to the Shell, it will fork a process to execute the command. While the child process is executing the command, the parent will go to sleep. Sleeping means that the process will not use any CPU time. It remains inactive until it is awakened. When the child process has finished executing the command, it dies. The parent process, which is running the Shell, wakes up and prompts you for another command. When you request a process to run in the background (by ending the command line with an &), the Shell forks a child process that is allowed to run to completion. The parent process will report the PID of the child process and them prompts you for another command. The child and parent are now independent processes.

    6.3 Process Identification

    The Unix operating system assigns a unique process identification number (PID) to each process. It will keep the same PID as long as the process is in existence. During one session, the same process is always executing the login Shell. When you execute another command a new process is forked and a new PID is assigned to that process. When that child process is finished you are returned to the login process, which is running the Shell, and that parent process has the same PID as when you logged on. The Shell stores the PID in Shell variable called $$. The PID can also be shown with the process status (ps) command. The format for ps is as follows:
     
        Command Format:  ps [options]                              
                                                                   
        See on-line manual for options                             
    
    With no options given the ps command will give you certain information about processes associated with the controlling terminal. The output consists of a short listing containing the process id, terminal id, cumulative execution time, and the command name. Otherwise, options will control the display. Sample session: $echo $$ 8347 $ps PID TTY TIME COMMAND 8347 rt021a0 0:03 ksh 8376 rt021a0 0:06 ps $ The PID numbers of the Shell are the same in the sample session because the shell will substitute its own PID number for $$. The Shell makes the substitution before it forks a new process to execute the echo command. Therefore, echo will display the PID number of the process that called it, not the PID of the process that is executing it. The -l option will display more information about the processes. Sample Session: $ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME COMD f0000 S 115 8347 309 2 30 20 1009000 140 94014 rt021a0 0:03 ksh f0000 O 115 8386 8347 16 68 20 1308000 72 rt021a0 0:01 ps $ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME COMD f0000 S 115 8347 309 1 30 20 1009000 140 94014 rt021a0 0:03 ksh f0000 O 115 8387 8347 26 73 20 1146000 72 rt021a0 0:01 ps $

    6.4 grep: A Pattern Matching Filter

    The grep utility can search through a file to see if it contains a specified string of characters. The utility will not change the file it searches but displays each line that contains the string. The format for the string is:
     
      Command Format:  grep [options] limited_regular-expression [file]         
                                                                                
        Use the man command for a complete list of options                      
    
    The grep utility searches files for a pattern and displays all lines that contain the pattern. It uses limited-regular- expressions (these are expressions that have string values that use a subset of all the possible alphanumeric and special characters) like those used with ed to match the patterns. Be careful using the characters $, *, [, ^, |, (, ), and \ in the regular expression because they will be evaluated by the shell. It is good practice to enclose the regular expression in single quotes. This will prevent the shell from evaluating these special characters. The grep utility will assume standard input if no files are given. Normally, each line found in the file will be displayed to standard output. Sample session: $grep 'disc' memo This command will search the file "memo" for the string "disc". It will include words like discover and indiscreet because they contain the characters "disc". The single quote marks are not necessary and for this example they wouldn't have made any difference. They do allow you to include spaces in the search pattern.

    6.4.1 More on Regular Expressions

    The grep command can be best understood by a discussion of regular expressions. Let's create a database of phone numbers called phone.lis and then use regular expressions to search through the database. Here is as listing of the contents of phone.lis Sample session: $cat phone.lis Smith, Joan 7-7989 Adams, Fran 2-3876 StClair, Fred 4-6122 Jones, Ted 1-3745 Stair, Rich 5-5972 Benson, Sam 4-5587 $ The format for the records in this database is: Last name, First name #-#### Using the database (phone.lis) above. What grep command would we use to search through the database and get all the records that had a person whose name contains an "S". An alphabetic character represents itself. Sample session: $grep S phone.lis Smith, Joan 7-7989 StClair, Fred 4-6122 Stair, Rich 5-5972 Benson, Sam 4-5587 $ This grep command searched for the string "S" and then listed all the lines in phone.lis that matched. A single . (dot) is used to represent any single character. Sample session: $grep .S phone.lis Benson, Sam 4-5587 $ A $ represents the end of the line. Sample session: $grep 5$ phone.lis Jones, Ted 1-3745 $ A ^ represents the beginning of the line Sample session: $grep ^S phone.lis Smith, Joan 7-7989 StClair, Fred 4-6122 Stair, Rich 5-5972 $ Regular expressions must get to grep in order for them to be evaluated properly. Let's say we want to get the records of employees that have a phone number that begins with a "4". What does the following expression do? Sample session: $grep 4 phone.lis StClair, Fred 4-6122 Jones, Ted 1-3745 Benson, Sam 4-5587 $ Why did we get the record of Ted Jones? The tab character was evaluated by the shell and so the search was actually made looking for a "4". This is the same as if we had entered $grep 4 phone.lis. We must prevent the shell from evaluating these characters, this is done with the \ (backslash) character as shown in the next example. Sample session: $grep \4 phone.lis StClair, Fred 4-6122 Benson, Sam 4-5587 $ Now it worked properly. It searched for a character followed by the number 4. The [] (left and right brackets) are used to identify a range of characters. Sample session: $grep \[AF] phone.lis Adams, Fran 2-3876 StClair, Fred 4-6122 $ Why do [] need to be quoted? In the previous example the search makes a match on "A" or "F" . A - (dash) can indicate inclusion. For example, we want to make a match on a phone number that has a 1, 2, 3, or 4. How can this be done? Here's an example: Sample Session: $grep \[1-4] phone.lis Adams, Fran 2-3876 StClair, Fred 4-6122 Jones, Ted 1-3745 Stair, Rich 5-5972 Benson, Sam 4-5587 $ A ^ character looks for all characters NOT inside the [] brackets. For example, [^0-9] matches all non-digits [^a-zA-Z] matches all non-alphabetic characters Note: \, *, and $ lose their meta-character meanings inside the []. Also the ^ character is special only if it appears first. What is the following command searching for? Sample Session: $grep '[^789]$' phone.lis Adams, Fran 2-3876 StClair, Fred 4-6122 Jones, Ted 1-3745 Stair, Rich 5-5972 $

    6.4.2 Closure

    The * (asterisk) represents zero or more of the characters preceding the asterisk. A* represents 0 or more As. AA* represents 1 or more As. [0-9]*$ 0 or more digits at the end of a line (last four digits in a phone number) .* represents 0 or more of any character. How would you write a grep command using regular expressions to find the last name starting with an "S" and the first name with an "F"? ^S Begins with an "S" .*,F Any number of characters before ,F Sample session: $grep ^S.\*,F phone.lis StClair, Fred 4-6122 $ Note: The * (asterisk) was quoted so the shell didn't try to evaluate it. It is very desireable to quote the entire string to keep the shell from doing an expansion or substitution. It also increases readability of the regular expression as in the following example: Sample session: $grep '^S.*, F' phone.lis StClair, Fred 4-6122 $

    6.4.3 Some Nice grep Options

    The grep provides several options that modify how the search is performed. -c Report count of matching lines only -v Print those lines that don't match the pattern. What will these lines print? Sample session: $grep -c '[J-Z]' phone.lis 5 $ Why did we get this result? Let's analyze the command. In English, this command could be interpreted to mean "Tell me how many records in the file "phone.lis" contain a letter from the set J through and including Z." Look at the phone.lis file and see that five records fit this restriction. So the answer is 5. Now look at another example and see what this one does. Sample session: $grep -v '[J-Z]' phone.lis Adams,Fran 2-3876 $ Why is this the only record that was found? The -v option says to select records that don't match the pattern. This is the same pattern as the previous example and therefore it selects records that don't match the pattern. The "Adams" record is the only one that doesn't make a match. It doesn't have a character from the set J through and Z.

    6.4.4 Summary of Regular Expression Characters

    ^ Beginning of the line $ End of the line * 0 or more preceding characters . Any single character [...] A range of characters [^...] Exclusion range of characters

    Go to the top, first,previous,next, last section, table of contents.