################################################################
####
#### This function is a korn shell emulation of the "awk" capability
#### of extracting and reordering fields from an input stream.
####
#### Syntax: k93awk [-FX] '{ print $# ... }' < file
####
#### Where -FX allows the field delimiter to be defined as some
#### character X.
####
#### One or more fields can be returned based on the positional
#### parameters defined in the required parameter. The "print"
#### statement is required to simulate the "awk" syntax.
####
################################################################
function k93awk {
[[ "_${1:?${0}: undefined parameter}" = _-F? ]] && IFS="${1#-F}" && shift
VARS="${1// /}"
VARS="${VARS/#\{print(*)\}/print -r -- \1}"
while read -r -- LINE
do eval set -- ${LINE}
eval eval "\"${VARS//,/ }\""
done
}
This script is the functional equivalent of the awk function to
extract and reorder fields from a stream of data. This script assumes
the same type of arguments used with the awk command will be used with
this function, i.e.:
awk -F: '{ print $3, $2, $1 }'
Ask the reader to interpret the script line by line, and character by
character as necessary.
Line 1: function k93awk {
Korn Shell method of defining a function, POSIX method is like:
k93awk () {
}
Line 2: [[ "_${1:?${0}: undefined parameter}" = _-F? ]]
&& IFS="${1#-F}"
&& shift
Double square bracket test performing a string comparison to compare
positional parameter 1 against a string pattern. The test is comparing
the first positional parameter of the function to see if it starts with
a "-F". If the test is true, a field delimeter was specified on the
command line and the IFS variable is set using a value extracted from
positional parameter 1. Positional parameter 1 is then "shifted" off
the command line.
- string 1: "_${1:?${0}: undefined parameter}"
- Double quotes allow variable substitution between the quotes.
Double quotes suppress special meaning of all metacharacters except $
(dollar sign) , \ (backslash), ` (backtick), or another " (double
quote).
- Uses positional parameter 1 = function command line argument 1.
- Curly braces around a variable name are the korn shell standard.
Not using curly braces is an extension to the standard.
- Performs a variable test on $1 to test for "unset or null"
condition.
- If $1 is unset or null, the error message, following the variable
test operator ":?", is printed to standard error and the script exits.
${0} is replaced with the function name.
- The underscore at the beginning of the string is added to both
sides of the "=" to avoid an error in the event of a null string.
- string 2: "_-F"
- The underscore at the beginning of the second string is to match
the underscore at the beginning of the first string.
- The pattern -F? translates to a - (dash), followed by a capital
F, followed by any single character represented by the "?".
- The double ampersand is a boolean "and" that says if the previous
expression is true, execute the next expression in the sequence.
- IFS="${1#-F}"
Sets the IFS (Internal Field Separator) variable to positional
parameter 1 using the "delete smallest matching pattern from the left"
operator. The deleted pattern is the "-F" which will leave the field
delimeter pattern specified when the function was called.
- Again, the double ampersand is a boolean "and" that says if the
previous expression is true, execute the next expression in the
sequence.
- shift
Deletes 1 positional parameter off the command line and shifts
all remaining positional parameters one position to the left.
Line 3: VARS="${1// /}"
Sets the variable "VARS" to the value of positional parameter 1 after
performing a substitution of all occurances of a space with nothing.
This removes all spaces from the value of $1 before assigning it to the
variable VARS.
Line 4: VARS="${VARS/#\{print(*)\}/print -r -- \1}"
Performs another substitution of the value contained in the variable
VARS. Since all spaces were removed by the previous substitution, all
characters will be scrunched together. This substitution replaces the
first occurance of a pattern (uses the substitute at the beginning of
the string operator "/#" ). The search pattern is a literal curly brace
followed by the word "print", followed by the marked pattern of 0 or
more characters, followed by a literal closing curly brace. This search
pattern is substituted with the replacement string "print -r --",
followed by the contents of the first marked pattern in the search.
This substitution is replacing the awk "print" command with the
equivalent korn shell print command and leaving the $1 $2 $3 ... as is.
The marked pattern contains the $1 $2 $3 ... specified by the
programmer.
Line 5: while read -r -- LINE
Executes the "read" command as the test statement in a while loop.
Reads one entire line from standard input and assigns it to the variable
LINE. The while loop will continue looping until the read command hits
and EOF.
The options "-r" and "--" are used with the read command because the
contents of the input stream are unknown. This will cause the read
command to process each line exactly as written without interpreting
escape or option characters.
Line 6: do set -- ${LINE}
The "do" command is a required keyword of the while construct and
signifies the beginning of the loop commands. The first executable
statement of the loop follows the do command and is the "set"
statement.
The "set" statement assigns each word contained in the value of the
${LINE} variable to a positional parameter beginning at $1. The "--"
following the "set" statement indicates there are no more options
following, meaning that if a word begins with a dash (-), it will not be
treated as an option to the "set" statement.
Line 7: eval eval "\"${VARS//,/ }\""
This is a complex statement interpreted in three steps. The eval
command causes the statement to be reprocessed by the shell. Two eval
commands cause the statement to be reprocessed a third time.
In the first shell interpretation of the command, a substitution is
performed on the contents of the VAR variable. All commas are
substituted with a space. This is because "awk" uses commas to indicate
a space should be inserted.
In the second shell interpretation of the command, the $1 $2 $3 ...
specified in the command line argument to the function are substituted
with the values of the positional parameters created with the "set"
command in the previous statement.
In the third shell interpretation of the command, the korn shell "print"
statement is executed. This "print" command was substituted originally
back at line 4.
Line 8: done
The "done" statement ends the while loop.
|