Notes on sed

Notes on sed

ed

When you open a file using ed, it displays the number of characters in the file and positions you at the last line.

By default, a command affects only the current line.

Commands:

p: print, d: delete

n: move to nth line

Move to line containing word: /word/

/regular/d, 1d

g/regular/d: delete all the lines that contain the regular expression

Substituting text:

[address]s /pattern/replacement/flag

s/regular/complex/g

This command changes all occurrences on the current line.

To make it apply to all lines, use the global command, putting g before the address.

g/regular/s/regular/complex/g

The "g" at the beginning is the global command that means make the changes on all lines matched by the address. The "g" at the end is a flag that means change each occurrence on a line, not just the first.

If the address and the pattern are the same, it can be written like:

g/regular/s//complex/g

ed test < ed-script

g/re/p stands for "global regular expression print."


Sed

SYNOPSIS

-n, --quiet, --silent suppress automatic printing of pattern space

-e script, --expression=script add the script to the commands to be executed

-f script-file, --file=script-file

add the contents of script-file to the commands to be executed

-i[suffix], --in-place[=suffix]

edit files in place (makes backup if extension supplied)

-l N, --line-length=N

specify the desired line-wrap length for the `l' command

-r, --regexp-extended

use extended regular expressions in the script.

-s, --separate

consider files as separate rather than as a single continuous long stream.

-u, --unbuffered

load minimal amounts of data from the input files and flush the output buffers more often

COMMAND SYNOPSIS

Zero-address “commands”

: label Label for b and t commands.

#comment The comment extends until the next newline (or the end of a -e script fragment).

} The closing bracket of a { } block.

Zero- or One- address commands

= Print the current line number.

a \

text

Append text, which has each embedded newline preceded by a backslash.

i \

text

Insert text, which has each embedded newline preceded by a backslash.

q

Immediately quit the sed script without processing any more input, except that if auto-print is not disabled the current pattern space will be printed.

Q

Immediately quit the sed script without processing any more input.

r filename Append text read from filename.

R filename Append a line read from filename.

Commands which accept address ranges

{ Begin a block of commands (end with a }).

b label Branch to label; if label is omitted, branch to end of script.

t label

If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.

T label

If no s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.

c \

text

Replace the selected lines with text, which has each embedded newline preceded by a backslash.

d

Delete pattern space. Start next cycle.

D

Delete up to the first embedded newline in the pattern space. Start next cycle, but skip reading from the input if there is still data in the pattern space.

h H

Copy/append pattern space to hold space.

g G

Copy/append hold space to pattern space.

x

Exchange the contents of the hold and pattern spaces.

l

List out the current line in a “visually unambiguous” form.

n N

Read/append the next line of input into the pattern space.

p

Print the current pattern space.

P(capital)

Print up to the first embedded newline of the current pattern space.

s/regexp/replacement/

Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.

w filename

Write the current pattern space to filename.

W filename

Write the first line of the current pattern space to filename.

y/source/dest/

Transliterate the characters in the pattern space which appear in source to the corresponding character in dest.

Sed was meant to execute scripts exclusively and cannot be used interactively. Sed differs from ed primarily in that it is stream-oriented. By default, all of the input to sed passes through and goes to standard output. The input file itself is not changed.

Sed is stream-oriented, and ed is not. In ed, a command without an address affects only the current line. But Sed goes through the file, a line at a time, such that each line becomes the current line, and the commands are applied to it. The result is that sed applies a command without an address to every line in the file.

In ed you use addressing to expand the number of lines that are the object of a command; in sed, you use addressing to restrict the number of lines affected by a command.

Command-Line Syntax

sed [-e] ’instruction’ file

The -e option is necessary only when you supply more than one instruction on the command line. It tells sed to interpret the next argument as an instruction. When there is a single instruction, sed is able to make that determination on its own.

sed ’s/MA/Massachusetts/’ list

To specify multiple instructions on the command line:

sed ’s/ MA/, Massachusetts/; s/ PA/, Pennsylvania/’ list

sed -e ’s/ MA/, Massachusetts/’ -e ’s/ PA/, Pennsylvania/’ list

sed -f scriptfile inputfile

Scripting

In sed and awk, each instruction has two parts: a pattern and a procedure. The patter n is a regular expression delimited with slashes (/). A procedure specifies one or more actions to be performed.

Saving output

sed -f sedscr list > newlist

Do not redir ect the output to the file you are editing or you will clobber it. (The “>” redir ection operator truncates the file before the shell does anything else.) If you want the output file to replace the input file, you can do that as a separate step, using the mv command.

Suppressing automatic display of input lines

The default operation of sed is to output every input line. The -n option suppresses the automatic output. When specifying this option, each instruction intended to produce output must contain a print command, p.

sed -n -e ’s/MA/Massachusetts/p’ list

Mixing options (POSIX)

You can build up a script by combining both the -e and -f options on the command

line. The script is the combination of all the commands in the order given.

Using sed and awk Together

awk -F, ’{

print $4 ", " $0

}’ $* |

sort |

awk -F, ’

$1 == LastState {

print "\t" $2

}

$1 != LastState {

LastState = $1

print $1

print "\t" $2

}’


Basic principles how sed works:

All editing commands in a script are applied in order to each line of input.

Commands are applied to all lines (globally) unless line addressing restricts the lines affected by editing commands.

The original input file is unchanged; the editing commands modify a copy of original input line and the copy is sent to standard output.


Sed is implicitly global, will apply commands to every input line.

Line addresses are used to supply context for, or restrict, an operation.

/Sebastopol/s/CA/California/g


A sed command can specify zero, one, or two addresses. An address can be a regular expression describing a pattern, a line number, or a line addressing symbol.

If no addr ess is specified, then the command is applied to each line.

If ther e is only one address, the command is applied to any line matching the address.

If two comma-separated addresses are specified, the command is performed on the first line matching the first address and all succeeding lines up to and including a line matching the second address.

If an addr ess is followed by an exclamation mark (!), the command is applied to all lines that do not match the address.

1d,$d

The line number refers to an internal line count maintained by sed. This counter is not reset for multiple input files. Thus, no matter how many files were specified as input, there is only one line 1 and only one last line in the input stream. Last line can be specified using the addressing symbol $.

/ˆ$/d deletes only blank lines

50,$d, 1,/ˆ$/d


An exclamation mark (!) following an address reverses the sense of the match. The following script deletes all lines except those inside tbl input:

/ˆ\.TS/,/ˆ\.TE/!d


Grouping Commands

Braces ({}) are used in sed to nest one address inside another or to apply multiple commands at the same address. You can nest addresses if you want to specify a range of lines and then, within that range, specify another address.

To delete blank lines only inside blocks of tbl input, use the following command:

/ˆ\.TS/,/ˆ\.TE/{

/ˆ$/d

s/ˆ\.ps 10/.ps 8/

s/ˆ\.vs 12/.vs 10/

}

The opening curly brace must end a line and the closing curly brace must be on a line by itself. Be sure there are no spaces after the braces.

Basic sed Commands: d (delete), a (append), i (insert), and c (change)

[address]command

[line-address]command

Placing multiple commands on the same line is highly discouraged because sed scripts are difficult enough to read even when each command is written on its own line.

Comment #

Substitution

[address]s/pattern/replacement/flags

where the flags that modify the substitution are:

n A number (1 to 512) indicating that a replacement should be made for only the n th occurrence of the patter n.

g Make changes globally on all occurrences in the pattern space. Normally only the first occurrence is replaced.

p Print the contents of the pattern space.

w file Write the contents of the pattern space to file.

In the replacement section, only the following characters have special meaning:

& Replaced by the string matched by the regular expression.

\n Matches the n th substring (n is a single digit) previously specified in the pattern using “\(” and “\)”.

\ Used to escape the ampersand (&), the backslash (\), and the substitution command’s delimiter when they are used literally in the replacement section. In addition, it can be used to escape the newline and create a multiline replacement string.


The print and write flags are typically used when the default output is suppressed (the -n option). In addition, if a script contains multiple substitute commands that match the same line, multiple copies of that line will be printed or written to file.

Append, Insert, and Change

The syntax of these commands is unusual for sed because they must be specified over multiple lines.

append [line-address]a\

text

insert [line-address]i\

text

change [address]c\

text


Each of these commands requires a backslash following it to escape the first end of line. The text must begin on the next line. To input multiple lines of text, each successive line must end with a backslash, with the exception of the very last line.

/<Larry’s Address>/i\

4700 Cross Court\

French Lick, IN

The append and insert commands can be applied only to a single line address, not a range of lines. The change command, however, can address a range of lines. In this case, it replaces all addressed lines with a single copy of the text.

/ˆFrom /,/ˆ$/c\

<Mail Header Removed>

removes the entire mail-message header and replaces it with the line “<Mail Header Removed>.” Note that you will see the opposite behavior when the change command is one of a group of commands, enclosed in braces, that act on a range of lines.

/ˆFrom /,/ˆ$/{

s/ˆFrom //p

c\

<Mail Header Removed>

}

will output “<Mail Header Removed>” for each line in the range.


The change command clears the pattern space, having the same effect on the pattern space as the delete command. No command following the change command in the script is applied.

The insert and append commands do not affect the contents of the pattern space. The supplied text will not match any address in subsequent commands in the script, nor can those commands affect the text. No matter what changes occur to alter the pattern space, the supplied text will still be output appropriately. This is also true when the default output is suppressed — the supplied text will be output even if the pattern space is not. Also, the supplied text does not affect sed’s internal line counter.


List

The list command (l) displays the contents of the pattern space, showing nonprinting characters as two-digit ASCII codes.

Transform

[address]y/abc/xyz/

The replacement is made by character position, “a” is replaced by “x” anywhere on the line, regardless of whether or not it is followed by a “b”.

Print command (p)

/ˆ\.Ah/{

p

s/ "//g

s/ˆ\.Ah //p

}

The substitute command’s print flag differs from the print command in that it is conditional upon a successful substitution.


Print Line Number

An equal sign (=) following an address prints the line number of the matched line.

[line-address]=

This command cannot operate on a range of lines.

#n print line number and line with if statement

/ if/{

=

p

}

#n suppr esses the default output of lines.

Next

The next command (n) outputs the contents of the pattern space and then reads the next line of input without retur ning to the top of the script. Its syntax is:

[address]n

The next command changes the normal flow control, the next command causes the next line of input to replace the current line in the pattern space. Subsequent commands in the script are applied to the replacement line, not the current line. If the default output has not been suppressed, the current line is printed before the replacement takes place.

/ˆ\.H1/{

n

/ˆ$/d

}

Match any line beginning with the string ‘.H1’, then print that line and read in the next line. If that line is blank, delete it.

Reading and Writing Files

[line-address]r file

[address]w file

The read command reads the contents of file into the pattern space after the addressed line. It cannot operate on a range of lines. The write command writes the contents of the pattern space to the file.

The read command will not complain if the file does not exist. The write command will create a file if it does not exist; if the file already exists, the write command will overwrite it each time the script is invoked. If there are multiple instructions writing to the same file in one script, then each write command appends to the file.

sed ’$r closing’ $* | pr | lp

/ˆ<Company-list>/r company.list

/ˆ<Company-list>/d

2.

/Northeast$/{

s///

w region.northeast

}

The substitute command matches the same pattern as the address and removes it.

Quit(q)

The quit command (q) causes sed to stop reading new input lines (and stop sending them to the output). Its syntax is:

[line-address]q

sed ’100q’ test

sed -n "

/ˆ\.de *$mac/,/ˆ\.\.$/{

p

/ˆ\.\.$/q

}" $file


Advanced sed Commands

Multiline Pattern Space

Sed has the ability to look at more than one line in the pattern space. This allows you to match patterns that extend over multiple lines. The three multiline commands (N,D,P) all correspond to lowercase basic commands (n,d,p).


Append Next Line:N

The multiline Next (N) command creates a multiline pattern space by reading a new line of input and appending it to the contents of the pattern space. The original contents of pattern space and the new input line are separated by a newline. The embedded newline character can be matched in patterns by the escape sequence “\n”. In a multiline pattern space, the metacharacter “ˆ” matches the very first character of the pattern space, and not the character(s) following any embedded newline(s). Similarly, “$” matches only the final newline in the pattern space, and not any embedded newline(s). After the Next command is executed, control is then passed to subsequent commands in the script.

The Next command differs from the next command, which outputs the contents of the pattern space and then reads a new line of input. The next command does not create a multiline pattern space.

s/Owner and Operator Guide/Installation Guide/

/Owner/{

N

s/ *\n/ /

s/Owner and Operator Guide */Installation Guide\

/

}


$!N excludes the last line ($) from the Next command.


Multiline Delete

The delete command (d) deletes the contents of the pattern space and causes a new line of input to be read with editing resuming at the top of the script. The Delete command (D) works slightly differently: it deletes a portion of the pattern space, up to the first embedded newline. It does not cause a new line of input to be read; instead, it returns to the top of the script, applying these instructions to what remains in the pattern space.


# reduce multiple blank lines to one; version using d command

/^$/{

N

/^\n$/d

}

When a blank line is encountered, the next line is appended to the pattern space. Then we try to match the embedded newline. Note that the positional metacharacters, ˆ and $, match the beginning and the end of the pattern space, respectively.


Where ther e was an even number of blank lines, all the blank lines were removed. Only when there was an odd number was a single blank line preserved. That is because the delete command clears the entire pattern space. Once the first blank line is encountered, the next line is read in, and both are deleted. If a third blank line is encountered, and the next line is not blank, the delete command is not applied, and thus a blank line is output. If we use the multiline Delete command (D rather than d), we get the result we want:

/ˆ$/{

N

/ˆ\n$/D

}


Multiline Print:(P)

The multiline Print command outputs the first portion of a multiline pattern space, up to the first embedded newline.

The Print command(P) frequently appears after the Next command and before the Delete command. These three commands can set up an input/output loop that maintains a two-line pattern space yet outputs only one line at a time. The purpose of this loop is to output only the first line in the pattern space, then return to the top of the script to apply all commands to what had been the second line in the pattern space.

/UNIX$/{

N

/\nSystem/{

s// Operating &/

P

D

}

}


Examples

# Scribe font change script.

s/@f1(\([^)]*\))/\\fB\1\\fR/g

/@f1(.*/{

N

s/@f1(\(.*\n[^)]*\))/\\fB\1\\fR/g

P

D

}

This can be translated as: Once making a substitution across two lines, print the first line and then delete it from the pattern space. With the second portion remaining in the pattern space, control passes to the top of the script where we see if there is an “@f1(” remaining on the line.


Hold That Line(hold space)

The pattern space is a buffer that contains the current input line. There is also a set-aside buffer called the hold space. The contents of the pattern space can be copied to the hold space and the contents of the hold space can be copied to the patter n space. A group of commands allows you to move data between the hold space and the pattern space. The hold space is used for temporary storage.

The most frequent use of the hold space is to have it retain a duplicate of the current input line while you change the original in the pattern space.

Command Abbreviation Function

Hold h or H Copy or append contents of pattern space to hold space.

Get g or G Copy or append contents of hold space to pattern space.

Exchange x Swap contents of hold space and pattern space.


Each of these commands can take an address that specifies a single line or a range of lines.

# Reverse flip

/1/{

h

d

}

/2/{

G

}

The hold command followed by the delete command is a fairly common pairing. Without the delete command, control would reach the bottom of the script and the contents of the pattern space would be output.


A Capital Transformation

# capitalize statement names

/the .* statement/{

h

s/.*the \(.*\) statement.*/\1/

y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/

G

s/\(.*\)\n\(.*the \).*\( statement.*\)/\2\1\3/

}


Building Blocks of Text

The hold space can be used to collect a block of lines before outputting them.

/ˆ$/!{

H

d

}

/ˆ$/{

x

s/ˆ\n/<p>/

s/$/<\/p>/

G

}


Examples:

sed -n '10,20p' access.log


Resources

sed & awk, 2nd Edition

http://linux.about.com/od/commands/l/blcmdl1_sed.htm


Notes on Shell Programming

Notes on Shell Programming

Special Parameters

$#

The number of arguments passed to the program; or the number of parameters set by executing the set statement

$*

Collectively references all the positional parameters as $1, $2, ...

$@

Same as $*, except when double-quoted ("$@") collectively references all the positional parameters as "$1", "$2", ...

$0

The name of the program being executed

$$

The process id number of the program being executed

grep -v "$name" phonebook > /tmp/phonebook$$

$!

The process id number of the last program sent to the background for execution

$?

The exit status of the last command not executed in the background

$-

The current option flags in effect


Other Variables Used by the Shell Variable

CDPATH

The directories to be searched whenever cd is executed without a full path as argument.

ENV

The name of a file that the shell executes in the current environment when started interactively.

FCEDIT

The editor used by fc. If not set, ed is used.

HISTFILE

If set, it specifies a file to be used to store the command history. If not set or if the file isn't writable, $HOME/.sh_history is used.

HISTSIZE

If set, specifies the number of previously entered commands accessible for editing. T

HOME

IFS

The Internal Field Separator characters; used by the shell to delimit words when parsing the command line, for the read and set commands, when substituting the output from a back-quoted command, and when performing parameter substitution. Normally, it contains the three characters space, horizontal tab, and newline.

PATH

PPID

The process id number of the program that invoked this shell (that is, the parent process).

PS1: The primary command prompt, normally "$ ".

PS2: The secondary command prompt, normally "> ".

PWD

Parameter Substitution

$parameter or ${parameter}

To access the tenth and greater arguments, you must write ${10}.

shift

whatever was previously stored inside $2 will be assigned to $1, The old value of $1 will be irretrievably lost. When this command is executed, $# (the number of arguments variable) is also automatically decremented by one.

You can shift more than one "place" at once by writing a count immediately after shift, as in

shift 3

${parameter:-value}

Substitute the value of parameter if it's set and non-null; otherwise, substitute value.

${parameter-value}

Substitute the value of parameter if it's set; otherwise, substitute value.

${parameter:=value}

Substitute the value of parameter if it's set and non-null; otherwise, substitute value and also assign it to parameter.

${parameter=value}

Substitute the value of parameter if it's set; otherwise, substitute value and also assign it to parameter.

${parameter:?value}

Substitute the value of parameter if it's set and non-null; otherwise, write value to standard error and exit. If value is omitted, write parameter: parameter null or not set instead.

${parameter?value}

Substitute the value of parameter if it's set; otherwise, write value to standard error and exit. If value is omitted, write parameter: parameter null or not set instead.

${parameter:+value}

Substitute value if parameter is set and non-null; otherwise, substitute null.

${parameter+value}

Substitute value if parameter is set; otherwise, substitute null.

${#parameter}

Substitute the length of parameter. If parameter is * or @, the result is not specified.

${parameter#pattern}

Substitute the value of parameter with pattern removed from the left side. The smallest portion of the contents of parameter matching pattern is removed. Shell filename substitution characters (*, ?, [...], !, and @) may be used in pattern.

${parameter##pattern}

Same as #pattern except the largest matching pattern is removed.

${parameter%pattern}

Same as #pattern except pattern is removed from the right side.

${parameter%%pattern}

Same as ##pattern except the largest matching pattern is removed from the right side. Quoting

'…' Removes special meaning of all enclosed characters

"…" Removes special meaning of all enclosed characters except $, `, and \

\c

Removes special meaning of character c that follows; inside double quotes removes special meaning of $, `, ", newline, and \that follows, but is otherwise not interpreted; used for line continuation if appears as last character on line (newline is removed)

Using the Backslash for Continuing Lines

The Backslash Inside Double Quotes

The backslash inside these quotes removes the meaning of characters that otherwise would be interpreted inside double quotes (that is, other backslashes, dollar signs, back quotes, newlines, and other double quotes). If the backslash precedes any other character inside double quotes, the backslash is ignored by the shell and passed on to the program:

Command Substitution

The Back Quote `command`

The $(...) Construct $(command)

variable=$(command)

The expr Command(for older linux)

expr 10 + 20 / 2

Each operator and operand given to expr must be a separate argument,

expr 17 * 6

The shell saw the * and substituted the names of all the files in your directory!

expr 17 \* 6

only use the command substitution mechanism to assign the output from expr back to the variable:

$ i=$(expr $i + 1) Add 1 to i

like the shell's built-in integer arithmetic, expr only evaluates integer arithmetic expressions. You can use awk or bc if you need to do floating point calculations.

Decisions

user="$1"

if who | grep "^$user " > /dev/null

then

echo "$user is logged on"

fi

if command

then

command

else

command

fi


if command

then

command

elif command2

then

elif commandn

then

else

fi

The test Command

test String Operators

Operator Returns TRUE (exit status of 0) if

string1 = string2, string1 != string2, string

-n string

string is not null (and string must be seen by test).

-z string

string is null (and string must be seen by test).

An Alternative Format for test

[ expression ]

spaces must appear after the [ and before the ].

[ "$name" = julio ]

Integer Operators

int1 -eq int2

int1 -ge int2

int1 -gt int2

int1 -le int2

int1 -lt int2

int1 -ne int2

File Operators

-d file file is a directory.

-e file file exists.

-f file file is an ordinary file.

-r file file is readable by the process.

-s file file has nonzero length.

-w file file is writable by the process.

-x file file is executable.

-L file file is a symbolic link.

The Logical Negation Operator !

The Logical AND Operator –a

The Logical OR Operator –o

[ -f "$mailfile" -a -r "$mailfile" ]

Parentheses

Use parentheses in a test expression to alter the order of evaluation; and make sure that the parentheses are quoted because they have a special meaning to the shell.

[ \( "$count" -ge 0 \) -a \( "$count" -lt 10 \) ]

The exit Command

The case Command

hour=$(date +%H)

case "$hour"

in

0? | 1[01] ) echo "Good morning";;

1[2-7] ) echo "Good afternoon";;

* ) echo "Good evening";;

esac

The -x Option for Debugging Programs

You can trace the execution of any program by typing sh -x followed by the name of the program.

The Null Command :

The && and || constructs implement shortcut logic

enable you to execute a command based on whether the preceding command succeeds or fails.

sort bigdata > /tmp/sortout && mv /tmp/sortout bigdata

[ -z "$EDITOR" ] && EDITOR=/bin/ed

grep "$name" phonebook || echo "Couldn't find $name"


The for Command

for i in 1 2 3

do

echo $i

done

The $@ Variable

Whereas the shell replaces the value of $* with $1, $2, ..., if you instead use the special shell variable "$@" it will be replaced with "$1", "$2", ... .

The for Without the List

for var

do

command

done

Shell automatically sequences through all the arguments typed on the command line, just as if you had written

for var in "$@"

do

command

done

The while Command

The while loop is often used in conjunction with the shift command to process a variable number of arguments typed on the command line.

while [ "$#" -ne 0 ]

do

echo "$1"

shift

done

The until Command

until who | grep "^$user " > /dev/null

do

sleep 60

done

break

When the break is executed, control is sent immediately out of the loop, where execution then continues as normal with the command that follows the done.

break n the n innermost loops are immediately exited.

continue n causes the commands in the innermost n loops to be skipped; but execution of the loops then continues as normal.

Executing a Loop in the Background

An entire loop can be sent to the background for execution simply by placing an ampersand after the done.

for file in memo[1-4]

do

run $file

done &

I/O Redirection on a Loop

Input/Output redirected into the loop applies to all commands in the loop that read their data from standard input or write to standard output.you can also redirect the standard error output from a loop, simply by tacking on a 2> file after the done.

You can override redirection of the entire loop's input or output by explicitly redirecting the input and/or output of commands inside the loop. To force input or output of a command to come from or go to the terminal, use the fact that /dev/tty always refers to your terminal.

for file

do

echo "Processing file $file" > /dev/tty

done > output

Piping Data Into and Out of a Loop

A command's output can be piped into a loop, and the entire output from a loop can be piped into another command in the expected manner.

for i in 1 2 3 4

do

echo $i

done | wc -l

Typing a Loop on One Line

for i in 1 2 3 4; do echo $i; done

if commands can also be typed on the same line using a similar format:

if [ 1 = 1 ]; then echo yes; fi

The getopts Command

getopts options variable

The getopts command is designed to be executed inside a loop. Each time through the loop, getopts examines the next command line argument and determines whether it is a valid option. This determination is made by checking to see whether the argument begins with a minus sign and is followed by any single letter contained inside options. If it does, getopts stores the matching option letter inside the specified variable and returns a zero exit status.


If the letter that follows the minus sign is not listed in options, getopts stores a question mark inside variable before returning with a zero exit status. It also writes an error message to standard error.

If no more arguments are left on the command line or if the next argument doesn't begin with a minus sign, getopts returns a nonzero exit status.

To indicate to getopts that an option takes a following argument, you write a colon character after the option letter on the getopts command line.

getopts mt: option


If getopts doesn't find an argument after an option that requires one, it stores a question mark inside the specified variable and writes an error message to standard error. Otherwise, it stores the actual argument inside a special variable called OPTARG.


Another special variable called OPTIND is initially set to one and is updated each time getopts returns to reflect the number of the next command-line argument to be processed.

while getopts mt: option

do

case "$option"

in

m) mailopt=TRUE;;

t) interval=$OPTARG;;

\?) echo "Usage: mon [-m] [-t n] user"

exit 1;;

esac

done

shiftcount=$((OPTIND – 1))

shift $shiftcount

user=$1


Reading and Printing Data

The read Command read variables

You can use the –r option to read to prevent shell interpreting the backslash character.

while read –r line

Special echo Escape Characters \c

\c tells echo to leave the cursor right where it is after displaying the last argument and not to go to the next line.

The printf Command

printf "format" arg1 arg2 ...

printf Conversion Specification Modifiers

- Left justify value.

+ Precede integer with + or -.

(space) Precede positive integer with space character.

# Precede octal integer with 0, hexadecimal integer with 0x or 0X.

Width Minimum width of field; * means use next argument as width.

Precision

Minimum number of digits to display for integers; maximum number of characters to display for strings; * means use next argument as precision.


More on Subshells

A subshell can't change the value of a variable in a parent shell, nor can it change its current directory.

The . Command

Its purpose is to execute the contents of file in the current shell, A subshell is not spawned to execute the program.


The (...) and { ...; } Constructs

They are used to group a set of commands together.

The first form causes the commands to be executed by a subshell, the latter form by the current shell.

For {}., if the commands enclosed in the braces are all to be typed on the same line, a space must follow the left brace, and a semicolon must appear after the last command.

.profile File

/etc/profile and $HOME/.profile

I/O Redirection

< file, > file

>| file

Redirect standard output to file; file is created if it doesn't exist and zeroed if it does; the noclobber (-C) option to set is ignored.

>> file

Like >, only output is appended to file if it already exists.

Redirect the standard output for a command to standard error by writing

command >& 2

Collect the standard output and the standard error output from a program into the same file.

command >foo 2>>foo or command >foo 2>&1

Because the shell evaluates redirection from left to right on the command line, the last example cannot be written: command 2>&1 > foo

Inline Input Redirection

command <<word

word

the shell uses the lines that follow as the standard input for command, until a line that contains just word is found.

The shell performs parameter substitution for the redirected input data, executes back-quoted commands, and recognizes the backslash character. However, any other special characters, such as *, |, and ", are ignored. If you have dollar signs, back quotes, or backslashes in these lines that you don't want interpreted by the shell, you can precede them with a backslash character. Alternatively, if you want the shell to leave the input lines completely untouched, you can precede the word that follows the << with a backslash.

If the first character that follows the << is a dash (-), leading tab characters in the input will be removed by the shell.

<& digit Standard input is redirected from the file associated with file descriptor digit.

>& digit Standard output is redirected to the file associated with file descriptor digit.

<&- Standard input is closed.

>&- Standard output is closed.

<> file Open file for both reading and writing.

Shell Archives

One or more related shell programs can be put into a single file and then shipped to someone.

cat > program_name <<\THE-END-OF-DATA

THE-END-OF-DATA

The exec Command

Because the exec'ed program replaces the current one, there's one less process hanging around; also, startup time of an exec'ed program is quicker, due to the way the Unix system executes processes.


exec can be used to close standard input and reopen it with any file that you want to read. To change standard input to file:

exec < file

Redirection of standard output is done similarly: exec > report

If you use exec to reassign standard input and later want to reassign it someplace else, you can simply execute another exec. To reassign standard input back to the terminal, you would write

exec < /dev/tty


The set Command

General Format: set options args

This command is used to turn on or off options as specified by options. It is also used to set positional parameters, as specified by args.

Each single letter option in options is enabled if the option is preceded by a minus sign (-), or disabled if preceded by a plus sign (+).

set Options

--

-- option tells set not to interpret any subsequent arguments on the command line as options. It also prevents set from displaying all your variables if no other arguments follow.

-a

Automatically export all variables that are subsequently defined or modified.

-v

Print each shell command line as it is read.

-x

Print each command and its arguments as it is executed,

set with No Arguments

Using set to Reassign Positional Parameters

There is no way to directly assign a value to a positional parameter;

These parameters are initially set on execution of the shell program. The only way they may be changed is with the shift or the set commands. If words are given as arguments to set on the command line, those words will be assigned to the positional parameters $1, $2, and so forth. The previous values stored in the positional parameters will be lost forever. So

set a b c

assigns a to $1, b to $2, and c to $3. $# also gets set to 3.

The IFS Variable stands for Internal Field Separator.

echo "$IFS" | od -b

The readonly Command

readonly –p gets a list of your read-only variables

The unset Command

unset x Remove x from the environment

This causes the shell to erase definitions of the variables or functions listed in names. Read-only variables cannot be unset. The –v option to unset specifies that a variable name follows, whereas the –f option specifies a function name. If neither option is used, it is assumed that variable name(s) follow.

The eval Command

eval command-line

the net effect is that the shell scans the command line twice before executing it.

pipe="|"

ls $pipe wc -l

The shell takes care of pipes and I/O redirection before variable substitution, so it never recognizes the pipe symbol inside pipe.

eval ls $pipe wc –l

The first time the shell scans the command line, it substitutes | as the value of pipe. Then eval causes it to rescan the line, at which point the | is recognized by the shell as the pipe symbol.

Get the last parameter: eval echo \$$#

The wait Command

wait process-id

where process-id is the process id number of the process you want to wait for. If omitted, the shell waits for all child processes to complete execution.

The trap Command

trap commands signals

Commonly Used Signal Numbers

0 Exit from the shell

1 Hangup

2 Interrupt (for example, Delete, Ctrl+c key)

15 Software termination signal (sent by kill by default)

trap "rm $WORKDIR/work1$$ $WORKDIR/dataout$$; exit" 1 2

The value of WORKDIR and $$ will be substituted at the time that the trap command is executed. If you wanted this substitution to occur at the time that either signal 1 or 2 was received (for example, WORKDIR may not have been defined yet), you can put the commands inside single quotes:

trap 'echo logged off at $(date) >>$HOME/logoffs' 0

trap 'rm $WORKDIR/work1$$ $WORKDIR/dataout$$; exit' 1 2

trap with No Arguments

Executing trap with no arguments results in the display of any traps that you have changed.

Ignoring Signals

If the command listed for trap is null, the specified signal will be ignored when received.

trap "" 2

If you ignore a signal, all subshells also ignore that signal. However, if you specify an action to be taken on receipt of a signal, all subshells will still take the default action on receipt of that signal.


If you execute trap : 2

and then execute your subshells, then on receiving the interrupt signal the current shell will do nothing (it will execute the null command), but all active subshells will be terminated (they will take the default action—termination).

Resetting Traps

After you've changed the default action to be taken on receipt of a signal, you can change it back again with trap if you simply omit the first argument;

trap 1 2

Functions

name () { command; ... command; }

Removing a Function Definition

To remove the definition of a function from the shell, you use the unset command with the –f option.

unset –f nu


Material

Unix Shell Programming, Third Edition


Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts