Tips on Subversion

Tips on Subversion

Remove projects from Subversion Repository:
I once used the following commands to import couples of project to my SVN repository.
svnadmin create /var/svn_repos
svn import /home/yuanyun/code/svn/project1 file:///var/svn_repos/projec1/trunk -m "importing project1"
svn import /home/yuanyun/code/svn/project2 file:///var/svn_repos/projec2/trunk -m "importing project2"

and later I want to remove project1 from Subversion Repository.
Solution:

Use 'svnadmin dump' and 'svndumpfilter' to dump and exclude unwanted project, then use 'svnadmin load' to load the filtered dumpfile into a new repository.
Steps:
svn list file:///var/svn_repos
svnadmin dump /var/svn_repos > svn_repos.dump

cd /home/yuanyun/temp/repos
svnadmin dump /var/svn_repos > svn_repos.dump
svndumpfilter exclude project2 < svn_repos.dump > filtered-svn_repos.dump

svnadmin create /var/filtered_svn_repos
svnadmin load /var/filtered_svn_repos < /home/yuanyun/temp/filtered-svn_repos.dump

svn list file:///var/filtered_svn_repos

Resources:
http://svnbook.red-bean.com
http://www.svnforum.org/2017/viewtopic.php?p=23940#23940


Notes on Trac

Notes on Trac

Trac is a lightweight, open source issue tracking and project management tool that builds on Subversion by

adding flexible web-based issue tracking, wiki, and reporting functionalities.

Install trac on Ubuntu:

http://trac.edgewall.org/wiki/0.11/TracOnUbuntu

Install Software Packages

sudo apt-get install apache2 libapache2-mod-python libapache2-svn python-setuptools subversion python-subversion

You may need install python Easy Install first.(http://peak.telecommunity.com/DevCenter/EasyInstall)

python ez_setup.py

sudo easy_install Trac

Setting Up a Trac Project

sudo mkdir -p /var/data/trac

trac-admin /var/data/trac/myproject initenv

sudo chown -R www-data:www-data /var/data/trac

By default, Trac will use an embedded SQLite database, which is sufficient for small projects.

Running Trac on the Standalone Server

tracd --port 8000 /var/data/trac/myproject

Then Your projects will be available on separate URLs: http://localhost:8000/myproject.

or:

python tracd -p 8000 \

--auth project1,/data/svn/svn-digest -auth-file,"Subversion repository" \

--auth project2,/data/svn/svn-digest -auth-file,"Subversion repository" \

/data/trac/project1 /data/trac/project2

Integrate Trac with Apache Server

You can set up authentication to use the same Apache basic or digest authentication that you use for your Subversion installation: simply use the same authentication type (AuthType), realm name (AuthName), and authentication file (AuthUserFile or AuthDigestFile).

LoadModule python_module modules/mod_python.so

<Location /trac>

SetHandler mod_python

PythonInterpreter main_interpreter

PythonHandler trac.web.modpython_frontend

PythonOption TracEnvParentDir /var/data/trac

PythonOption TracUriRoot /trac

PythonOption PYTHON_EGG_CACHE /tmp

</Location>

# use the following for one authorization for all projects

<LocationMatch "/trac/[^/]+/login">

AuthType Basic

AuthName "Myproject Repository"

AuthUserFile /etc/apache2/myproject.passwd

Require valid-user

</LocationMatch>

Restart Apache: sudo apache2 -k restart

Access your project: https://servername/trac/myproject

Administrating the Trac Site

trac-admin /var/data/trac/myproject permission add yuanyun TRAC_ADMIN

Trac Plugins

http://trac.edgewall.org/wiki/TracPlugins

Plugins are packaged as Python eggs. To use egg based plugins in Trac, you need to have setuptools installed.

easy_install http://trac-hacks.org/svn/usermanagerplugin/0.11

If Trac reports permission errors after installing a zipped egg and you would rather not bother providing a egg cache directory writable by the web server, you can get around it by simply unzipping the egg. Just pass --always-unzip to easy_install:

easy_install --always-unzip TracSpamFilter-0.2.1dev_r5943-py2.4.egg

You should end up with a directory having the same name as the zipped egg (complete with .egg extension) and containing its uncompressed contents.

Enabling the plugin

We can use trac web admin or edit trac project configuration file - trac.ini.

[components]

tracspamfilter.* = enabled

Tailoring the Trac Web Site: Using the Wiki Function

Headings

Headings and subheadings can be written using "=" and "==,"

Lists

You can use "*" for unordered lists, as shown here:

* Cats

* Burmese

* Siamese

Preformatted text can be displayed using {{{...}}}:

Trac tickets:

Trac issue tickets can be referenced using either the "ticket" link type (ticket:123) or the # shorthand form (#123).

Source code files:

Trac provides many ways to set up links to your source code. At the simplest level, files in the source code repository can be referred to using the "source:"

Using the Trac Ticket Management System

Creating a New Ticket

Customizing Trac Ticket Fields

Trac lets you customize the ticket screen in via its Admin pages.

Browsing the Source Code Repository

Using RSS and ICalendar


Useful plugins:

http://trac-hacks.org/wiki/UserManagerPlugin

http://trac-hacks.org/wiki/StractisticsPlugin

http://trac-hacks.org/svn/masterticketsplugin/0.11/

http://trac-hacks.org/wiki/TagsPlugin

http://stackoverflow.com/questions/194361/what-are-some-recommended-plugins-for-trac

http://trac-hacks.org/wiki/TicketImportPlugin


Taking Notes from Learning Perl

Taking Notes from Learning Perl

Introduction

Perl is sometimes called the "Practical Extraction and Report Language,". and Larry created Perl in the mid-1980s.

#!/usr/bin/env perl

use 5.010;

while (<>) { chomp; print join("\t", (split /:/)[0, 2, 1, 5] ), "\n";}

CPAN is the Comprehensive Perl Archive Network, your one-stop shop for Perl, start at http://search.cpan.org/ or http://kobesearch.cpan.org/ to browse or search the archive.

Comments run from a pound sign (#) to the end of the line. There are no "block comments" in Perl.

On Unix systems,if the very first two characters on the first line of a text file are #!, what follows is the name of the program that actually executes the rest of the file.

@lines = `perldoc -u -f atan2`; foreach (@lines) { s/\w<([^>]+)>/\U$1/g; print;}

Scalar Data

Numbers

All Numbers Have the Same Format Internally internally, Perl computes with double-precision floating-point values.* This means that there are no integer values internal to Perl—an integer constant in the program is treated as the equivalent floating-point value.

Floating-Point Literals, Integer Literals,

Nondecimal Integer Literals

Octal (base 8) literals start with a leading 0, hexadecimal (base 16) literals start with a leading 0x, and binary (base 2) literals start with a leading 0b. Perl allows underscores for clarity within integer literals: 61_298_040, 0x50_65_72_7C

Numeric Operators:

exponentiation operator, 2**3

a modulus operator (%), Both values are first reduced to their integer values.

Strings

Single-Quoted String Literals

Any character other than a single quote or a backslash between the quote marks stands for itself inside a string.

Double-Quoted String Literals

Double-quoted string backslash escapes:

\n, \r, \t, \f, \b, \\, \"

\a Bell

\e Escape (ASCII escape character)

\l Lowercase next letter

\L Lowercase all following letters until \E

\u Uppercase next letter

\U Uppercase all following letters until \E

\Q Quote nonword characters by adding a backslash until \E

\E End \L, \U, or \Q

Another feature of double-quoted strings is that they are variable interpolated.

String Operators

String values can be concatenated with the . operator.

String repetition operator consists of the single lowercase letter x: "fred" x 3 # is "fredfredfred"

The copy count (the right operand) is first truncated to an integer value (4.8 becomes 4) before being used.

Automatic Conversion Between Numbers and Strings

When a string value is used where an operator needs a number (say, for multiplication), Perl automatically converts the string to its equivalent numeric value. Trailing nonnumber stuff and leading whitespace are discarded.

Perl’s Built-in Warnings

$ perl -w my_program, #!/usr/bin/perl -w, That works even on non-Unix systems. #!perl -w

With Perl 5.6 and later, you can turn on warnings with a pragma. use warnings; If you get a warning message you don’t understand, you can get a longer description of the problem with the diagnostics pragma: use diagnostics;

A further optimization can be had by using one of Perl’s command-line options, -M, to load the pragma only when needed instead of editing the source code each time to enable and disable diagnostics:

$ perl -Mdiagnostics ./my_program

Scalar Variables

Scalar variable names beginwith a dollar sign followed a Perl identifier: $fred. Scalar variables in Perl are always referenced with the leading $&.

Scalar Assignment, Binary Assignment Operators

$fred = 17; $barney *= 3; $str .= " ";

Output with print

You can give print a series of values, separated by commas: print "The answer is ", 6 * 7, ".\n";

Interpolation of Scalar Variables into Strings: $barney = "fred ate a $meal";

To put a real dollar sign into a double-quoted string, precede the dollar sign with a backslash, or put it in Single-Quoted String. The variable name will be the longest possible variable name that makes sense at that

part of the string, Perl provides a delimiter for the variable name in a manner similar to the shell.

print "fred ate $n ${what}s.\n";

Operator Precedence and Associativity

Associativity Operators

left parentheses and arguments to list operators

left ->

++ -- (autoincrement and autodecrement)

right **

right \ ! ~ + - (unary operators)

left =~ !~

left * / % x

left + - . (binary operators)

left << >> named unary operators (-X filetests, rand)

< <= > >= lt le gt ge (the "unequal" ones) == != <=> eq ne cmp (the "equal" ones)

left &

left | ^

left &&

left ||

right ?: (ternary)

right = += -= .= (and similar assignment operators)

left , =>, list operators (rightward)

right not

left and

left or xor

Numeric and string comparison operators

Comparison Numeric String

Equal == eq

Not equal != ne

Less than < lt

Greater than > gt

Less than or equal to <= le

Greater than or equal to >= ge

The if Control Structure: if ($name gt 'fred') {}

Boolean Values

Because the string '0' is the exact same scalar value as the number 0, Perl has to treat them both the same. That means that the string '0' is the only nonempty string that is false.

Getting User Input: $line = <STDIN>;

Each time you use <STDIN> in a place where a scalar value is expected, Perl reads the

next complete text line from standard input (up to the first newline), and uses that string as the value of <STDIN>. and The string value of <STDIN> typically has a newline character on the end of it.

The chomp Operator: chomp($text = <STDIN>);

The undef Value, The defined Function

$madonna = <STDIN>; if ( defined($madonna) ) {}

Lists and Arrays

Accessing Elements of an Array,: $fred[0] = "yabba";

If the subscript’s not an integer already, it’ll automatically be truncated to the next lower integer the last element index is $#rocks: $rocks[ $#rocks+1] = 'hard rock'; negative array indices count from the end of the array, −1 (the last element).

List Literals, (1, 2, 3), ( ), (1..100)

(1.7..5.7)#both values are truncated, (5..1)# empty list - .. only counts "uphill"

The qw Shortcut

qw stands for "quoted words" or "quoted by whitespace,", The qw shortcut makes it easy to generate them without typing a lot of extra quote marks: qw( fred barney betty wilma dino ). And Perl actually lets you choose any punctuation character as the delimiter: qw##, qw[]

List Assignment

($fred, $barney) = ($barney, $fred); # swap those value

Use the at sign (@) before the name of the array (and no index brackets after it) to refer to the entire array at once. You can read this as “all of the”. @rocks = qw/ bedrock slate lava /; @quarry = (@rocks, "crushed rock", @tiny, $dino);

It’s also worth noting that an array name is replaced by the list it contains. An array doesn’t become an element in the list because these arrays can contain only scalars, not other arrays.

The pop and push Operators, and shift and unshift Operators

The push and pop operators add to and remove to the end of an array, and the unshift and shift operators perform the corresponding actions on the “start” of the array.

Interpolating Arrays into Strings

Elements of an array are automatically separated by spaces upon interpolation. A single element of an array will be replaced by its value.

The foreach Control Structure

The foreach loop steps through a list of values, executing one iteration for each value: foreach $rock (qw/ bedrock slate lava /) {}

The control variable is not a copy of the list element—it actually is the list element. That is, if you modify the control variable inside the loop, you’ll be modifying the element itself. after the loop has finished, the value of the control variable is the same as it was before the loop started. the variable has the value it had before

the loop, or undef if it hadn’t had a value.

Perl’s Favorite Default: $_

If you omit the control variable from the beginning of the foreach loop, Perl uses its favorite default variable, $_.

foreach (1..10) { print # Uses $_ by default}

The reverse, sort Operators

reverse returns the reversed list; it doesn’t affect its arguments. The sort operator sorts list in the internal character ordering.

Scalar and List Context

As Perl is parsing your expressions, it always expects either a scalar value or a list value. Expressions in Perl always return the appropriate value for their context. In a list context, An array gives the list of elements. But in a scalar context, it returns the number of elements in the array.

Using List-Producing Expressions in Scalar Context

In a scalar context, reverse returns a reversed string. sort in a scalar context always returns undef.

Using Scalar-Producing Expressions in List Context

if an expression doesn’t normally have a list value, the scalar value is automatically promoted to make a one-element list.

Forcing Scalar Context: scalar

print "I have ", scalar @rocks, " rocks!\n"; # gives a number

<STDIN> in List Context

<STDIN> returns the next line of input in a scalar context. In list context, this operator returns all of the remaining lines up to the end of file. Each line is returned as a separate element of the list.

chomp(@lines = <STDIN>);

Subroutines

The name of a subroutine is another Perl identifier with a sometimes-optional ampersand (&) in front.

The subroutine name comes from a separate namespace.

Defining a Subroutine: sub functionName{}

Invoking a Subroutine: & functionName

Invoke a subroutine from within any expression by using the subroutine name (with the ampersand): &marine;

Return Values

All Perl subroutines have a return value, Whatever calculation is last performed in a subroutine is automatically also the return value.

Arguments

Perl automat ically stores the parameter list in the special array variable named @_ for the duration of the subroutine. the first subroutine parameter is stored in $_[0], etc.

Private Variables in Subroutines: my($m, $n) = @_;

By default, all variables in Perl are global variables; that is, they are accessible from every part of the program. But you can create private variables called lexical variables with the my operator.

Those lexical variables can actually be used in any block, not merely in a subroutine’s block, they can be used in the block of an if, while, or foreach. without the parentheses, my only declares a single lexical variable:

my $fred, $barney; # WRONG! Fails to declare $barney my($fred, $barney); # declares both

Variable-Length Parameter Lists

sub max { my($max_so_far) = shift @_; foreach (@_) { if ($_ > $max_so_far) {$max_so_far = $_;}} $max_so_far;}

The use strict Pragma

A pragma is a hint to a compiler, telling it something about the code. In this case, the use strict pragma tells Perl’s internal compiler that it should enforce some good programming rules for the rest of this block or source file.

The return Operator, Nonscalar Return Values

Omitting the Ampersand

If the compiler sees the subroutine definition before invocation, or if Perl can tell from the syntax that it’s a subroutine call, the subroutine can be called without an ampersand, just like a built-in function.

if the subroutine has the same name as a Perl built-in, you must use the ampersand to call it. With an ampersand, you’re sure to call the subroutine; without it, you can get the subroutine only if there’s no built-in

with the same name.

Persistent, Private Variables

With state, we can still have private variables scoped to the subroutine but Perl will keep their values between calls.

sub running_sum {state $sum = 0; state @numbers;

foreach my $number ( @_ ) { push @numbers, $number;$sum += $number; }

say "The sum of (@numbers) is $sum";}

There’s a slight restriction on arrays and hashes as state variables, though. We can’t initialize them in list context as of Perl 5.10: state @array = qw(a b c); # Error!

Input and Output

Input from Standard Input: chomp($line = <STDIN>);

Since the line-input operator will return undef when you reach end-of-file, this is handy for dropping out of loops:

while (defined($line = <STDIN>)) { print "I saw $line";}

And the shortcut looks like this:while (<STDIN>) { print "I saw $_";}

Input from the Diamond Operator: <>

Another way to read input is with the diamond operator: <>. This is useful for making programs that work like standard Unix utilities, with respect to the invocation arguments.

If you give no invocation arguments, the program should process the standard input stream. Or, as a special case, if you give just a hyphen as one of the arguments, that means standard input as well. such as fred - betty.

instead of getting the input from the keyboard, it comes from the user’s choice of input.

while (<>) {chomp; print "It was $_ that I saw!\n";}

The Invocation Arguments

@ARGV array is a special array that is preset by the Perl interpreter as the list of the invocation arguments.

The diamond operator looks in @ARGV to determine what filenames it should use. If it finds an empty list, it uses the standard input stream; otherwise it uses the list of files that it finds.

Output to Standard Output

print "...\n"; == say "..."

Since print is looking for a list of strings to print, its arguments are evaluated in list context. Since the diamond operator (as a special kind of line-input operator) will return a list of lines in a list context:

print <>; # source code for 'cat' print sort <>; # source code for 'sort'

Formatted Output with printf

The printf operator takes a format string followed by a list of things to print printf "Hello, %s; your password expires in %d days!\n", $user, $days_to_expire;

Arrays and printf: printf "The items are:\n".("%10s\n" x @items), @items;

Filehandles

A filehandle is the name in a Perl program for an I/O connection between your Perl process and the outside world.

Perl already uses six special filehandle names for its own purposes: STDIN, STDOUT, STDERR, DATA, ARGV, and ARGVOUT

Opening a Filehandle

use the open operator to tell Perl to ask the operating system to open the connection between your program and the outside world: open CONFIG, "dino";(or "<dino", ">fred", ">>logfile")

In modern versions of Perl (starting with Perl 5.6), you can use a “three-argument” open: open CONFIG, "<", "dino";

open always tells us if it succeeded or failed, by returning true for success or false for failure.

my $success = open LOG, ">>logfile"; # capture the return value

if ( ! $success) {}

Closing a Filehandle: close BEDROCK;

Perl will automatically close a filehandle if you reopen it (that is, if you reuse the filehandle name in a new open) or if you exit the program.

Fatal Errors with die: $!

The die function prints out the message you give it and makes sure that your program exits with a nonzero exit status.

if ( ! open LOG, ">>logfile") { die "Cannot create logfile: $!"; }

In general, when the system refuses to do something we’ve requested (like opening a file), $! will give you a reason. if you use die to indicate an error that is not the failure of a system request, don’t include $!.

There’s one more thing that die will do for you: it will automatically append the Perl program name and line number to the end of the message, so you can easily identify which die in your program is responsible for the untimely exit.

If you don’t want the line number and file revealed, make sure that the dying words have a newline on the end.

if (@ARGV < 2) {die "Not enough arguments\n";}

Warning Messages with warn

The war function doesn’t actually quit the program.

Using Filehandles

if ( ! open PASSWD, "/etc/passwd") {die "How did you get logged in? ($!)";}

while (<PASSWD>) {chomp; ...}

A filehandle open for writing or appending may be used with print or printf, appearing immediately after the keyword but before the list of arguments: print LOG "Captain's log, stardate 3.14159\n"; # output goes to LOG

Changing the Default Output Filehandle

The default ouyput may be changed with the select operator. by default, the output to each filehandle is buffered. Setting the special $| variable to 1 will set the currently selected filehandle (that is, the one selected at the time that the variable is modified) to always flush the buffer after each output operation.

select BEDROCK; $| = 1; # don't keep LOG entries sitting in the buffer select STDOUT;

Reopening a Standard Filehandle

if you were to reopen a filehandle, the old one would be closed for you automatically. if one of the three system filehandles—STDIN, STDOUT, or STDERR—fails to reopen, Perl kindly restores the original one.† That is, Perl closes the original one (of those three) only when it sees that opening the new connection is successful.

Output with say

It’s the same as print, except that it adds a newline to the end. say BEDROCK "Hello!";

Hashes

Hash Element Access

$hash{$some_key}, $family_name{"fred"} = "flintstone";

The Hash As a Whole

To refer to the entire hash, use the percent sign (%) as a prefix. For convenience, a hash may be converted into a list, and back again. Assigning to a hash is a list-context assignment, where the list is made of key-value pairs:*

%some_hash = ("foo", 35, "bar", 12.4...);

The value of the hash (in a list context) is a simple list of key-value pairs: @any_array = %some_hash;

Hash Assignment: %new_hash = %old_hash;, %inverse_hash = reverse %any_hash;

The Big Arrow: my %last_name = ("fred" => "flintstone",...)

Hash Functions

The keys and values Functions

Perl doesn’t maintain the order of elements in a hash. But, whatever order the keys are in, the values will be in the corresponding order:

The each Function

each function returns a key-value pair as a two-element list. while ( ($key, $value) = each %hash ) { print "$key => $value\n";}

Typical Use of a Hash

The exists Function

To see whether a key exists in the hash, use the exists function. if (exists $books{"dino"}) {}

The delete Function

The delete function removes the given key. delete $books{$person};

Hash Element Interpolation

foreach $person (sort keys %books) {if ($books{$person}) {print "$person has $books{$person} items\n";}}

But there’s no support for entire hash interpolation; "%books" is just the six characters of (literally) %books.

The %ENV hash

your program can look at the environment to get information about its surroundings. Perl stores this information in the %ENV hash: print "PATH is $ENV{PATH}\n";

In the World of Regular Expressions

Don’t confuse regular expressions with shell filename-matching patterns, called globs. A typical glob is what you use when you type *.pm to the Unix shell to match all filenames that end in .pm.

if (/abba/) {print "It matched!\n";}

About Metacharacters

the dot (.) is a wildcard character that matches any single character except a newline. The star (*) matches the preceding item zero or more times. The plus+ matches the preceding item one or more times.

the question mark(?) means that the preceding item is optional.

Grouping in Patterns()

We can use back references to refer to text that we matched in the parentheses such as \1,\2. To know which group gets which number, just count the order of the opening parenthesis and ignore nesting: if (/y((.)(.)\3\2) d\1/) {}

Perl 5.10 has a new way to denote back references: \g{N}. With the \g{N} notation, we can also use negative numbers to specify a relative back reference: if (/(.)(.)\g{-1}11/) {}

Alternatives: |

Character Classes: [abcwxyz], [a-zA-Z]

A character class, a list of possible characters inside square brackets ([]), matches any single character from within the class. It matches just one single character, but that one character may be any of the ones listed. A caret (^) at the start of the character class can be used to specify the characters left out: [^def]

Character Class Shortcuts and Negating the Shortcuts

\d([0-9]), \D;

\w( [A-Za-z0-9_]), \W

\s([\f\t\n\r ]), \s matches just a single character from the class, so it’s usual to use either \s* for any amount of whitespace (including none at all), or \s+ for one or more whitespace characters.

The \h[\t ] shortcut only matches horizontal whitespace. The \v shortcut only matches vertical whitespace, or [\f\n\r]. The \R shortcut matches any sort of linebreak, meaning that you don’t have to think about kind of operating system. [\d\D] means any digit, or any nondigit, that is to say, any character at all [^\d\D] matches anything that’s not either a digit or a nondigit. Right—nothing.

Matching with Regular Expressions

Matches with m//

/fred/ is actually a shortcut for the m// (pattern match) operator. you could write that same expression as m(fred), m<fred>, m{fred}, or m[fred] using those paired delimiters, or as m,fred,, m!fred!, m^fred^, or many other ways using nonpaired delimiters. The shortcut is that if you choose the forward slash as the delimiter, you may omit the

initial m, that is: /fred/.

Option Modifiers

Case-Insensitive Matching with /i

Matching Any Character with /s

By default, the dot (.) doesn’t match newline, If you might have newlines in strings, and you want the dot to be able to match them, the /s modifier will do the job. It changes every dot in the pattern to act like the character class [\d\D] does, which is to match any character, even if it is a newline.

Adding Whitespace with /x

/x allows you to add arbitrary whitespace to a pattern to make it easier to read. Combining Option Modifiers: if (/barney.*fred/is) {}

Anchors: ^,$,\b,\B

The caret anchor (^) marks the beginning of the string, while the dollar sign($) marks the end. /^\s*$/ matches a blank line.

The word-boundary anchor, \b, matches at either end of a word: /\bfred\b/, /\bhunt/, /stone\b/. The nonword-boundary anchor is \B; it matches at any point where \b would not match: /\bsearch\B/.

The Binding Operator, =~

Matching against $_ is merely the default; the binding operator, =~, tells Perl to match the pattern on the right against the string on the left: if ($some_other =~ /\brub/) {}

Interpolating into Patterns

The regular expression is double-quote interpolated, just as if it were a double-quoted string.

while (<>) {if (/^($what)/) {print "We saw $what in beginning of $_";}}

the command-line arguments in @ARGV.

The Match Variables:$1,$2 etc

parentheses also trigger the regular expression engine’s memory. The memory holds the part of the string matched by the part of the pattern inside parentheses. Since these variables hold strings, they are scalar variables; in Perl, they have names like $1 and $2.

if (/(\S+) (\S+), (\S+)/) {print "words were $1 $2 $3\n";}

The Persistence of Memory

These match variables generally stay around until the next successful pattern match. That is, an unsuccessful match leaves the previous memories intact, but a successful one resets them all. This correctly implies that you shouldn’t use these match variables unless the match succeeded; otherwise, you could be seeing a memory from some previous pattern: if ($wilma =~ /(\w+)/) {my $wilma_word = $1;}

Noncapturing Parentheses:(?:)

Perl’s regular expressions have a way to use parentheses to group things but not trigger the memory variables. We call these noncapturing parentheses: (?:).

if (/(?:bronto)?saurus (?:BBQ )?(steak|burger)/) {print "Fred wants a $1\n";}

Named Captures:(?<LABEL>PATTERN)

Perl 5.10 lets us name the captures directly in the regular expression. It saves the text it matches in the hash named %+: the key is the label we used and the value is the part of the string that it matched. To label a match

variable, we use (?<LABEL>PATTERN), then we can refer it as $+{LABEL}.

Perl also lets us use the Python syntax (?P<LABEL>...)

if( $names =~ m/((?<name2>\w+) (and|or) (?<name1>\w+))/ ) {say "I saw $+{name1} and $+{name2}";}

we also use \g{label} to refer to them for back references, and Instead of using \g{label}, we use \k<label>.

if( $names =~ m/(?<last_name>\w+) and \w+ \g{last_name}/ ) {say "I saw $+{last_name}";}

The Automatic Match Variables:$&, $`,$'

The part of the string that actually matched the pattern is automatically stored in $&, Whatever came before the matched section is in $`, and whatever was after it is in $'.

if ("Hello there, neighbor" =~ /\s(\w+),/) {print "That was ($`)($&)($').\n";}

There is price to use $&, $`,$', that is is once you use any one of these automatic match variables anywhere in

your entire program, other regular expressions will run a little more slowly. So if possible, don't use them, if the only one you need is $&, just put parentheses around the whole pattern and use $1 instead.

General Quantifiers:{m,n},{m,},{,n}

Regular expression precedence

Regular expression feature Example

Parentheses (grouping or memory) (...), (?:...), (?<LABEL>...)

Quantifiers a* a+ a? a{n,m}

Anchors and sequence abc ^a a$

Alternation a|b|c

Atoms a [abc] \d \1

A Pattern Test Program

while (<>) { chomp;if (/YOUR_PATTERN_GOES_HERE/) {print "Matched: |$`<$&>$'|\n";} else {print "No match: |$_|\n";}}

Processing Text with Regular Expressions

Substitutions with s///

if (s/fred/wilma/) {}

s/(\w+) (\w+)/$2, $1/; s/,.*een//; s/^/huge, /;

Global Replacements with /g

s/// will make just one replacement, even if others are possible. The /g modifier tells

s/// to make all possible nonoverlapping* replacements.

to turn any arbitrary whitespace into a single space:s/\s+/ /g stripping leading and trailing whitespace:s/^\s+//; s/\s+$//; or s/^\s+|\s+$//g;

Different Delimiters: s///, s###,s{}{}

the delimiters don’t have to be the same kind around the string as they are around the pattern: s<fred>#barney#;

Option Modifiers: /g,/i, /x, and /s

The Binding Operator: $file_name =~ s#^.*/##s;

Case Shifting: \U,\L,\E, \u,\l

The \U escape forces what follows to all uppercase, the \L escape forces lowercase. By default, these affect the rest of the (replacement) string, or you can turn off case

shifting with \E:s/(\w+) with (\w+)/\U$2\E with $1/i;

When written in lowercase (\l and \u ), they affect only the next character: s/(fred|barney)/\U$1/gi;

these escape sequences are available in any double-quotish string:

print "Hello, \L\u$name\E, would you like to play a game?\n".

The split Operator

Another operator that uses regular expressions is split, which breaks up a string according to a pattern: @fields = split /separator/, $string; Here’s a rule that seems odd at first, but it rarely causes problems: leading empty fields

are always returned, but trailing empty fields are discarded.

It’s also common to split on whitespace, using /\s+/ as the pattern: my @args = split /\s+/, $some_input; and The default for split is to break up $_ on whitespace:my @fields = split; # like split /\s+/, $_;

The join Function

The join function doesn’t use patterns, and it glues together a bunch of pieces to make a single string: my $result = join $glue, @pieces;

m// in List Context

When you use split, the pattern specifies the separator: the part that isn’t the useful data. Sometimes it’s easier to specify what you want to keep. When a pattern match (m//) is used in a list context, the return value is a list of the memory variables created in the match.

my($first, $second, $third) = /(\S+) (\S+), (\S+)/;

The /g modifier lets it match at more than one place in a string. In this case, a pattern with a pair of parentheses will return a memory from each time it matches: my @words = ($text =~ /([a-z]+)/ig);

In fact, if there is more than one pair of parentheses, each match may return more than one string.

my %last_name = ($data =~ /(\w+)\s+(\w+)/g);

Each time the pattern matches, it returns a pair of memories. Those pairs of values then become the key-value pairs in the newly created hash.

Nongreedy Quantifiers

The four quantifiers(*,+,?,{m,n}) are all greedy. That means that they match as much as they can, only to reluctantly give some back if that’s necessary to allow the overall pattern to succeed.

For each of the greedy quantifiers, though, there’s also a nongreedy quantifier available: +?,*?,??,{m,n}?.

+?, which matches one or more times (just as the plus does), except that it prefers to match as few times as possible, rather than as many as possible.

To remove all of the tags <BOLD> and </BOLD>: s#<BOLD>(.*?)</BOLD>#$1#g;

+?, *?, {5,10}? or {8,}?, ?? matches either once or not at all, but it prefers not to match anything.

Matching Multiple-Line Text:/m

Classic regular expressions were used to match just single lines of text. and the anchors ^ and $ are normally anchors for the start and end of the whole string. the /m regular expression option lets them match at internal new-

lines as well (think m for multiple lines): print "Found 'wilma' at start of line\n" if /^wilma\b/im;

we read an entire file into one variable,* then add the file’s name as a prefix at the start of each line:

my $lines = join '', <FILE>; $lines =~ s/^/$filename: /gm;

Updating Many Files

Perl supports a way of in-place editing of files with a little extra help from the diamond operator (<>) and $^I.

chomp(my $date = `date`); $^I = ".bak";

while (<>) {s/^Author:.*/Author: Randal L. Schwartz/;s/^Phone:.*\n//; s/^Date:.*/Date: $date/; print;}

By default the variable $^I is undef, and everything is normal. But when it’s set to some string, it makes the diamond operator (<>) even more magical than usual.

the diamond’s magic—it will automatically open and close a series of files for you, or read from the standard-input stream if there aren’t any filenames given. But when there’s a string in $^I, that string is used as a backup filename’s extension.

Let’s say it’s time for the diamond to open our file fred03.dat. It opens it like before, but now it renames it, calling it fred03.dat.bak. We’ve still got the same file open, but now it has a different name on the disk. Next, the diamond creates a new file and gives it the name fred03.dat. And now the diamond selects the new file as the default for output, so that anything that we print will go into that file.

Some folks use a tilde (~) as the value for $^I since that resembles what emacs does for backup files. Another possible value for $^I is the empty string. This enables in-place editing, but doesn’t save the original data in a backup file.

In-Place Editing from the Command Line

perl -p -i.bak -w -e 's/Randall/Randal/g' fred*.dat

The -p option tells Perl to write a program for you. it looks something like this: while (<>) {print;}

If you want even less, you could use -n instead; that leaves out the automatic print statement. The next option is -i.bak, which you might have guessed sets $^I to ".bak" before the program starts. If you don’t want a backup file, you can use -i alone, with no extension. The -e option says “executable code follows.”,

So the command line looksl ike the following program: $^I = ".bak"; while (<>) {s/Randall/Randal/g;print;}

More Control Structures

The unless Control Structure, and The else Clause with unless

The until Control Structure

Expression Modifiers

In order to have a more compact notation, an expression may be followed by a modifier that controls it, and Only a single expression is allowed on either side of the modifier.

print "$n is a negative number.\n" if $n < 0;

$i *= 2 until $i > $j;

print " ", ($n += 2) while $n < 10;

&greet($_) foreach @person;

The Naked Block Control Structure

one of its features is that it provides a scope for temporary lexical variables.

The elsif Clause, Autoincrement and Autodecrement: ++,--

The for Control Structure:for (initialization; test; increment) {}

The Secret Connection Between foreach and for inside the Perl parser, the keyword foreach is exactly equivalent to the keyword for.

Loop Controls

Perl has three loop-control operators you can use in loop blocks to make the loop do all sorts of tricks.

The last Operator

The last operator immediately ends execution of the loop, like break, and The last operator will apply to the innermost currently running loop block.

The next Operator

It jumps to the inside of the bottom of the current loop block, like continue.

# Analyze words in the input file or files

while (<>) {foreach (split) {$total++; next if /\W/; $valid++; $count{$_}++; # count each separate word}}

The redo Operator

It says to go back to the top of the current loop block, without testing any conditional expression or advancing to the next iteration, no equivalent in c.

Labeled Blocks

When you need to work with a loop block that’s not the innermost one, use a label. To label a loop block, just put the label and a colon in front of the loop. Then, inside the loop, you may use the label after last, next, or redo as needed: LINE: while (<>) { foreach (split) { last LINE if /__END__/; # bail out of the LINE loop}}

The Ternary Operator, expression ? if_true_expr : if_false_expr

Logical Operators: &&,||, and, or

The Value of a Short-Circuit Operator the value of a short-circuit logical operator is the last part evaluated, not just a Boolean value. my $last_name = $last_name{$someone} || '(No last name)';

The defined-or Operator://

the defined-or operator, // short-circuits when it finds a defined value, no matter if that value on the lefthand side is true or false: my $last_name = $last_name{$someone} // '(No last name)';

open CHAPTER, $filename or die "Can't open '$filename': $!";

Perl Modules

Comprehensive Perl Archive Network (CPAN), which is a worldwide collection of servers and mirrors containing thousands of modules of reusable Perl code.

Finding Modules

To find modules that don’t come with Perl, start at either CPAN Search (http://search.cpan.org) or Kobes’ Search (http://kobesearch.cpan.org/). Before you go looking for a module, you should check whether it is already installed. One way is to just try to read the documentation with perldoc.

Installing Modules

you can simply download the distribution, unpack it, and run a series of commands from the shell. perl Makefile.PL; make install or perl Build.PL; ./Build install. If you can’t install modules in the system-wide directories, you can specify another directory with a PREFIX argument to Makefile.PL:

$ perl Makefile.PL PREFIX=/Users/fred/lib

Better way: we can use one of the modules that comes with Perl, CPAN.pm.‡ From the command line, you can start up the CPAN.pm, from which you can issue commands: $ perl -MCPAN -e shell

cpan Module::CoreList LWP CGI::Prototype

Using Simple Modules

The File::Basename Module:use File::Basename

During compilation, Perl sees that line and loads up the module.

perldoc File::Basename, my $basename($dirname) = basename(dirname) $name;

Using Only Some Functions from a Module

Simply give File::Basename, in your use declaration, an import list showing exactly which function names it should give you, and it’ll supply those and no others.

use File::Basename qw/ basename /;

And here, we’ll ask for no new functions at all: use File::Basename qw/ /; use File::Basename ();

use File::Basename qw/ /; my $dirname = File::Basename::dirname $name;

The File::Spec Module

File::Spec is an object-oriented module.

use File::Spec; my $new_name = File::Spec->catfile($dirname, $basename);

Databases and DBI

Once you install DBI, you also have to install a DBD (Database Driver).

use DBI; $dbh = DBI->connect($data_source, $username, $password);$sth = $dbh->prepare("SELECT * FROM foo WHERE bla");

$sth->execute();@row_ary = $sth->fetchrow_array; $sth->finish;$dbh->disconnect();

File Tests

File tests and their meanings

File test Meaning

-r(w,x,o) File or directory is readable,writable,executable,owned by this (effective) user or group

-R(W,X,O) File or directory is readable,writable,executable,owned by this real user or group

-e File or directory name exists

-z File exists and has zero size (always false for directories)

-s File or directory exists and has nonzero size (the value is the size in bytes)

-f,-d,-l, -S,-p,-b,-c Entry is a plain file,directory,symbolic link, socket,named pipe,block-special file,character-special file.

-u,-game File or directory is setuid,setgid

-k File or directory has the sticky bit set

-t The filehandle is a TTY (as reported by the isatty() system function; filenames can’t be tested by this test)

-T,-B File looks like a “text” file,a “binary” file

-M,-A,-C Modification age, Access age, Inode-modification age (measured in days)

The -s returns the length of the file, measured in bytes.

The -t file test returns true if the given filehandle is a TTY—in short, if it’s interactive because it’s not a simple file or pipe. When -t STDIN returns true, it generally means that you can interactively ask the user questions. If it’s false, your program is probably getting input from a file or pipe, rather than a keyboard.

the default operand is the file named in $_.

foreach (@lots_of_filenames) {print "$_ is readable\n" if -r; # same as -r $_}

my $size_in_k = (-s) / 1024;

Testing Several Attributes of the Same File

-r $file and -w $file: This is an expensive operation, though. Each time you perform a file test, Perl asks the filesystem for all of the information about the file (Perl’s actually doing a stat each time.

The virtual filehandle _(just the underscore) uses the information from the last file lookup that a file test op-

erator performed. Perl only has to look up the file information once now: if( -r $file and -w _ )

Stacked File Test Operators

Perl 5.10 lets us “stack” our file test operators by lining them all up before the filename: if( -r -w -x -o -d $file )

The stat and lstat Functions

the stat function returns pretty much everything that the stat Unix system call returns. my($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size, $atime, $mtime, $ctime, $blksize, $blocks) = stat($filename);

If you need the information about the symbolic link itself, use lstat rather than stat. If the operand isn’t a symbolic link, lstat returns the same things that stat would.

The localtime Function: my $now = localtime

In a list context, localtime returns a list of numbers: my($sec, $min, $hour, $day, $mon, $year, $wday, $yday, $isdst) = localtime $timestamp; Both localtime and gmtime default to using the current time value if you don’t supply a parameter.

Bitwise Operators: &,|,^,<<,>>,~

Directory Operations

Moving Around the Directory Tree

The chdir operator changes the working directory: chdir "/etc" or die "cannot chdir to /etc: $!";

Globbing

cat >show-args

foreach $arg (@ARGV) {print "one arg is $arg\n";}

perl show-args *.pm

my @pm_files = glob "*.pm";

you can also put as the (single) argument to glob, including multiple patterns separated by spaces:

my @all_files_including_dot = glob ".* *";

An Alternate Syntax for Globbing

my @all_files = <*>; ## exactly the same as my @all_files = glob "*";

The value between the angle brackets is interpolated similarly to a double-quoted string, which means that Perl variables are expanded to their current Perl values before being globbed: my $dir = "/etc"; my @dir_files = <$dir/* $dir/.*>; using angle brackets means both filehandle reading and globbing, a filehandle has to be a Perl identifier. So, if the item between the angle brackets is strictly a Perl identifier, it’s a filehandle

read; otherwise, it’s a globbing operation.

Directory Handles

A directory handle looks and acts like a filehandle. You open it with opendir, you read from it with readdir, and you close it with closedir. But instead of reading the contents of a file, you’re reading the names of files (and other things) in a directory.

my $dir_to_process = "/etc"; opendir DH, $dir_to_process or die "Cannot open $dir_to_process: $!";

foreach $file (readdir DH) {print "one file in $dir_to_process is $file\n";}closedir DH;

the names are returned in no particular order.† And the list includes all files, not just those matching a particular pattern. It is also includes the dot files, and particularly the dot and dot-dot entries.

The filenames returned by the readdir operator have no pathname component. It’s just the name within the directory.

opendir SOMEDIR, $dirname or die "Cannot open $dirname: $!";

while (my $name = readdir SOMEDIR) { next if $name =~ /^\./; # skip over dot files

$name = "$dirname/$name"; # patch up the path next unless -f $name and -r $name; # only readable files}

Recursive Directory Listing: using File::Find

Manipulating Files and Directories

Removing Files: unlink "slate", "bedrock", "lava"; unlink glob "*.o";

Renaming Files: rename "old", "new";

foreach my $file (glob "*.old") { my $newfile = $file; $newfile =~ s/\.old$/.new/;

if (-e $newfile) {warn "can't rename $file to $newfile: $newfile exists\n";

} elsif (rename $file, $newfile) { ## success, do nothing

} else { warn "rename $file to $newfile failed: $!\n";}}

(my $newfile = $file) =~ s/\.old$/.new/;

Links and Files

link "chicken", "egg" or warn "can't link chicken to egg: $!";

symlink "dodgson", "carroll" or warn "can't symlink dodgson to carroll: $!";

To find out where a symbolic link is pointing, use the readlink function. This will tell you where the symlink leads, or it will return undef if its argument wasn’t a symlink:

my $perl = readlink "/usr/local/bin/perl"; # Maybe tells where perl is

Making and Removing Directories

The rmdir operator fails for nonempty directories.

my $temp_dir = "/tmp/scratch_$$"; # based on process ID; mkdir $temp_dir, 0700 or die "cannot create $temp_dir: $!";

unlink glob "$temp_dir/* $temp_dir/.*"; rmdir $temp_dir;

Modifying Permissions: chmod 0755, "fred", "barney";

Changing Ownership: chown $user, $group, glob "*.o";

getpwnam function to translate the name into a number, and the corresponding getgrnam† to translate the group name into its number:

defined(my $user = getpwnam "merlyn") or die "bad user";defined(my $group = getgrnam "users") or die "bad group"; chown $user, $group, glob "/home/merlyn/*";

Changing Timestamps

you can use the utime function to fudge the books a bit. The first two arguments give the new access time and modification time, while the remaining arguments are the list of filenames to alter to those timestamps.

my $now = time;my $ago = $now − 24 * 60 * 60;utime $now, $ago, glob "*"; The third timestamp (the ctime value) is always set to “now” whenever anything alters a file, so there’s no way to set it with utime function.

Strings and Sorting

Finding a Substring with index: $where = index($big, $small[,$start_position]);

my $where1 = index($stuff, "w"); my $where2 = index($stuff, "w", $where1 + 1);

my $last_slash = rindex("/etc/passwd", "/");

Manipulating a Substring with substr: $part = substr($string, $initial_position, $length);

The selected portion of the string can be changed if the string is a variable:

my $string = "Hello, world!"; substr($string, 0, 5) = "Goodbye"; # $string is now "Goodbye, world!"

substr($string, −20) =~ s/fred/barney/g;

Formatting Data with sprintf

Using sprintf with “Money Numbers”

sub big_money {my $number = sprintf "%.2f", shift @_;

# Add one comma each time through the do-nothing loop

1 while $number =~ s/^(-?\d+)(\d\d\d)/$1,$2/;

# Put the dollar sign in the right place

$number =~ s/^(-?)/$1\$/; $number;}

Advanced Sorting

You’ll tell Perl what order you want by making a sort-definition subroutine, or sort subroutine for short.

The sort subroutine is defined like an ordinary subroutine (well, almost). This routine will be called repeatedly, each time checking on a pair of elements from the list you’re sorting.

sub by_number { # a sort subroutine, expect $a and $b

if ($a < $b) { −1 } elsif ($a > $b) { 1 } else { 0 }}

my @result = sort by_number @some_numbers;

the spaceship operator (<=>) compares two numbers and returns −1, 0, or 1 as needed to sort them numerically: sub by_number { $a <=> $b }

three-way string-comparison operator: cmp: sub ASCIIbetically { $a cmp $b } sub case_insensitive { "\L$a" cmp "\L$b" }

for efficiency reasons, $a and $b aren’t copies of the data items. They’re actually new, temporary aliases for elements of the original list, so if we changed them, we’d be mangling the original data. you can make the code even simpler yet, by replacing the name of the sort routine with the entire sort routine “inline,” like so: my @numbers = sort { $a <=> $b } @some_numbers;

my @descending = reverse sort { $a <=> $b } @some_numbers;

my @descending = sort { $b <=> $a } @some_numbers;

Sorting a Hash by Value

sub by_score { $score{$b} <=> $score{$a} }

my @winners = sort by_score keys %score;

Sorting by Multiple Keys

my @winners = sort by_score_and_name keys %score;

sub by_score_and_name {$score{$b} <=> $score{$a} # by descending numeric score

or $a cmp $b # ASCIIbetically by name }

Smart Matching and given-when

The Smart Match Operator

The smart match operator, ~~, looks at both of its operands and decides on its own how it should compare them. If the operands look like numbers, it does a numeric comparison. If they look like strings, it does a string comparison. If one of the operands is a regular expression, it does a pattern match.

Smart Match Precedence

Example Type of match

%a ~~ %b Hash keys identical

%a ~~ @b At least one key in %a is in @b

%a ~~ /Fred/ At least one key matches pattern

%a ~~ 'Fred' Hash key existence exists $a{Fred}

@a ~~ @b Arrays are the same

@a ~~ /Fred/ At least one element matches pattern

@a ~~ 123 At least one element is 123, numerically

@a ~~ 'Fred' At least one element is 'Fred', stringwise

$name ~~ undef $name is not defined

$name ~~ /Fred/ Pattern match

123 ~~ '123.0' Numeric equality with “numish” string

'Fred' ~~ 'Fred' String equality

123 ~~ 456 Numeric equality

The given Statement

It’s Perl’s equivalent to C’s switch statement.

given( $ARGV[0] ) {when( /fred/i ) { say 'Name has fred in it' }...

default { say "I don't see a Fred" } }

Unless you say otherwise, there is an implicit break at the end of each when block, and that tells Perl to stop the given-when construct and move on with the rest of the program.

If you use continue at the end of a when instead, Perl tries the succeeding when statements too, repeating the process it started before.

given( $ARGV[0] ) {when( $_ ~~ /fred/i ) { say 'Name has fred in it'; continue }...

when( $_ ~~ 'Fred' ) { say 'Name is Fred'; break } # OK now! default { say "I don't see a Fred" }}

The smart match operator finds things that are the same (or mostly the same), so it doesn’t work with comparisons for greater than or less than. You can mix and match dumb and smart matching;

when with Many Items

foreach my $name ( @names ) {given( $name ) {...}}

foreach ( @names ) { when( /fred/i ) { say 'Name has fred in it'; continue } default{}}

Process Management

The system Function

system 'ls -l $HOME'; system "long_running_command with parameters &";

system 'for i in *; do echo == $i ==; cat $i; done';

Avoiding the Shell

The system operator may also be invoked with more than one argument,† in which case, a shell doesn’t get involved, no matter how complicated the text:

system "tar", "cvf", $tarfile, @dirs;

system "/bin/csh", "-fc", $command_line;

The return value of the system operator is based upon the exit status of the child command. In Unix, an exit value of 0 means that everything is okay, and a nonzero exit value usually indicates that something went wrong:

unless (system "date") { print "We gave you a date, OK!\n";}

!system "rm -rf files_to_delete" or die "something went wrong";

The exec Function

The system function creates a child process, which then scurries off to perform the requested action while Perl naps. The exec function causes the Perl process itself to perform the requested action.

exec "bedrock", "-o", "args1", @ARGV;

When we reach the exec operation, Perl locates bedrock, and “jumps into it.” At that point, there is no Perl process anymore,‡ just the process running the bedrock command. When bedrock is finished, there’s no Perl to come back to, so we’d get a prompt back if we invoked this program from the command line.

Because Perl is no longer in control once the requested command has started, it doesn’t make any sense to have any Perl code following the exec, except for handling the error when the requested command cannot be started:

exec "date";die "date couldn't run: $!";

The Environment Variables

In Perl, the environment variables are available via the special %ENV hash; At the start of your program’s execution, %ENV holds values it has inherited from its parent process. Modifying this hash changes the environment variables, which will then be inherited by new processes and possibly used by Perl as well.

Using Backquotes to Capture Output:chomp(my $no_newline_now = `date`);

The value between backquotes is interpreted as a double-quoted string, meaning that backslash-escapes and variables are expanded appropriately.

foreach (@functions) {$about{$_} = `perldoc -t -f $_`;}

Using Backquotes in a List Context: my @who_lines = `who`

Using the same backquoted string in a list context yields a list containing one line of output per element.

Processes as Filehandles

The syntax for launching a concurrent (parallel) child process is to put the command as the “filename” for an open call, and either precede or follow the command with a vertical bar, which is the “pipe” character.

open DATE, "date|" or die "cannot pipe from date: $!";

open MAIL, "|mail merlyn" or die "cannot pipe to mail: $!";

with the vertical bar on the right, the command is launched with its standard output connected to the DATE filehandle opened for reading, similar to the way that the command date | your_program would work from the shell. In the second example, with the vertical bar on the left, the command’s standard input is connected

to the MAIL filehandle opened for writing, similar to what happens with the command your_program | mail merlyn. In either case, the command is now launched and continues independently of the Perl process.

to get data from a filehandle opened for reading: my $now = <DATE>;

to send data to the mail process: print MAIL "The time is now $now";

open F, "find / -atime +90 -size +1000 -print|" or die "fork: $!";

while (<F>) {chomp; printf "%s size %dK last accessed on %s\n", $_, (1023 + -s $_)/1024, -A $_;}

Getting Down and Dirty with Fork

defined(my $pid = fork) or die "Cannot fork: $!";

unless ($pid) { exec "date"; die "cannot exec date: $!";} # Child process is here

waitpid($pid, 0);# Parent process is here

we’ve checked the return value from fork, which will be undef if it failed. Usually it will succeed, causing two separate processes to continue to the next line, but here only the parent process has a nonzero value in $pid, so only the child process executes the exec function. The parent process skips over that and executes the waitpid function, waiting for that particular child to finish.

Sending and Receiving Signals

kill 2, 4201 or die "Cannot signal 4201 with SIGINT: $!";

unless (kill 0, $pid) {warn "$pid has gone away!";}

my $temp_directory = "/tmp/myprog.$$"; # create files below here

mkdir $temp_directory, 0700 or die "Cannot create $temp_directory: $!";

sub clean_up { unlink glob "$temp_directory/*"; rmdir $temp_directory;}

sub my_int_handler {&clean_up; die "interrupted, exiting...\n";}

$SIG{'INT'} = 'my_int_handler';

....

&clean_up;

The assignment into the special %SIG hash activates the handler (until revoked). The key is the name of the signal (without the constant SIG prefix), and the value is a string* naming the subroutine, without the ampersand. From then on, if a SIGINT comes along, Perl stops whatever it’s doing and jumps immediately to the subroutine. If the subroutine returns rather than exits, execution resumes right where it was interrupted. This can be useful if the interrupt actually needs to interrupt something rather than cause it to stop.

ust set a flag in the interrupt procedure, and check it at the end of each line’s processing:

my $int_count;sub my_int_handler { $int_count++ } $SIG{'INT'} = 'my_int_handler'; $int_count = 0;

while (<SOMEFILE>) { ... some processing that takes a few seconds ...

if ($int_count) {# interrupt was seen! print "[processing interrupted...]\n"; last; }}

Some Advanced Perl Techniques

Trapping Errors with eval

When a normally fatal error happens during the execution of an eval block, the block is done running, but the program doesn’t crash. and If the eval caught a fatal error, $@ will hold what would have been the program’s dying words, eval { $barney = $fred / $dino }; print "An error occurred: $@" if $@;

the eval is an expression, it has a return value. If there’s no error, it’s like a subroutine. If the eval traps a fatal error, the return value is either undef or an empty list, depending upon the context: my $barney = eval { $fred / $dino };

The exit operator terminates the program at once, even if it’s called from a subroutine inside an eval block.

Picking Items from a List with grep

my @odd_numbers = grep { $_ % 2 } 1..1000; my @matching_lines = grep { /\bfred\b/i } <FILE>;

Transforming Items from a List with map

my @formatted_data = map { &big_money($_) } @data;

print "The money numbers are:\n", map { sprintf("%25s\n", &big_money($_) ) } @data;

Unquoted Hash Keys

If the hash key is made up of nothing but letters, digits, and underscores without starting with a digit, you may be able to omit the quote marks. in the curly braces of a hash element reference: instead of $score{"fred"}, you could write simply $score{fred}. my %score = ( barney => 195,...);

Slices

Perl can index into a list as if it were an array. This is a list slice.

my($name, $card_num, $addr, $home, $work, $count) = split /:/;

my $mtime = (stat $some_file)[9]; my($card_num, $count) = (split /:/)[1, 5];

Array Slice

my @numbers = @names[ 9, 0, 2, 1, 0 ]; print "Bedrock @names[ 9, 0, 2, 1, 0 ]\n";

Hash Slice

A slice is always a list.

my @three_scores = ($score{"barney"}, $score{"fred"}, $score{"dino"});

my @three_scores = @score{ qw/ barney fred dino/ };


Labels

Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (38) Eclipse (33) Code Example (31) Linux (25) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts