If you are using the Opera web browser, use full screen mode to see the presentation in slide show format.
Location of presentation: http://www.carrotpatch.org/reference/regexp-tutorial/
Match a pattern:
/match_pattern/mods
m{match_pattern}mods
Substitute:
s/match_pattern/replacement_text/mods
s{match_pattern}{replacement_text}mods
if ($msgstring =~ /^(.+)\((.+)\)$/) { $messageID = $1; $messageName = $2; }
if ($msgstring =~ /^(.+)\((.+)\)$/) { $messageID = $1; $messageName = $2; }
if ($msgstring =~ m{
^
([^()]+) # $1 - anything but ()
\(
(.+) # $2 - anything
\)
$
}x)
An example of parsing an HTML tag is given later, with a quick-and-dirty regexp, and a more complete regexp afterwards.
if ($msgstring =~ m{
^
(.+?) # $1 (non-greedy)
\(
(.+) # $2 (greedy by default)
\)
$
}x)
Verbal description of desired regexp:
Less accurate but simplified description:
1 $newString =~ s{
2 (<img
3 [^>]* \s+
4 src \s*=\s*) # $1 - stuff up to quote
5 (['"]) # $2 - quote
6 ([^/'"]
7 [^'"]*) # $3 - attribute value
8 (\2
9 [^>]* /?>
10 ) # $4 - rest
11 }{$1$2$docroot$3$4}gx;
Introducing lookahead assertion:
1 $newString =~ s{
2 (<img
3 [^>]* \s+
4 src \s*=\s*) # $1 - stuff up to quote
5 (['"]) # $2 - quote
6 ((?!/) # Not a /
7 (?!\2).*) # $3 - attribute value
8 (\2
9 [^>]* /?>
10 ) # $4 - rest
11 }{$1$2$docroot$3$4}gx;
1 $newString =~ s{
2 (<img
3 (?: \s+ \w+ \s*=\s* (['"])(?!\2).* \2 )* # $2
4 src \s*=\s*) # $1 - stuff up to quote
5 (['"]) # $3 - quote
6 ((?!/) # Not a /
7 (?!\3).*) # $4 - attribute value
8 (\3
9 (?: \s+ \w+ \s*=\s* (['"])(?!\6).* \6 )* # $6
10 \s*/?>
11 ) # $5 - rest
12 }{$1$3$docroot$4$5}gx;
m{ A*AAA \s+ (?:Hawaii|Bermuda) }x
m{ \b AAAA* \s+ (?:Hawaii|Bermuda) }x
'hihihihihihi' =~ / (?: (?: hi )+ )* /x
Ways that each parenthesized expression can match at the start of the string:
There are least 20 different ways to match at the start of the string.
Note the whole pattern can match zero times anywhere in the string.
qr/^' (?: [^\\\'] | \\[\\\'])* '$/x,
Note the cases do not have to be literally "normal" and "special"; the important aspect is that there are two different cases that are matched by two distinct regexp patterns.
Note the regexp for a single-quoted string given earlier follows this format.
Note this is just the general outline of the unrolled loop pattern; it should be tailored for each particular regexp.
my $chars = qr/ [^\\\'] /x; my $escapedChars = qr/ \\[\\\'] /x; my $re = qr{^' $chars* (?: $escapedChars $chars* )* ' }x;