02 - Perl Programming Regular Expression 97 Danairat T. Line ID: Danairat FB: Danairat Thanabodithammachari +668-1559-1446
Danairat T. Perl Regular Expressions • A powerful, flexible, and efficient text processing. Regular expressions like a mini programming language. • You can use Regular expressions to verify whether input match with text pattern within a larger body of text, to replace text matching the pattern with other text. 98
Danairat T. Regular Expressions - Topics • Match Operator – Match Operator Modifiers • Substitution Operator – Substitution Operator Modifiers • Translation Operator – Translation Operator Modifiers • Regular Expression Elements – Metacharacters – Character Classes – Anchors – Pattern Quantifiers – Pattern Match Variables – Backreferencing 99
Danairat T. Match Operator 100 • The match operator represents by m// • We can use the match operator to determine text or string whether match to provided pattern. The basic form of the operator is m/PATTERN/; • The =~ is used as regular expression match between variable and the pattern. • The !~ is used as regular expression NOT match between variable and the pattern. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; if ($myString =~ m/one/) { print "match."; } exit(0); MatchEx01.pl Results:- match.
Danairat T. Match Operator 101 • We can omit the m to be only // #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; if ($myString =~ /one/) { print "match."; } exit(0); MatchOmitTheMEx01.pl Results:- match.
Danairat T. Match Operator 102 • The m sometime make the code more clear #!/usr/bin/perl use strict; use warnings; my $myString ="/usr/local/lib"; if ($myString =~ //usr/local/lib/) { print "match without mn"; } if ($myString =~ m(/usr/local/lib)) { print "match with mn"; } exit(0); MatchWithMEx01.pl Results:- match without m match with m
Danairat T. Match Operator Modifiers 103 Modifier Meaning g Match globally, i.e., find all occurrences. i Do case-insensitive pattern matching. m Treat string as multiple lines. o Evaluates the expression only once. Use this modifier when the pattern is a variable running in the loop and may be changed during running. s Treat string as single line. x Allows you to use white space in the expression for clarity.
Danairat T. Match Operator Modifiers 104 • Normally, the match returns the first valid match for a regular expression, but with the /g modifier in effect, all possible matches for the expression are returned in a list #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; foreach my $myMatch ($myString =~ /e/g) { print "match.n"; } exit(0); GlobalMatchEx01.pl Results:- match. match. match.
Danairat T. Match Operator Modifiers 105 • The /i is used for match case insensitive. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; foreach my $myMatch ($myString =~ /e/ig) { print "match.n"; } exit(0); CaseInsensitiveGlobalMatchEx01.pl Results:- match. match. match. match.
Danairat T. Match Operator Modifiers 106 • the /m modifier is used, while ``^'‘ (leading with) and ``$'' (ending with) will match at every internal line boundary. #!/usr/bin/perl use strict; use warnings; my $myString =<<END_OF_LINES; Hello Everyone Everyone END_OF_LINES foreach my $myMatch ($myString =~ /^e/igm) { print "match.n"; } exit(0); MultilinesEx01.pl Results:- match. match.
Danairat T. Substitution Operator 107 • The Substitution operator represents by s/// • The Substitution operator is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is s/PATTERN/REPLACEMENT/; #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ s/Hello/Hi/; print "$myString n"; print "$myCount n"; exit(0); SubstituteEx01.pl Results:- Hi Everyone 1
Danairat T. Substitution Operator 108 • Language supported in the Substitution operator #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ s/Hello/สวัสดี/; print "$myString n"; print "$myCount n"; exit(0); SubstituteEx02.pl Results:- สวัสดี Everyone 1
Danairat T. Substitution Operator Modifiers 109 Modifier Meaning g Match globally, i.e., find all occurrences. i Do case-insensitive pattern matching. m Treat string as multiple lines. o Evaluates the expression only once. Use this modifier when the pattern is a variable running in the loop and may be changed during running. s Treat string as single line. x Allows you to use white space in the expression for clarity. e Evaluates the replacement as if it were a Perl statement, and uses its return value as the replacement text
Danairat T. Substitution Operator Modifiers 110 • The Substitution operator with L, u, i, g can be used to convert the character case #!/usr/bin/perl use strict; use warnings; my $myString = "hELlo eveRyoNe"; # the w is match any alphanumeric # the + is match one or more than one my $myCount = $myString =~ s/(w+)/uL$1/ig; print "$myString n"; print "$myCount n"; exit(0); ChangeCaseEx01.pl Results:- Hello Everyone 2
Danairat T. Substitution Operator Modifiers 111 • Using substitute with /m to match multiline text MultiLinesSubstituteEx01.pl #!/usr/bin/perl use strict; use warnings; my $myString =<<END_OF_LINES; Hello Everyone Everyone END_OF_LINES $myString =~ s/^every/Any/igm; print $myString . "n"; exit(0); Results:- Hello Anyone Anyone
Danairat T. Substitution Operator Modifiers 112 • The /e modifier causes Perl to evaluate the REPLACEMENT text as if it were a Perl expression, and then to use the value as the replacement string. We’ve already seen an example of this when converting a date from traditional American slashed format into the format: $c =~ s{(d+)/(d+)/(d+)}{sprintf("%04d%02d%02d",$3,$2,$1)}e; • We have to use sprintf in this case; otherwise, a single-digit day or month would truncate the numeric digits from the eight required—for example, 26/3/2000 would become 2000326 instead of 20000326.
Danairat T. Translation Operator 113 • The tr function allows character-by-character translation. The following expression replaces each a with e, each b with d, and each c with f in the variable $sentence. The expression returns the number of substitutions made. $sentence =~ tr/abc/edf/ • Most of the special RE codes do not apply in the tr function. However, the dash is still used to mean "between". This statement converts string to upper case. $sentence =~ tr/a-z/A-Z/;
Danairat T. Translation Operator Modifiers 114 Modifier Meaning c Complement SEARCHLIST. d Delete found but unreplaced characters. s Squash duplicate replaced characters
Danairat T. Translation Operator Modifiers 115 • The /c modifier changes the replacement text to be the characters not specified in SEARCHLIST. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ tr/a-zA-z/-/c; print "$myString n"; print "$myCount n"; exit(0); TrEx01.pl Results:- Hello-Everyone 1
Danairat T. Translation Operator Modifiers 116 • The /d modifier removes any character in the search list #!/usr/bin/perl use strict; use warnings; my $myString = 'He@l*lo E%very$one'; my $myCount = $myString =~ tr/@$%*//d; print "$myString n"; print "$myCount n"; exit(0); TrEx02.pl Results:- Hello Everyone 4
Danairat T. Translation Operator Modifiers 117 • The /s modifier performs converting the same sequences character into a single character. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ tr/a-zA-Z//s; print "$myString n"; print "$myCount n"; exit(0); TrEx03.pl Results:- Helo Everyone 13
Danairat T. Metacharacters 118 Symbol Atomic Meaning Varies Treats the following character as a real character ^ No True at beginning of string (or line, if /m is used) $ No True at end of string (or line, if /m is used) | No Alternation match. . Yes Match one character except the newline character. (...) Yes Grouping (treat as a one unit). [...] Yes Looks for a set and/or range of characters, defined as a single character class, The [...] only represents a single character.
Danairat T. Metacharacters 119 • The to match any escape sequence character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /t/) { print "matched."; } exit(0); UsingBackSlashEx03.pl Results:- <Please enter the [tab] to match with pattern>
Danairat T. Metacharacters 120 • The ^ to match the beginning of string #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /^The/) { print "matched."; } exit(0); MatchBeginningEx03.pl Results:- <Please enter the word start with “the“ to match with pattern>
Danairat T. Metacharacters 121 • The $ to match the ending of string #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /.$/) { print "matched."; } exit(0); MatchEndingEx03.pl Results:- <Please enter the word end with “.“ to match with pattern>
Danairat T. Metacharacters 122 • The | to perform alternation match. #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /apple|orange/) { print "matched."; } exit(0); MatchSelectionEx03.pl Results:- <Please enter “Apple” or “Orange” to match with pattern>
Danairat T. Metacharacters 123 • The period . to match any single character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /b.ll/) { print "matched."; } exit(0); UsingDotEx03.pl Results:- <Please enter the bill or bull or ball to match with pattern>
Danairat T. Metacharacters 124 • The period . to match any single character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /b.ll/) { print "matched."; } exit(0); UsingDotEx03.pl Results:- <Please enter the bill or bull or ball to match with pattern>
Danairat T. Character Classes 125 Code Matches dddd A digit, same as [0[0[0[0----9]9]9]9] DDDD A nondigit, same as [^0[^0[^0[^0----9]9]9]9] wwww A word character (alphanumeric), same as [a[a[a[a----zAzAzAzA----Z_0Z_0Z_0Z_0----9]9]9]9] WWWW A non-word character, [^a[^a[^a[^a----zAzAzAzA----Z_0Z_0Z_0Z_0----9]9]9]9] ssss A whitespace character, same as [[[[ ttttnnnnrrrrf]f]f]f] SSSS A non-whitespace character, [^[^[^[^ ttttnnnnrrrrf]f]f]f] CCCC Match a character (byte) pPpPpPpP Match P-named (Unicode) property PPPPPPPP Match non-P XXXX Match extended unicode sequence
Danairat T. Character Classes 126 Code Matches llll Lowercase until next character uuuu Uppercase until next character LLLL Lowercase until E UUUU Uppercase until E QQQQ Disable pattern metacharacters until E EEEE End case modification
Danairat T. Anchors 127 Anchors don't match any characters; they match places within a string. Assertion Meaning ^^^^ Matches at the beginning of the string (or line, if /m/m/m/m is used) $$$$ Matches at the end of the string (or line, if /m is used) bbbb Matches at word boundary (between wwww and WWWW) B Matches a non-word boundary A Matches at the beginning of the string Z Matches at the end of the string or before a newline z Matches only at the end of the string G Matches where previous m//g left off (only works with /g modifier).
Danairat T. Pattern Quantifiers 128 • Pattern Quantifiers are used to specify the number of instances that can match. the quantifiers have a notation that allows for minimal matching. This notation uses a question mark immediately following the quantifier to force Perl to look for the earliest available match. Maximal Minimal Allowed range {{{{nnnn,mmmm}}}} {n{n{n{n,m}?m}?m}?m}? Must occur at least n times but no more than m times {n,}{n,}{n,}{n,} {n,}?{n,}?{n,}?{n,}? Must occur at least n times {n}{n}{n}{n} {n}?{n}?{n}?{n}? Must match exactly n times **** *?*?*?*? 0 or more times (same as {0,}) ++++ +?+?+?+? 1 or more times (same as {1,}) ???? ???????? 0 or 1 time (same as {0,1})
Danairat T. Character Classes 129 • Example #!/usr/bin/perl use strict; use warnings; my $myString ="Hello 111Every2343one"; if ($myString =~ /^(w+)(s+)(d+)(w+)(d+)(w+)$/) { print "match." . "n"; } exit(0); MatchChrClassEx01.pl Results:- <Please enter the bill or bull or ball to match with pattern>
Danairat T. • Example Character Classes 130 #!/usr/bin/perl use strict; use warnings; my $myString ="Hello 111Every2343one"; if ($myString =~ /^(w+)(s+)(d{1,3})(w+)(d{1,4})(w+)$/) { print "match." . "n"; } exit(0); MatchChrClassEx02.pl Results:- <Please enter the bill or bull or ball to match with pattern>
Danairat T. Pattern Match Variable $1, $2, … 131 • Parentheses () not only to group elements in a regular expression, they also remember the patterns they match. • Every match from a parenthesized element is saved to a special, read-only variable indicated by a number. • Using 1, 2,.. to recall a match within the matching pattern. • Using $1, $2,... to recall a match outside of the matching pattern.
Danairat T. Pattern Match Variable $1, $2, … 132 #!/usr/bin/perl use strict; use warnings; my $myString = "Everyone Hello"; my $myCount = $myString =~ s/(w+)s(Hello)/$2 $1/; print "$myString n"; print "$myCount n"; exit(0); PatternMatchVarEx03.pl Results:- Hello Everyone 1 • Example:-
Danairat T. Pattern Match Variable The backreferencing 133 The backreferencing variables are:- • $+ Returns the last parenthesized pattern match • $& Returns the entire matched string • $` Returns everything before the matched string • $' Returns everything after the matched string Backreferencing will slow down your program noticeably.
Danairat T. Line ID: Danairat FB: Danairat Thanabodithammachari +668-1559-1446 Thank you

Perl Programming - 02 Regular Expression

  • 1.
    02 - PerlProgramming Regular Expression 97 Danairat T. Line ID: Danairat FB: Danairat Thanabodithammachari +668-1559-1446
  • 2.
    Danairat T. Perl RegularExpressions • A powerful, flexible, and efficient text processing. Regular expressions like a mini programming language. • You can use Regular expressions to verify whether input match with text pattern within a larger body of text, to replace text matching the pattern with other text. 98
  • 3.
    Danairat T. Regular Expressions- Topics • Match Operator – Match Operator Modifiers • Substitution Operator – Substitution Operator Modifiers • Translation Operator – Translation Operator Modifiers • Regular Expression Elements – Metacharacters – Character Classes – Anchors – Pattern Quantifiers – Pattern Match Variables – Backreferencing 99
  • 4.
    Danairat T. Match Operator 100 •The match operator represents by m// • We can use the match operator to determine text or string whether match to provided pattern. The basic form of the operator is m/PATTERN/; • The =~ is used as regular expression match between variable and the pattern. • The !~ is used as regular expression NOT match between variable and the pattern. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; if ($myString =~ m/one/) { print "match."; } exit(0); MatchEx01.pl Results:- match.
  • 5.
    Danairat T. Match Operator 101 •We can omit the m to be only // #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; if ($myString =~ /one/) { print "match."; } exit(0); MatchOmitTheMEx01.pl Results:- match.
  • 6.
    Danairat T. Match Operator 102 •The m sometime make the code more clear #!/usr/bin/perl use strict; use warnings; my $myString ="/usr/local/lib"; if ($myString =~ //usr/local/lib/) { print "match without mn"; } if ($myString =~ m(/usr/local/lib)) { print "match with mn"; } exit(0); MatchWithMEx01.pl Results:- match without m match with m
  • 7.
    Danairat T. Match OperatorModifiers 103 Modifier Meaning g Match globally, i.e., find all occurrences. i Do case-insensitive pattern matching. m Treat string as multiple lines. o Evaluates the expression only once. Use this modifier when the pattern is a variable running in the loop and may be changed during running. s Treat string as single line. x Allows you to use white space in the expression for clarity.
  • 8.
    Danairat T. Match OperatorModifiers 104 • Normally, the match returns the first valid match for a regular expression, but with the /g modifier in effect, all possible matches for the expression are returned in a list #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; foreach my $myMatch ($myString =~ /e/g) { print "match.n"; } exit(0); GlobalMatchEx01.pl Results:- match. match. match.
  • 9.
    Danairat T. Match OperatorModifiers 105 • The /i is used for match case insensitive. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; foreach my $myMatch ($myString =~ /e/ig) { print "match.n"; } exit(0); CaseInsensitiveGlobalMatchEx01.pl Results:- match. match. match. match.
  • 10.
    Danairat T. Match OperatorModifiers 106 • the /m modifier is used, while ``^'‘ (leading with) and ``$'' (ending with) will match at every internal line boundary. #!/usr/bin/perl use strict; use warnings; my $myString =<<END_OF_LINES; Hello Everyone Everyone END_OF_LINES foreach my $myMatch ($myString =~ /^e/igm) { print "match.n"; } exit(0); MultilinesEx01.pl Results:- match. match.
  • 11.
    Danairat T. Substitution Operator 107 •The Substitution operator represents by s/// • The Substitution operator is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is s/PATTERN/REPLACEMENT/; #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ s/Hello/Hi/; print "$myString n"; print "$myCount n"; exit(0); SubstituteEx01.pl Results:- Hi Everyone 1
  • 12.
    Danairat T. Substitution Operator 108 •Language supported in the Substitution operator #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ s/Hello/สวัสดี/; print "$myString n"; print "$myCount n"; exit(0); SubstituteEx02.pl Results:- สวัสดี Everyone 1
  • 13.
    Danairat T. Substitution OperatorModifiers 109 Modifier Meaning g Match globally, i.e., find all occurrences. i Do case-insensitive pattern matching. m Treat string as multiple lines. o Evaluates the expression only once. Use this modifier when the pattern is a variable running in the loop and may be changed during running. s Treat string as single line. x Allows you to use white space in the expression for clarity. e Evaluates the replacement as if it were a Perl statement, and uses its return value as the replacement text
  • 14.
    Danairat T. Substitution OperatorModifiers 110 • The Substitution operator with L, u, i, g can be used to convert the character case #!/usr/bin/perl use strict; use warnings; my $myString = "hELlo eveRyoNe"; # the w is match any alphanumeric # the + is match one or more than one my $myCount = $myString =~ s/(w+)/uL$1/ig; print "$myString n"; print "$myCount n"; exit(0); ChangeCaseEx01.pl Results:- Hello Everyone 2
  • 15.
    Danairat T. Substitution OperatorModifiers 111 • Using substitute with /m to match multiline text MultiLinesSubstituteEx01.pl #!/usr/bin/perl use strict; use warnings; my $myString =<<END_OF_LINES; Hello Everyone Everyone END_OF_LINES $myString =~ s/^every/Any/igm; print $myString . "n"; exit(0); Results:- Hello Anyone Anyone
  • 16.
    Danairat T. Substitution OperatorModifiers 112 • The /e modifier causes Perl to evaluate the REPLACEMENT text as if it were a Perl expression, and then to use the value as the replacement string. We’ve already seen an example of this when converting a date from traditional American slashed format into the format: $c =~ s{(d+)/(d+)/(d+)}{sprintf("%04d%02d%02d",$3,$2,$1)}e; • We have to use sprintf in this case; otherwise, a single-digit day or month would truncate the numeric digits from the eight required—for example, 26/3/2000 would become 2000326 instead of 20000326.
  • 17.
    Danairat T. Translation Operator 113 •The tr function allows character-by-character translation. The following expression replaces each a with e, each b with d, and each c with f in the variable $sentence. The expression returns the number of substitutions made. $sentence =~ tr/abc/edf/ • Most of the special RE codes do not apply in the tr function. However, the dash is still used to mean "between". This statement converts string to upper case. $sentence =~ tr/a-z/A-Z/;
  • 18.
    Danairat T. Translation OperatorModifiers 114 Modifier Meaning c Complement SEARCHLIST. d Delete found but unreplaced characters. s Squash duplicate replaced characters
  • 19.
    Danairat T. Translation OperatorModifiers 115 • The /c modifier changes the replacement text to be the characters not specified in SEARCHLIST. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ tr/a-zA-z/-/c; print "$myString n"; print "$myCount n"; exit(0); TrEx01.pl Results:- Hello-Everyone 1
  • 20.
    Danairat T. Translation OperatorModifiers 116 • The /d modifier removes any character in the search list #!/usr/bin/perl use strict; use warnings; my $myString = 'He@l*lo E%very$one'; my $myCount = $myString =~ tr/@$%*//d; print "$myString n"; print "$myCount n"; exit(0); TrEx02.pl Results:- Hello Everyone 4
  • 21.
    Danairat T. Translation OperatorModifiers 117 • The /s modifier performs converting the same sequences character into a single character. #!/usr/bin/perl use strict; use warnings; my $myString = "Hello Everyone"; my $myCount = $myString =~ tr/a-zA-Z//s; print "$myString n"; print "$myCount n"; exit(0); TrEx03.pl Results:- Helo Everyone 13
  • 22.
    Danairat T. Metacharacters 118 Symbol AtomicMeaning Varies Treats the following character as a real character ^ No True at beginning of string (or line, if /m is used) $ No True at end of string (or line, if /m is used) | No Alternation match. . Yes Match one character except the newline character. (...) Yes Grouping (treat as a one unit). [...] Yes Looks for a set and/or range of characters, defined as a single character class, The [...] only represents a single character.
  • 23.
    Danairat T. Metacharacters 119 • The to match any escape sequence character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /t/) { print "matched."; } exit(0); UsingBackSlashEx03.pl Results:- <Please enter the [tab] to match with pattern>
  • 24.
    Danairat T. Metacharacters 120 • The^ to match the beginning of string #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /^The/) { print "matched."; } exit(0); MatchBeginningEx03.pl Results:- <Please enter the word start with “the“ to match with pattern>
  • 25.
    Danairat T. Metacharacters 121 • The$ to match the ending of string #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /.$/) { print "matched."; } exit(0); MatchEndingEx03.pl Results:- <Please enter the word end with “.“ to match with pattern>
  • 26.
    Danairat T. Metacharacters 122 • The| to perform alternation match. #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /apple|orange/) { print "matched."; } exit(0); MatchSelectionEx03.pl Results:- <Please enter “Apple” or “Orange” to match with pattern>
  • 27.
    Danairat T. Metacharacters 123 • Theperiod . to match any single character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /b.ll/) { print "matched."; } exit(0); UsingDotEx03.pl Results:- <Please enter the bill or bull or ball to match with pattern>
  • 28.
    Danairat T. Metacharacters 124 • Theperiod . to match any single character #!/usr/bin/perl use strict; use warnings; print "Please enter word: "; my $myWord = <STDIN>; chomp($myWord); if ($myWord =~ /b.ll/) { print "matched."; } exit(0); UsingDotEx03.pl Results:- <Please enter the bill or bull or ball to match with pattern>
  • 29.
    Danairat T. Character Classes 125 CodeMatches dddd A digit, same as [0[0[0[0----9]9]9]9] DDDD A nondigit, same as [^0[^0[^0[^0----9]9]9]9] wwww A word character (alphanumeric), same as [a[a[a[a----zAzAzAzA----Z_0Z_0Z_0Z_0----9]9]9]9] WWWW A non-word character, [^a[^a[^a[^a----zAzAzAzA----Z_0Z_0Z_0Z_0----9]9]9]9] ssss A whitespace character, same as [[[[ ttttnnnnrrrrf]f]f]f] SSSS A non-whitespace character, [^[^[^[^ ttttnnnnrrrrf]f]f]f] CCCC Match a character (byte) pPpPpPpP Match P-named (Unicode) property PPPPPPPP Match non-P XXXX Match extended unicode sequence
  • 30.
    Danairat T. Character Classes 126 CodeMatches llll Lowercase until next character uuuu Uppercase until next character LLLL Lowercase until E UUUU Uppercase until E QQQQ Disable pattern metacharacters until E EEEE End case modification
  • 31.
    Danairat T. Anchors 127 Anchors don'tmatch any characters; they match places within a string. Assertion Meaning ^^^^ Matches at the beginning of the string (or line, if /m/m/m/m is used) $$$$ Matches at the end of the string (or line, if /m is used) bbbb Matches at word boundary (between wwww and WWWW) B Matches a non-word boundary A Matches at the beginning of the string Z Matches at the end of the string or before a newline z Matches only at the end of the string G Matches where previous m//g left off (only works with /g modifier).
  • 32.
    Danairat T. Pattern Quantifiers 128 •Pattern Quantifiers are used to specify the number of instances that can match. the quantifiers have a notation that allows for minimal matching. This notation uses a question mark immediately following the quantifier to force Perl to look for the earliest available match. Maximal Minimal Allowed range {{{{nnnn,mmmm}}}} {n{n{n{n,m}?m}?m}?m}? Must occur at least n times but no more than m times {n,}{n,}{n,}{n,} {n,}?{n,}?{n,}?{n,}? Must occur at least n times {n}{n}{n}{n} {n}?{n}?{n}?{n}? Must match exactly n times **** *?*?*?*? 0 or more times (same as {0,}) ++++ +?+?+?+? 1 or more times (same as {1,}) ???? ???????? 0 or 1 time (same as {0,1})
  • 33.
    Danairat T. Character Classes 129 •Example #!/usr/bin/perl use strict; use warnings; my $myString ="Hello 111Every2343one"; if ($myString =~ /^(w+)(s+)(d+)(w+)(d+)(w+)$/) { print "match." . "n"; } exit(0); MatchChrClassEx01.pl Results:- <Please enter the bill or bull or ball to match with pattern>
  • 34.
    Danairat T. • Example CharacterClasses 130 #!/usr/bin/perl use strict; use warnings; my $myString ="Hello 111Every2343one"; if ($myString =~ /^(w+)(s+)(d{1,3})(w+)(d{1,4})(w+)$/) { print "match." . "n"; } exit(0); MatchChrClassEx02.pl Results:- <Please enter the bill or bull or ball to match with pattern>
  • 35.
    Danairat T. Pattern MatchVariable $1, $2, … 131 • Parentheses () not only to group elements in a regular expression, they also remember the patterns they match. • Every match from a parenthesized element is saved to a special, read-only variable indicated by a number. • Using 1, 2,.. to recall a match within the matching pattern. • Using $1, $2,... to recall a match outside of the matching pattern.
  • 36.
    Danairat T. Pattern MatchVariable $1, $2, … 132 #!/usr/bin/perl use strict; use warnings; my $myString = "Everyone Hello"; my $myCount = $myString =~ s/(w+)s(Hello)/$2 $1/; print "$myString n"; print "$myCount n"; exit(0); PatternMatchVarEx03.pl Results:- Hello Everyone 1 • Example:-
  • 37.
    Danairat T. Pattern MatchVariable The backreferencing 133 The backreferencing variables are:- • $+ Returns the last parenthesized pattern match • $& Returns the entire matched string • $` Returns everything before the matched string • $' Returns everything after the matched string Backreferencing will slow down your program noticeably.
  • 38.
    Danairat T. Line ID:Danairat FB: Danairat Thanabodithammachari +668-1559-1446 Thank you