Saturday, June 13, 2009

Regular Expressions in Powershell

Regular Expressions are the most powerful pattern matching techniques available for text parsing. Almost all programming languages have regular expressions. Perl has very powerful regex engine and it provides more features.

Windows Powershell carries excellent regular expression support from .Net Framework

Regex Pattern matching in Powershell is so easy. We will go though some examples on Regular Expressions in Windows Powershell

Prerequisite: Some knowledge on regular expression would be good


PS C:\Scripts> 'Hello World' -match '^H'
True
PS C:\Scripts> 'ello World' -match '^H'
False

Powershell returns True if the match is found, otherwise False is returned

Let's see one more example


PS C:\Scripts> '2009-Jun-13' -match '\d{4}-[A-z]{3}-\d{2}'
True
PS C:\Scripts> '2009-June-13' -match '\d{4}-[A-z]{3}-\d{2}'
False

-replace operator is used for Search and Replace in Powershell


PS C:\Scripts> '2009-June-13' -replace "2009", "2010"
2010-June-13

How do you capture the pattern and use it in Search and Replace in Powershell? Lets see how,


PS C:\Scripts> '2009-Jun-13' -replace "(\d{4})-([A-z]{3})-(\d{2})", "$2 $3 $1"

PS C:\Scripts> '2009-Jun-13' -replace "(\d{4})-([A-z]{3})-(\d{2})", '$2 $3 $1'
Jun 13 2009

In Search and Replace, Powershell stores captured patterns in special variables $1, $2, etc. But if you use those special variables inside double-quoted replacements, they will be considered as normal variables. That's the reason, we didn't get expected result in first statement in the above code window. To make it work, you need to use single-quoted replacements like above.

To get more idea on this, let's see another example


PS C:\Scripts> $1="How"
PS C:\Scripts> $2="are"
PS C:\Scripts> $3="you"
PS C:\Scripts>
PS C:\Scripts> '2009-Jun-13' -replace "(\d{4})-([A-z]{3})-(\d{2})", "$2 $3 $1"
are you How
PS C:\Scripts> '2009-Jun-13' -replace "(\d{4})-([A-z]{3})-(\d{2})", '$2 $3 $1'
Jun 13 2009

Now let's see how to use captured patterns with -match operator. If you capture patterns using -match, unlike Search and Replace, the captured patterns will be stored in a special array variable called $matches.


PS C:\Scripts> '2009-Jun-13' -match '(\d{4}).*'
True
PS C:\Scripts> $matches

Name Value
---- -----
1 2009
0 2009-Jun-13

PS C:\Scripts> $matches[1]
2009
PS C:\Scripts>

Regular Expressions in Powershell are case-insensitive by default. For case-sensitive pattern matching, we can use the following operators

1. -cmatch
2. -creplace


PS C:\Scripts> 'HELLO' -match '[A-Z]'
True
PS C:\Scripts> 'HELLO' -match '[a-z]'
True
PS C:\Scripts>
PS C:\Scripts> 'HELLO' -cmatch '[A-Z]'
True
PS C:\Scripts> 'HELLO' -cmatch '[a-z]'
False
PS C:\Scripts>

2 comments:

  1. Regular expression is a technique used for parsing. The concept of regular expression is not easy to understand. It requires lot of concentration. You need to get the basics accurately. This post does the work for those who want to learn regular expressions in powershell.

    ReplyDelete
  2. Yes. Those people should buy a book

    ReplyDelete