PowerShell Problem Solver: PowerShell String Parsing with Regular Expressions - Technology Portal

Breaking

Post Top Ad

Post Top Ad

3/20/2015

PowerShell Problem Solver: PowerShell String Parsing with Regular Expressions

In a recent article called PowerShell string parsing with substrings I showed you some ways to parse strings with PowerShell. For many PowerShell beginners, splitting strings works just fine. But eventually you will realize you want more control and this is where regular expressions come into play. I am not going to try and teach your regular expressions from scratch. There is an entire chapter in the 2nd edition of PowerShell in Depth that covers regular expressions in PowerShell. You should also take a few minutes to look at the help topic about_regular_expressions.
A regular expression is a way of using a pattern to describe some piece of data. Granted, coming up with the pattern can be time-consuming. But let’s see if our string challenge can help shed some light on the subject. If you recall I am starting with a string, presumably from some log.
The goal is to extract O’Hicks, Jeffery(X.) from the string. I’ve intentionally modified my name to throw in a different character because you might face something similar. The examples I am going to show you should also work for simpler strings as well. If you know for a fact what your data will look like, you might even be able to get by with simpler patterns. But enough chat. The simple way to even test if there is a matching pattern is with the –Match operator.
The stuff to the left of the –match operator is the regular expression pattern. Here’s how it breaks down and it is case-senstive:
  • \S means get any non-whitespace character
  • + means get one or more instances of the preceding, e.g. non-whitespace character
  • , means a literal comma
  • \s means a single whitespace
  • \S+ is a repeat of the first part.
The –match operator will return True or False. If True, you can look at the built-in $matches variable to see what matched.

PowerShell String Parsing with Regular Expressions
This is similar to what I came up with splitting in the previous article.

As before, I can now parse out the relevant part of the string.

The thing about regular expression patterns is that they float to match anywhere in the string, unless you use anchors. Or you can fine-tune your pattern
There are a few additions here.
  • \b indicates a word boundary which usually means there is a space before the character
  • \w means an alphanumeric character
  • {1} means exactly one of the preceding, e.g. alphanumeric, characters
  • \. means a literal period. The period is a special regular expression character so I need to escape it with a \ so PowerShell treats it literally.
  • \) means a literal parentheses. These too are special characters so if you mean a ) you need to escape it.
As you can see, now I have exactly the result I need without any additional parsing.

I can get the value from $matches.values. Still with me? Let’s spin your head a bit more and let me show you the REGEX object. This object starts out as a regular expression pattern.
This is a variation on what I used before.
  • \( is a literal parantheses
  • . means any single character
  • means any preceding instance so .* is a way of saying every and anything
  • \) is a literal parentheses
The REGEX object gives you a bit more control. Pipe $rx to Get-Member and you should see something like this:

If you merely want to test and see if the pattern matches in the string you can do this.
Or use the Match method.

As I did with –Match I can get the value and parse it.

As you can see it all comes down to the pattern.
Since I’ve already found a pattern that matches exactly what I want without any additional parsing, I might as well use that.

Or if you want to show off your head-spinning PowerShell skills, try this one-liner:
The advantage to a regular expression is that you can find matching patterns within strings without having to worry about how long the string is or how it might be formatted. You still need to know your data and it must be consistent. If some lines show the data I want as “O’Hicks, Jeffery(X.)” but it might also be “O’Hicks, Jeffery{X.}” or “O’Hicks, Jeffery X”it might make your regular expression pattern a bit more complicated.
I know many Windows IT Pros are new to regular expressions and find them difficult, but like anything it simple takes practice. So the next time you are looking to parse some string for some nugget of information, see if regular expressions can make your life easier. But I strongly believe that if you want to be taken seriously as a PowerShell professional, then you need to develop at least some basic proficiency with regular expressions. We’ll wrap up this mini-series next time with another aspect of regular expressions – named captures.

Source


No comments:

Post a Comment

Post Top Ad