The correct syntax when using capturing groups in PowerShell

When working with regular expressions in PowerShell, capturing groups allow you to extract specific parts of a matched pattern. However, there are two important syntax considerations when referencing these groups in your code.

But first, let us understand what a capturing group is.

What is a Capturing Group?

A capturing group in PowerShell is a part of a regular expression enclosed in parentheses (). It allows you to “capture” the matched text inside the parentheses and reference it later.

These groups can then be used, for example, together with the -replace parameter when working with strings. The first capturing group is numbered 1, the second 2, and so on, and the groups can be accessed with the $1, $2, etc., notations.

However, when working with this notation, you have to pay attention to the following two edge cases.

The use of the backtick ` in your strings

Consider the following example:

$newValue = "ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`$1$newValue`$2"

Write-Host $result # Will print key=ThisIsAString--

We have the $newValue variable, which we want to use to replace the string between the key= and the -- strings. For that, we set the key= into a capturing group and the -- into a second.

We want to create a new string containing the value of the first capturing group, followed by the $newValue, and finishing with the second capturing group. For this, we use the following string: `\$1$newValue`\$2.

The first edge case here is the use of the backtick `. We use it to ensure the dollar sign is parsed as a capturing group and not as a variable from PowerShell.

The pitfall of the $n notation

Now consider adding a number at the beginning of the value of the $newValue variable:

$newValue = "1ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`$1$newValue`$2"

Write-Host $result # Will print $11ThisIsAString--

As you can see, PowerShell now believes that the capturing group is $11. Since there is no such capturing group in our regular expression, it will print $11 as a string.

The correct approach with ${}

Updating the string to use the ${} notation for capturing groups ensures that PowerShell will interpret the capture groups correctly, even if they are followed by a variable that starts with numbers.

$newValue = "1ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`${1}$newValue`${2}"

Write-Host $result # Will print key=1ThisIsAString--

Conclusion

The two takeaways of this article are:

  1. Always use the backtick ` when working with capturing groups inside strings.
  2. Always use the ${n} notation when referencing capturing groups that are followed by text or variables to avoid unexpected behavior.
comments powered by Disqus