The correct syntax when using capturing groups in PowerShell
When working with regular expressions in PowerShell, capturing groups allow you to extract specific parts of a matched pattern. However, there are two important syntax considerations when referencing these groups in your code.
But first, let us understand what a capturing group is.
What is a Capturing Group?
A capturing group in PowerShell is a part of a regular expression enclosed in parentheses ()
. It allows you to “capture” the matched text inside the parentheses and reference it later.
These groups can then be used, for example, together with the -replace
parameter when working with strings. The first capturing group is numbered 1, the second 2, and so on, and the groups can be accessed with the $1
, $2
, etc., notations.
However, when working with this notation, you have to pay attention to the following two edge cases.
The use of the backtick ` in your strings
Consider the following example:
$newValue = "ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`$1$newValue`$2"
Write-Host $result # Will print key=ThisIsAString--
We have the $newValue
variable, which we want to use to replace the string between the key=
and the --
strings. For that, we set the key=
into a capturing group and the --
into a second.
We want to create a new string containing the value of the first capturing group, followed by the $newValue
, and finishing with the second capturing group. For this, we use the following string: `\$1$newValue`\$2
.
The first edge case here is the use of the backtick `. We use it to ensure the dollar sign is parsed as a capturing group and not as a variable from PowerShell.
The pitfall of the $n notation
Now consider adding a number at the beginning of the value of the $newValue
variable:
$newValue = "1ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`$1$newValue`$2"
Write-Host $result # Will print $11ThisIsAString--
As you can see, PowerShell now believes that the capturing group is $11
. Since there is no such capturing group in our regular expression, it will print $11
as a string.
The correct approach with ${}
Updating the string to use the ${}
notation for capturing groups ensures that PowerShell will interpret the capture groups correctly, even if they are followed by a variable that starts with numbers.
$newValue = "1ThisIsAString"
$result = "key=ValueToBeReplaced--" -replace "(key=)[a-zA-Z0-9]+(--)", "`${1}$newValue`${2}"
Write-Host $result # Will print key=1ThisIsAString--
Conclusion
The two takeaways of this article are:
- Always use the backtick ` when working with capturing groups inside strings.
- Always use the
${n}
notation when referencing capturing groups that are followed by text or variables to avoid unexpected behavior.