Regular Expression Capture Groups in C#

We all know regular expressions are great for matching patterns and parsing strings, but if you really want to take your Regex game to the next level, spend some time looking at capture groups. Using capture groups–more specifically, named capture groups–makes it easier to manipulate results and replacements, and it also has a fortunate side-effect of improving readability.

To create a named capture group in a .net regular expression, use the syntax “(?<Name>pattern).” The name acts like an inline comment and allows you to reference the group using that name by using “${Name}” in Result and Replace statements.

Let’s look at an example that uses Replace. Social Security numbers are a sensitive piece of information that is often masked when displaying details. Let’s use a regular expression to hide part of the number.

var ssn = "123-45-6789";
var re = new Regex(@"\d{3}-\d{2}-(?<lastFour>\d{4})");
var masked = re.Replace(ssn, "xxx-xx-${lastFour}");
Console.WriteLine("{0} -> {1}", ssn, masked);
123-45-6789 -> xxx-xx-6789

Another common scenario is to extract a piece of data from a string. Here’s another quick example that extracts the month and year from a date string (great for grouping items in a reporting scenario!).

var date = "01/15/2012";
var re = new Regex(@"(?<month>\d{1,2})/(?<day>\d{1,2})/(?<year>\d{4})");
var monthYear = re.Match(date).Result("${year}-${month}");
Console.WriteLine("{0} -> {1}", date, monthYear);
01/15/2012 -> 2012-01
Advertisement

Value Extraction with Regular Expressions in C#

Regular expressions are one of my favorite things in programming. Each time I write one, it’s like a challenging little brain teaser. One of the things that I commonly use them for is to extract data out of a string.

In the past, I’ve done this by instantiating a Regex with a pattern, checking for matches, getting a MatchCollection, iterating through its matches, and, finally, pulling my “value” out of the match’s group. That’s a whole lot of work to extract a piece of data, and I’ve always suspected there’s an easier way.

I figured out how to do this elegantly just the other day, and I was thrilled. I was working with an alphanumeric text field that was left-padded with 0s. I needed to strip the 0s, and my mind instantly went to regular expressions. Using the static Result method, you can specify capture groups for the output. So, getting my value could be done in a single operation!

// trim leading 0s 
if (value.StartsWith("0")) 
{ 
    value = Regex.Match(value, "^0+(.*)$").Result("$1"); 
}

For those of you who may not be as regular expression savvy, here’s what’s going on:

  • ^ – the beginning of the string; we use this so that we don’t match on a subset of the string
  • 0+ – one or more 0s
  • (.*) – zero or more characters; the parentheses indicate that this is a capture group
  • $ – the end of the string; we again use this so that we don’t match on a subset of the string
  • $1 – $n can be used to output the value of a capture group

Wonderful!

%d bloggers like this: