Value Extraction with Regular Expressions in C#

Regular expressions are one of my favorite things in programming. Each time I write one, it’s like a challenging little brain teaser. One of the things that I commonly use them for is to extract data out of a string.

In the past, I’ve done this by instantiating a Regex with a pattern, checking for matches, getting a MatchCollection, iterating through its matches, and, finally, pulling my “value” out of the match’s group. That’s a whole lot of work to extract a piece of data, and I’ve always suspected there’s an easier way.

I figured out how to do this elegantly just the other day, and I was thrilled. I was working with an alphanumeric text field that was left-padded with 0s. I needed to strip the 0s, and my mind instantly went to regular expressions. Using the static Result method, you can specify capture groups for the output. So, getting my value could be done in a single operation!

// trim leading 0s 
if (value.StartsWith("0")) 
{ 
    value = Regex.Match(value, "^0+(.*)$").Result("$1"); 
}

For those of you who may not be as regular expression savvy, here’s what’s going on:

  • ^ – the beginning of the string; we use this so that we don’t match on a subset of the string
  • 0+ – one or more 0s
  • (.*) – zero or more characters; the parentheses indicate that this is a capture group
  • $ – the end of the string; we again use this so that we don’t match on a subset of the string
  • $1 – $n can be used to output the value of a capture group

Wonderful!

Advertisements

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s