Regular expressions with ASCII values

I was writing some unit tests today to test a format out. The format that I was testing used ASCII characters for FS, GS, RS, and US.

A sample format might look like this:

1.03:1[us]00[rs]2[us]01[rs]10[us]01[gs]

So, in my test, I wanted to verify that my string started with “1.03:” and ended with “[rs]10[us]someValue[gs]” However, I didn’t know how to check for those pesky ASCII characters, though! After a bit of Googling, I found the answer, and it’s actually pretty simple. You can use an escaped u in a regular expression to specify a four-digit Unicode character. After a quick ASCII-to-Unicode lookup (here) I came up with the perfect regular expression:

Regex.IsMatch(contents, @"1.03:.*?\u001E10\u001F0*" + expected + @"\u001D")

Thanks for being so awesome, regular expressions!

Advertisement

Author: Adam Prescott

I'm enthusiastic and passionate about creating intuitive, great-looking software. I strive to find the simplest solutions to complex problems, and I embrace agile principles and test-driven development.

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: